Camera Lidar Calibration

These are tidbits of my internship in Hesaitech.

Alignment of camera and lidar has important implications for computer vision applications. We need to obtain the external parameters from the radar to the camera (that is, how the radar coordinate system is transformed to the camera coordinate system through a rigid body) when the internal parameters of the camera are known, including the rotation matrix R and translation t.

The correspondence between the point cloud in the radar coordinate system and the points on the pixel plane is as follows:

Assume that the radar coordinate system is the world coordinate system. one point is
$$ (x_w,y_w,z_w)^T. $$
Transformed to the camera coordinate system after R, t, that is
$$ (x_c,y_c,z_c)^T=R(x_w,y_w,z_w)^T+t^T $$
Coordinates projected onto the imaging plane with Z=1
$$ p=(x_1,y_1)^T=(x_c/z_c,y_c/z_c)^T $$

Get new coordinates after distortion transformation
$$ (x_D,y_D)^T=\phi_D(x_1,y_1)^T $$
Get the coordinates of the last pixel plane through the internal reference matrix K
$$ (x_p,y_p)^T=(x_D,y_D)^T K^T. $$
We simply say that the entire mapping function above is
$$ (x_p,y_p)^T=f(x_w,y_w,z_w) $$

In order to obtain the external parameters R and t, we use an original calibration method, which we call L2L(3d-2d). We use the mapping relationship from the 3D line to the 2D line to implement: specifically, the relationship that the corresponding pixel points in 2D are on the straight line projected from 3D to 2D. (The 3D line is still a line after being projected to 2D. We use the intersection of two surfaces in 3D to determine the line in 3D. The surface parameters in 3D are calculated by several points in the point cloud on it. We use several pixels in the pixel space to represent the lines in 2D.) We hope that the distance between the corresponding pixels in 2D and the straight line projected from 3D to 2D should be as small as possible, so the calculation is done by the pixel The distance of a point to a 3D line projected onto the pixel plane forms our optimization function. In the case of a given initial value, the objective function obtained by minimizing a series of corresponding line pairs is obtained to obtain optimized external parameters.

Suppose
$$ (x_w^{1},y_w^{1},z_w^{1})^T,(x_w^{2},y_w^{2},z_w^{2})^T $$
is two points on the 3D line, then the line projected by the 3D line in 2D can be represented by the two points projected on it (the default distortion has little effect and will not make the line curved.)
$$ (x_p^{1},y_p^{1})^T=f(x_w^{1},y_w^{1},z_w^{1}),(x_p^{2},y_p^{2})^T=f(x_w^{2},y_w^{2},z_w^{2}). $$

Then we can find the equation of the straight line
$$ Ax+By+C=0. $$
For several known pixel points on a straight line
$$ (x_p^a,y_p^a)^T,(x_p^b,y_p^b)^T,\cdots, $$
Its distance to the straight line is
$$ d^a=\vert\frac{Ax_p^a+By_p^a+C}{\sqrt{A^2+B^2}} \vert,\cdots, $$
The objective function of a line is constructed:
$$ L_{line1}=d^a+d^b+\cdots, $$
The objective function for all lines then becomes
$$ totalLoss=L_{line1}+L_{line2}+\cdots. $$

It is noted that the projection to the Z=1 plane is a nonlinear operation during the whole process, which makes the entire loss nonlinear and very complicated with respect to R and t, and cannot be solved directly analytically. We use the global optimization function basinhopping in scipy's optimize in python to optimize the objective function with respect to R, t.

Since bashihoping is a global optimization function, it is greatly affected by the initial value of the search. And R is not well suited for direct optimization because its degrees of freedom are lower than its number of variables. So we use the Euler angle RPY to describe the R matrix (the space from the Euler angle to the R matrix is a surjection. By enumerating the RPY feasible region and taking points as the initial value of the bashihoping function to optimize. We will try 0 and 180° as the initial value for each dimension, a total of 8 situations, and empirically, good results can be obtained under the 4 initial values.

We have also tried the method of using point-to-point correspondence to solve the external parameters, that is, replacing the distance from point to line with the distance from point to point, but we found that it is difficult to determine the correspondence between 3D points and 2D points. After verifying that its solution effect is very poor, we finally gave up this method.