High precision robot-to-camera calibration

BramHeerschap · April 13, 2026, 10:00am

High precision robot-to-camera calibration
In our setup, high precision is required because the parts we handle are very small. Even slight misalignment between the camera and the robot leads to incorrect positioning. To solve this, we implemented a calibration pipline that maps camera pixels to board coordinates and finally to robot coordinates.
Concept
The system uses two calibration steps:

Camera → board mapping
a. This determines where a pixel lies on a known flat surface (the workspace)
Board → robot mapping
a. This converts positions on the workspace into robot coordinates
By separating the problem, we simplify calibration and make it easier to debug. The result is a transformation chain from image pixel to robot movement.
How we did it:
Camera to board calibration
We place four ArUco markers on a flat surface in known positions. These positions are defined in millimeters:

WORLD_POINTS_MM = {
2: (0.0, 0.0),
0: (100.0, 0.0),
1: (0.0, 80.0),
3: (100.0, 80.0),
}

Instead of only using the center of each marker (which would give just 4 points), we used multiple points per marker:

4 corners per marker
1 center point per marker
This results in 5 points x 4 markers = 20 points used for calibration. These are matched with the detected image points and used in solvePnP:

ok, rvec, tvec = cv2.solvePnP(object_points, image_points, camera_matrix, dist_coeffs)

using more points significantly improves accuracy because:

Noise in detection is averaged out
Small error in individual points have less impact
The solution becomes more stable
After estimating the pose, we compute a homography to map image pixels to the board plane:
def pixel_to_board(x, y, H):
pt = np.array([[[x, y]]], dtype=np.float32)
out = cv2.perspectiveTransform(pt, H)
return out[0, 0]

Board to robot calibration
for the robot calibration we also used more than the minimum number of points. Instead of just 3-4 points, we used multiple measurements per marker:

Each marker is touched twice
o Once on the vertical edge
o Once on the horizontal edge
This gives 8 calibration points in total, spread across the workspace.
These known board positions are paired with robot coordinates:
M, _ = cv2.estimateAffine2D(board_pts, robot_pts)
Using more points here improves precision in the same way

Final conversion
After both calibrations are completed, the full pipeline is straightforward:
Pixel → board
x_mm, y_mm = pixel_to_board(px, py, H)
Board → robot
rx, ry = board_to_robot(x_mm, y_mm)
this allows the robot to move to a point detected by the camera.

How can you do it?
To reproduce this setup, you only need a camera, a robot and four markers placed in known positions. First, detect the markers using OpenCV and compute the camera pose with solvePnP. This converts this into a plane mapping (homography). After that, manually collect a few corresponding board and robot points and compute the affine transformation.
Finally, test the system by selecting points in the image and verifying that the robot reaches the expected location. If necessary, accuracy can be improved by using more calibration points or refining detection.

Conclusion
This method of calibration provides a simple and effective way to achieve high-precision robot-to-camera calibration. By splitting the problem into two steps and using standard OpenCV functions, the system remains easy to implement while still achieving reliable results. The overall accuracy depends mainly on the quality of marker detection and the precision of the manually recorded robot points.