This post is the third, and final, in a series of posts on mathematical camera representation. The following are links to the earlier two entries in this series:
- Camera Representation Part 1: Homogenous Coordinate Systems and the Simplest Camera Imaginable
- Camera Representation Part 2: Moving the Camera and the Image
This post builds upon the model built up in these previous two posts by adding two final concepts: the ability to handle non-square pixels in an image and the ability to handle skewed images.
For the rest of this discussion, the form of the solution for finding the projection matrix will remain the same as in Part 2. That is, the projection matrix can be found by incorporating the camera rotation matrix , the 3-vector , and the upper-triangular intrinsic camera parameter matrix as
The intrinsic camera parameter matrix defined in the Part 2 will be updated to take into account non-square pixels and skew. It hopefully makes sense that is where these changes take place since pixel dimensions and image skew are intrinsic to the camera and do not relate to the camera's extrinsic location in the world.
Most digital cameras have rectangular pixels. Because the pixels are rectangular, the camera model must scale the image by different amounts along the x- and y-axes. We now update the definition of the intrinsic camera parameter matrix to be defined as:
Here, and where is the number of pixels per unit distance in x and is the number of pixels per unit distance in y. The principal point is now measured in terms of pixels.
The final parameter we will add to our model is the skew parameter s. The skew parameter models how the x- and y-axes are aligned in the image plane. In most cases, the axes are perpendicular and . If the x- and y-axes are not perpendicular, then .
Incorporating the skew parameter into the intrinsic camera parameter matrix, we get
Final Note on Degrees of Freedom
The camera projection matrix is a homogenous transform, which means that two projection matrices are equivalent if the only difference between them is a non-zero scaling coefficient. That is, if where c is a non-zero constant. Practically, this means that a projection matrix has 11 degrees of freedom despite being a 12-item matrix.
Going into a bit more depth, we can expand out our projection matrix as
We can now count our degrees of freedom:
- has 5 degrees of freedom since it has 6 elements, but is homogenous and only defined up to scale. That is, only has five elements that are mutually exclusive.
- defines a rotation matrix, and therefore only has 3 degrees of freedom (roll, pitch, and yaw).
- has 3 degrees of freedom since it defines a translation in 3-dimensional space which links the camera position with the world origin.
Thus, by simple addition, the camera projection matrix has 11 degrees of freedom.
And with that, we are finished with our discussion of the mathematical camera model. I hope that you have found this useful!