# Camera Representation Part 2: Moving the Camera and the Image

This post is the second in a series of posts on representing cameras mathematically.  If you have not read it yet, or need a quick refresher, please read Part 1 here.

## Intrinsic vs. Extrinsic Camera Properties

To move the camera in the world and to move the image on the image plane, we must distinguish between properties that are intrinsic to the camera and those that are extrinsic to it.  Extrinsic properties describe the camera’s position in the world, while intrinsic properties describe things like the location of the image plane origin and image scaling.

To separate out the intrinsic from the extrinsic parameters, we define the camera calibration matrix $\mathbf{K}$ which describes the camera’s intrinsic parameters.  The camera calibration matrix for the simple pinhole camera described in Part 1 is

$$\mathbf{K}= \left[ \begin{array}{ccc} f & 0 & 0\\ 0 & f & 0\\ 0 & 0 & 1\end{array} \right].$$

This camera calibration matrix only takes into account the focal length f.  But, we now have a description of the intrinsic parameters that is separate from the camera’s position in the world.  Let’s now change the camera’s position.

## Setting the Camera Location

The above diagram was introduced back in Part 1, but the projection matrix $\mathbf{P}$ was then calculated assuming that the camera center $\mathbf{C}_w$ was at the origin and the camera points along the z-axis.  We will now generalize and assume that $\mathbf{C}_w$ can be any location in the world, and that the camera can be rotated arbitrarily.

The rotation of the camera is described by a $3 \times 3$ rotation matrix $\mathbf{R}$.  Rotation matrices are a common way to mathematically describe an object’s roll, pitch, and yaw in a 3 dimensional space.  Rotation matrices are used whenever a linear model of 3D location is needed–vision, robotics, and graphics are example sub-fields of computer science that use rotation matrices regularly.

To apply the rotation matrix $\mathbf{R}$ and the camera position $\mathbf{C}_w$, we must define a transformation that translates and rotates the camera in terms of the world frame.  That is, we need the rotation and translation of the camera from the origin of the world frame to its position and orientation in the world.  The rotation is very straight forward, as it is described by rotation matrix $\mathbf{R}$.  However, the translation is a bit trickier;  to find the translation to use in the projection matrix $\mathbf{P}$, we need to “correct” for the rotation.  Thus, the translation is described as

$$\mathbf{t}= -\mathbf{RC}_w,$$

where $\mathbf{t}$ is the resulting 3 dimensional vector.

Given all of this, we can solve for the projection matrix using the following equation:

$$\mathbf{P}= \mathbf{K} \left[ \mathbf{R} | \mathbf{t} \right].$$

## Setting the Image Location

Now that we can move the camera around to any arbitrary location and orientation in the world, we will focus on moving the principal point of the image to an arbitrary point in the image plane.  The principal point is the point in the 2D image plane that corresponds to point $\mathbf{C}_i$ in the diagram above.  The reason why it is important to move is because the principal point is the origin, point (0, 0) in the image.  Most digital image formats put the origin in the corner of the image, but without moving the principal point, the origin will be in the center of the image.  This must be changed!

Image plane diagram. Shows the location of the principal point and associated axes in the camera image plane ($$C_{cam}$$) and the x,y axes of the actual image.

To move the principal point to the image origin, we need to add the $y_0$ offset for the y-axis and the $x_0$ offset for the x-axis.  This is a fairly straightforward modification of the camera calibration matrix $\mathbf{K}$ above.  Once we make this change, we get:

$$\mathbf{K}= \left[ \begin{array}{ccc} f & 0 & x_0\\ 0 & f & y_0\\ 0 & 0 & 1 \end{array} \right].$$

It can clearly be seen that this addition simply adds a (scaled) offset to the image locations in the image plane.  To illustrate this with an example, let’s solve for $\mathbf{K X}_{cam}$ where $\mathbf{X}_{cam}$ is a 3D homogenous vector containing a point in the camera’s image plane:

$$\mathbf{K X}_{cam}= \left[ \begin{array}{ccc} f & 0 & x_0\\ 0 & f & y_0\\ 0 & 0 & 1 \end{array} \right] \left[ \begin{array}{c} x_{cam}\\ y_{cam}\\ 1 \end{array} \right]= \left[ \begin{array}{c} fx_{cam}+x_0\\ fy_{cam}+y_0\\ 1 \end{array} \right].$$

### Images with Origin in the Upper-Left-Hand Corner

One final thought to consider:  many digital image formats put the origin of the image in the upper left-hand corner of the image, with the y-axis pointed down.  If you are dealing with images like that, you will need to correct your camera calibration matrix as follows:

$$\mathbf{K}’= \left[ \begin{array}{ccc}1 & 0 & 0\\ 0 & -1 & 0\\ 0 & 0 & 1 \end{array} \right] \mathbf{K}.$$

This correction will flip the y-axis so that it will line up correctly with the image plane.

And that is where we will leave off for today.  Come back next time for Part 3 of this series where we will add in more intrinsic camera parameters to think about.

Edit 8/16/2013:  You can find Part 3 of this series here.

### One Comment

1. Pingback: Camera Representation Part 1: Homogenous Coordinate Systems and the Simplest Camera Imaginable » Nate Bird