In this project, we're tasked with taking images and stitching them together. To do this we go through the steps of:Before continuing, we can formalize this a bit. If we have
- Defining correspondence points between the images
- Using the correspondence points to create a transform that warps one image to another
- Blending the images into a mosaic (think of Project 2!)
im1
andim2
, here's the math for 1 set of their correspondence points assuming we are going fromim1
toim2
(noteim1
's point isu
andim2
's point isv
):
Blue highlights correspond to translation, Yellow corresponds to scaling and rotation (and potential shearing), and Red corresponds to effects from perspective distortion
This formulation tells us the value of w (bottom row dot product with the RHS vector)
We can plug this value of w into the equations for v1 and v2, and do some algebra...
Rewriting this, we now see we get 2 equations per correspondence point, and are solving for a through h (parameters of the Homography matrix)
With more than 4 correspondence points defined, the system becomes overdetermined, so we must use least squares to find an approximate best solution for these parameters.
Here are the pictures we'll be using.
![]() Mob Psycho Image 1 |
![]() Mob Psycho Image 2 |
![]() Window Image 1 |
![]() Window Image 2 |
![]() Among Us Image 1 |
![]() Among Us Image 2 |
We've defined correspondence points between the images as shown below, and are now solving for the homogrpahy matrix, H
, as described in the beginning.
Note again that because we get 2 "constraints"/equations per correspondence point, we the system
becomes overdetermined (with > 4 correspondence points) and thus we must use least squares.
![]() Mob Psycho Image 1 with Correspondences |
![]() Mob Psycho Image 2 with Correspondences |
![]() Window Image 1 with Correspondences |
![]() Window Image 2 with Correspondences |
![]() Among Us Image 1 with Correspondences |
![]() Among Us Image 2 with Correspondences |
Here are the warped images (note that the middle image is very warped due to covering a very large field of view -- almost 180 degrees.):
![]() Warped Mob Psycho Image |
![]() Warped Window Image |
![]() Warped Among Us Image |
On a more specific note (and to illustrate the concept better), with what we've done above, we can "rectify" a known rectangle in an image. The idea is that you define the 4 corners of the rectangle we would like to look at "head-on", and map that to a rectangle. In this image, note how the rectangle is shaped into a trapezoid. After warping it to a rectangle (with some reasonable aspect ratio), it becomes rectangular.
![]() iPhone Case |
![]() iPhone Case with Correspondences |
![]() Rectified iPhone Case |
To create a mask that will blend the images correctly, we can:
- Binarize the image to note where the actual image content is and where's the unused background
- Use a distance transform, which gives us the distance of the pixel from the background (black pixels)
- Obtain a mask by doing
dist_transform1 > dist_transform2
- Blurring this mask to soften the edges
- And finally doing image blending using the Laplacian Stack as described in Project 2. We can do as many layers as we'd like, but 2 layers is sufficient.
![]() Mob Psycho Distance Transform 1 |
![]() Mob Psycho Distance Transform 2 |
![]() Mob Psycho Blurred Mask |
![]() Window Distance Transform 1 |
![]() Window Distance Transform 2 |
![]() Window Blurred Mask |
![]() Among Us Distance Transform 1 |
![]() Among Us Distance Transform 2 |
![]() Among Us Blurred Mask |
![]() Mob Psycho Blended Image |
![]() Window Blended Image |
![]() Among Us Blended Image |
Again, note how the middle image looks very warped at the left end. This is likely due to the two images covering a very large field of view (almost 180 degrees), and the homographies are better suited for smaller field of views.