Project 4A

Project 4AShoot and Digitize PicturesRecover HomographiesWarp the ImagesBlend images into a mosaicProject 4BHarris CornersAdaptive Non-Maximal SuppressionGenerate Feature DescriptorsMatch Feature DescriptorsRANSACWhat have you learned?

Shoot and Digitize Pictures

The pictures to merge are:


Kitchen (left)	Kitchen (right)

Living room (left)	Living room (right)

Desk with tarot cards (left)	Desk with tarot cards (right)

Pictures to rectify are:


Laptop (16:9)

Television (16:10)

Recover Homographies

$H$ such that

\begin{matrix} H [\begin{matrix} x \\ y \\ 1 \end{matrix}] = [\begin{matrix} a & b & c \\ d & e & f \\ g & h & 1 \end{matrix}] [\begin{matrix} x \\ y \\ 1 \end{matrix}] = [\begin{matrix} w x^{'} \\ w y^{'} \\ w \end{matrix}] \end{matrix}

$x, y$ $x', y'$ is the point coordinate in the target image.

Expanding the equation, we have:

\begin{matrix} {\begin{aligned} a x + b y + c & = w x^{'} \\ d x + e y + f & = w y^{'} \\ g x + h y + 1 & = w \end{aligned} \end{matrix}

then solve the formula:

\begin{matrix} \to {\begin{aligned} a x + b y + c & = (g x + h y + 1) x^{'} \\ d x + e y + f & = (g x + h y + 1) y^{'} \end{aligned} \end{matrix}

\begin{matrix} \to {\begin{aligned} a x + b y + c - g x x^{'} - h y x^{'} & = x^{'} \\ d x + e y + f - g x y^{'} - h y y^{'} & = y^{'} \end{aligned} \end{matrix}

$H$ by applying at least 4 corresponding points to the equation.

Warp the Images

$H$ , I used inverse warping with bilinear interpolation to warp project the image to the target project plane.

$x, y$ $x, y$ $x_{max}, y_{max}$ ).

We are able to rectify the images now:


Television	Rectified television

Laptop	Rectified Laptop

Blend images into a mosaic

Finally, we can build a panorama by warping corresponding points to the same coordinates.

I first tried naive averaging in the intersecting part of two images:

The naive averaging didn't give preferable results, specifically, there are clear lines resulting from naive averaging in the blended image.

Therefore, I used the weighted averaging as my second approach. The weights of intersecting parts are determined by a linear function, where the leftmost part has a weight of 0 and the rightmost part has a weight of 1. The weighting masks of the second image are shown below:


Left mask	Right mask

The subtle noises on the right mask resulted from the way I implemented it. I construct the mask by mask = np.any(image, axis=2). Thus, the black pixels in the image would generate the black pixels on the mask. However, it has no negative effect on our blending since the pixels are black and no blending is needed in those black pixels.

The results are:

The quality of the panoramas is much better than the naive averaged one.

Project 4B

In part B, the objective is to create a automatic stitching algorithm. It consists of several steps:

Detect corner features using Harris corner detector
Reduce feature points with Adaptive Non-Maximal Suppression (ANMS)
Extract feature descriptors from each feature point
Match feature descriptors
Use RANSAC to compute a homography

Harris Corners

I first applied Harris corner detector to find the feature points that look like corners. This resulted in many feature points in images.

Adaptive Non-Maximal Suppression

Since there is too many feature points from the previous step, I used Adaptive Non-Maximal Suppression (ANMS) to reduce the feature points but also keep the "useful" points that uniformly spread in the image.

I computed the pair-wise L2 distance for each feature points, and update the supression radius by the formula given by paper:

r_{i} = min | x_{i} - x_{j} |, s.t. f (x_{i}) < c_{robust} f (x_{j}), x_{j} \in I

Then, I set the minimum supression radius with the 200th largest suppression radius to find 200 "useful" feature points from the images.

Generate Feature Descriptors

$40\times 40$ $8\times 8$ $0$ $1$ . Finally, they are flattened to be a 64-dimensional vector as a feature descriptor.

It looks like:

descriptor

Match Feature Descriptors

$(i, j)$ such that:

d_{i, j} < D_{m e a n} - 1.5 \times D_{s t d}

$d_{i, j}$ $i$ _th $j$ _th $\bold D_{mean}, \bold D_{std}$ $93\%$ of noisy pairs.

$\frac{e_{1-NN}}{e_{2-NN}} < 0.25$ . The result is shown below:

RANSAC

Finally, I implemented RANSAC to give a robust homography. As described in the lecture, I randomly sampled 4 pairs, computed the homography, counted the inliers, and used the largest inlier set to compute the final homography.

After doing RANSAC, we're able to mosaic the images now.

Belows are the final results:


Auto	Manual

Auto	Manual

Auto	Manual

What have you learned?

I found that it is very interesting to use a random algorithm (RANSAC) to generate robust results. The power of randomness is amazing.