CS 280A HW2 Report

1.1: Finite Difference Operator

To obtain the difference between adjacent pixels of an image I, we can define filters Dx=[1, 1], Dy=[11] to compute the gradient in x-direction and y-direction by convoluting them with the origin image.

1.1.dx1.1.dy
IDxIDy

Note that the range of the convoluted image's pixels are [1, 1], I normalized them by the function f(x)=x+12 for better visualizations.

Next, the gradient magnitude image Igrad can be computed by Igrad=(IDx)2+(IDy)2, where the operation denotes the convolution. Lastly, by applying binary-thresholding process, we can get the binarized edge image Ibin, which showed the pixels with a larger gradient.

1.1.grad1.1.binary
IgradIbin

1.2: Derivative of Gaussian Filter

To remove noises generated by the lawn in the image, I convoluted the original image I with a Gaussian filter G to create a blurred image I=IG. Similarly, we can compute Igrad and Ibin by the method mentioned in section 1.1.

1.2.1.dx1.2.1.dy
IDxIDy
1.2.1.grad1.2.2.binary
IgradIbin

By applying Gaussian blur to the image, I saw the difference that the noises generated by the lawn are suppressed. This is mainly because the areas of the gradients generated by lawn are small. Therefore, they can be smoothed by the Gaussian blur and not be considered an "edge."

Notice that Igrad=(IDx)2+(IDy)2=(IGDx)2+(IGDy)2, since convolution is associative, we can compute GDx and GDy first. In this assignment, it is called Derivative of Gaussian (DoG) filter.

DoGx=GDxDoGy=GDy

Now, compute Igrad by DoG: Igrad=(IDoGx)2+(IDoGy)2.

1.2.2.dx1.2.2.dy
IDoGy
1.2.2.grad1.2.2.binary
Igrad (by DoG filter)Ibin (by DoG filter)

Now we verified that Ibin obtained from two different methods are essentially the same.

2.1: Image "Sharpening"

Based on the fact that the Gaussian filter behaves like a low-pass filter, we can obtain the high frequencies Ihf of an image by subtraction.

Ihf=I(IG)

Since Ihf is the "details" of an image, adding them to the original image yields a "sharpened" image Isharp.

Isharp=I+αIhf

, where α is the parameter to control the extent of sharpening. α=0 keeps the image unchanged.

In the following examples, I chose α=3.

     
taj+32.1.taj_hf=2.1.sharpened
     
campanile+32.1.campanile_hf=2.1.campanile

I picked an image of Sather Gate. I blurred it first and tried to resharpen it based on the blurred image.

sather_gate
Origin
2.1.sather_blurred
Blurred
2.1.sather_blurred
Resharpened

Unfortunately, the resharpen process failed to "restore" the blurred image. The characters on the Sather Gate are still vague after resharpening.

2.2: Hybrid Images

To generate a hybrid image, given two images I1, I2, we can extract the low frequencies of one image and the high frequencies of the other and average them. In my implementation, instead of simply averaging them, I used the weighted average for better visual effects. Formally, it can be expressed as:

IHybrid=aI1G+b(I2I2G)a+b

, where a, b are hand-tuned parameters.

For example, take the following images as I1, I2:

frieren_alignedanya_aligned
I1 (Frieren)I2 (Anya)

I kept the low frequencies of Frieren and the high frequencies of Anya to generate the hybrid image "Frierenya":

2.2.frierenya
"Frierenya"

The image looks more like Frieren when you look far away, while it looks more like Anya when you look close.

Now, let's inspect the frequencies of these images by Fourier analysis.

low_originallow_freqs
Frieren's frequenciesLow frequencies of Frieren
high_originalhigh_freqs
Anya's frequenciesHigh frequencies of Anya
hybrid_freqs
Frierenya's frequencies

Similarly, I applied the merge process to two more pairs of images.

einstein_alignedefros_aligned
I1 (Albert Einstein)I2 (Alexei Efros)
2.2.albert_efros
"Albert Efros"

The result seems acceptable, though Prof. Efros' collar aligned with Einstein's chin because of Einstein's large head (note that their eyes are aligned!).

The following is a failed example:

campanile_alignedbigben_aligned
CampanileBig Ben
2.2.bigbenile
"???????"

For me, it seems like some strange texture on the surface of Campanile. I can't discern Big Ben through this image.

2.3: Gaussian and Laplacian Stacks

For the purpose of blending images, we need to compute their Gaussian stack and Laplacian stack first. Let the image be I, the Gaussian stack of I can be obtained by repeatedly convolute I with a Gaussian filter G. Therefore, a Gaussian stack of I with N levels can be expressed as:

IG={IG,i | i=0,1,2,,N1}

, where

{IG,0=IIG,i=IG,i1G

In my implementation, the kernel size of the Gaussian filter increased doubly as the level increased by one to capture the image's features in various scales.

The Laplacian stack can be derived from the Gaussian stack. It is defined as:

IL={IL,i | i=0,1,2,,N1}

where

{IL,i=IG,iIG,i+1IL,N1=IG,N1

The original image can be constructed by summing up the Laplacian stack.

(1)i=0N1IL,i=(i=0N2IL,i)+IL,N1=(IG,0IG,N1)+IL,N1=(IG,0IG,N1)+IG,N1=IG,0=I

The following are the Laplacian stack examples of an apple and orange:

2.3.g_apple2.3.l_apple
IG for appleIL for apple

For the first five levels of the Laplacian stack (Level 0 ~ Level 4), since the value of each pixel ranged from [1, 1], I linearly transformed them into [0, 1] for better visualization (just as the one did in section 1.1).

2.3.g_orange2.3.l_orange
IG for orangeIL for orange

We also need a Gaussian stack of a mask to blend two images. In the "oraple" example, a mask that vertically divides the image is needed:

2.3.mask

The Laplacian stack of the blended image can be computed by the formula:

(2)IL,iA=MG,i×IL,iB+(1MG,i)×IL,iC

, where IA is the blended image, IB and IC are the images to be blended (apple and orange in this example), M is the mask, and × is the element-wise multiplication.

Finally, IA can be constructed by ILA through the formula (1).

Blending process

The first three rows are levels 0, 2, and 4. The last row is the summation of the stack. The columns are the different levels of MG×ILB,MG×ILC, and ILA. (apple, orange, oraple)

For better visualization, the images in the last level of the Laplacian stack are added to the images in the first three rows. For instance, the top-left image is MG,0×IL,0B+MG,N1×IL,N1B . I did this because these images look more reasonable than the images normalized by other methods.

The final result is:

2.3.orple

2.4: Multiresolution Blending

As described in section 2.3, we can blend arbitrary image pairs with an appropriate mask.

Mr. Bean

realanimemask

2.4.bean_process

2.4.bean_blended

Hell Rock's Kitchen

What makes Hell's Kitchen even more intense? Gordon "The Rock" Ramsay!

hellrockmask

2.4.hell_rock_process

2.4.hell_rock_kitchen

Sadly, since their poses are not 100% the same, the white part on the right shoulder is hard to remove based on this method.