CS4551 Spring 2020 HW #2

CS4551 Multimedia Software Systems

Homework2 (15%) – Vector Quantization and DCT Coding

• Due: Electronic submission via CSNS by Sunday, 04/19/2020.

• What to turn in:

o Submit source code with necessary files for “compile and run”.

o Do NOT submit data files.

o You MUST provide a readme.txt file containing all information to help with the grading process.

• If your program produces any compile errors, you will receive 0 automatically no matter how close your

program is to the solution.

• Programming requirements:

o You are not allowed to use any Java built-in image class methods, library, or tools to complete this

homework.

o Do not create one mega-size main class.

o Do not change any given methods of MImage class nor create a new class that duplicates MImage

class. Treat MImage as a part of imported library.

o Test your program with all test data.

o If you do not meet any of the requirements above, you will receive a significant reduction.

0. What your program should do

Name your main application CS4551_[YourLastName].java (e.g. CS4551_Doe.java) and expand the given

template program to perform the following tasks.

Receive the input file as command line arguments.

On Command Prompt

java CS4551_Doe Ducky.ppm

Read a 24bit input PPM image and display the following main menu to the user and receive the user’s input.

Main Menu-----------------------------------

1. Vector Quantization

2. DCT-based Coding

3. Quit

Please enter the task number [1-3]:

After performing a selected task, go back to display the menu again.

1. Task 1 – Vector Quantization (50 pts)

Compress the 24 bits per pixel input image to 2 bits per pixel using Vector Quantization (VQ). Implement VQ

encoding/decoding using the requirements below.

CS4551 Spring 2020 HW #2

Encoding:

• Input vectors are formed by 2×2 blocks of RGB pixels. Each input vector 𝒙 consists of RGB values of

FOUR pixels, P1, P2, P3, and P4, and therefore 𝒙 is 12 dimensional.

P1 P2

P3 P4

Diagram of a 2×2 pixel block of the input image

𝒙 = {𝑃1𝑅,𝑃1𝐺,𝑃1𝐵,𝑃2𝑅, 𝑃2𝐺,𝑃2𝐵,𝑃3𝑅,𝑃3𝐺,𝑃3𝐵, 𝑃4𝑅,𝑃4𝐺,𝑃4𝐵

}

• Codebook and codebook vectors: The 2-bits per pixel quantization is equivalent to using 8 bits per 4

pixels. Therefore, the VQ should the vector space into 256 (=28

) cells and the codebooks should have

256 entries that are centroids of the 256 cells. After the vector quantization, each vector 𝒙 belongs to one

cell and each cell number is represented by 1 byte. In order words, after the quantization, each 2×2 block

(4×3=12 bytes) is encoded by a 1-byte codebook index. So, the compression ratio is 12.

• Codebook generation: Use K-means clustering algorithm to generate codebook vectors (centroids of

cells).

K-means Clustering Algorithm

Inputs: K, number of clusters and the data set (input vectors 𝒙)

K is 256 in our case.

Assume that 𝒄[𝑖] store the K centroids. 𝑖 = 0, 1, ⋯ , 255.

Each 𝒄[𝑖] is a 12-dimensional vector.

1. Assign randomly generated initial values for the 𝐾 centroids.

2. For each 𝒙,

For each 𝑖 = 0 to 255

If 𝒄[𝑖] is the closest cell (cluster) to 𝒙 based on the Euclidean distance between 𝒙 and 𝒄[𝑖],

assign 𝒙 to 𝒄[𝑖] cluster

3. Update the 𝐾 centroids.

4. Iterate 2 & 3 until the algorithm meets a stopping condition (i.e. no data points change clusters, the

sum of the distance is minimized, or the maximum number of iterations is reached.)

• Display the codebook. This is equivalent to displaying 𝒄[𝑖], 𝑖 = 0, 1, ⋯ ,255.

• Extra credit (10 pts) – Save the quantized image (i.e. indices of all 2×2 blocks) into a PPM file. Given

𝑊 × 𝐻 input image, the quantized image resolution is 𝑊/2 × 𝐻/2. The quantized image is a grayscale

image because each pixel is an 8-bit index.

Decoding:

• Given the quantized image and the codebook, for each pixel of the quantized image, recover RGB pixel

values of 4 pixels.

• Save the decoded image so that you can compare the output with the input.

2. Task 2 – DCT-based Coding (50 pts)

Implement a DCT-based transform coding. Notice that this is different from the standard JPEG steps. The

encoder/decoder steps are shown below.

CS4551 Spring 2020 HW #2

Encoding Steps Decoding Steps

E1. Read and resize the input image

Read the input ppm file containing RGB pixel values

for encoding. First, if the image size is not a multiple

of 8 in each dimension, make (increase) it become a

multiple of 8 and pad with zeros. For example, if your

input image size is 21×14, make it become 24×16 and

fill the extra pixels with zeros (black pixels).

D4. Remove Padding and Display the image

Display the decompressed image. Remember that you

padded with zeros if the input image size is not

multiple of 8 in both dimensions (width and height).

Restore the original input image size by removing

extra padded rows and columns.

E1. Resize Input Image

E2. Color Conversion &

Subsampling

E3. Forward DCT

E4. Quantization

D4. Restore Original Size

D3. Supersampling & Inverse

Color Conversion

D2. Inverse DCT

D1. De-quantization

Input RGB Image (PPM) Decompressed RGB Image (PPM)

Compressed Image 01001…

Entropy Encoding/Decoding

CS4551 Spring 2020 HW #2

E2. Color space transformation and Subsampling

Transform each pixel from RGB to YCbCr using the

equation below:

(

𝑌

𝐶𝑏

𝐶𝑟

) = (

0.2990 0.5870 0.1140

−0.1687 −0.3313 0.5000

0.5000 −0.4187 −0.0813

)(

𝑅

𝐺

𝐵

)

Initially, RGB value ranges from 0 and 255. After

color transformation, Y should range from 0 to 255,

while Cb and Cr should range from -127.5 to 127.5.

(Truncate if necessary.)

Subtract 128 from Y and 0.5 from Cb and Cr so that

they span the same range of values [-128,127]

Subsample Cb and Cr using 4:2:0 (MPEG1)

chrominance subsampling scheme. If Cb(Cr) is not

divisible by 8, pad with zeros.

D3. Supersampling and Inverse Color space

transformation

Supersample Cb and Cr so that each pixel has Cb and

Cr.

Add 128 to the values of the Y component and 0.5 to

the values of the Cb and Cr components.

If using a color image, transform from the YCbCr

space to the RGB space according to the following

equation:

(

𝑅

𝐺

𝐵

) = (

1.0000 0 1.4020

1.0000 −0.3441 −0.7141

1.0000 1.7720 0

) (

𝑌

𝐶𝑏

𝐶𝑟

)

Common mistake: After this step, you have to make

sure that the resulting RGB values are in the range

between 0 and 255. Truncate if necessary.

E3. Forward DCT

Perform the forward DCT for Y image using the

following steps:

• Divide the image into 8×8 blocks. Scan each

block in the image in raster order (left to right,

top to bottom)

• For each 8×8 block, perform the DCT

transform to get the values 𝐹𝑢𝑣 from the values

𝑓𝑥𝑦. The elements 𝐹𝑢𝑣 range from −2

10 to 2

Check max and min and assign −2

10 or 2

for the values outside of the range so that the

values range from −2

10 to 2

Perform the DCT for Cb and Cr images, too.

D2. Inverse DCT

Perform the inverse DCT to recover the values 𝑓𝑥𝑦

from the values 𝐹𝑢𝑣 and recover Y, Cb, Cr images.

CS4551 Spring 2020 HW #2

Forward DCT Formula

𝐹𝑢𝑣 =

𝐶𝑢𝐶𝑣 ∑ ∑𝑓𝑥𝑦 cos (

(2𝑥 + 1)𝑢𝜋

16 )cos (

(2𝑦 + 1)𝑣𝜋

16 )

𝑦=0

𝑥=0

𝐶𝑢 = {

1/√2, 𝑢 = 0

1, otherwise

𝐶𝑣 = {

1/√2, 𝑣 = 0

1, otherwise

𝑓𝑥𝑦 is the 𝑥-th row and 𝑦-th column pixel of the 8×8 image block (𝑥 and 𝑦 range from 0 to 7); 𝐹𝑢𝑣 is the DCT

coefficient value in the 𝑢-th row and 𝑣-th column (𝑢 and 𝑣 range from 0 to 7).

Inverse DCT Formula

𝑓𝑥𝑦

′ =

∑ ∑𝐶𝑢 𝐶𝑣 𝐹𝑢𝑣

′

cos (

(2𝑥 + 1)𝑢𝜋

16 ) cos (

(2𝑦 + 1)𝑣𝜋

16 )

𝑣=0

𝑢=0

E4. Quantization

Given 𝐹𝑢𝑣 in an 8×8 DCT block, quantize 𝐹𝑢𝑣 using:

Quantized(𝐹𝑢𝑣) = round (

𝐹𝑢𝑣

𝑄𝑢𝑣

)

The default intervals 𝑄𝑢𝑣 corresponding 𝑢 and 𝑣 are

specified in Table 1 and Table 2.

In this homework, we want to provide a variety of

compression quality options (high compression or

low compression). User shall specify one parameter

𝑛 (0 ≤ 𝑛 ≤ 5 ), which controls the quality of the

compression by changing the quantization intervals.

The actual quantization is done by

Quantized(𝐹𝑢𝑣) = round (

𝐹𝑢𝑣

𝑄𝑢𝑣

′

)

𝑄𝑢𝑣

′ = 𝑄𝑢𝑣 × 2

𝑛

For example, if 𝑛 = 0, 𝑄𝑢𝑣

′

is same as 𝑄𝑢𝑣; if 𝑛 = 1,

𝑄𝑢𝑣

′

is double of 𝑄𝑢𝑣 , which will divide 𝐹𝑢𝑣 with

bigger values and result in more compression.

D1. De-quantization

Assume that the quantization tables (basis ones) and

the compression quality parameter 𝑛 are available for

decoding. Given the quantized value for DCT

coefficient 𝐹𝑢𝑣 , multiply it by the corresponding

quantization interval 𝑄𝑢𝑣

′

𝐹𝑢𝑣

′ = Quantized(𝐹𝑢𝑣) × 𝑄𝑢𝑣

′

Notice that the recovered 𝐹𝑢𝑣

′ will be different from

the original 𝐹𝑢𝑣.

CS4551 Spring 2020 HW #2

Default Quantization Tables

The following table gives the default quantization intervals for each element in the 8×8 DCT block for the

luminance (Y) and chrominance (Cb and Cr).

Table 1. Luminance Y Quantization

Table

Table 2. Chrominance Cb and Cr

Quantization Table

4 4 4 8 8 16 16 32 8 8 8 16 16 32 32 64

4 4 8 8 16 16 32 32 8 8 16 16 32 32 64 64

4 8 8 16 16 32 32 32 8 16 16 32 32 64 64 64

8 8 16 16 32 32 32 32 16 16 32 32 64 64 64 64

8 16 16 32 32 32 32 48 16 32 32 64 64 64 64 96

16 16 32 32 32 32 48 48 32 32 64 64 64 64 96 96

16 32 32 32 32 48 48 48 32 64 64 64 64 96 96 96

32 32 32 32 48 48 48 48 64 64 64 64 96 96 96 96

• E1/D4 – 10 pts

• E2/D3 – 10 pts

• E3/D2 – 20 pts

• E4/D1 – 10 pts

An important requirement – After each encoding step, implement the corresponding decoding step

immediately and check if your output is correct or not.

You will receive credits for each encoding step if only if you complete to implement the corresponding

decoding step.

Sample results will be posted on CSNS.

联系我们

QQ：99515681
邮箱：99515681@qq.com
工作时间：8:00-21:00
微信：codinghelp

热点文章

辅导 comm2000 creating socia... 2026-01-08
讲解 isen1000 – introductio... 2026-01-08
讲解 cme213 radix sort讲解 c... 2026-01-08
辅导 csc370 database讲解迭代 2026-01-08
讲解 ca2401 a list of colleg... 2026-01-08
讲解 nfe2140 midi scale play... 2026-01-08
讲解 ca2401 the universal li... 2026-01-08
辅导 engg7302 advanced compu... 2026-01-08
辅导 comp331/557 – class te... 2026-01-08
讲解 soft2412 comp9412 exam辅... 2026-01-08
讲解 scenario # 1 honesty讲解... 2026-01-08
讲解 002499 accounting infor... 2026-01-08
讲解 comp9313 2021t3 project... 2026-01-08
讲解 stat1201 analysis of sc... 2026-01-08
辅导 stat5611: statistical m... 2026-01-08
辅导 mth2010-mth2015 - multi... 2026-01-08
辅导 eeet2387 switched mode ... 2026-01-08
讲解 an online payment servi... 2026-01-08
讲解 textfilter辅导 r语言 2026-01-08
讲解 rutgers ece 434 linux o... 2026-01-08

热点标签

msinm014/msing014/msing014b

联系我们 - QQ: 99515681 微信：codinghelp

程序辅导网！