EIE4435
Image and Audio Processing
Semester 1, 2020/21
Q1. (Image Enhancement and Segmentation, 20 marks)
(a) Figure Q1a shows a 7x5 digital image of 3-bit amplitude resolution (allowable gray levels of this image is {0,1,2,…7}).
Figure Q1a
Suppose that you want to use the thresholding technique to segment the image shown in Fig. Q1a. Derive an appropriate threshold value and threshold the image accrodingly. The initial threshold can be set to the average of pixel intensities. Follow the following rule when you do thresholding or grouping: pixel (i,j) ∈ group A pixels if g(i,j) ≤ the current threshold T; pixel (i,j) ∈ group B pixels, otherwise, where g(i,j) is the intensity value of pixel (i, j). (7 marks)
(b) Enhance the image shown in Fig. Q1a by increasing its amplitude resolution to 8-bit and equalizing its histogram simultaneously. Show the first two rows ofthe output image. (10 marks)
(c) Figure Q1c shows an image polluted by salt and pepper noise. Suggest a filter to remove the noise. (3 marks)
Figure Q1c
Q2. (Image Representation, 20 marks)
(a) Draw the medial axis of the shaded object shown in Fig. Q2a. (5 marks)
Figure Q2a
(b) Approximate the boundary of the shaded object shown in Fig. Q2b with a polygon by using the splitting technique. The splitting criterion is that the maximum distance of the boundary from the corresponding side of the approximate polygon is larger than 0.3r. (5 marks)
Figure Q2b
(c) Figure Q2c shows the contour of an object and the grid applied. Also shown in the figure is the 4-directional code and the 8-directional code definitions.
Figure Q2c
(i) Represent the contour of the object in a clockwise direction with a 4-directional chain code and 8-directional code, respectively. Use the top-left corner as the starting point. (4 marks)
(ii) Normalize your 8-directional code with respect to the orientation of the object (the first difference code). (2 marks)
(iii) Now the 8-directional code of an input object is 7756535533131101. Determine whether it is the same object with a different orientation as shown in Fig. Q2c. Show your steps clearly. (4 marks)
Q3. (Digital Image Coding, 20 marks)
(a) Suppose that a 256 gray-level image is divided into 4 ×4 blocks and encoded using a JPEG baseline encoder. The quantized Discrete Cosine Transform. (DCT) coefficients ofthe first two 4 ×4 blocks in the image are given in Fig. Q3a.
Fig. Q3a
(i) Using the DC Coding Table and the AC Coding Table in Table 1 and Table 2 respectively, find the output bitstream for each 4 ×4 block. (10 marks)
(ii) Determine the compression ratio for each 4 ×4 block. (2 marks)
(b) Encode the message ‘ABCB’ with the arithmetic coding scheme. Show or illustrate your steps clearly. (8 marks)
Q4. (Noise Shaping, 20 marks)
(a) Figure Q4a shows a block diagram of a noise shaping circuit in an analog-to-digital conversion system.
Figure Q4a
(i) Show that, in Fig. Q4a, signal Y is the 2nd-order sigma-delta modulation output of signal X. (6 marks)
(ii) Derive and sketch the frequency response of the noise transfer function of this 2nd- order sigma-delta modulator. (4 marks)
(b) Figure Q4b is a noise shaper. The input is bounded by 0 v and 1 v. A constant 0.75 v input is fed into the noise shaper.
Figure Q4b
(i) Show that this is a 1st-order sigma-delta noise shaper. (3 marks)
(ii) Derive the output sequence by completing Table Q4b. (3 marks)
Table Q4b
(iii) Is the output a periodic pattern sequence? What is the period of the sequence if it is? (2 marks)
(iv) What is the problem with a periodic output? Suggest a solution to solve this problem. (2 marks)
Q5. (Digital Audio Coding, 20 marks) A perceptual audio codec is used to compress an audio signal. The codec groups every 8 barks into a subband and then allocates bits to different subbands according to the result of a spectrum analysis based on a psychoacoustic model. All samples in the same subband are quantized with the same quantizer, and the bit resolution of which is allocated by the codec. Fig. Q5a shows the frequency spectrum of a windowed segment of audio signal. The psychoacoustic model shown in Fig. Q5b is used in the audio codec to derive the masking threshold for the audio segment.
(a) Locate the potential maskers. (4 marks)
(b) Based on the given psychoacoustic model, show the masking threshold in the figure. (6 marks)
(c) Determine the Signal-to-Mask levels of each subband. (4 marks)
(d) Suppose that allocating one additional bit to a subband results in a 6dB drop of the noise floor in that subband. Allocate an appropriate number of bits to all subbands. (2 marks)
(e) Explain briefly why increasing the number of subbands can improve the perceptual quality of audio signal. (2 marks)
(f) Give reasons why samples in Subband 4 generally require fewer bits to encode. (2 marks)