讲解COMP3065 Computer vision

COMP3065 Computer vision Coursework (40% of Module Mark)

Submit an electronic copy via Moodle

In class we have learned many techniques that help solve computer vision problems. Some
techniques are discussed in details in the lecture or in the labs while some are only briefly
discussed. In this coursework, you are required to apply these techniques to solve practical
problems at your interest. You will implement/or use the techniques discussed in class or any
computer vision algorithm you found through the text books or published papers, depending
on the projects you select to work on.

1. Select a project. First, you need to select one of the following projects to work on. Note
that the following only depicts the basic requirement of the project. You need to
implement additional features at your choice in order to obtain higher marks for the
coursework (see marking rubrics in moodle). Additional features could be allowing more
input images/videos, additional steps/algorithms for improve results, etc.

a) Stereo vision In this project, you are required to write an program that can
successfully compute the depth map from two images capturing the same scene from
different position. You can rectify the images first before search for corresponding
points to produce disparity. Note that the input of your program are pairs of images
captured by yourself (at least 3 pairs). The output of your program should be the
depth map similar to those shown in the lecture notes (e.g. see the following figure,
grayscale images when white pixels indicate small depth and black pixels indicate
big depth).

b) Sparse optical flow for tracking In this project, you are required to write a program
that can track objects from a given video via sparse optical flow. You can use SIFT
or other features to identify good feature points and then compute optical flow only
for those points to track the locations on the next frame in the video. Note the input
of your program are short videos (possibly few seconds) captured by yourself. The
output of your program should be the same video with trajectories of your tracked
points (e.g. green line indicating the sequence of locations of the point).

c) Image search In this project, you are required to write a program that can search
within a set of images for a given image using features from techniques such as bag-
of-words or CNN. You need to prepare a small set of images of any kind by
downloading from the internet or use a subset of existing public image dataset. The
dataset should contain at least no more than 20 images. You need to consider what
features or techniques should be used how you consider a match, etc.
d) Your own idea You can write a program to solve a computer vision problem that
you are interested. This can be any of the topics covered in the class or not covered
in the class but relevant to computer vision (see project ideas given in the first lab).
You can also select a computer vision paper to implement. You do not need to
implement the full paper as long as your program has the main idea. The scope of
your own idea should be similar to project a, b, or harder. If you select your own
idea, ensure you discuss with me what you want to do (see following). The idea is
subject to my approval. In general, I will allow it as long as it is not too simple to
implement.

2. Write a program implementing your design. You are recommended to use Python
although any programming languages are OK. You can use any libraries that can help you
to achieve your tasks such as OpenCV, as long as it will not directly give you the output
of the project you are working on. You cannot directly use the code you found online or
from the lab sample code. Please consult me if you are not sure whether certain libraries
are allowed.

3. Write a report (max 2500 words) which:
Describes the main objectives of your project and the key functionalities/features
implemented .
Describes detailed steps included in your method and specific computer vision
techniques employed.
Presents and explains the results obtained on the test images/videos.
Critically evaluates your method on the basis of those results; what are its strengths
and weaknesses? This section of the report should make explicit reference to
features of the results you obtained and how they compare to the expectations you
had of your design.

Assessment criteria:
Code: 50%
Report:
Description of key features of the implementation: 25%
Explanation of the results obtained: 10%
Discussion of the strengths and weaknesses of the chosen approach and
methods: 15%

What to submit: two files to submit: 1) a zip file containing source code and test
images/videos; 2) a report of max 2500 words as described above, due 23:59, May 06, 2022.