辅导COMP/ENGN4528 Computer Vision

COMP/ENGN4528 Computer Vision - 2022
Computer-Lab 2(C-Lab2)
COMP/ENGN4528, 2022
March 27, 2022
Objectives:
This is CLab-2 for COMP/ENGN4528 Computer Vision. This Lab focuses on features, develop-
ing mid-level computer vision features, and using Deep Learning to learn features to perform a
classification task.
For this the second part of this lab we highly recommend that you use PyTorch. Please discuss
with your tutor if you plan to use anything else.
Special Notes:
1. Each computer lab has three weeks for submission, but only two weeks of classes: session-A
and session-B. Tutors/Lab instructor will provide basic supervision to both sessions. Please
attend both sessions to get support. Piazza support is available throughout.
2. Your Lab will be marked based on the overall quality of your Lab Report (PDF). The report
is to be uploaded to Wattle site before the due time, which is usually on Sunday 11:59pm of
Week-3 session of your lab.
3. Your submission includes the lab report in PDF format as well as the Lab code that generating
the experimental result.
4. It is normal if you cannot finish all the tasks within two 2-hour sessions — these tasks are
designed so that you will have to spend about 9 hours to finish all tasks including finishing
your Lab report. This suggests that, before attending the third lab session (in Week-2 of
each CLab), you must make sure that you have almost complete 80%. It may take longer if
you are not familiar with programming in Python and with object oriented concepts, or less
time if you are already familiar with pytorch.
5. In your Lab Report, you need to list your complete source code with detailed comments for
each task. You should also show corner detection results and their comparisons for each of
the test images for the first task.
Academic Integrity:
You are expected to comply with the University Policy on Academic Integrity and Plagiarism. You
are allowed to talk with / work with other students on lab and project assignments. You can share
ideas but not code, you should submit your own work. Your course instructors reserve the right
to determine an appropriate penalty based on the violation of academic dishonesty that occurs.
Violations of the university policy can result in severe penalties.
1
Figure 1: For Matlab Users.
CLab-2 Tasks
Task 1 Harris Corner Detector. (5 marks)
For Matlab Users:
1. Read and understand the corner detection code ‘harris.m’ in Fig 1.
2. Complete the missing parts, rewrite ‘harris.m’ into a Matlab function, and design appropriate
function signature (1 mark).
3. Please provide comments on line #13 and every line of your solution after line #20 (0.5
mark) in your report. Specifically, you need to provide short comments on your code, which
should make your code readable. Please make sure you include the comments in your report
to get marks.
4. Test this function on the provided four test images (Harris-[1,2,3,4].jpg, they can be down-
loaded from Wattle). Display your results by marking the detected corners on the input
images (using circles or crosses, etc) (0.5 mark for each image, 2 marks in total).
Please make sure that your code can be run successfully on a local machine and generate
results. If your submitted code cannot replicate your results, you may need to explain and
demonstrate the results in person to tutors.
5. Compare your results with that from Matlab’s built-in function corner() (0.5 mark), and
discuss the factors that affect the performance of Harris corner detection (1 mark).
For Python users:
2
1. Read and understand the below corner detection code (Fig. 2).
2. Complete the missing parts, rewrite them to ‘harris.py’ as a python script, and design ap-
propriate function signature (1 mark).
3. Comment on block #5 (corresponding to line #13 in “harris.m”) and every line of your
solution in block #7 (0.5 mark) in your report. Specifically, you need to provide short
comments on your code, which should make your code readable. Please make sure you
include the comments in your report to get marks.
4. Test this function on the provided four test images (Harris-[1,2,3,4].jpg, they can be down-
loaded from Wattle). Display your results by marking the detected corners on the input
images (using circles or crosses, etc) (0.5 mark for each image, 2 marks in total).
Please make sure that your code can be run successfully on a local machine and generate
results. If your submitted code cannot replicate your results, you may need to explain and
demonstrate the results in person to tutors.
5. Compare your results with that from python’s built-in function cv2.cornerHarris() (0.5 mark),
and discuss the factors that affect the performance of Harris corner detection (1 mark).
Task 2 - Deep Learning Classification (10 Marks)
In this lab we will train a CNN with the Kuzushiji-MNIST dataset using the PyTorch deep learning
framework. The Kuzushiji-MNIST dataset contains 70000 images: 59000 training images, 1000
validation images and 10000 testing images1. Images are 28×28 Greyscale.
Complete the following exercises:
1. Download the Kuzushiji-MNIST dataset from google drive: link
2. After loading the data using numpy, normalize the data to the range between (-1, 1). Also
perform the following data augmentation when training:
randomly flip the image left and right.
zero-pad 4 pixels on each side of the input image and randomly crop 28x28 as input to
the network.
3. Build a CNN with the following architecture:
5×5 Convolutional Layer with 32 filters, stride 1 and padding 2.
ReLU Activation Layer.
2×2 Max Pooling Layer with a stride of 2.
3×3 Convolutional Layer with 64 filters, stride 1 and padding 1.
ReLU Activation Layer.
2×2 Max Pooling Layer with a stride of 2.
Fully-connected layer with 1024 output units.
ReLU Activation Layer.
Fully-connected layer with 10 output units.
4. Set up cross-entropy loss.
5. Set up Adam optimizer, with 1e-3 learning rate and betas=(0.9, 0.999). (3 marks to here)
6. Train your model. Draw the following plots:
Training loss vs. epochs.
Training accuracy vs. epochs.
1The reference for the dataset is: Tarin Clanuwat, Mikel Bober-Irizar, Asanobu Kitamoto, Alex Lamb, Kazuaki
Yamamoto, David Ha, "Deep Learning for Classical Japanese Literature", arXiv:1812.01718.
3
Validation loss vs. epochs.
Validation accuracy vs. epochs.
You can either use Tensorboard to draw the plots or you can save the data (e.g. in a dictio-
nary) then use Matplotlib to plot the curve. (2 marks)
7. Train a good model. Marks will be awarded for high performance and high efficiency in
training time and parameters (there may be a trade-off), good design, and your discussion.
You are not allowed to use a pre-trained model, you should train the model yourself. You need
to describe what exactly you did to the base model to improve your results in your report,
and your motivation for your approach (no more that 1 page of text). Please include plots as
above for training and validation loss and accuracy vs. epochs, as well as the final accuracy on
the test set. Please submit the code and your trained model for this. Your performance will
be verified. Please ensure you follow the test/train/validation splits as provided. You also
must show your training and validation accuracy vs. epochs for this model in your report.
Note that you may be asked to run training of your model to demonstrate its training and
performance. (4 marks)
8. The main dataset site on github (https://github.com/rois-codh/kmnist) includes a series
of results for other network models (under Benchmark & Results). How does your model
compare? Explain why the ResNet18 model may produce better results than yours (or the
other way around if this is the case). (1 mark).
Resources:
PyTorch training a classifier tutorial: https://pytorch.org/tutorials/beginner/blitz/
cifar10_tutorial.html (Note this uses CIFAR-10, but you need to train Kuzushiji-MNIST.)
PyTorch Documentation: https://pytorch.org/docs/stable/index.html
Deeper models implementation: https://github.com/pytorch/vision/tree/master/torchvision/models
Tensorboard tutorial: https://www.tensorflow.org/tensorboard/get_started
1 Notes
1. The lab report will be due on Sunday, 1 May 2022, 11:59pm. You are required to complete
all tasks and answer all questions. Please submit a single zip file containing your report in
pdf format and attach all your code. You are required to submit code including your best
performing trained model, and Latex is recommended for the report (but not required). Name
your report as Lab_2_uxxxxxxx.pdf, and name your submission as Lab_2_uxxxxxxx.zip,
replacing uxxxxxxx with your uni-ID.
2. Marks will be deducted for poor presentation such as code and report quality. Failing to
follow the above instructions will lead to mark deductions.
3. This Lab is worth 15% of the total course assessment.
4. Late penalty will be 10% per day after the deadline, which is Sunday, 1 May 2022, 11:59pm.