首页 > > 详细

Implement a Machine Learning Model and Test the Training Algorithm on Data

 Implement a Machine Learning Model and Test the Training Algorithm on Data

Introduction
The goal of this assignment is to develop your hands-on skills in performing learning from data, as well as further understanding of the practical technical details of the learning procedure.
You will implement a simple machine learning algorithm from scratch, for example, the ID3 decision tree building algorithm or the perceptron training algorithm or an ensemble method. You can also implement another algorithm of your interest to solve the supervised learning problem. The algorithm needs to be tested using a simple but practical dataset and under an appropriate learning framework as instructed in the course. The task also includes a report of your reflection on the development, the testing scheme and the results.
Alternatively, unsupervised learning algorithms can also be considered, but you must specify the testing scheme and the criteria clearly in the report (see below) if you choose to implement an unsupervised learning algorithm.
Specification
Implementation from scratch: The implementation should include detailed computational steps of an algorithm. We do NOT consider straightforward usage of the off-the-shelf toolboxes as implementing the algorithm details. For example, if you choose to build a decision tree, the implementation of the tree-building algorithm should address the construction of the tree structure, the computation of splitting data (a subset of the training dataset) at a tree node to create the children nodes -- in ID3, this is to compute the information gain and the entropy and decide the split accordingly. However, it is allowed to use basic auxiliary tools such as the libraries to perform matrix or linear algebra operations, to facilitate loading and parsing the data files, etc.
Practical dataset: The dataset contains sufficient samples to represent practical relationships between the attributes and the target to be predicted. Typical examples include the Iris flower dataset, the image dataset of hand-written digits, or other examples that have been used in the tutorial demos. Manually crafted toy datasets are not recommended.
Learning framework and test scheme: A proper training and validation scheme must be set up for the test of the implementation. More sophisticated evaluation schemes are also welcomed. For formal and detailed information of the learning framework, refer to the related sections in course materials.
What to Submit
You will submit a PDF file, including a link to a cloud-based source code hosting service where your implementation can be assessed and evaluated. An example template of the PDF report is attached at the bottom of this document.
Note the link to project implementation is a VETO criterion in grading this assignment -- 1) failing to set it up will disable the assessors to mark the rest of your report and 2) trying to modify the setting up after due date will incur late penalty (likely to be heavy by the time a link is found to be not working).
The project must be self-contained, i.e. all data collection and processing, environment preparation steps are included. Assessors of your project will NOT modify her/his computer environment to enable execution of the project. This is best ensured to use a cloud computing service (Google's colab is recommended) and verify your project by assessing it from an anonymous account (different from your own to ensure you have given assess permission to 3rd party users, i.e. the assessors). 
It is difficult to give an estimate of the number of lines of code. However, you should be able to code it all up using 3 or 4 classes in an object–oriented design. You can refer to the demo tutorials in this course as reference.
Software
We recommend using Python, the software setting up is readily provided on Google colab. But you are free to develop in other languages. If you want to choose a different language, all above requirements apply.
 
联系我们 - QQ: 99515681 微信:codinghelp
© 2021 www.7daixie.com
程序辅导网!