首页 > > 详细

辅导 CS3TM Text Mining and Natural Language Processing调试Python程序

Department of Computer Science

Summative Coursework Set Front Page

Module Title

Text Mining and Natural Language Processing

Module Code

CS3TM

Type of Assignment
(e.g., technical report, set exercise, in-class test)

Technical report

 

Individual or Group Assignment

Individual

Weighting of the Assignment

50%

Word count/page limit

12 pages (excluding appendix)

Expected hrs spent for the assignment (set by lecturer)

8 hours

Items to be submitted

Individual report in PDF with commented code

Work to be submitted on-line via Blackboard Ultra by

20/5/2025 noon

Work will be marked and returned by

15 working days after the submission deadline                              

Artificial Intelligence Tools (select one of these)

May not be used

Note

By submitting this work, you are certifying that you have read the assessment guidelines, which are displayed in the folder of Assessment on the Blackboard course for this module, and that you have conformed to and understand the associated policies and practices, including those on:

• Submitting your own work, not that of other people or systems (including those using artificial intelligence), and the associated penalties for Academic Misconduct

• Submitting by the specified deadline, and the penalties associated with late submission (if allowed)

• The exceptional circumstances system

• For students with relevant needs, attaching with a green sticker

1.  Assessment classifications

First Class (>= 70%)

The coursework demonstrates:

· Exceptional understanding of the principles of natural language processing

· Solid knowledge of used techniques/algorithms for text processing and excellent technique skills in implementing these algorithms.

· Comprehensive analysis of results from the implemented algorithms

· Excellent presentation of the report 

Upper Second (60-69%)

The coursework demonstrates:

· Good understanding of the principles of natural language processing

· Appropriate use of techniques/algorithms for text processing and good technique skills in implementing these algorithms.

· Good technical skills in implementing these algorithms with good result analysis.

· Clear presentation of the report 

Lower Second (50-59%)

The coursework demonstrates:

· Basic understanding of the principles of natural language processing

· Basic use of algorithms in implementing these algorithms.

· Moderate technical skills in implementation

· Clear presentation of the report

Third (40-49%)

The coursework demonstrates:

· Satisfactory understanding of the principles of natural language processing

· Satisfactory use of algorithms in implementing these algorithms.

· Satisfactory technical skills in implementation.

Pass (35-39%)

The coursework demonstrates:

· Satisfactory understanding of the principles of   natural language processing

· Satisfactory knowledge to implementing these algorithms.

Fail (0-34%)

The coursework fails to demonstrate understanding of NLP processing techniques and skills in implementing these techniques.

2. Assignment description

Summary:

A technical report is required.  Please refer to report structure and marking scheme below.  

This report should describe related concepts of text mining and NLP techniques, and the experimental results  in two tasks. The experiments include:

     Task 1   Apply NLP analysis methods of linguistic level including morphology, lexicon, syntax, and semantics to process  text  inputs and extract features.

 Task 2   Training a Logistic regression classifier or other classifiers, based on two Newsgroups and predict the group label of your own two class data set.  

· Use tf-idf weighted unigram bag-of-words model as baseline model.

· Add more text extraction methods  (optional)

       A skeleton code is provided in Blackboard Week 7 folder to assist your implementation. The original code (with detailed comments) should be attached at the end of the report as an appendix.

        You will have own version of scikit-learn 20 newsgroups text dataset by typing student number at the beginning of the skeleton code below. You need to modify the code accordingly to achieve the two tasks above. 

       You will download your two Newsgroups based on your student number.



联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!