首页 > > 详细

讲解INFS7410 Java程序、Java编程调试、辅导留学生Java语言、Java解析

INFS7410 ASSIGNMENT 4
Semester 2/2018

Marks: 20 marks (20%)
Assessment Date: Tutorial Session on 16 October 2018 (No later than 16 October)
Submission Due Date: 11.59PM, 19 October 2018 (No late submission is allowed)
What to Submit: Zipped source code with detailed comments and reports
Where to Submit: Electronic submission via blackboard

The goal of this project is to build a simple but practical search engine using
Nutch, Solr and Lucene. You must work on this project individually. The
standard academic honesty rules apply.
Task 1 – Crawl websites using Nutch: Crawl any website you are interested in.
At least 200 webpages should be crawled, and store the crawled webpages in a
database such as MongoDB. Please write a report to describe the crawling
process and results. (2 marks)
Task 2 – Use Lucene and Solr to build a search engine: Create an index for
the crawled webpages using Solr. Based on the index, write codes with the
Lucene library to process a given query and return a ranked list of relevant
pages. The marks for this task are based on the 3 parts.
Part 1: Retrieval Model and Ranking Algorithm
(1) If you implement this task by using the simple and default retrieval model and
ranking algorithm in Lucene, you will get 5 marks for this part.
(2) If you use advanced retrieval models and advanced ranking algorithms
provided by Lucene to implement this task, which means you need to investigate
and evaluate a range of extensive methods Lucene provides and compare their
performance, you will get 7 marks for this part.
(3) If you make modifications to the source codes of Lucene (especially the
ranking algorithm or retrieval model) to implement some of your own ideas, and
achieve better search results, you will get 9 marks for this part.
Part 2: User Interface
(1) If your developed search engine works in the command line interface, you will
get 2 marks for this part.
(2) If you develop a graphic user interface for this search engine, you will get 4
marks for this part.
Part 3: Advanced Functions
If you implement some extra interesting and useful functions such as keyword
highlighting and query correction or suggestion, you will get 2 marks for this part.
Code: Your implementation should be coded in Java and should allow users to
enter the keywords of a search query and return a list of relevant documents.
Deliverables: Your submission includes the following components:
1) Program: (15 marks maximum)
 Source code and its brief description
 Search Engine-User Interface
2) Reports (5 marks maximum)
 Describe how you crawl the website and store the crawled web pages
in detail. (2 marks)
 Describe how you use Lucene and Solr to implement the search
engine in detail. (3 marks)

联系我们
  • QQ:99515681
  • 邮箱:99515681@qq.com
  • 工作时间:8:00-21:00
  • 微信:codinghelp
热点标签

联系我们 - QQ: 99515681 微信:codinghelp
程序辅导网!