课程名称 (Course Name) :Introduction to Web Search and Mining
课程代码 (Course Code):F033583
学分/学时 (Credits/Credit Hours): 2 / 32
开课时间 (Course Term ): Spring
开课学院(School Providing the Course): SEIEE
任课教师(Teacher): ZHU Qili Kenny
课程讨论时数(Course Discussion Hours): 0
课程实验数(Lab Hours): 0
课程内容简介(Course Introduction):
The World Wide Web (WWW) is the largest source of open-domain information today. The popularization of the web has revolutionized the way people search and retrieve information. This course presents the fundamental theory and practice behind web search engines and introduce some basic techniques to extract information and mine knowledge from the web, with an emphasis on text documents. After learning from this course, you should be able to understand the basic internals of a web search engine, and perhaps build a small search engine of yourself. On the other hand, you should get enough hands-on experience to write a crawler to extract data from the web and do various data analytics on the acquired data.
教学大纲(Course Teaching Outline):
1. Information retrieval models (Boolean, vector space, language models)
2. Indexing and index compression
3. Link analysis (PageRank, HITS)
4. Semantic Search
5. Web crawling
6. Recommender systems
课程进度计划(Course Schedule):
TBD
课程考核要求(Course Assessment Requirements):
In-class quizzes 30%
Assignments 40%
Projects 30%
参考文献(Course References):
1. Introduction to Information Retrieval, Jul 7, 2008, by Christopher D. Manning and Prabhakar Raghavan
2. Mining the Web: Discovering Knowledge from Hypertext Data Hardcover – October 23, 2002, by Soumen Chakrabarti
3. Web Information Retrieval (Data-Centric Systems and Applications), Aug 30, 2013, by Stefano Ceri and Alessandro Bozzon
4. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data (Data-Centric Systems and Applications), Aug 6, 2013, by Bing Liu
预修课程(Prerequisite Course)
Discrete Math, Probability and Statistics, Database systems, Machine Learning or Data Mining