COURSE UNIT TITLE

: INTRODUCTION TO TEXT AND WEB MINING

Description of Individual Course Units

Course Unit Code Course Unit Title Type Of Course D U L ECTS
ELECTIVE

Offered By

Computer Science

Level of Course Unit

First Cycle Programmes (Bachelor's Degree)

Course Coordinator

PROFESSOR DOCTOR EFENDI NASIBOĞLU

Offered to

Computer Science

Course Objective

In this course, Introduction to Text and Web Mining, queries and documents, document preprocessing, word distributions, vectorization, automatic indexing/tagging, sentence matching, social network analysis, natural language processing, deep learning-based models, large language models will be explained.

Learning Outcomes of the Course Unit

1   Have general information about text mining techniques,
2   Have general information about web mining techniques,
3   Be capable of analysing text based documents,
4   Have general information about natural language processing techniques,
5   Have information about web search and indexing.

Mode of Delivery

Face -to- Face

Prerequisites and Co-requisites

None

Recomended Optional Programme Components

None

Course Contents

Week Subject Description
1 Introduction to text mining, Boolean retrieval
2 Dictionaries
3 Indexes construction, compression
4 Scoring, term weighting
5 Computing scores
6 Information retrieval
7 XML retrieval
8 Language models
9 Language models (cont.)
10 Text classification
11 Vector space classification
12 Support vector machines, Machine learning on documents
13 Flat and hierarchical clustering
14 Web search basics, web crawling and indexes Link analysis

Recomended or Required Reading

Textbook(s):
Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, An Introduction to Information Retrieval, Cambridge University Press, 2009.
Supplementary Book(s):
Song, M., Handbook of Research on Text and Web Mining Technologies, Volume I-II, Y-F. B. Wu, 2007.
Jurafksy, D., Martin, J. H.., An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, 3rd ed., Stanford University, 2022.

Planned Learning Activities and Teaching Methods

The course is taught in a lecture, class presentation and discussion format. Besides the taught lecture, group presentations are to be prepared by the groups assigned and presented in a discussion session. In some weeks of the course, results of the homework given previously are discussed.

Assessment Methods

SORTING NUMBER SHORT CODE LONG CODE FORMULA
1 MTE MIDTERM EXAM
2 ASG ASSIGNMENT
3 FIN FINAL EXAM
4 FCG FINAL COURSE GRADE MTE * 0.30 + ASG * 0.30 + FIN * 0.40
5 RST RESIT
6 FCGR FINAL COURSE GRADE (RESIT) MTE * 0.30 + ASG * 0.30 + RST * 0.40


Further Notes About Assessment Methods

None

Assessment Criteria

Assignment: 30%
Midterm exam: 30%
Final exam: 40%

Language of Instruction

Turkish

Course Policies and Rules

Students will come to the class in time. Attending the 70% of the classes are mandotary.

Contact Details for the Lecturer(s)

efendi.nasibov@deu.edu.tr

Office Hours

Will be announced.

Work Placement(s)

None

Workload Calculation

Activities Number Time (hours) Total Work Load (hours)
Lectures 14 3 42
Preparations before/after weekly lectures 14 1 14
Preparing assignments 1 15 15
Preparation for final exam 1 30 30
Preparation for midterm exam 1 15 15
Final 1 2 2
Midterm 1 2 2
TOTAL WORKLOAD (hours) 120

Contribution of Learning Outcomes to Programme Outcomes

PO/LOPO.1PO.2PO.3PO.4PO.5PO.6PO.7PO.8PO.9PO.10PO.11PO.12PO.13
LO.13
LO.23
LO.33444
LO.43444
LO.543544