Courses of the second year of the Data Science curriculum

Exit year (M2)

Semester 1

FST - Machine Learning II (code S9IOFSTU, 5 ects)

In this course, students learn how to design machine learning processes for the task of classification (binary or multiclass). We first see the theoretical issues related to this task (definition, learning step, evaluation step, overfitting), and then we explain how to use different algorithms adapted to this task (logistic regression, bayes classifier, nearest neighbour, decision trees and random forests, SVM, neural networks).

Students are evaluated through 2 written exams and homeworks.

FSY -  Symbolic Data Mining (code S9IOFSYU, 5 ects)

This course aims at presenting the symbolic data mining process : data preparation, application of different data mining algorithms, interpretation of the results...

The techniques presented in this course are : frequent itemset mining, association rules, patternmining in transactions or sequences, pattern selection, multi-relational data mining.

Students are evaluated through written exams and a final personal work.

EDD - Data Warehouse (code S9IENDDU, 4 ects)

This course covers the main steps of the design of a datawarehouse for business intelligence purposes : study and design of dimensional models, R-OLAP and Mondrian, design of relational database with the “star” model, MDX querying, data integration with ETL.

Students are evaluated through 1 written exam and a final personal work.

CLD - Cloud and Big Data Management (code S9IPCLDU, 3 ects)

The aim of this course is to introduce techniques for big data management. Different models for data storage and transformation are presented (MapReduce, Hadoop, Spark). Insights are given in order to be able to compare these different technical solutions.

Techniques for big data distribution are also presented (for streaming for instance).

Students are evaluated through 1 written exam.

INV - Indexing and Visualization (code S9IOINVU, 3 ects)

This course aims at giving students techniques to deal with huge quantites of multimedia data. Indexation techniques for image and text data are presented. At the end of this course, students should be able to design search engines for text and image data, as well as evaluating the performance of such engines.

Data visualization techniques are also presented in the second part of this course.

IES -  I & E Study (code, 6 ects)
SSC - Summer School, Case Study (code S9IDECOU, 4 ects)

 


Semester 2

STA -  Master Thesis (code S0IMATGU, 30 ects)

This course is the final internship. This internship takes place from Mid-march to end of august. At the end of the insternship, the students have to send a written report explaining their work and make an oral presentation of 30 minutes.