Machine Learning for Data Science
- Course type
- STATISTICS
- Correspondant
- François PORTIER
- Unit
-
UE-MSD01 : Machine Learning
- Number of ECTS
- 3.5
- Course code
- MSD 01-1
- Distribution of courses
-
Heures de cours : 30
- Language of teaching
- English
Objectives
Upon completing this course, students should be able to:
– select the appropriate methods;
– implement these statistical methods;
– compare leading procedures based on statistical arguments;
– assess the prediction performance of a learning algorithm;
– apply these key insights into class activities using statistical software.
Course outline
This course focuses on supervised learning methods for regression and classification. Starting from elementary algorithms such as ordinary least squares, we will cover regularization methods (crucial in large scale learning), nonparametric decision rules such as support vector machine, the nearest neighbor algorithm and CART.
Finally, bagging and boosting techniques will be discussed while presenting random forest and XGboost algorithm.
We shall focus on methodological and algorithmic aspects, while trying to give an idea of the underlying theoretical foundations. Practical sessions will give the opportunity to apply the methods on real data sets using either R or Python. The course will alternate between lectures and practical lab sessions.
Prerequisites
Linear algebra, probability, optimization