Research

Smart Data Prerequisites

Below you will find a list of relevant prerequisite knowledge that an ideal candidate for the Master for Smart Data Science program should have. Exhaustive knowledge of all the notions is not required. Candidates who master a majority of the items mentioned are encouraged to apply.

Prerequisites in Analysis

Sets: Sets, subset and inclusion. Union and intersection of sets, difference of two sets, complement of a subset. Countable sets. Mapping between two sets. Direct and inverse image of a set by a mapping. Inverse mapping. Injective, surjective and bijective mappings. Composition of two mappings.

Sequences and series of real numbers: Limit of real valued sequences, convergence and divergence. Equivalent sequences. Monotone sequences. Partial sums and convergence of series. Absolute convergence of a series. Geometric and exponential series.

Functions defined on the real line: Definition of a function, graph of a function. Limit of a function at a given point and continuity. Derivative of a function, tangent equation and usual properties of derivatives (derivation of a product, a quotient or a composition of functions, derivative of the inverse). Odd, even and periodic functions. Derivative and variations of a function. Usual functions: exponential, logarithm, power, sine and cosine. Convex functions. Primitive of a function. Riemann sums, Riemann integral and area under a curve. Basic properties of the Riemann integral. Integration by parts and change of variables formula.

Functions of several variables: Partial derivatives. Gradient. Hessian matrix. Taylor expansions. Optimization without constraints: first-order and second-order conditions. The case of convex functions. Integration of functions of several variables. Fubini theorem.

Norm, scalar product and Euclidean space: Definition of a norm, of a distance. Scalar product and associated norm. Cauchy-Schwarz inequality. Orthogonal vectors. Euclidean spaces. Orthonormal basis. Gram-Schmidt process. Distance between a point and a vector subspace, orthogonal projection and expression in an orthonormal basis. Orthogonal complement of a vector subspace. Affine hyperplanes in Euclidean spaces

Prerequisites in Algebra

Complex numbers: Real and imaginary part of a complex number. Modulus and trigonometric form. Elementary algebraic operations. Exponential of a complex number. Quadratic equations.

Polynomials: Roots of a polynomial, divisibility. Polynomial functions. Degree of a polynomial and roots multiplicity.

Vector spaces: Notion of vector space and vector subspace. Vector subspace generated by a family of vectors. The vector space ℝn. Linearly independent vectors. Basis. Coordinates of a vector in a basis. Sum of vector subspaces. Complements of a vector subspace. Vector spaces of finite dimension. Dimension of a vector subspace, rank of a system of vectors. Linear maps. Kernel and image of a linear map. Rank-nullity theorem. Linear form and hyperplane. Affine subspaces of a vector space.

Matrices: Sum and product of matrices. Transpose of a matrix. Inverse matrix. Rank of a matrix. Trace. Link between matrices and linear maps. Link between matrices and linear
systems. Determinant of matrices and properties.

Spectral decomposition of a square matrix: Eigenvalues, eigenvectors. Basis of eigenvectors. Diagonalization of a matrix. Spectral decomposition of symmetric matrices, positive semi-definite matrices and orthogonal projection matrices.

Prerequisites in Probability & Statistics

Combinatorics: Cardinality of a set. Factorial of an integer, binomial coefficients. Binomial expansion.

Probability space: Random experiments, events and probability measures. Basic properties of a probability measure. Conditional probabilities. Bayes formula. Formula of total probability. Independent events.

Random variables: Discrete or continuous random variables. Usual discrete probability distributions (Bernoulli, Poisson, uniform, exponential, normal). Independence of random variables. Expectation of a random variable. Variance and covariance. Conditional distribution and conditional expectation.

Convergence and limit theorems: convergence in probability, convergence in law, law of large numbers, central limit theorem.

Exploratory Statistics and Data analysis: Mean, median, mode, range, standard deviation, interquartile range, quartiles and percentiles; interpretation of data in tables and graphs (line graphs, bar graphs, circle graphs, boxplots, scatterplots and frequency distributions); principal components analysis.

Statistical inference: statistical model, likelihood, method of moments, point estimation, bias, mean square error, confidence intervals, tests, p-value, chi-squared tests, Student’s t-test, Kolmogorov-Smirnov test.

Regression analysis and Time series: linear regression, analysis of variance, logistic regression, regression trees, autoregressive linear process, moving averages.

Prerequisites in Programming, Algorithms & Data Structures

Abstract types: setting arrays, trees, dictionaries.

Classic algorithmic patterns: greedy approach, divide and conquer, dynamic programming.

Programming: Writing and compiling a program, debugging programs, input/output using files, use of variables, control structures and loops, exception handling, writing functions, methods, procedures.

Languages and statistical software: intermediate level in Python or R

Relational database management system: SQL language, variable type, create a table, update a table, SELECT/DELETE/INSERT INTO queries, SQL scripts.