A thorough introduction to the basic ideas in supervised statistical learning with a focus on regression and a brief introduction to classification. Methods covered will include multiple linear regression and its extensions, k-nn regression, variable selection and regularization via AIC,BIC, Ridge and lasso penalties, non-parametric methods including basis expansions, local regression and splines, generalized additive models, tree-based methods, bagging, boosting and random forests. Content will be discussed from a statistical angle, putting emphasis on uncertainty quantification and the impact of randomness in the data on the outcome of any learning procedure. A detailed discussion of the main statistical ideas behind crossvalidation, sample splitting and re-sampling methods will be given. Throughout the course, R will be used as software, a brief introduction will be given in the beginning.
Priority is given to students enrolled in Statistics Specialist or Major programs.
Total Instructional Hours
Mode of Delivery