The goal of statistical machine learning and data mining is not to test a specific hypothesis or construct a confidence interval; instead, the goal is to find and understand an unknown systematic component within the realm of noisy, complex data.
Lectures: MW 12:20 – 1:35pm, Hanes 130. Syllabus
Instructor: Yufeng Liu
Office Hours: Mondays 1:35-2:30pm (Hanes 354); Fridays 11:30am-12:30pm (GSB 4250)
TAs: Weibin Mo (Ph.D Student in Statistics) Email: harrymok@email.unc.edu Office: Hanes Hall B-40 Office Hours: Tuesdays and Thursdays 3:30-4:30pm
Jianyu Liu (Ph.D Student in Statistics) Email: liuoo@live.unc.edu Office: Hanes Hall B-1
Zhengling Qi (Ph.D Student in Statistics) Email: qizl1027@live.unc.edu Office Hanes Hall B-26
Textbook: The Element of Statistical Learning: data mining, inference, and prediction, by Hastie, Tibshirani, and Friedman (2009). The electronic version can be downloaded for free.
Additional References:
- The Nature of Statistical Learning Theory by Vapnik (1999).
- Statistics for High Dimensional Data by Buhlmann and van de Geer (2011).
- An Introduction to Support Vector Machine and Other Kernel-Based Learning Methods by Cristianni and Shawe-Taylor (2000).
- Learning with Kernels by Scholkopf and Smola (2000).
- Convex Optimization by Boyd and Lieven Vandenberghe
- An Introduction to Statistical Learning by James, Witten, Hastie, and Tibshirani (2013).
- Linear Models with R by Faraway 2nd edition (2014).
Statistical Software:
We will use R for this course. R is free so you can easily use it anytime and anywhere. R can be downloaded from the R website. Rstudio is a recommended interface for the R software. It is also free, and it runs on Windows, Mac, and Linux operating systems. We will also use R Markdown, which can produce high quality documents and reports.
Reference: W. N. Venables, D. M. Smith, and the R Core Team. 2017. An Introduction to R: Notes on R: A Programming Environment for Data Analysis and Graphics (version 3.4.3).
Evaluation & Grading: There will be homework assignments through the semester on both the theoretical and computational aspects of the course.
The course grade will be given based on class participation, homework grades, project (presentation & final report), and exam.
The distribution of the grade is as follows:
• Presentation 20%;
• Final report 20%
Homework Policy: Homework assignments will be posted on the course web page. Each homework assignment will be graded: late/missed homework assignments without permission will receive a grade of zero. Assignments will be collected at the beginning of class on the day they are due, so please be prepared to turn in your homework at that time.
Honor Code: Students are expected to adhere to the UNC honor code at all times. Violations of the honor code will be prosecuted.
Announcements, Assignments & Lectures:
Lectures | Date | Tentative Plan | Remark |
1 | Jan 9 W | Introduction & Overview of Supervised Learning Reading: Ch1&2 | Notes 1 |
2 | Jan 14 M | Overview of Supervised Learning Reading: Ch1&2 | |
3 | Jan 16 W | Linear Regression and Extensions Learning Reading: Ch3 | Notes 2 Homework 2 |
Jan 21 M | No class | ||
4 |
Jan 23 W | Linear Regression and Extensions Reading: Ch3 | Homework 1 due |
5 | Jan 28 M | Linear Regression and Extensions Reading: Ch3 | |
6 | Jan 30 W | Linear Classification Methods Reading: Ch4 | Notes 3 |
7 | Feb 4 M | Linear Classification Methods Reading: Ch4 | Homework 3 |
8 | Feb 6 W | Splines Reading: Ch5&6 | Notes 4
Homework 2 due |
9 | Feb 11 M | Splines Reading: Ch5&6 | |
10 | Feb 13 W | Smoothing&Kernel Methods Reading: Ch5&6 | Notes 5 |
11 | Feb 18 M | Smoothing&Kernel Methods& Wavelet Reading: Ch5&6 | Homework 3 due Homework 4 |
12 | Feb 20 W | Density estimation & Additive Models Reading: Ch5&6 | |
13 | Feb 25 M | Cross Validation & Beyond Reading: Ch7&8 | Notes 6 |
14 | Feb 27 W | Support Vector Machines Reading: Ch4.5, 5.8, 12 | Notes 7
Homework 4 due |
15 | Mar 4 M | Support Vector Machines Reading: Ch4.5, 5.8, 12 | |
16 | Mar 6 W | Tree-based Methods and Beyond Reading: Ch8, 9, 10 | Notes 8 |
Mar 11 -15 UNC spring break | |||
17 | Mar 18 M | Tree-based Methods and Beyond Reading: Ch8, 9, 10 | Homework 5 due Homework 6 |
18 | Mar 20 W | Unsupervised Learning: clustering Reading: Ch14 | Notes 9 |
19 | Mar 25 M | Unsupervised Learning: dimension reduction Reading: Ch14 &17 | |
20 | Mar 27 W | Unsupervised Learning: Graphical models | Notes 10 Homework 6 due |
21 | Apr 1 M | In-class Exam | |
22 | Apr 3 W | Guest Lecture on Neural Networks and Deep Learning
Dr. Tao Wang, Senior Manager, AI and Machine Learning, R&D, SAS Institute |
|
23 | Apr 8 M | In-class presentations | Xiaoyang Chen; Miheer Dewaskar; Zhenghan Fang; Gang Li |
24 | Apr 10 W | In-class presentations | Tianshe He; Kentaro Hoffman; Dayton Steele; Benjamin Leinwand; Bohan Li |
25 | Apr 15 M | In-class presentations | Daiqi Gao; Zichao Li; Deyi Liu; Wei Liu; Yiyun Luo |
26 | Apr 17 W | In-class presentations | Carson Mosso; Robert Niewoehner; Kevin O’connor; Yifeng Shi; Nhan Pham |
27 | Apr 22 M | In-class presentations | Jack Prothero; Yukai Huang; Aleksandr Touzov; Haodong Wang |
28 | Apr 24 W | In-class presentations
Final Report Due (both hardcopy and email) |
Mingyi Wang; Xi Yang; Ai Ye; Jonghwan Yoo; Hang Yu |
- Overview Data mining and statistics: what is the connection? Friedman (1997)
-
Linear Regression & Extensions Ridge regression: biased estimation for nonorthogonal problems, by Hoerl and Kennard (1970)
Ridge regression: applications to nonorthogonal problems, by Hoerl and Kennard (1970)
Regression shrinkage and selection via the lasso, Tibshirani (1996)
Better subset regression using the nonnegative garrote, by Breiman (1995)
Continuum Regression: Cross-Validated Sequentially Constructed Prediction Embracing Ordinary Least Squares, Partial Least Squares and Principal Components Regression, by Stone and Brooks (1990)
-
Classification Flexible Linear Discriminant Analysis by Optimal Scoring, Hastie, Tibshirani, and Buja (1994)
-
Splines and Smoothing Spline Models for Observational Data, by Wahba (1990)
Nonparametric Regression and Generalized Linear Models: A Roughness Penalty Approach, by Green, P. and Silverman, B. (1994), Chapman and Hall, London.
Donoho, D. and Johnstone, I. (1994). Ideal spatial adaptation by wavelet shrinkage, Biometrika81: 425–455.
Scott, D. (1992). Multivariate Density Estimation: Theory, Practice, and Visualization, Wiley, New York.
-
Cross Validation Linear model selection by cross validation, Shao (1993)
Estimation of Prediction Error, Efron (2004)
Improvements on Cross-Validation: The .632+ Bootstrap Method, Efron and Tibshirani (1997)
-
Support Vector Machines Support Vector Machines and the Bayes Rule in Classification, Lin (2002)
Support Vector Machines for Classification in Nonstandard Situations, Lin et al. (2002)
Support Vector Machines, Reproducing Kernel Hilbert Spaces and the Randomized GACV Wahba (1998)
-
Tree-based Methods Classification and Regression Trees, by Breiman et al. (1984)
Multivariate Adaptive Regression Splines (MARS) by Friedman (1990)
Experiments with a new boosting algorithm Freund and Schapire (1996)Additive logistic regression: A statistical view of boosting , by Friedman, Hastie and Tibshirani(2001)
-
Unsupervised Learning Methods Clustering Algorithms, by Hartigan. (1975)
A Brief Introduction to Independent Component Analysis by JV Stone, 2005.
Introductory chapter of the book A. Hyvärinen, J. Karhunen, E. Oja (2001). Independent Component Analysis
Kruskal and Wish (1978), Multidimensional Scaling, Sage
http://www.personality-project.org/r/mds.html