These are all resources you may find helpful in your first few years (e.g. to supplement the core courses and/or starting research). Unless noted otherwise, they are all freely available online.

Probability, Statistics, Machine Learning and Optimization

Probability Theory and Examples by Rick Durrett (http://services.math.duke.edu/~rtd/PTE/PTE4_1.pdf)
– One of the preferred grad level probability textbooks.

Jeff Miller has an excellent series of youtube videos
– Probability primer (https://www.youtube.com/playlist?list=PL17567A1A3F5DB5E4). This course covers some topics in probability (634-635) and stat theory (654-655).
– Machine learning (https://www.youtube.com/playlist?list=PLD0F06AA0D2E8FFBA). This course overs many topics in stat theory (654-655) and applied stats (664-665). It also covers topics in machine learning and bayesian stat courses.

Introduction to Statistical Learning (http://www-bcf.usc.edu/~gareth/ISL/ISLR%20Seventh%20Printing.pdf) and Elements of Statistical Learning (https://web.stanford.edu/~hastie/Papers/ESLII.pdf)
– These are great places to turn for your first (and second) foray applied statistics and machine learning.

Michael Jordan’s suggested reading list for statistics PhD: https://honglangwang.wordpress.com/2014/12/30/machine-learning-books-suggested-by-michael-i-jordan-from-berkeley/ (not free)

The deep learning book (http://www.deeplearningbook.org/)
– Introductory/intermediate level textbook form some of the masters.
– Also a good book to machine learning and optimization.

Convex Optimization by Vandenberghe and Boyd  (https://web.stanford.edu/~boyd/cvxbook/bv_cvxbook.pdf)
– The standard introduction to optimization.
– Also see the course webpage (http://www.seas.ucla.edu/~vandenbe/ee236b/ee236b.html) and Stephen Boyd’s youtube lectures (https://www.youtube.com/view_play_list?p=3940DD956CDF0622)

Optimization Methods for Large-Scale Machine Learning (https://arxiv.org/pdf/1606.04838.pdf)
– Overview of many of the modern optimization methods that statisticians/machine learning researchers should at least be aware of.

Computation

These are helpful resources for getting started in R/Python and for learning some more advanced topics.

Introductory R

R for Data Science http by Hadley Wickham (http://r4ds.had.co.nz/)
– Fantastic, free, online textbook for introductory to intermediate R.

STOR 320: Intro to Data Science (https://idc9.github.io/stor390/)
– Undergrad course at UNC which introduces R and data science.

Introductory Python

Python Data Science Handbook by Jake Vanderplas (https://jakevdp.github.io/PythonDataScienceHandbook/)
– Introduction to doing statistics/machine learning in Python.

Computational Statistics in Python by Cliburn Chan  (http://people.duke.edu/~ccc14/sta-663-2017/)
– Covers a huge number of topics in computational statistics from advanced python to MCMC to GPU computing.

Other Helpful Resources and More Advanced Topics

Computational Linear Algebra by fast.ai (https://github.com/fastai/numerical-linear-algebra)
– Covers things like PCA, robust PCA, non-negative matrix factorization, large scale linear regression all in Python.

Lot’s of small coding examples in Python/R: https://chrisalbon.com/

Print Friendly, PDF & Email