My Favorite Books & Online Courses for Statistical Learning, Data Visualization & Machine Learning

Get started today!

modololw
Better Programming

--

Photo by Ruthson Zimmerman on Unsplash

I really enjoy learning statistics, and I find that we make decisions more wisely using data. Here are some books/online courses that helped me a lot in the last several years.

Statistics and Probability

I had my statistics and probability courses in university, but if you’re looking for an online course, I believe that the MIT course Introduction to Probability and Statistics is an excellent choice.

This course provides an introduction to basic probability definitions and theorems, and it also talks about basic statistics topics — Bayesian Inference, Frequentist Inference (NHST Null Hypothesis Significance Testing), Confidence Intervals, and Regression.

For Bayesian Inference, I also enjoy two Coursera courses from the University of Califonia Santa Cruz: Bayesian Statistics: From Concept to Data Analysis and Bayesian Statistics: Techniques and Models.

And want to know something about history in statistics? The book The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century talks about the revolutionary ideas that changed our life.

Statistical Learning, Machine Learning, and Deep Learning

In statistical learning, I’ll highly recommend a book called The Elements of Statistical Learning: Data Mining, Inference, and Predictionfrom Stanford University. If you want to have a conceptual framework in the field of statistical learning, this book will be very helpful.

For machine learning, I love the famous Machine Learning course by Andrew Ng on Coursera. This course is a very good introduction to machine learning algorithms (supervised learning and unsupervised learning), and you can also find much useful advice in practice.

For more details in Machine Learning practices, I recommend the book Hands-On Machine Learning with Scikit-Learn and TensorFlow.If you’re ready to start a 101 Machine Learning project, this book will definitely help you a lot.

For Deep Learning, I would like to talk aboutDeep Learning by Ian Goodfellow. What I really like about this book is that it covers linear algebra, probability theory, and information theory to give a mathematical and conceptual explication to deep learning algorithms.

Handbooks in Data Visualization, Data Science, and Python

Data visualization is fantastic, and it is also essential in data projects. For me, ggplot2 does it the best. If you love ggplot2 like me, I recommend checking out the book R Graphics Cookbook: Practical Recipes for Visualizing Data. With its content being so well-organized, you can easily find how to create your dream plots with ggplot2. It also talks about plot design topics.

New in R coding? Never mind, check the book R Cookbook: Proven Recipes for Data Analysis, Statistics, and Graphics.” out, you’ll learn from installing R, data structures, data transformations to general statistics, and last but not least, the useful tricks!

And for coding in Python for machine learning/deep learning, I frequently use the book Python Data Science Handbook: Tools and Techniques for Developers. It’s well-organized like the book mentioned before, and this guide will teach you how to use all the famous/basic tools you need in Python data science — IPython, NumPy, Pandas, Matplotlib, Scikit-Learn, and other related tools.

Although coding in Python the most efficiently is not the primary goal for data scientists, I still believe a nice clean program will always help us. If you have any questions about Python, just grab the book Automate the Boring Stuff with Python: Practical Programming for Total Beginners for help. I believe you can find joy in step-by-step program coding.

--

--

Work with data in the insurance industry. I like books and music. I want to share stuffs. ;)