Machine learning has bestowed humanity the power to run tasks in an automated manner. It allows improving things that we already do by studying a continuous stream of data related to that same task. Machine learning has a wide array of applications that belongs to different fields, such as healthcare, finance and manufacturing, and others. Machine learning also forms the basis of artificial intelligence. It is the branch of AI that powers chatbots, customizes the shows that Netflix recommends for you, and determines your TikTok feed. It plays a powerful role in health care technology, where machines can diagnose treatments and even perform surgery.

To become an expert in machine learning, you need to develop a strong foundation in three main areas: Mathematics, coding, and machine learning theory. There is an overwhelming amount of resources and courses out there dedicated to teaching these skills. Good books give individuals the opportunity to learn at their own pace and unlike online courses tend to go into more detail and explanation. They also serve as a great reference. Machine learning is a very broad field, owning a good book means you can always flick through a few chapters for a quick refresher from a concise and well-detailed source.

In this article, five machine-learning books are recommended to help you build a strong machine-learning foundation. These books range from beginner to intermediate and then advanced level. In addition to that, they have explained the algorithms from different perspectives and with different focuses such that some of them focus on the basics of machine learning concepts and algorithms, others go deeper on theoretical concepts, and others focus on the technical and coding perspective.

To be able to get the most out of the books, it is important to incorporate active learning techniques. This means not settling for just finishing the book and going through it, but making sure to solve the exercise at end of each chapter, summarize the main outcomes of each chapter and share your learning with others by writing blogs and social media posts. This will not only increase your understanding and benefits others.

If you want to study Data Science and Machine Learning for free, check out these resources:

If you would like to start a career in data science & AI and you do not know how. I offer data science mentoring sessions and long-term career mentoring:

Join the Medium membership program for only 5 $ to continue learning without limits. I'll receive a small portion of your membership fee if you use the following link, at no extra cost to you.

1. The Hundred-Page Machine Learning Book

The first book is the Hundred-Page Machine Learning Book by Andriy Burkov. This is probably one of the best introductory books to machine learning.

None
The Hundred-Page Machine Learning Book by Andriy Burkov

This book could be read in a couple of days or a little bit more since it is just around 135 pages as it appears from its name. This book is a perfect choice for beginners or machine learning practitioners who have been applying machine learning using built-in tools and would like to understand what is going on behind them.

From page one to page 136, Andriy Burkov, the author, does not waste a single word in distilling the most practical concepts in machine learning. The author does not assume that you have any background but having a little knowledge about math, probability and statistics would be helpful. It is important to mention that this book will not learn you everything about data science, it builds the basics and gets you gently into this field. If you would like to know more about machine learning you should check the books below.

The book assumes no prior technical background and builds the theoretical foundation step by step. In addition to that, the book avoids any intensive mathematical explanation for the machine learning algorithms, and it settles for explaining the algorithms in a friendly language style. The book also does not require a programming background as it barely uses codes to explain different machine learning algorithms and concepts.

In addition to that, the book provides an excellent explanation of the machine learning landscape. It explains the main tasks and algorithms in machine learning. In addition to explaining how models are trained and evaluated and providing technical tips for each concept. The book is short approximately 136 pages and is direct to the point, so every word will add new knowledge and will increase your understanding.

This is an important advantage for technical books which are usually very long and needs a long-term commitment to being able to finish them and get the most out of them. However, this book can be finished in a couple of days or one week and after finishing it you will have a very good understanding of the machine learning landscape and the machine learning algorithms and tasks.

Table of Contents:

The book consists of two parts: supervised learning and unsupervised learning.

Part I: Supervised Learning

  • Chapter 1 — Introduction: This chapter discuss the landscape of machine learning and the main learning tasks.
  • Chapter 2 — Notation and Definitions: This chapter speaks about different data structures (matrices, arrays, sets…), defines the notation used throughout the book, and introduces different concepts that are fundamental to the mathematics of machine learning, such as derivatives, gradients, and various statistical quantities and terms.
  • Chapter 3 — Fundamental Algorithms: Starting from the simplest algorithm linear regression, follows a path through Logistic Regression, Decision Trees, Support Vector Machines, and KNN. All with simple mathematical explanations and great visualizations.
  • Chapter 4 — Anatomy of a Learning algorithm: This chapter describes gradient descent in depth with an example to illustrate how learning works. After this, the main way of implementing these models is described: using libraries like Scikit-Learn or TensorFlow instead of coding the algorithms from scratch by yourself.
  • Chapter 5 — Basic Practice: This chapter describes the data science pipeline. Starting with feature engineering practices such as null-value imputation, feature scaling, and the main criteria for choosing one machine learning algorithm versus another. It explores train, test, and validation sets, and describes what over-fitting and under-fitting are, describing regularization techniques to avoid the former. To end, it speaks about model assessment using different metrics like the confusion matrix, accuracy, or the ROC curve.
  • Chapter 6 — Neural Networks and Deep Learning: This chapter describes the main elements of artificial neural networks, starting from the perceptron, and introduces the two most likely popular neural network architectures: convolutional and recurrent neural networks.
  • Chapter 7 — Problems and Solutions: This chapter is quite curious but provides great value. It speaks about specific Machine Learning problems and solutions like high dimensionality, multi-class classification, ensemble methods like bagging and boosting, attention models, and semi-supervised learning.
  • Chapter 8 — Advanced Practice: This chapter contains the description of specific techniques that are useful in some contexts, like transfer learning, and handling imbalanced data sets. It also explores algorithm efficiency and why it is important.

Part II — Unsupervised Learning

  • Chapter 9 — Unsupervised Learning: This chapter explores unsupervised learning. Different clustering techniques, and how to use them properly, as well as dimensionality reduction techniques like Principal Component Analysis and outlier detection.
  • Chapter 10 — Other Forms of Learning: This chapter is again, curious. It provides content that most Machine Learning practitioners will have not heard much about, like Metric and Recommender Learning, and introduces some more usual concepts like Word embedding.
  • Chapter 11 — Conclusion: This chapter ends this fantastic book, briefly exploring concepts that have not been covered in the previous chapters like Topic Modelling GLM, or GANs and Reinforcement Learning.

Throughout the 100 pages of the 100-page Machine Learning book, there is so much knowledge. In addition to that the explanations are so elegant and simple, that it is recommended to grab this book and give it to anyone willing to enter the amazing field of machine learning and carry it around for a while. Even after reading it for the first time, it is a great tool to refresh your knowledge of various concepts.

2. Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems

The second book is the Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow book by Aurélien Géron and has already been two editions and the third edition will be published soon.

None
The Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow book by Aurélien Géron

Aurélien Géron worked as a Product Manager at YouTube where he led the development of machine learning for video classification. His experience as a practitioner is evident in Hands-On Machine Learning as each chapter is filled with practical advice and realistic techniques for building machine learning models in the industry.

The book is not for an absolute beginner you need to have a basic knowledge of python and linear algebra and calculus. To understand the more advanced topics discussed in the book, you'll need to have a firm grasp of Python coding and some of the useful tricks such as list comprehensions and lambda functions, as well as basic knowledge of key data science libraries such as Numpy, Pandas, and Matplotlib. You also need to have a solid command of algebra, calculus, and the basics of data science. Hands-on Machine Learning assumes you know your math well and won't hold your hand on partial derivatives and gradients when you reach the deep learning section.

The book has a unique approach. It usually starts with a high-level description of different machine learning concepts to give you a general idea; then you go through hands-on coding with Python libraries without going into the details; finally, when you get comfortable with the coding and concepts, you lift the hood and get into the nitty-gritty of how the math works for each algorithm. In addition to that, it breaks down the algorithms in a nice way that balances the needed theory and mathematics, and applied concepts.

The book also goes through all the basic machine learning and deep learning algorithms in addition to advanced topics such as transformer and generative adversarial networks (GAN). The book not only covers the basic concepts and algorithms of machine and deep learning in detail. But it also gives a very good introduction to a wide range of advanced topics such as training and deploying Tensor flow models at scale, deep learning for computer vision, natural language processing with recurrent neural networks (RNN) and attention models, representation learning, and generative learning using autoencoders and GANs.

Another strong point of this book is that the book chapters are filled with examples demonstrating how to use Python libraries like pandas, scikit-learn, and TensorFlow to preprocess data, split datasets for training and validation, and build models. In addition to that for every explained concept or algorithm, there will be python code that shows how to use it or implement the algorithm itself. Finally, it is noteworthy to mention that the book is written in a friendly language that is easy to understand with no complex abbreviations or hard language, or unknown mathematical symbols. The book always starts with a high-level introduction then it covered the topics in the machine learning life cycle in deeper detail, so it will be easy to follow along without the need for further resources or research.

Table of Contents:

The book consists of two parts: The fundamentals of machine learning and Neural Networks and Deep Learning.

The first part of the book consists of nine chapters:

  1. The Machine Learning Landscape
  2. End-to-End Machine Learning Project
  3. Classification
  4. Training Models
  5. Support Vector Machines
  6. Decision Trees
  7. Ensemble Learning and Random Forests
  8. Dimensionality Reduction
  9. Unsupervised Learning Techniques

The second part of the book consists of ten chapters:

  1. Introduction to Artificial Neural Networks with Keras
  2. Training Deep Neural Networks
  3. Custom Models and Training with TensorFlow
  4. Loading and Preprocessing Data with TensorFlow
  5. Deep Computer Vision Using Convolutional Neural Networks
  6. Processing Sequences Using RNNs and CNNs
  7. Natural Language Processing with RNNs and Attention
  8. Representation Learning and Generative Learning Using Autoencoders and GANs
  9. Reinforcement Learning
  10. Training and Deploying TensorFlow Models at Scale

3. An Introduction to Statistical Learning: With Applications in R

The third book is An Introduction to Statistical Learning: With Applications in R book by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.

None
Figure 3. An Introduction to Statistical Learning: With Applications in R book by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.

This book provides an accessible overview of the field of statistical learning, an essential toolset for making sense of the vast and complex data sets that have emerged in fields ranging from biology to finance to marketing to astrophysics in the past twenty years.

It presents a perfect introduction to the intersection between statistics and machine learning, covering topics that go from the most basic algorithms and concepts such as linear regression to more advanced ones like Support Vector Machines and clustering techniques. Overall, the book offers a clear application of Mathematics and the application of the R programming language to statistical learning, with fantastically written, beautiful explanations of each topic, that require a solid mathematical background.

One of the main goals of this book is to present the machine algorithms mentioned above in a practical and applicable manner. That's why each chapter contains a tutorial on implementing the analysis methods and prediction techniques in the R programming language. While Python is used more in data science, R offers some great analysis tools and statistical methods that are hard to find with any other programming language.

Also, at the end of each chapter, there are exercises that cover both, the theoretical and practical parts of what has been covered in this chapter. Solving these exercises is a perfect way to test your knowledge and understanding of what has been taught up to that point. It is strongly recommended to dedicate some time to slowly and properly completing these exercises and doing them by hand.

This book is appropriate for advanced undergraduates or master's students in statistics or related quantitative fields or for individuals in other disciplines who wish to use statistical learning tools to analyze their data. To be able to advance in the book without difficulties, it is better to have a strong background in statistics, linear algebra, and calculus. You can proceed into the book without a strong background in them, but you will have at least to brush up your knowledge and know the basics of these topics and you can learn as you proceed in the book.

Table of Contents:

  1. Introduction: This chapter gives an overview and brief history of statistical learning, a vast set of tools for understanding data, and some examples.
  2. Statistical Learning: This chapter covers what is statistical learning, inference, parametric and non-parametric methods, and the trade-off between accuracy and model interpretability. Bias-variance trade-off and a lot more!
  3. Linear Regression: This chapter explains the simplest approach to supervised learning, how to estimate the coefficients, the different errors, and all you need to know.
  4. Classification: This chapter discusses different approaches to calculating discrete target variables and the detail of logistic regression, Linear Discriminant Analysis, and Bayes Theorem.
  5. Resampling Methods: This chapter discusses how to draw examples from a training set and refit a model of interest in each example to obtain additional information about the fitted model.
  6. Linear Model Selection and Regularization: This chapter introduces approaches for extending the framework of linear models to GLM, and how to avoid model variance: Lasso and Ridge Regression.
  7. Moving Beyond Linearity: This chapter will discuss how to go beyond linear models to polynomial regression, smoothing splines, generalized additive models, and more!
  8. Tree-Based Methods: This chapter explains decision trees for classification and regression, bagging, and random forest.
  9. Support Vector Machines: This chapter explains maximum margin classifiers and their evolution to Support Vector Machines. How they are trained, used to make predictions, and their advantages and disadvantages.
  10. Unsupervised Learning: The final chapter discuss clustering and dimensionality reduction, k-means, hierarchical clustering, and Principal Component Analysis.

4. The Elements of Statistical Learning

The fourth book is The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshriani, and Jerome Friedman. Professor Hastie and Professor Rob Tibshirani are Professors of Statistics and Biomedical Data Science at Stanford University. Professor Jerome H. Friedman is a Professor of Statistics at Stanford University. All three of the authors have made important contributions to the fields of Statistics and Machine Learning.

None
Figure 4. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Second Edition by Trevor Hastie, Robert Tibshriani, and Jerome Friedman.

This book is seen by many as the bible of machine learning and despite being an old text, it remains the king of books to become a serious expert in the theory underlying machine learning. It is a very conceptual and theoretical book, where many examples are given, and it comes with very illustrative and high-quality figures. It covers topics that go from supervised and unsupervised learning to artificial neural networks.

Despite being very theoretical, The Elements of Statistical Learning avoids spinning around on the same topic or tedious and long demonstrations, going straight to the point in each subject, which makes it a great reference manual to refresh the deepest corners of machine learning algorithms and concepts. Because of this, it is a great document to have for both researchers and those that use Machine Learning techniques in the business world. Overall, it is a must-have manual on the shelf for anybody who aims to be a Machine Learning expert, and definitely, a good resource for reaching cutting-edge knowledge in this field.

It is very important to mention that The Elements of Statistical Learning is a highly theoretical book, it does not speak about programming, and the math required to understand it is that of a medium-high level. Therefore this book is not recommended for beginners in statistics, or for those that are looking to learn to implement Machine Learning algorithms in R or Python. Rather, this book is best for those that have a good statistics and mathematics foundation, that have been implementing and working with Machine Learning for a while, and that want to scale their knowledge of the theoretical concepts underlying the different algorithms. If you would like to learn similar topics but with fewer mathematical details and is more focused on applications and software implementations the previous book "An Introduction to Statistical Learning" will be a better choice.

Table of Contents:

  1. Introduction
  2. Overview of Supervised Learning
  3. Linear Methods for Regression
  4. Linear Methods for Classification
  5. Basis Expansions and Regularization
  6. Kernel Smoothing Methods
  7. Model Assessment and Selection
  8. Model Inference and Averaging
  9. Additive Models, Trees, and Related Methods
  10. Boosting and Additive Trees
  11. Neural Networks
  12. Support Vector Machines and Flexible Discriminants
  13. Prototype Methods and Nearest-Neighbors
  14. Unsupervised Learning
  15. Random Forests
  16. Ensemble Learning
  17. Undirected Graphical Models
  18. High-Dimensional Problems: p N

5. Pattern Recognition and Machine Learning

The final book is Pattern Recognition and Machine Learning book by Dr. Christopher Bishop. Dr. Christopher Bishop is the Laboratory Director at Microsoft Research Cambridge and a Professor of Computer Science at the University of Edinburgh. Dr. Bishop is an expert in the field of Artificial Intelligence and Artificial Neural Networks, and he has a Ph.D. in Theoretical Physics. In 2017, he was elected as a Fellow of the Royal Society.

None
Figure 5. The Pattern Recognition and Machine Learning book by Dr. Christopher Bishop.

This book is a collection of topics that are loosely organized but the discussion of the topics is extremely clear. The loose organization of topics has the advantage that one can start the book and read different sections without having to read earlier chapters. However, a beginner to machine learning should start by reading Chapters 1, 2, 3, and 4 very carefully and then read the initial sections of the remaining chapters to get an idea about what types of topics they cover.

The choice of topics hit most of the major areas of machine learning and the pedagogical style and writing style are quite clear. There are lots of great exercises, great color illustrations, intuitive explanations, relevant but not excessive mathematical notation, and numerous comments which are extremely relevant for applying these ideas in practice.

In order to read this textbook, a background in linear algebra, calculus (although multivariate calculus is recommended), and a calculus-based probability theory is required. Without this background, the book may be a little challenging to read but it is certainly accessible to students with this relatively minimal math background. If you have a Ph.D. in Statistics, Computer Science, Engineering, or Physics you will find this book extremely useful because it will help you make contact with topics with which you are already familiar.

The book focuses on the theoretical aspects of machine learning as well as statistical concepts in machine learning and pattern recognition. If you want to learn more about these concepts, this is the book for you. Ideas covered include basic probability theory, pattern recognition, the Bayesian method, and approximate inference algorithms. The book also includes practice exercises surrounding statistical pattern recognition.

Table of Contents:

  • Chapters 1 and 2: provide a brief overview of relevant topics in probability theory.
  • Chapters 3 and 4: discuss methods for parameter estimation for linear regression modeling.
  • Chapter 5: discusses parameter estimation for feedforward neural network models.
  • Chapters 6 and 7: discuss kernel methods and Support Vector Machines (SVMs).
  • Chapter 8: discusses Bayesian networks and Markov random fields.
  • Chapter 9: discusses mixture models and Expectation Maximization (EM) methods.
  • Chapter 10: discusses the variational Inference methods.
  • Chapter 11: discusses sampling algorithms that are useful for seeking global minima as well as numerically evaluating high-dimensional integrals.
  • Chapter 12: discusses various types of Principal Component Analysis (PCA) including PCA, Probabilistic PCA, and Kernel PCA.
  • Chapter 13: discusses Hidden Markov models. And
  • Chapter 14: discusses Bayesian Model Averaging and other methods for modeling mixtures of experts.

Conclusion

Machine Learning is a fast-growing field and the tools used out there can seem very overwhelming. The collection of books in this article should, however, serve as an excellent foundation along your journey. These textbooks have already proved to be the best in the market in this rapidly developing field and should be considered valuable referential as you move ahead in your machine learning journey. You can select any of these books as per your interest and experience and start your learning journey.

Loved the article? Become a Medium member to continue learning without limits. I'll receive a small portion of your membership fee if you use the following link, at no extra cost to you.

Thanks for reading! If you like the article make sure to clap (up to 50!) and connect with me on LinkedIn and follow me on Medium to stay updated with my new articles.