History of Machine Learning

Learn about the history of machine learning in detail.

We'll cover the following

Machine learning might sound like a niche area of science, and you might wonder why there is now so much interest in this discipline both academically and in the industry. The reason is that machine learning is really about modeling data. Modeling is the basis for advanced object recognition, data mining, and, ultimately, intelligent systems. Machine learning is the analytic engine in areas such as data science, big data, data analytics, and, to some extent, science in general in the sense of building quantitative models.


Machine learning has a long history with traces far back in time. The first genuine public recognition and accompanying widespread excitement among scientists about learning machines came in the late 1950s and early 1960s with work like Arthur Samuel’s self-learning checkers program. Samuel devised a program with reinforcement learning that ultimately learned to outperform its creator. Around this time, Richard Bellman established much of the mathematical foundation of reinforcement learning.

One of the first general learning machines, that is now considered to be a neural network, was invented by Karl Steinbuch in Germany. Frank Rosenblatt invented much of the systematic foundation of neural networks and started to build neural network computers together with Charles Wightman, such as the Mark I Perceptron. Neural networks were popularized again in the 1980s, influenced by David Rumelhart and Geoffrey Hinton, and Terry Sejnowski studied their connection to the brain. Of course, there are many, many more inspiring researchers such as Yoshua Bengio, Yann LeCun, and Jürgen Schmidthuber, to name but a few.

Era of machine learning

We are now in an era of deep learning, with important recent developments that are responsible for the popularity of machine learning today. This has a great deal to do with the availability of appropriate data and the availability of faster computers, but also to smart techniques that make it possible to scale models to much larger domains. A great example of recent progress in deep reinforcement learning is the ability of a computer to learn to play video games.

Video games from the old Atari platform have become a useful paradigm for a new class of benchmarks that go beyond the classical data sets for machine learning from the University of California Irvine (UCI) machine learning repository that have dominated the benchmarks in the past. Atari games are somewhat simplified worlds while still presenting more learning in environments that humans have to figure out. In these benchmarks only visual input is given, made up of the computer frames of the video game, and feedback is only provided with how well the player performed in the game.

While the above examples have been widely popularized as the new forefront in AI, much of the scientific progress in machine learning is related to its embedding with probabilistic methods and statistical learning theories. Some pioneers in this domain are Vladislaw Vapnik and Judea Pearl. The development of statistical machine learning and Bayesian networks has influenced the field strongly in the last twenty years, and the domain of Bayesian reasoning is essential for a deeper understanding of machine learning.

Some scientists are now working on more general probabilistic programming methods that, to some extent, go beyond the recent standard in machine learning applications. The aim of this course is to introduce machine learning at a more practical level so that it can be applied immediately by practitioners, at least in its basic form, and then to discuss the foundations in more general terms to help practitioners to learn more about the general theoretical underpinning of machine learning.

In the illustration above, from top-left to bottom-right, we have the following people: Arthur Samuel playing checkers; Richard Bellman, who formalized reinforcement learning; Karl Steinbuch, who invented the ’learn matrix’; Frank Rosenblatt and Charles Wightman, who implemented a neural computer; Terry Sejnowski and Geoffrey Hinton discussing the Boltzmann machine circa 1983 (Courtesy of Geoffrey Hinton); and David Rummelhardt.

In the next three chapters, we’ll learn how to apply machine learning with the help of Python-based programming frameworks based on Python libraries such as Sklearn and Keras.

The next several chapters explore the principle behind supervised learning by discussing regression and classification. We thereby switch frequently between a functional and a probabilistic framework. A refresher on the basic probability formalism is included in this discussion.

In the last few chapters, we will discuss some more advanced machine learning issues and methods, including recurrent networks and reinforcement learning.