Basics of AI I

In this lesson, you will gain an understanding of the basic concepts of AI.


Computers are incredible at storing, organizing, fetching and processing huge volumes of data. This is perfect for things like e-commerce websites with millions of items for sale and for storing billions of health records for quick access by doctors.

However, what if we want to use computers not just to fetch and display data, but to actually make decisions about data? What if we want algorithms that allow computers to learn from data and then make predictions and decisions? What if we want machines to perform cognitive functions we associate with human minds, like perceiving, reasoning, learning, interacting with the environment, problem-solving, and even exercising creativity?

This is where AI comes into the picture. You might have heard the terms artificial intelligence (AI), machine learning (ML), and deep learning (DL) being used interchangeably. ML and AI, especially, are often used one for the other. What really is the difference?

Let’s start by briefly discussing what each of these terms means.

What is machine learning (ML)?

One of the main differences between humans and computers is that humans learn from past experiences whereas computers need to be told what to do; they need to be programmed so that they can follow instructions. The question is, how can we get computers to learn from experience? The answer is machine learning. Of course, in the context of computers, experiences have a different name, and we call it data.

The main idea

In machine learning, the main idea is that there are generic algorithms that can tell you something interesting about a set of data without you having to write any custom code specific to the problem.

In a standard program, like when we want to teach the computer to sort data, we provide a very specific set of instructions and the computer just follows those instructions. But with machine learning, instead of writing explicit code, like on the left-hand side in the figure below, you feed data to a generic algorithm, and the algorithm is able to build its own logic based on the data:


A scalable approach

Say we want to recognize objects in a picture. In the old days, programmers would have to write code for every object they wanted to recognize, e.g., person, cat, vehicle. This is not a scalable approach.

Today, thanks to machine learning algorithms, one system can learn to recognize all these objects by just showing it many examples of each:

  • For instance, the algorithm is able to understand that a cat is a cat by looking at examples of pictures labeled “this is a cat” or “this is not a cat” and by being corrected every time it makes a wrong guess about the object in the picture. If shown a series of new pictures, it begins to identify cat photos in the new set.

  • Think of when you were a toddler. How did you learn to recognize objects? A child learns to call a cat “a cat” and a dog “a dog” by being exposed to the same example many times and by being corrected for the wrong guesses. This is what’s going on with machine learning as well.

Machine learning is an umbrella term covering these generic algorithms aimed at teaching computers to learn from data. Whether we want to classify images or predict housing prices, machine learning has us covered.