Understanding Machine Learning

Let's learn about the components and applications of machine learning in this lesson.

Image Credits: https://medicalfuturist.com/top-ai-algorithms-healthcare/
Image Credits: https://medicalfuturist.com/top-ai-algorithms-healthcare/

In the first lesson, you learned the main idea of machine learning: there are generic algorithms that can tell you something interesting about a set of data without having to write any custom code specific to the problem. Instead of writing explicit code, you feed data to a generic algorithm, which then builds its own logic based on the data. This magic of learning from data is possible because the algorithm learns based on properties or features, of the object it is being asked to learn about.

For example, while learning to distinguish between apples and oranges in a very rudimentary way, color could be used as a feature. In this way, all the red-colored fruits would be assigned “apple”, while the ones with an orange color would be labeled as “orange.”

Main components of machine learning

Based on our examples, can you spot the three main components of machine learning?
Basically, we need three components to train our machine learning systems:

  • Data: Data is the most essential component for machine learning, which is the reason why it’s called “new oil.” Data can be collected both manually and automatically. For example, users’ personal details (like age and gender), their clicks, and purchase histories are valuable data for an online store. Do you recall “ReCaptcha,” which forces you to “Select all the street signs”? That’s an example of some free manual labor to collect data. By selecting the right blocks you help the algorithm to learn to recognize street signs! Data is not always images; it could be tables of data with many variables (features), text, sensor recordings, sound samples, etc., depending on the problem at hand.
Image Credits: xkcd.com
Image Credits: xkcd.com
  • Features: Features are often also called variables or parameters. These are essentially the factors for a machine to look at; the properties of the “object” in question, include users’ age, stock price, area of the rental properties, number of words in a sentence, petal length, size of the cells.
    Choosing meaningful features is very important. Continuing with our example of distinguishing apples from oranges, say we take features like ripeness and seed count. Since these are not really distinct properties of the fruits, our machine learning system won’t be able to effectively distinguish between apples and oranges based on these features.
    Remember that it takes practice and thought to figure out what features to use because they are not always as intuitive as in this trivial example.

  • Algorithms: Machine learning is based on general-purpose algorithms. For example, one kind of algorithm is classification. Classification allows us to put data into different groups. The interesting thing is that the same classification algorithm used to recognize handwritten numbers can also be used to classify emails into spam and not-spam without changing a line of code! How is this possible? Although the algorithm is the same, it’s fed different input data, so it comes up with different classification logic. However, this is not meant to imply that one algorithm can be used to solve all kinds of problems! The choice of the algorithm is made based on the type of problem at hand, e.g., "Are we working with predicting stock prices or do we want to assign labels like spam or not-spam?
    The choice of the algorithm is important in determining the quality of the final machine learning model. We will get into the details of algorithms in the coming sections. However, one very important thing to remember is that if the data is crappy, even the best algorithm won’t help. "Garbage in, garbage out" is what they always say. This is why acquiring as much data as possible is a very important first step in getting started with machine learning systems.

Machine learning applications

Machine learning can provide us with both predictions and prescriptions. What does this mean?

  • Predictions: Data-driven organizations use machine learning predictions as a key source of insights to anticipate what will happen next.
  • Prescriptions: Prescriptions are powering recommendation engines; machine learning algorithms recommend what to do next to move closer to our set goal.

Can you think of some examples of machine learning that you use everyday?

Here are some popular applications:

  • Virtual personal assistants: Siri, Cortana, Alexa, and Google Now
  • Finance: Fraud detection, prediction and execution of trades at speeds and volumes that humans can’t compete with
  • Social media: Face Recognition, People You May Know, and Pages You Might Like
  • Retail: Product recommendations; maximization of revenue by learning customers’ habits
  • Online customer support: Customer support representatives are being increasingly replaced by chatbots.
  • Medicine: Medical diagnosis, drug discovery, and understanding of risk factors for diseases in large populations.
  • Search results: When you search on Google, the backend keeps an eye on whether you clicked on the first result or went on to the second page, the data is used to learn from mistakes so that relevant information can be found quicker next time.
Image Credits: Introduction to Machine Learning - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/Machine-Learning-Application_fig1_323108787
Image Credits: Introduction to Machine Learning - Scientific Figure on ResearchGate. Available from: https://www.researchgate.net/figure/Machine-Learning-Application_fig1_323108787