AutoML Overview

Learn about the workflow of AutoML and how it’s different in ML.NET.

We'll cover the following

AutoML, short for automated machine learning, is a set of techniques and tools that aim to automate various aspects of the ML pipeline, making it easier and faster for individuals without extensive ML expertise to build and deploy ML models. AutoML systems automate tasks such as data preprocessing, feature engineering, model selection, hyperparameter tuning, and model evaluation.

AutoML aims to reduce the time and effort required for developing ML models by automating repetitive and time-consuming tasks. It democratizes ML by enabling nonexperts to leverage the power of ML in their respective domains, allowing them to focus on domain-specific problems rather than the intricacies of model building and optimization.

AutoML workflow

The general workflow of AutoML typically involves the following steps:

  1. Data preprocessing: AutoML tools handle tasks such as data cleaning, handling missing values, feature scaling, and encoding categorical variables. These steps ensure that the data is in a suitable format for training ML models.

  2. Feature engineering: AutoML systems automatically generate or select relevant features from the available data. They can perform techniques such as feature extraction, feature selection, or feature transformation to improve model performance.

  3. Model selection: AutoML algorithms explore a wide range of ML algorithms and models to identify the best-performing one for the given task. They typically employ a combination of statistical techniques and heuristics to assess the quality of models, such as cross-validation or information criteria.

  4. Hyperparameter tuning: Each ML model has various hyperparameters that control its behavior and performance. AutoML algorithms automatically search the hyperparameter space to find the optimal combination that maximizes the model’s performance. This can involve techniques like grid search, random search, or more advanced optimization algorithms like Bayesian optimization.

  5. Model evaluation: AutoML systems assess the performance of trained models using appropriate evaluation metrics, such as accuracy, precision, recall, or F1 score. They can also perform model comparison and selection based on these metrics.

Essentially, AutoML performs the same steps as any traditional ML task. However, several steps of the process are automated; therefore, there are only three steps that are left for the developer to do:

  • Defining the problem: We need to configure the AutoML pipeline to perform a specific type of task and solve a specific type of problem.

  • Collecting data: We need to collect appropriate data for the specific task.

  • Building and running the AutoML pipeline: All the steps mentioned above, including data preprocessing, feature engineering, model selection, etc., will be automatically performed by AutoML.

The following diagram demonstrates the difference between a traditional ML process and AutoML:

Get hands-on with 1200+ tech skills courses.