...

/

What is a variable?

What is a variable?

In machine learning, data is collected in datasets and represented by many features and variables that can be of different types, which is the purpose of this lesson. We will see what is a 'variable', and what type of variables we can find in datasets, alongside some examples for a better illustration.

Variable types

This segment intends to help you know the different types of variables you can find in a dataset to apply an efficient feature engineering process on them.

What is a variable?

A variable is any form of characteristic, number, or quantity that can be measured or counted. They are called variables because the values they take may vary.

Here are a few examples of variables:

  • Marital Situation (single, married, divorced, widowed)
  • Number of siblings (1, 3, 6, …)
  • Identification Number (88524136, 80254700, …)
  • Phone Brand (Apple, Samsung, Huawei, …)
  • Height (1.65m, 1.72m, 1.86m, …)

Dataset variables are classified into one of these types:

  • Numerical variables
  • Categorical variables
  • Datetime variables
  • Mixed Variables

Feature types with python

The following code snippet is used to get the type of each variable from a Pandas dataframe, and we are using here public data sets from the seaborn library to demonstrate that example:

Python 3.8
import seaborn as sns
titanic = sns.load_dataset('titanic')
print(titanic.dtypes)

Access this course and 1200+ top-rated courses and projects.