The **`get_dummies`** function is used to convert categorical variables into dummy or indicator variables.

> A dummy or indicator variable can have a value of 0 or 1.

# How `get_dummies` works

The `get_dummies` function works as follows:

* It takes a data frame, series, or list. 
* Then, it converts each unique element present in the object to a column heading.
* The function iterates over the object that is passed and checks if the element at the particular index matches the column heading.
* If it does, it encodes it as a 1. 
* Otherwise, it assigns it a 0.
 
# Illustration

The illustration below gives an example of how the `get_dummies` function works:


# Syntax

The syntax of the `get_dummies` function is as follows:

```python
pandas.get_dummies(data, prefix=None, prefix_sep='_', dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None)
```

> Only the first parameter is compulsory. The rest are optional.

# Parameters

The table below describes the parameters:

| Parameter  | Description  |
| - | - |
| `data`  | Refers to a data frame, series, or list.  |
| `prefix`  | String to append column names of the data frame that is returned. It is `None` by default. |
| `prefix_sep`  | The separator or delimiter to be used if a prefix is added. It is `_` by default.  |
| `dummy_na`  | Adding a column to represent `NAN` values. It is `False` by default.  |
| `columns`  | Column names in the data frame to be encoded. It is `None` by default.  |
| `sparse`  | Whether a SparseArray should back the dummy-encoded columns. It is `False` by default.   |
| `drop_first`  | To remove the first column. It is `False` by default.  |
| `dtype`  | Data type for the new column.  |

# Return value

The `get_dummies` function returns a data frame with categorical encodings (0s and 1s).

# Example

The code snippet below shows how the `get_dummies` function is used in Pandas:


import pandas as pd
import numpy as np

# Creating a series from a list
s = pd.Series(list('abcb'))
print(s)
print('\n')
# Encoding using the function
print(pd.get_dummies(s))
print('\n')

# With dummy_na = True
s1 = ['a', 'b', np.nan]
print("With NA column as well")
print(pd.get_dummies(s1, dummy_na=True))
print('\n')

# On a dataframe with column prefixes
df = pd.DataFrame({'A': ['a', 'b', 'a'], 
     'B': ['b', 'a', 'c'],
     'C': [1, 2, 3]})

print(pd.get_dummies(df, prefix=['col1', 'col2']))
print('\n')

# With drop_first = True
print(pd.get_dummies(pd.Series(list('abcaa')), drop_first=True))
print('\n')

What is the get_dummies function in Pandas?

Converts categorical variables to dummy/indicator variables (0s and 1s) in a data frame.

Parameter	Description
`data`	Refers to a data frame, series, or list.
`prefix`	String to append column names of the data frame that is returned. It is `None` by default.
`prefix_sep`	The separator or delimiter to be used if a prefix is added. It is `_` by default.
`dummy_na`	Adding a column to represent `NAN` values. It is `False` by default.
`columns`	Column names in the data frame to be encoded. It is `None` by default.
`sparse`	Whether a SparseArray should back the dummy-encoded columns. It is `False` by default.
`drop_first`	To remove the first column. It is `False` by default.
`dtype`	Data type for the new column.

What is the get_dummies function in Pandas?

How get_dummies works

Illustration

Syntax

Parameters

Return value

Example

How `get_dummies` works