How to convert continuous values into discrete values in pandas

Convert continuous values into discrete values in pandas

The cut method is used to separate the array elements into different buckets/bins. This method works only on one-dimensional data. We can utilize this method when we have a lot of scalar data and want to do some statistical analysis on it.

Note: Click here to learn more about the pandas library.

Syntax

pandas.cut(x, bins, right=True, labels=None, retbins=False, precision=3, include_lowest=False, duplicates='raise', ordered=True)

Parameters

  • x: This is the input data/array to be segregated into bins. This has to be one-dimensional.
  • bins: This defines the criteria to bucket the data. The following are the ways where we can specify how to bucket/bin the data:
    • If bins is an integer, it defines the number of equal-width bins in the range of x.
    • If bins is a sequence of scalar values, it defines bin boundaries for the distribution.
    • If bins is an IntervalIndex, it defines the exact bins to be used.
  • right: This is a boolean value. When True, it indicates that the bins include the rightmost edge.
  • labels: This specifies the labels for the individual bins.
  • retbins: This is a boolean value. When True, it indicates that the generated bins be returned.
  • precision: This is the precision at which to store and display the bin labels.
  • include_lowest: This is a boolean value that indicates whether the first interval should be left-inclusive or not.
  • duplicates: This specifies what to do if the edges of the bin are not unique, for example, whether to raise ValueError or drop non-uniques.
  • ordered: This is a boolean value that indicates whether the labels are ordered or not.

Example

import pandas as pd
data = {'name': ['Sam', 'John', 'Tim', 'Tom', 'Singh', 'Song', 'Gold'],
'age': [21, 34, 54, 76, 23, 10, 23]}
df = pd.DataFrame(data)
categories = pd.cut(df['age'], bins=[20,40,70], labels=[ "Young", "Middle Aged"])
print(categories)

Explanation

  • Line 1: We import the pandas module.
  • Lines 3–4: We define the data for the pandas DataFrame.
  • Line 5: We create a pandas DataFrame.
  • Line 7: We categorize the data in the DataFrame into different age bins with different labels using the cut() function.
  • Line 9: We print the categories to the console.

Free Resources

Copyright ©2026 Educative, Inc. All rights reserved