Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags


How to make a histogram in pandas

Educative Answers Team

pandas is a popular Python-based data analysis toolkit that can be imported using:

" import pandas as pd "

It presents a diverse range of utilities from parsing multiple file-formats to converting an entire data table into a NumPy matrix array. This property makes pandas a trusted ally in data science and machine learning.

pandas can help with the creation of multiple types of data analysis graphs. One such graph is the histograma graph of vertical bars whose area is proportional to the frequency of an item, and whose width is equal to the class interval.

The default implementation of histogram is:

DataFrame.hist(column = None by= None,grid:bool = True, xlabelsize= None,xrot= None,ylabelsize= None,yrot= None, ax= None, sharex= False, sharey= False, figsize= None, layout = None, bins= 10, backend= None, legend:bool = False, **kwargs)


  • column: string, list of string - The columns that should be plotted.

  • by: object - Used to form histograms for separate groups.

  • grid: bool - Whether or not to show the axis grid lines.

  • xlabelsize: int - The fontsize of the x-axis labels.

  • xrotsize: float - The rotation for the x-axis labels.

  • ylabelsize: int - The fontsize of the y-axis labels.

  • yrotsize: float - The rotation for the y-axis labels.

  • ax: Matplotlib axes object - The axis on which to plot the histogram.

  • sharex: bool, default true if ax is True - In case subplots = True, share x-axis labels and set some names to invisible.

  • sharey: bool - In case subplots = True, share y-axis labels and set some names to invisible.

  • figsize: tuple (width, height) - The size of the output image.

  • layout: tuple (rows, columns) - The layout in which the output graphs must be, for example, (4, 1) gives the figures in a single column and four rows.

  • bins: int or sequence - Number of histogram bins to be used.

  • backend: str - Backend to use instead of the backend specified in the option plotting.backend. For instance, ‘matplotlib.’ Alternatively, set pd.options.plotting.backend to determine the plotting.backend for the whole session.

  • legend: bool - Whether or not to show the legend.

  • **kwargs: tuple (rows, columns) - All other plotting keyword arguments to be passed to matplotlib.pyplot.hist().


The following code shows how histograms can be added in Python. You can change different parameters and look at how the output varies.
#import library
import pandas as pd

#add csv file to dataframe
df = pd.read_csv('dataset.csv')

#create histogram
histogram = df.hist(bins = 7)


Copyright ©2022 Educative, Inc. All rights reserved

View all Courses

Keep Exploring