Search⌘ K

Histograms and Probability Density Function

Explore how to represent data with histograms and understand the probability density function of distributions. Learn to use matplotlib's hist() for plotting and scipy's norm.pdf to overlay normal distribution curves. This lesson helps you visualize and analyze random variable data with practical Python tools.

Representing data #

One of the most common ways to represent a data set is to draw a histogram. For a histogram, you count how many data points fall within a certain interval. For example, how many data points are between 5 and 6. These intervals are called bins. The bar graph of the number of data points in each bin is called a histogram. The function to compute and plot a histogram is called hist() and is part of the matplotlib package. The simplest way of plotting a histogram is to let hist() decide what bins to use; the default number of bins is nbin=10.

hist() even figures out where to put the limits of the bins. The hist() function creates a histogram graph and returns a tuple of three items:

  1. The first item is an array of length nbin with the number of data points in each bin.
  2. The second item is an
...