Python Bokeh histogram
Bokeh is a Python library used for creating interactive visualizations in a web browser. It provides powerful tools that offer flexibility, interactivity, and scalability for exploring various data insights.
What is a histogram?
A histogram is a graphical representation of statistical data that has grouped frequency distribution with continuous classes. It has all adjacent rectangular glyphs since the base covers the intervals between class boundaries.
Real-life applications
Histograms are widely used in industry and research centers to examine the results for different data categories in various domains.
Required imports
import numpy as npfrom bokeh.io import output_file, savefrom bokeh.plotting import figure, show
numpy:To generate random data.bokeh.io:To control the output and display of the plots. We specifically importoutput_fileandsavemethods from it.bokeh.plotting:To create and customize plots without working directly with the lower-level Bokeh models. We specifically importfigureandshowmethods from it.
Example code
import numpy as np
from bokeh.io import output_file, save
from bokeh.plotting import figure, show
#specify range
rng = np.random.default_rng()
sampleData = rng.normal(loc=0, scale=1, size=500)
#create plot
myPlot = figure(width=670, height=400, toolbar_location=None,
title="Impact of spice level on CRC")
# Histogram
bins = np.linspace(-4, 4, 30)
hist, edges = np.histogram(sampleData, density=True, bins=bins)
myPlot.quad(top=hist, bottom=0, left=edges[:-1], right=edges[1:],
fill_color="purple", line_color="white",
legend_label="500 random samples")
#labels
myPlot.y_range.start = 0
myPlot.xaxis.axis_label = "Spice level"
myPlot.yaxis.axis_label = "CRC"
output_file("output.html")
show(myPlot)Code explanation
Lines 1–3: Import all the necessary libraries and modules.
Line 6: Create a random number generator using the
random.default_rng()function fromnumpy.Line 7: Generate an array of 500 samples using
normal()and pass mean, standard deviation, and size as parameters. Not that, in this case, we are generating samples from a normal distribution.Lines 10–11: Create
myPlotusing thefigure()function and pass all the specifications as parameters. Set the width, height, and location, and specify the title for the plot.Line 14: Specify the x-axis range and total number of bins using the
linespace()function fromnumpyand assign it to thebinvariable.Line 15: Calculate the histogram using the
histogram()function fromnumpyby passing the plot,densityandbinsas parameter. In this case,densityis equal totrueto normalize the histogram.Lines 16–18: Fill the histogram glyph using the
quad()function and pass the coordinates for each, fill color, line color, and label as a parameter. It is used to enhance visual representation and can be modified as per need.Lines 21–23: Assign starting point and the x-axis and y-axis labels to
myPlot.Lines 25–26: Set the output to
output.htmlto specify the endpoint where the plot will appear and usingshow()to display the created plot.
Code output
A normally distributed histogram is displayed at the output.html endpoint with 30 purple-filled bins for a range of (-4, 4) on the x-axis as specified in the code.
Relation between a histogram and PDF
PDF stands for the probability density function showing continuous random data's probability distribution. Histograms are used to represent the data on which the skewness is measured to examine the data variance. The above example code shows a simple histogram with random data generated using the normal() function.
Common Query
Can we modify the histogram and add the probability density function to it?
Free Resources