Testing for Normality in Our Data
Learn about normal and error distributions along with normality tests.
How do we know if our data fits a normal distribution? That’s an essential but somewhat tricky question to answer. There are multiple ways to try and address this question. The first and most important thing is to plot a histogram of the data to look at the spread of values.
In addition, we’ll discuss two different approaches:
- We’ll look at using normality tests.
- We’ll look at using maximum likelihood to estimate the most appropriate error distribution.
Looking at the data
If we want to check if our data is normal or not, the first and easiest thing to do is to plot the histogram of values. For this purpose, we can use either the hist()
function or the qplot()
function found in the ggplot2 package. Either way, a histogram should give us a general sense of what our data looks like. Remember that normally distributed data will have a relatively even spread of values above and below the mean, as shown in the figure generated by the code below.
Let’s look at the variables Age.FromEmergence
and SVL.final
and plot them using the qplot()
function from the ggplot2 package.
library(ggplot2)library(cowplot)SVL.final.dist<-qplot(data=RxP.byTank,x=SVL.final,ylab="occurrences",geom="histogram",bins=8)Age.FromEmergence.dist<-qplot(data=RxP.byTank,x=Age.FromEmergence,ylab="occurrences",geom="histogram",bins=10)plot_grid(SVL.final.dist, Age.FromEmergence.dist, ncol=2)
Note: Remember that we have to load the package first before using the
library()
function.
Notice in the code given above that we’re using the bins=
argument to specify how finely the histogram breaks up the data. Also notice that we’ve loaded the cowplot package, which contains the plot_grid()
function. This function allows us to plot multiple different figures together in a single window.
Looking at the two histograms that were generated by the code is quite informative. The SVL
data appears almost normal, although the data has a slight tail to the right. However, the Age
data is very skewed. Most froglets emerged very early in the period of metamorphosis, but the ...