# Data Processing

In this lesson, we will learn about data processing tools using arrays.

## We'll cover the following

NumPy arrays provide diverse functionalities and tools to manipulate extract meaning from raw data.

## Mean and standard deviation

Suppose you are measuring the peak value of current passing through a transformer every two hours, but due to temperature changes, there is a fluctuation in the peak value. In order to get a good representation of the data, we will need to calculate the **mean** value of all 12 readings. The mathematical formula for the mean is given below:

$\overline{a}=\frac{1}{n}\sum_{i=1}^n{a_i}$

where $a_{i}$ are elements in the data set, $n$ is the total number of elements and $\overline{a}$ is the mean.

The data’s **standard deviation** and **variance** will represent the number of fluctuations in the peak value. The mathematical formula for standard deviation is given below:

$\sigma=\sqrt{\frac{1}{n}\sum_{i=1}^{n}(a-\overline{a})^2}$

$variance=\sigma^2$

where $a_{i}$ are elements in the data set, $n$ is the total number of elements, $\overline{a}$ is the mean, and $\sigma$ is the standard deviation. Variance is simply the square of the standard deviation.

In the example below, we will see how easily we can compute mean and standard deviation.

Get hands-on with 1200+ tech skills courses.