Trusted answers to developer questions
Trusted Answers to Developer Questions

Related Tags

hypothesis
testing
data science

# What is hypothesis testing?

Hassaan Waqar

Grokking Modern System Design Interview for Engineers & Managers

Ace your System Design Interview and take your career to the next level. Learn to handle the design of applications like Netflix, Quora, Facebook, Uber, and many more in a 45-min interview. Learn the RESHADED framework for architecting web-scale applications by determining requirements, constraints, and assumptions before diving into a step-by-step design process.

Hypothesis testing is used in statistics to verify a hypothesis made by researchers or analysts. It is used to test whether findings deduced from some sample data can be generalized to the wider population, or not.

There are different methods of testing hypotheses. They depend on the nature of the data and the reason for the analysis.

## How does hypothesis testing work?

Hypothesis testing begins by stating a null hypothesis and an alternate hypothesis.

The null hypothesis defines the status quo. It states that there is no significant difference between data obtained before and after an intervention or condition, and that any difference occurs merely by chance.

Null hypothesis is represented by $H_{o}$.

The alternate hypothesis states that there is a significant difference between data obtained before and after an intervention or condition, and that this difference cannot occur merely by chance.

Alternate hypothesis is represented by $H_{a}$.

Both null and alternate hypotheses are mutually exclusive. Only one of them can be true at a time.

Data is then collected and analyzed. Finally, a decision is made regarding the null hypothesis. It is either rejected, or researchers fail to reject the null hypothesis with the given data.

The illustration below gives a summary of the hypothesis testing cycle:

Hypothesis testing cycle

## Example

Suppose that a researcher wants to determine whether there is an equal chance for a toss to result in either a heads or a tails. The null and alternate hypotheses are then stated as follows:

Null hypothesis: There is an equal chance of getting a heads or a tails. The data obtained through an experiment should not differ significantly.

Alternate hypothesis: There is an unequal chance of getting a heads or a tails. The data obtained through an experiment should differ significantly.

The researcher will then carry out an experiment. A coin is tossed 500 times. The number of heads and tails obtained are recorded. The results are then analyzed.

An equal chance of heads and tails would mean that around 50% of the toss result in a heads and 50% result in a tails.

If the results differ significantly, there will be a greater proportion of one type of result over the other. We say that the result is significantly different, and we reject the null hypothesis. If the results are closer to an equal ratio, we say that there is not much difference between the results obtained, and we fail to reject the null hypothesis.

The criteria of rejecting or failing to reject the null hypothesis depends on two values derived from the analysis: the p-value and significance level.

RELATED TAGS

hypothesis
testing
data science

CONTRIBUTOR

Hassaan Waqar 