How to perform z-test in Python

A z-test is a statistical method used to determine whether there is a significant difference between the means of two groups, assuming the sample size is large, and the population variance is known. But before delving into it, let’s first understand what is inferential statistics:

Inferential statistics is a branch of statistics that allows us to make inferences or predictions about a population based on sample data. It involves drawing conclusions about an entire population using data collected from a subset of that population. Quantifying the level of confidence associated with the estimated population parameters helps researchers assess the reliability and uncertainty of these predictions.

What is a z-test?

The z-test is a hypothesis test that compares a sample mean or proportion to a known population mean or proportion when the population standard deviation is known, and the population size is equal to or greater than 30. It calculates the z-statistic, which measures how many standard deviations a data point is from the mean.

Example

Imagine you’re a teacher, and you know that the average score (mean) for all students who took a certain exam last year was 75. This average score of 75 is the population mean.

This year, you have a class of 40 students, and their average score on the same exam is 78. You want to know if this higher average score is just due to random chance or if it’s a significant difference that suggests this year's students did better overall.

To figure this out, you can use a z-test. The z-test will compare your class’s average score (78) to the population mean (75) to see if the difference is big enough to be significant. The test considers how spread out the scores are (using the population standard deviation) and calculates a z-statistic, which tells you how many standard deviations the difference (78 vs. 75) is from the population mean.

If the z-statistic shows a large enough difference, you might conclude that your class truly performed better than average, not just by chance.

Note: The z-test is typically performed on numerical (quantitative) data, such as heights, weights, or test scores, to compare means or proportions.

The role of the z-test in inferential statistics

The z-test is a statistical method used to determine whether there's a significant difference between a sample statistic (like the sample mean or proportion) and a known population parameter. It helps assess whether observed differences are due to chance or if they indicate a real effect.

Assumptions of z-test

Before performing a z-test, it is important to understand the assumptions that must be met to ensure valid and reliable results.

Here are the key assumptions:

Data follows a normal distribution: The z-test assumes that the analyzed data follows a normal distribution, which is characterized by symmetry, a bell-shaped curve, and defined by its mean and standard deviation.

It is essential to assess the normality of the data, especially for smaller samples.

Population standard deviation is known: The z-test requires prior knowledge of the population standard deviation. This assumption is typically met when population parametersPopulation parameters are numerical values that describe a characteristic of an entire population. are well-defined and accurately known.

If the population standard deviation is known, a one-sample z-test can be easily applied; otherwise, a t-test should be used.

Random sampling from the population: The z-test assumes that the sample data is obtained from the population of interest through a random samplingRandom sampling is a technique to select a subset of individuals or items from a larger population. process. This ensures that the sample is representative of the population, minimizing bias and enabling valid statistical inferences.

The illustration below describes when to use a z-test:

Below is an explanation of the terms used in the formula:

$\bar{x}$ : It denotes the sample mean, which is the average of the observations in your sample.
$\mu$ : It denotes the population mean.
$\sigma$ : it denotes the population standard deviation, which measures the dispersion of population data.
$n$ : It denotes the sample size, which is the number of observations in your sample.

Two types of one-sample z-tests are defined below:

Right-tailed z-test

A right-tailed test determines whether a sample mean exceeds the population mean. For instance, if a researcher aims to assess whether their new medication increases the average recovery rate compared to the existing one, a right-tailed z-test should be conducted.

Left-tailed z-test

A left-tailed z-test determines whether a sample mean is less than the population mean. For instance, if a researcher aims to assess whether a new medication decreases the average recovery rate compared to the existing one, a left-tailed test would be appropriate.

Here is an illustration that describes two types of z-tests:

Code example

Consider a case study in which a pharmaceutical company has developed a new medication to improve recovery rates for patients with a chronic illness. The existing medication has an average recovery rate of 60% and a standard deviation of 10%.

The company conducted a clinical trial with 30 patients using the new medication and recorded their recovery rates.

Research questions:

Right-tailed z-test: Does the new medication improve the recovery rate for patients with chronic illness compared to the existing medication?
Left-tailed z-test: Is there a risk that the new medication could result in a lower recovery rate for patients than the existing medication?
Two-tailed z-test: Is there a significant difference in the recovery rates of patients using the new medication compared to those using the existing medication?

We will perform right-tailed, left-tailed, and two-tailed z-tests on the data. Below, we assume data on recovery rates for 30 patients treated with the new medication:

import numpy as np
import statsmodels.stats.weightstats as st
# Assumed sample data values of recovery rates for 30 patients
new_medication_recovery_rates = [
    0.62, 0.65, 0.67, 0.64, 0.66, 0.68, 0.69, 0.70, 0.71, 0.73,
    0.72, 0.74, 0.75, 0.76, 0.77, 0.78, 0.79, 0.80, 0.81, 0.82,
    0.83, 0.84, 0.85, 0.86, 0.87, 0.88, 0.89, 0.90, 0.91, 0.92
]
# Known values
population_mean = 0.6  # Known average recovery rate for the existing medication
population_std = 0.1   # Known standard deviation for the existing medication
# Calculating the z-test for right-tailed test
test_statistic_right, p_value_right = st.ztest(new_medication_recovery_rates, value=population_mean, alternative='larger')
# Calculating the z-test for left-tailed test
test_statistic_left, p_value_left = st.ztest(new_medication_recovery_rates, value=population_mean, alternative='smaller')
# Calculating the z-test for two-tailed test
test_statistic_two_tailed, p_value_two_tailed = st.ztest(new_medication_recovery_rates, value=population_mean)
# Printing the results for right-tailed test
print(f"Right-Tailed Z-Test: Z-Score = {test_statistic_right:.4f}, P-Value = {p_value_right:.4f}\n")
# Printing the results for left-tailed test
print(f"Left-Tailed Z-Test: Z-Score = {test_statistic_left:.4f}, P-Value = {p_value_left:.4f}\n")
# Printing the results for two-tailed test
print(f"Two-Tailed Z-Test: Z-Score = {test_statistic_two_tailed:.4f}, P-Value = {p_value_two_tailed:.4f}\n")
# Drawing conclusions based on the p-values
alpha = 0.05
if p_value_right < alpha:
    print("Right-Tailed Test: Reject the null hypothesis, there is evidence that the new medication increases the average recovery rate.\n")
else:
    print("Right-Tailed Test: Fail to reject the null hypothesis, there is no evidence that the new medication increases the average recovery rate.\n")
if p_value_left < alpha:
    print("Left-Tailed Test: Reject the null hypothesis, there is evidence that the new medication decreases the average recovery rate.\n")
else:
    print("Left-Tailed Test: Fail to reject the null hypothesis, there is no evidence that the new medication decreases the average recovery rate.\n")
if p_value_two_tailed < alpha:
    print("Two-Tailed Test: Reject the null hypothesis, there is evidence that the new medication changes the average recovery rate.\n")
else:
    print("Two-Tailed Test: Fail to reject the null hypothesis, there is no evidence that the new medication changes the average recovery rate.\n")

Code explanation

Lines 1–2: Imported libraries numpy for numerical operations and statsmodels.stats.weightstats for statistical analysis and hypothesis testing.
Lines 5–9: Defined a Python list named new_medication_recovery_rates containing assumed recovery rates of 30 patients who used a new medication.
Lines 12–13: population_mean represents the known average recovery rate (mean) for an existing medication population and population_std represents the known standard deviation for the recovery rate of the existing medication population.
Line 16: We conducted a right-tailed z-test using st.ztest(). This test evaluates whether the average recovery rate of patients on the new medication is significantly greater than the known population mean population_mean. The alternative='larger' parameter specifies that the alternative hypothesis (H1) is that the sample mean is greater than population_mean. The result includes test_statistic_right the z-score and p_value_right p-value.
Line 19: Similarly, we conducted a left-tailed z-test using st.ztest() with alternative='smaller'. This test checks if the average recovery rate of patients on the new medication is significantly less than population_mean. The test_statistic_left (z-score) and p_value_left (p-value) are calculated based on this hypothesis.
Line 22: A two-tailed z-test is conducted by default with st.ztest(). This test examines whether the average recovery rate differs significantly from population_mean in either direction. The test_statistic_two_tailed (z-score) and p_value_two_tailed (p-value) are computed to assess this hypothesis.
Lines 25–49: The results of each z-test are printed with formatted output, displaying the z-score and associated p-value. Based on a predefined significance level (alpha = 0.05), conclusions are drawn for each hypothesis:
- For the right-tailed test: The null hypothesis is rejected if p_value_right < alpha, it indicates evidence that the new medication increases the recovery rate.
- For the left-tailed test: The null hypothesis is rejected if p_value_left < alpha, it suggests evidence that the new medication decreases the recovery rate.
- For the two-tailed test: The null hypothesis is rejected if p_value_two_tailed < alpha, it indicates evidence of a change in the recovery rate with the new medication.

Note: The above code example is intended to demonstrate the concept of z-tests using synthetic data. The recovery rates new_medication_recovery_rates and known values such as population_mean and population_std used in this code are assumed and do not represent actual clinical data. Additionally, the calculated z-scores test_statistic_right, test_statistic_left, test_statistic_two_tailed and p-values p_value_right, p_value_left, p_value_two_tailed in this code are arbitrary and do not reflect real statistical analysis results. The primary goal of this code is to illustrate the process of performing z-tests in Python and interpreting the outcomes based on hypothetical data.

Conclusion

In conclusion, the z-test is a valuable tool for hypothesis testing and decision-making using sample data.

It allows researchers to compare sample statistics with known population parameters, especially when the population standard deviation is known, enabling them to draw reliable conclusions about the broader population.

Free Resources