Z-test

Use the z-test to determine if the difference between the proportions of two samples is significant or not.

The test assumes that both samples are independent and are drawn from normal populations, or at least have a sample size large enough to use the normal approximation. The null hypothesis is that there is no difference between the two population proportions, while the alternative hypothesis is that the proportions are different.

Z-test formula
z_score = (p1 - p2) / sqrt( p * (1 - p) * (1/n1 + 1/n2) )
Variable Description
z_score The z-score.
p1, p2 The sample proportions of two populations.
p The pooled proportion of the two populations: (p = (x1 + x2) / (n1 + n2))
n1, n2 The sample sizes of the two populations.

Minimum sample size formula

n = 2*(Zα/2 + Zβ)^2 * (p1(1-p1) + p2(1-p2)) / (p1-p2)^2
Variable Description
n The minimum sample size for each group.
Zα/2 The critical value of the standard normal distribution for a significance level of α/2.
The critical value of the standard normal distribution for the desired power of the test (1-β).

Power is the probability of a hypothesis test of finding an effect if there is an effect to be found.

This can be adjusted. Usually Power is 0.8 (0.84).

p1, p2 The estimated proportions of the two groups.
(p1-p2) The expected difference in proportions between the two groups.

Z-test proportion formula for two tailed hypothesis

z = (p1 - p2) / sqrt( p * (1 - p) * (1/n1 + 1/n2) )
Variable Description
z The z-test score.
p1, p2 The sample proportions of the two populations.
p The pooled proportion of the two populations: (p = (x1 + x2) / (n1 + n2)).
n1, n2 The sample sizes of the two populations.

Z-test proportion formulas

Z-test proportion calculation for the two tailed hypothesis.

z = (p1 - p2) / sqrt( p * (1 - p) * (1/n1 + 1/n2) )
p = (x1 + x2) / (n1 + n2)
Variable Description
z The z-test score.
p1, p2 The sample proportions of the two populations.
p The pooled proportion of the two populations: (p = (x1 + x2) / (n1 + n2)).
x1, x2 The proportions count of the sample.
n1, n2 The sample sizes of the two populations.