Kruskal Wallis test in Python

Kruskal-Wallis H statistic, Chi-Square Critical Value, non-parametric test

tnathu-ai
3 min readNov 25, 2022

Kruskal Wallis test is a non-parametric and rank-based test for the equality of means between 3 or more groups using the χ 2 Distribution.

Kruskal-Wallis Test: group1 versus group4 using MiniTab

Assumptions for the test:

1. At least one of the two large sample conditions is met.

2. All the samples are random.

3. All the sampled populations have the same shaped probability density function, with possibly different means.

4. The populations are independent.

H0 : μ1 =μ2 =μ3 =···=μk

H1 : at least two μi differ.

NOTE: The larger the differences, the larger the test statistic H. This is why the test is only an upper-tail test.

# Find Kruskal-Wallis H Statistic Value
from scipy import stats

# data instance
group1 = [624, 680, 454, 510, 539]
group2 = [425, 595, 737, 459, 709, 482]
group3 = [397, 794, 595, 539, 680, 652]
group4 = [482, 510, 369, 567, 595]


# Find the Kruskal-Wallis H Critical Value which is also Chi-Square Critical Value

# find Chi-Square critical value for 2 tail hypothesis tests
alpha = float(0.01)
k = 4
degree_freedom = k - 1
# X² for upper tail
upper_tail_critical_value = scipy.stats.chi2.ppf(1 - alpha, df=degree_freedom)


alpha_value = 0.05

#perform Kruskal-Wallis Test
h_statistic = stats.kruskal(group1, group2, group3, group4)[0]
p_value = stats.kruskal(group1, group2, group3, group4)[1]


def check_Kruskal_Wallis_test_pvalue(p_value, alpha_value):
print(f'The p-value is {p_value}')
if p_value < alpha_value:
print(f"The p value is less than alpha {alpha_value} which p is significant -> Reject the null hypothesis. The means of the samples are DIFFERENT because the groups have statistically significant difference in their means.\n\n")
else:
print(f"The p value is larger than alpha {alpha_value} which p is not significant -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.\n\n")


check_Kruskal_Wallis_test_pvalue(p_value, alpha_value)
print(f'\n')

def check_Kruskal_Wallis_test_Hvalue(h_statistic, upper_tail_critical_value):
print(f'The Kruskal-Wallis H statistic is {h_statistic}')
print(f'The critical value X²U for the upper tail is {upper_tail_critical_value}')
if h_statistic < upper_tail_critical_value:
print(f"The Kruskal-Wallis H statistic is smaller than the critical value -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.\n\n")
else:
print(f"The Kruskal-Wallis H statistic is larger than the critical value -> Reject the null hypothesis. The means of the samples are DIFFERENT because the groups have statistically significant difference in their means.\n\n")

check_Kruskal_Wallis_test_Hvalue(h_statistic, upper_tail_critical_value)
print(f'\n\n')

The p-value is 0.5643408523150142

The p value is larger than alpha 0.05 which p is not significant -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.

— — — — — — — — — — — — — — — — — —

The Kruskal-Wallis H statistic is 2.0390527509926217

The critical value X²U for the upper tail is 11.344866730144373

The Kruskal-Wallis H statistic is smaller than the critical value -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.

Referenecs

--

--

tnathu-ai
tnathu-ai

Written by tnathu-ai

0 Followers

Statistic, Data, Python enthusiast

No responses yet