Kruskal Wallis test in Python
Kruskal-Wallis H statistic, Chi-Square Critical Value, non-parametric test
Kruskal Wallis test is a non-parametric and rank-based test for the equality of means between 3 or more groups using the χ 2 Distribution.
Assumptions for the test:
1. At least one of the two large sample conditions is met.
2. All the samples are random.
3. All the sampled populations have the same shaped probability density function, with possibly different means.
4. The populations are independent.
H0 : μ1 =μ2 =μ3 =···=μk
H1 : at least two μi differ.
NOTE: The larger the differences, the larger the test statistic H. This is why the test is only an upper-tail test.
# Find Kruskal-Wallis H Statistic Value
from scipy import stats
# data instance
group1 = [624, 680, 454, 510, 539]
group2 = [425, 595, 737, 459, 709, 482]
group3 = [397, 794, 595, 539, 680, 652]
group4 = [482, 510, 369, 567, 595]
# Find the Kruskal-Wallis H Critical Value which is also Chi-Square Critical Value
# find Chi-Square critical value for 2 tail hypothesis tests
alpha = float(0.01)
k = 4
degree_freedom = k - 1
# X² for upper tail
upper_tail_critical_value = scipy.stats.chi2.ppf(1 - alpha, df=degree_freedom)
alpha_value = 0.05
#perform Kruskal-Wallis Test
h_statistic = stats.kruskal(group1, group2, group3, group4)[0]
p_value = stats.kruskal(group1, group2, group3, group4)[1]
def check_Kruskal_Wallis_test_pvalue(p_value, alpha_value):
print(f'The p-value is {p_value}')
if p_value < alpha_value:
print(f"The p value is less than alpha {alpha_value} which p is significant -> Reject the null hypothesis. The means of the samples are DIFFERENT because the groups have statistically significant difference in their means.\n\n")
else:
print(f"The p value is larger than alpha {alpha_value} which p is not significant -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.\n\n")
check_Kruskal_Wallis_test_pvalue(p_value, alpha_value)
print(f'\n')
def check_Kruskal_Wallis_test_Hvalue(h_statistic, upper_tail_critical_value):
print(f'The Kruskal-Wallis H statistic is {h_statistic}')
print(f'The critical value X²U for the upper tail is {upper_tail_critical_value}')
if h_statistic < upper_tail_critical_value:
print(f"The Kruskal-Wallis H statistic is smaller than the critical value -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.\n\n")
else:
print(f"The Kruskal-Wallis H statistic is larger than the critical value -> Reject the null hypothesis. The means of the samples are DIFFERENT because the groups have statistically significant difference in their means.\n\n")
check_Kruskal_Wallis_test_Hvalue(h_statistic, upper_tail_critical_value)
print(f'\n\n')
The p-value is 0.5643408523150142
The p value is larger than alpha 0.05 which p is not significant -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.
— — — — — — — — — — — — — — — — — —
The Kruskal-Wallis H statistic is 2.0390527509926217
The critical value X²U for the upper tail is 11.344866730144373
The Kruskal-Wallis H statistic is smaller than the critical value -> Fail to reject the null hypothesis. The means of the samples are SAME because the groups have non-statistically significant difference in their means.