Essay about Non-parametric Statistics

Submitted By Caged-Lioness
Words: 1024
Pages: 5

Week 3

p-values
• The calculated probability values corresponding to a particular test statistic given some set of circumstances
– Sample size
– Degrees of freedom
– Etc

• Calculated for us by SPSS
• P<0.05 generally considered significant
– Corresponds to alpha level

χ and
Other Non-Parametric Tests
2

Parametric vs Non-Parametric
Tests
• Parametric tests
– used when making a generalization about at least one parameter (population measure)
– Generally requires
• Normal distribution
• At least one interval level data
– Or that “approaches” interval level

– More powerful than non-parametric tests

Parametric vs Non-Parametric
Tests
• Non-Parametric Tests
– Don’t test hypotheses about population parameters – Don’t require normal distributions
– May be used for all levels of measurement
• Nominal to ratio

Chi-squared (χ ) Test
2

• The basic non-parametric test
• A test of independence
– Are two categorical variables independent? or – Is there a relationship between two categorical variables?

Chi-squared (χ ) Hypotheses
2

• Ho: there is no relationship between categorical variables
• H1: there is a relationship between categorical variables

χ2 Assumptions and
Requirements
• May be used when:
– Both IV and DV are nominal
• Ordinal sometimes if few categories
• Interval or ratio data is sometimes grouped to form nominal or ordinal variables
– Age in years into {0-15, 16-25, 26-35}







Assumes random and independent sampling
Each subject must qualify for ONE cell
No assumptions made about distribution shape
No assumptions about homogeneity
Expected frequency for each cell must be >0

χ Contingency Table
2

In SPSS: Crosstabs subject has diabetes * subject had stroke Crosstabulation
Count

subject has diabetes Total

no yes subject had stroke no yes
67
10
15
8
82
18

Total
77
23
100

The 2×2 table is the simplest but… can go to n1×n2

Calculating χ

2

(Oij - Eij)2
2 =  -------------- with df = (r-1)(c-1)
Eij
Where,
Oij = Observed cell frequencies
Eij = Expected cell frequencies =

Row Count x Column Count
--------------------------------------Total Count (N)

χ Distribution
2

• One-tailed
• Skewed to the right
• Similar to F distribution Calculating χ

2

subject has diabetes * subject had stroke Crosstabulation
Count

subject has diabetes no yes Total

E11
E12
E21
E22

=
=
=
=

(77X82)/100
(77X18)/100
(23X82)/100
(23X18)/100

subject had stroke no yes
67
10
15
8
82
18

=
=
=
=

63.1
13.9
18.9
4.1

Total
77
23
100

*Remember: This can be expanded to almost any reasonably sized table.

In Our Example…
(Oij - Eij)2
2 = 
Eij
(67 – 63.1)2 + (10 – 13.9)2 + (15 – 18.9)2 + (8 - 4.1)2
=

=5.8
63.1

13.9

18.9

4.1

with df = (R - 1)(C - 1) = (2 - 1)(2 - 1) = 1
General rule of thumb for 2X2 table with 1 df: χ2>4 is significant

SPSS Chi-Square Results
Chi-Square Tests

Pearson Chi-Square
Continuity Correctiona
Likelihood Ratio
Fisher's Exact Test
Linear-by-Linear
Association
N of Valid Cases

Value
5.700b
4.319
5.093
5.643

df
1
1
1
1

Asymp. Sig.
(2-sided)
.017
.038
.024

Exact Sig.
(2-sided)

Exact Sig.
(1-sided)

.028

.023

.018

100

a. Computed only for a 2x2 table
b. 1 cells (25.0%) have expected count less than 5. The minimum expected count is 4.
14.

Fisher’s Exact Test
• Used when an expected value is <5
– Takes into account small sizes

• Can be used only with 2×2 table
– Sometime necessary to collapse cells if χ2 cannot be adequately calculated
– If cells are not collapsed, χ2 can provide estimate but NOT actual significance

Yates’ Correction
• Yates’ correction for continuity
• Used in 2X2 tables, generally when any expected cell frequency is <10
– Do not apply when expected frequencies are small • Some disagreement about its use
– Reduces power

• Provides more conservative estimate
– Sometimes desirable, particularly with small numbers Yates’ Correction χ2 = ∑

(|O-E| - 0.5)2
E

Other Non-Parametric Tests
• Used when DV is nominal or ordinal OR
• When the assumptions for more