Chapter 1 Essay

Submitted By jackykk101
Words: 2499
Pages: 10

Chapter 1
Linear regression with one predictor variable

1

Regression (Historically)





Regression means ‘going back’
Francis Galton (1822‐1911) studied “Hereditary Genius”
(1869) and other traits
Heights of fathers and sons







Sons of the tallest fathers tended to be taller than average, but shorter than their fathers
Sons of the shortest fathers tended to be shorter than average, but taller than their fathers

This kind of thing was observed for lots of traits.
Galton was deeply concerned about “regression to mediocrity.” 2

Types of Data


Typically, data come to us in one of four forms: 




Categorical (Nominal)
Ordinal
Interval
Ratio

3

Categorical variables


Take on several levels, none of which have any natural ordering








Sex (M, F, …)
Race (Black, White, Asian, …)
Program major (Stat, CS, Math, Psych, Bio, …)
Type of fertilizer (A, B, …)
Drug (Active, Placebo)

When controlled by the experimenter, called a Factor


Important nomenclature for R

4

Ordinal variables


Take on several levels which have a natural order, but no consistent distance metric



Grade (A+, A, A-, B+, …)
Professor Rating (5, 4, 3, 2, 1)








Likert item

Level of education (PhD, Masters, Bachelors, HS,
Primary, None)
Sports (Rugby, Football, Soccer, … Basketball) 

Difficult to deal with, so we usually consider them as either Categorical, or

5

Interval variables


Numerical variable with a consistent distance metric, but no proper zero point






IQ
Temperature (in °C)
SAT score

Slope and difference are meaningful, but ratios are not
6

Ratio variables


Interval variable with a proper