Today

Course Overview
Statistical review (continued)
Brief history of testing
Norms and sampling

About the Course

E-411-PRMA Course Website
Cohen, R. J., Swerdlik, M. E., & Sturman, E. D. (2013). Psychological testing and assessment: An introduction to tests and measurement. New York, NY: McGraw-Hill
Website readings on item response theory and generalizability theory
Will be using R in class and on assignments
http://kennslubanki.hi.is/tolfraedi

First-Day Test

Thoughts?
What was I measuring?
Pros/cons of the tests?
Will return to this test throughout the course

Testing and measurement

What is measurement?

Assignment of numerical values based on a set of rules

What is a test?

Instrument used to measure

Statistical Review

Scales - Nominal, ordinal, interval, and ratio
An item measuring ...

Temperature (in Celsius)
Temperature (in Kelvin)
Speed
Gender
Happiness

Probability distribution, Skewness, kurtosis

What is a probability distribution

Assigns a probability, likeliness of occurence, of a score of all possible scores
May be parametric or non-parametric
Normal

What skew might you expect these outcomes to look like?

Reaction time in a psychological experiment
Number of children in a family
Scores on an easy test
Height in Reykjavík

Platykurtic, mesokurtic, and leptokurtic
Plot your data, rely less on statistics!

Homework Problems

GRE exam: Mean = 500, SD = 100, and 1000 participants
Jarl gets a 650 on the GRE, about how many students scored below him?
330 students score below Þöll, what was her score?

Letting `R do the work`

        pnorm(650, mean = 500, sd = 100) * 1000
        # 933.1928
        qnorm(330/1000, mean = 500, sd = 150)
        # 456.0087

Percentiles

330 students score below Þöll, what percentile is she in?
Percentiles represent a reasonably simple and intuitive way to classify scores relative to other test takers

Correlation

$$ r = \frac{\sum (X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum(X - \bar{X})^2 \sum(Y - \bar{Y})^2}}$$

Calculating correlations

	X	Y
	5	6
	3	0
	1	0
Mean	3	2

Calculating the correlation in `R`

        # Assign X the values of 5, 3, 1 and Y the values of 6, 0, and 0
        x <- c(5, 3, 1)
        y <- c(6, 0, 0)
        # Calculate the correlation
        cor(x , y)
        # 0.8660254

Simple linear regression

Pretend we are interested in predicting height given someones weight
We might consider a linear regression model

$$Height = \beta_0 + \beta_1*Weight$$

How would we know if this is appropriate?

Model Summary

Coefficient	Estimate
Intercept	25.723456
Slope	0.287249

$$\hat{Height} = 25.723456 + 0.287249*Weight$$

How does this relate to the correlation?

Slope and the correlation

There is a relationship between the estimated slope and the correlation between two variables in a SLR

$$ r = b\frac{SD_X}{SD_Y} $$

What is r if b is 0.287249, the SD of weight is 15.49869 and the SD of height is 4.472136?
Does that parallel what you thought based on the plot?
What would b become if r = 0.4?

History and timeline of test development

2200 BCE, Chinese believed to use testing for determining who would get governmental jobs
Greek and Romans categorized individuals based on personality type ("blood" or "phlegm")
Francis Galton's classification based on "natural gift" (i.e. eugenics)

Contributed to development of questionnaries, rating scales, and self-report inventories

Wilhelm Wundt's laboratory and his focus on "standardization"

James Cattell's mental tests
Charles Spearman - reliability and factor analysis

Testing in the 20th Century

1905, Binet and Simon publish a test measuring intelligence in mental retarded school children in Paris
1939, Wechsler publishes a test to measure intelligence in adults (would become WAIS)
Group intelligence test administered by the US military during WWI and WWII
WWI personality tests used to screen recruits

Assumptions of Psychological Testing

Psychological traits and states exist
Psychological traits and states can be measured
Behavior on tests predicts non-test behavior
Measurement error is part of the process
Test can be fair
Test can benefit society

Norm-Referenced Testing

Individuals scores are relative only to some reference group
This group should represent the entire pool of test takers for the tested construct
Collectively, this group is known as a normative sample and data from them are the norms
Understanding the normative sample is very important

Sampling: PTSD in Iceland

Sampling Techniques

Simple random sample
Stratified random sample
Cluster random sample
Purposive sample
Convenience sample

Different Norms

Developmental Norms

Age Norms

A 6 year old performs at the level of a 10 year old
This is on this material only though!

Grade Norms

School year typically 10 months in the US (and Iceland?)
A 4th grader is performing at the level of a 5th grader in third month
This is on this material only though!

National Norms, national representative

Anchor norms enable two tests to be compared
In USA, students could take SAT or ACT for admission to college

Fixed Reference and Criterion-Related

Fixed reference group scores are used as the basis for calculation of future administrations of the test
SAT does this through using anchor items and equating
Criterion-referenced, evaluate a score with reference to a set criteria or standard NOT other test takers

Question

How would you score the grades in a class room?
What do you think is the fairest way?

Next time

Please read the chapter on reliability (chapter 5)
Please watch the RStudio videos