E-411 PRMA

Lecture 16 - Intelligence Tests & Education

Christopher David Desjardins

Last time

Different models of intelligence

Spearman's g; Cattell & Horn's broad domains; Carroll's hierarchical view; CHC; processing view of intelligence

Does it really matter how we define intelligence?

Is it more academic than practical?

Test selection criteria

  • Theoretical model
  • Ease of administration and scoring
  • Ease on interpretation for a purpose
  • Appropriateness of the norms
  • Reliabilty and validity indices
  • Test utility (i.e. costs vs. benefits)

Stanford-Binet IQ test

  • Conceived to screen for developmental disabilities
  • Provided organized administration/scoring instructions
  • Originally intelligence calculated as the ratio of mental age to true age (IQ)
  • Deviation IQ, mean of 100 and standard deviation of 16 (now 15), comparison of an individual with other of the same age in the standardization sample
  • Used age scale, grouping by items by age; point scale, grouping of items by category that could be scored correct or incorrect
  • Scores can be obtained as a test composite (e.g. IQ)

Fifth edition

  • For 2 to 85+ year olds
  • Measures fluid reasoning, knowledge, quantitative reasoning, visual-spating processing, and working memory (table 10-2).
  • 10 subtests, each about 5 minutes (M = 10; SD = 3)
  • Scores can be use to create various composites: factor index; nonverbal and verbal IQ; abbreviated battery IQ; and FSIQ

Uses of SB5

  • Diagnose a range of development disabilities and exceptionalities
  • Clinical and neuropsychological assessment
  • Early childhood assessment
  • Psychoeducational evaluations for special education placements
  • Adult social security and workers? compensation evaluations
  • Providing information for interventions such as IFSPs, IEPs, career assessment, industrial selection, and adult neuropsychological treatment
  • Forensic contexts
  • Research on abilities and aptitudes

Validity evidence for SB5

Normative sample: 4,800 individuals between 2 and 85+ years (to match the 2000 U.S. Census)

Bias reviews were conducted on all items for the following variables: gender, ethnicity, culture, religion, region, and socioeconomic status.

Co-normed with with the Bender Visual-Motor Gestalt Test, Second Edition and the Test Observation Form

For the FSIQ, NVIQ, and VIQ, reliabilities range from .95 to .98.

Reliabilities for the Factor Indexes range from .90 to .92.

For the 10 subtests, reliabilities range from .84 to .89.

Concurrent and criterion validity data were obtained using the SB-IV,SB-LM, WJ III, UNIT, Bender-Gestalt II, WPPSI-R,WAIS-III, WIAT®-II, and WISC-III.

WISC-IV

Provides IQ scores and and critical clinical insights into a child's cognitive functioning.

Measures verbal comprehension; perpetual reasoning; working memory; and processing speed (table 10-5)

Norming: consisted of 2,200 children between the ages of 6 and 16:11 years. The sample was stratified on age, sex, parent education level, region, and race/ethnicity.

What kind of valdity evidence is provided?

Measurement check

Will the reliablity and validity statistics reported in a technical manual be applicable to you as a test administrator?

If they report coefficient alpha of 0.95, what will your coefficient alpha be?

If the correlation between the IQ score obtained from the Stanford-Binet and an GRE is reported as .75, will you have that same correlation for your group of students?

Comparison and other tests

  • Both purport to measure intelligence

  • Highly correlated, differ by amount of g

  • Both work within the CHC model, though Weschler favors g

  • Both represent gold standard

  • Kaufman test focus on processing not structure

Group tests

  • USA army developed tests for recruits in WWI

  • Alpha, those who could read, Beta, those who couldn't

  • Assigned duty and service based on performance

  • Tests used in post-war because they were much cheaper

  • Later, Army General Classification Test and Armed Service Vocational Aptitute Battery

  • Also used in the schools in the USA for placement (not as much now)

What is the purpose of school and education?


How do we use tests in education?

In the United States, historically, the purpose of education has evolved according to the needs of society. Education's primary purpose has ranged from instructing youth in religious doctrine, to preparing them to live in a democracy, to assimilating immigrants into mainstream society, to preparing workers for the industrialized 20th century workplace.
I think that my view, and most people's view, is that the purpose of education is to support children in developing the skills, the knowledge, and the dispositions that will allow them to be responsible, contributing members of their community—their democratically-informed community. Meaning, to be a good friend, to be a good mate, to be able to work, and to contribute to the well-being of the community.

What is the difference between an achievement and an aptitude test?

How do their uses differ?

Is it possible to write an item that measures achievement and not aptitude or vice versa?

Diagnostic tests

  • May consists of multiple subtests
  • Designed to identify the missing knowledge/skill
  • Typically, easier than evaluative tests
  • Doesn't answer why
  • Often focus on reading and mathematics

What would an item on a reading diagnostic test look like?

What would an item on a mathematics diagnostic test look like?

Woodcock Reading Mastery Tests-Revised

measures: reading readiness, achievement, and difficulties

norm: 3,300 USA nationally representative.

target: 4.5 to 80 year olds

subtests include letter identification, word identification, word attack, word comprehension, passage comprehension, phonological awareness, listening comphrension, oral reading fluency

Stanford Diagnostic Reading Tests

Instead: GRADE

Stanford Diagnostic Mathematics Test

Instead: GMADE

KeyMath3

Psychoeducational batteries

  • Measure abilities related to success
  • Measure educational achievement
  • Used for normative comparision and to plan interventions

How would an item here differ from those for a diagnostic test?

Kaufman Assessment Battery for Children

  • Measures intelligence and achievement
  • Kaufmans focus on information-processing aspect of intelligence
    • Simultaneous - all at once
    • Sequential - processing in a series
  • Table 11 - 3
  • Unclear factor structure
  • Also can be used with the CHC model ... but how?

Woodcock-Johnson IV

Performance

  • Performance task - a work sample design to elicit representative knowledge, skills, and values from a domain of study
  • Performance assessment - evaluation of these tasks
  • How might we use performance assessment in class? HR?

Portfolio

  • What is a portfolio and what are some examples of a portfolio?

  • A sample of your work

  • How might we use a portfolio in class? HR?

  • What are some ways you use a portfolio?

  • Major issue, potential subjectivity in scoring

Authentic Assessment

  • A form of performance assessment is authentic assessment
  • A task that evaluates your ability to transfer knowledge from the classroom to the real-world

  • What have we done in class that is this type of an assessment?

  • Major issue, could be affected by what you already know

Peer appraisal and other measurements in education

peers assign a score or ranking to you

"Which student would you rather work on a class project with?"

"Which student is the most popular"

these are often dynamic

Other inventories measure study habits, interests, and attitudes