In deze cursus hebben we slechts één werkcollege per week, op vrijdagmiddag. Daarnaast is er een ondersteunend computer-practicum, op vrijdagochtend. De eerste week is er geen practicum.
De nadruk ligt op zelfstudie, opdrachten, en "peer review".
De voertaal is Nederlands.
De beoordeling is gebaseerd op de wekelijkse opdrachten en reviews (70%) en de eindopdracht (30%).
Het eerste deelcijfer is gebaseerd op je gezamenlijke opdrachten, reviews, en deelname tijdens colleges. Je ontvangt dus niet voor iedere week een apart deelcijfer! Je studievoortgang moet je dus zelf bewaken, met behulp van de uitwerkingen van andere studenten, reviews, en discussies tijdens werkcollege.
De eindopdracht bepaalt 30% van je eindcijfer. Naar verwachting zullen de eindcijfers pas bekend gemaakt kunnen worden na afloop van blok 2.
Let us assume that we have 2 observations for each of 5 persons. These observations are about the perceived body weight, as judged by two 'raters' or judges, x1 and x2.
The data are as follows:
person x1 x2
1 60 62
2 70 68
3 70 71
4 65 65
5 65 63
Because we have only two measures (variables), there is only one pair of measures to compare in this example. Very often, however, there are more than two judges involved, and hence many more pairs.
First, let us calculate the correlation between these two variables x1 and x2. This can be done in SPSS with the Correlations command (Analyze > Correlate > Bivariate, check Pearson correlation coefficient).
This yields r=.904, and the average r (over 1 pair of judges) is the same.
If you need to compute r manually, one method is to first convert x1 and x2 to Z-values [(x-mean)/s], yielding z1 and z2. Then r = SUM(z1×z2) / (n-1).
This value of r corresponds to Cronbach's Alpha of (2×.904)/(1+.904) = .946 (with N=2 judges).
Cronbach's Alpha can be obtained in SPSS by choosing Analyze > Scale > Reliability Analysis. Select the "items" (or judges) x1 and x2, and select model Alpha.
The output states: Reliability Coefficients [over] 2 items, Alpha = .9459 [etc.]
If the same average correlation r=.904 had been observed over 4 judges (i.e. over 4×3 pairs of judges), then that would have indicated an even higher inter-rater reliability, viz. alpha = (4×.904)/(1+3×.904) = .974.
Exactly the same reasoning applies if the data are not provided by 2 raters judging the same 5 objects, but by 2 test items "judging" a property of the same 5 persons. Both approaches are common in language research. Although SPSS only mentions items, and inter-item reliability; the analysis is equally applicable to raters or judges, and inter-rater reliability.
Note that both judges (items) may be inaccurate. A priori, we do not know how good each judge is, nor which judge is better. We know, however, that their reliability of judging the same thing (true body weight, we hope) increases with their mutual correlation.
Now, let's regard the same data, but in a different context. We have one measuring instrument of the abstract concept x that we try to measure. The same 5 objects are measured twice (test-retest), yielding the data given above. In this test-retest context, there is always just one correlation, and the idea of inter-rater reliability does not apply in this context. We find that rxx=.904.
This reliability coefficient r = s2T / s2x . This provides us with an estimate about how much of the total variance is due to variance in the underlying, unknown, "true" scores. In this example, 90.4% of the total variance is estimated to be due to variance of the true scores. The complementary part, 9.6% of the total variance, is estimated to be due to measurement error. If there were no measurement error, then we would predict perfect correlation (r=1); if the measurements would contain only error (and no true score component at all), then we would predict zero correlation (r=0) between x1 and x2.
In this example, we find that
se = sx × sqrt(1-.904) = sqrt(15.484) × sqrt(.096) = 1.219
check: s2x = 15.484 = s2T + s2e =
s2T + (1.219)2,
so s2T = 15.484 - 1.486 = 13.997
and indeed r = .904 = s2T / s2x = 13.997 / 15.484.
Supposedly, x1 and x2 measure the same property x. To obtain s2x, the total observed variance of x (as needed above), we cannot use x1 exclusively nor x2 exclusively. The total variance is obtained here from the two standard deviations:
s2x = sx1 × sx2
s2x = 4.18330 × 3.70135 = 15.484
In general, a reliability coefficient smaller than .5 is regarded as low, between .5 and .8 as moderate, and over .8 as high.
college 3 (vervolg)
Opdrachten:
Je antwoorden en uitwerkingen op deze opdrachten moet je weer inleveren zoals
hierboven beschreven.
Zoals altijd moet je weer helder, correct en compact schrijven.
-
Answer the following questions:
Ferguson & Takane, Chapter 24: Exercises 1, 2.
-
We have constructed a test consisting of 4 items, with an average inter-item correlation of 0.4.
a. How many inter-item correlations are there, between 4 items? (Ignore the trivial correlation of an item with itself.)
b. Compute the Cronbach Alpha reliability coefficient of this test of k=4 items.
Now we add a new 5th item.
c. How many new inter-item correlations are added to the correlation matrix when a 5th item is added to the test?
Unfortunately the coding of this item happens to be incorrect, that is, the scale was reversed for this new item. The inter-item correlation of this 5th item with each of the 4 older items is -0.4 (note the negative sign).
d. What is the average inter-item correlation after adding this 5th test item?
e. Compute the Cronbach Alpha coefficient of the longer test of k=5 items.
f. Compare and discuss the reliability and usefulness of the shorter and of the longer test.
-
A student weights an object 6 times. The object is known to weigh 10 kg. She obtains readings on the scale of 9, 12, 5, 12, 10, and 12 kg. Describe the systematic error and the random errors characterizing the scale's performance.
Adapted from: R.L. Rosnow & R. Rosenthal (2002). Beginning Behavioral Research: A conceptual primer (4th ed.). Upper Saddle River, NJ: Prentice Hall. Ch.6, Q.7, p.159.
college 5: 8 Jan 2009 (week 7)
ANOVA: beginselen, eenweg en meerweg analyses, interactie, fixed vs random factors, error terms.
Leesstof:
RvH: Chapters 3 and 4 and 5 and 6.
Opdrachten:
-
In a study of cardiovascular risk factors, joggers who run at least 15 miles per week were compared with a control group described as "generally sedentary". Both men and women participated in this study. The design is a 2×2 between-subjects ANOVA, with Group and Sex as factors. There were 200 participants for each combination of factors. One of the dependent variables is the rate of heartbeat of a participant, after 6 minutes on a treadmill, expressed in beats per minute.
Data from this study are available here in SPSS format, or as plain text (the latter file contains variable names in the first line).
(a) What do you think of the construct validity? Please comment.
(b) Is is allowed to conduct an analysis of variance on these data? Motivate your answer with relevant statistical considerations.
(c) Conduct a two-way ANOVA on these data.
(d) Write a summary of the results of this study, including the (partial) effect size η and η2. Draw your conclusions clearly.
(e) From each cell (combination of factors), draw a random sample of n=20 individuals, out of the 200 in that cell. Explain how you have performed the random sampling. Repeat the two-way ANOVA on this smaller data set.
(f) Discuss the similarities and differences in results between (b) and (d).
This exercise is adapted from:
Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. Example 13.8, pp.813-816.
-
In a fictitious study, the effect of a growing potion was investigated. The growing potion was administered in 5 different dosages (of 1, 3, 5, 10, and 20 units per day), to 10 men and to 10 women for each dosage, during 15 days.
The dependent variable is the increase in body length of a participant, after 15 days, in cm.
Data from this study are available here in SPSS format,
or as plain text
(the latter file contains variable names in the first line).
(a) Import these data into SPSS or a statistical package of your choice. Make a graph of the increase in body length, for each of the 10 conditions. (Hint: In SPSS use a "clustered boxplot".) Discuss what the graph shows.
(b) Conduct a two-way ANOVA on these data, with Sex and Dosage as two "fixed" factors. Include measures of effect size and of power in your report.
(c) What is the range of generalisation over dosages, in the ANOVA in (b)? Discuss the external validity of the dosage factor.
(d) Conduct a two-way ANOVA, but now with Dosage as "random" factor. (Hint: SPSS does not handle "mixed" models like this one very well. It's probably easiest to calculate the F-ratios by hand, using the ANOVA results obtained under (b) above.)
(e) What is now the range of generalisation over dosages, in the ANOVA in (d)? Again discuss the external validity of the dosage factor.
(f) Discuss the similarities and differences in results between the two ANOVAs in this assignment. Does the growing potion have a different effect on men and women?
college 6: vrij 15 Jan (week 8)
ANOVA: Repeated Measures, minF'
Leesstof:
-
RvH: Chapter 8
-
my manual about ANOVA in SPSS, covering the same ground as RvH Chapter 8.
Verwijzingen:
Kijk ook naar deze pagina's van verwante cursussen over onderzoeksmethoden, bij andere universiteiten:
Opdrachten:
-
Beantwoord de volgende vragen uit het gelezen hoofdstuk:
§8.15: Exercises 1, 2, 3, 4, 5.
college 7: 22 Jan (week 9)
Multipele regressie, multivariate analyses.
Leesstof:
-
Moore, D.S., & McCabe, G.P. (2003). Introduction to the Practice of Statistics (4th ed.). New York: Freeman. [ISBN 0-7167-9657-0]. Chapter 11 "Multiple Regression", pp. 708-745.
[book companion website].
-
optional: Devore, Jay & Peck, Roxy (2001) Statistics: The Exploration and Analysis of Data (4th ed.). Pacific Grove, CA: Duxbury. [ISBN 0-534-35867-5.] Chapter 14 "Multiple Regression Analysis", pp. 553-610.
-
optional: chapter on Multiple Regression, from the excellent online
statistics textbook
at StatSoft, Inc.
Opdrachten:
Bij dit college zijn geen verplichte opdrachten, en geen peer review.
Je kunt voor jezelf wel de volgende opdrachten maken en inleveren als eindopdracht!
-
Answer the following questions:
Moore & McCabe, Chapter 11: Exercises 2, 3, 16, 33.
Data for the last two questions are available here in plain text format (the first line of this file contains variable names).
Forward or Backward?
For questions 16 and 33 the FORWARD method is most appropriate.
This means that you start with an empty model (only intercept b0)
to which predictors are added step by step. After each addition of a predictor,
you check whether the model performs significantly better than before
(e.g. by checking whether R2 increases).
The questions are about the increment in R2 by adding a predictor.
The relevant information is easier to find in the SPSS output if you specify
the FORWARD method.
As a bonus, you could check what happens if you exclude case #51 from the
data set, e.g. by marking it as a missing value. This is quite easy if you
keep the regression command in a Syntax window for repeated use.
HSS, SAT, GPA??
The chapter by Moore & McCabe draws heavily on typically American concepts. In the USA, your achievements are all that counts, in life as well as in study. The US grading system ranges from A+ (excellent) to F (fail).
For admission to a university, two things are taken into account:
(a) your average grades in the final years of high school (HSM, HSS, HSE), and
(b) your score in a national admissions exam, like the Dutch CITO test (Scholastic Aptitude Test, SAT).
Top-class universities, like Harvard, Yale, Stanford, etc., use both parameters in selection. You have to be the best in your class (but your classmates are strongly competing for this honor), plus you need a minimal score on your SAT.
During your academic study, all your grades and results contribute to your Grade Point Average (GPA), a weighted average grade. This GPA is generally used as an indication of academic achievement and success. The authors attempt to predict the GPA from the previously obtained indicators (a) and (b).
regression
Why is it "regression"? This has to do with heredity, the field of biology where regression was first developed by Francis Galton (cousin of Charles Darwin) in the late 19th century.
Take a sample of fathers, and note their body length (X). Wait for one full generation, and measure the body length of each father's oldest adult son (Y). Make a scattergram of X and Y. The best-fitting line throught the observations has a slope of less than 1 (typically about .65). This is because the sons' length Y tends to "regress to the mean" — outlier fathers tend to produce average sons, and average fathers also tend to produce average sons. Galton called this phenomenon "regression towards mediocrity". Thus the best-fitting line is a "regression" line because it shows the degree of regression to the mean, from one generation to the next. (Note that any slope larger than 0 suggests an hereditary component in the sons' body length, Y.)
Questions: Which variable has the larger variance, X or Y? Does the variation in body length increase or decrease (regress) over generations? Why?
partial correlation
The partial correlation between X
1 and X
2, with X
3 removed from both, is given by:
r
12.3 = ( r
12-r
13r
23 ) / sqrt[ (1-r
213)(1-r
223) ]
- Ferguson, G. A., & Takane, Y. (1989). Statistical Analysis in Psychology and Education (6th ed.). New York: McGraw-Hill. p495.
eindopdracht
Voor de eindopdracht zijn twee mogelijkheden.
Ten eerste kan je een herziene versie inleveren van een eerdere opdracht uit deze cursus. Je mag zelf kiezen welke opdracht je wilt herzien.
Het herziene werkstuk dient zoveel mogelijk een vloeiend verhaal te zijn, dus geen verzameling van losse zinnen en statistische uitvoer.
In de herziene versie moet je de commentaren verwerken van je reviewer — als je het daarmee eens bent. Gebruik ook de leesstof en externe verwijzingen indien beschikbaar.
Je kunt de opmerkingen van de reviewer bespreken in je eigen herziene tekst. Maar misschien vind je het makkelijker om een aangepaste coherente tekst te schrijven, en een afzonderlijk document waarin je de opmerkingen van de reviewer bespreekt, welke je hebt overgenomen en welke niet, en waarom niet. (Dat heet een "cover letter").
Ten tweede mag je de opdrachten bij college 7 inleveren als eindopdracht.
Verwerk je antwoorden in een vloeiend en coherent betoog, dus geen verzameling van losse zinnen en statistische uitvoer.
Deadline is vrijdag 29 januari 2010, 23:59 h.
-
Butler, Ch. (1985) Statistics in Linguistics. s.l.: Blackwell.
[out of print, but see the
web version].
-
Carver, R.H. & Nash, J.G. (2005) Doing Data Analysis with SPSS version 12.0. Belmont, CA: Brooks/Cole. ISBN 0-534-46551-x.
-
Gelman, A. & Hill, J. (2007) Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press. ISBN 978-0-521-68689-1.
-
Maxwell, S.E. & Delaney, H.D. (2004) Designing Experiments and Analyzing Data: A model comparison perspective (2nd ed.). Mahwah, NJ: Lawrence Erlbaum Associates. ISBN 0-8058-3718-3. [very good, but not an easy book].
-
Kirkpatrick, L.A. & Feeney, B.C. (2005) A Simple Guide to SPSS for Windows/ for Version 12.0. Belmont, CA: Thomson Wadsworth. ISBN 0-534-61006-4.
-
Levin, I.P. (1998) Relating Statistics and Experimental Design: An introduction. Thousand Oaks, CA: Sage. Sage University Papers Series on Quantitative Applications in the Social Sciences; 07-125. ISBN 0-7619-1472-2.
-
Rosenthal, R., & Rosnow, R.L. (2008). Essentials of Behavioral Research: Methods and Data Analysis (3rd ed.). Boston: McGraw Hill. ISBN 0-07-353196-0.
-
Rosenthal, R., Rosnow, R.L., & Rubin, D.B. (2000) Contrasts and Effect Sizes in Behavioral Research: A correlational approach. Cambridge: Cambridge University Press. ISBN 0-521-65980-9.
-
StatSoft, Inc. (2004) Electronic Statistics Textbook. Tulsa, OK: StatSoft.
URL: http://www.statsoft.com/textbook/stathome.html [clear and concise chapters about most statistical topics].
-
Also check the hyperlinks listed under session 1.
-
Also check the webpage of my statistiek course [in Dutch].
© 2003-2010
HQ
2010.01.07