The George Washington University Graduate School of Education and Human Development
Quantitative Research Methods: New Doctoral Student Self-Assessment
 
Module 13:
Critiquing Education and Social Science Research
This section assesses skill in judging the strengths and weaknesses of a research study, and drawing generalizations from the results in light of those.

Consider again the experiment described in Module 12. It is restated in full as follows: You randomly sample 72 students from a high school of 1,200 and randomly assign 36 to the school's normal mathematics instruction and 36 to a new artificial intelligence computerized tutorial system. Each group receives the same hours of instruction. At the end of the school year both groups are assessed with a well developed and widely used standardized test of mathematics and with several questions about students' interests in mathematics. The normal instruction group had a mean score of 60 and a variance of 25 on the math test; the computerized tutorial group had a mean score of 65 and a variance of 16. The students were asked to respond to several questions about their interests in mathematics, including : "Students who like math are mostly: smart  cool   nerds     bad-looking." In response to that question, the frequency distribution of students in the traditional program was: 8  0  18  10, respectively. The distribution for the students using the computerized tutorial was: 12  10  8  6.

Question 1:  What aspects of the sampling in this study limit the generalizations that can be made validly from it about the superiority of the computerized tutorial?
A.  None; it had random sampling, which is the best.
B.  The sample was drawn from only one school, whose students and teachers may have differed from other schools in ways that affected the results.
C.  There were no provisions to assure the sample was representative of the school, and thus you can't generalize to other schools.
D.  There were no provisions to assure that the students in the two groups were initially comparable, and thus it is inappropriate to generalize the results.
E.   None of the above

Question 2:  What aspects of the assignment to the experimental and control conditions in this study limit the generalizations that can be made validly about the superiority of the computerized tutorial?
A.  None; it had random assignment to the treatment and control groups, which is the best.
B.  Students in the control group were assigned to just one teacher, who might have been an ineffective teacher.
C.  Students in the control group were assigned the normal math curriculum for that school, but that curriculum may be less effective than others widely used in other schools.
D.  There is no information on how the computerized tutorial was delivered, and that might have impacted its apparent effectiveness-for instance if one of the best math teacher in the school supervised its use.
E.  (B), (C), and (D).

Question 3:  No pre-measures of students' math skills were undertaken; math skills were assessed only at the end of the school year. How does that limit the generalizations that can be made validly about the superiority of the computerized tutorial?
A.  It improves the validity of generalizations (in comparison to pre- and post- measures), by eliminating possible interactions between the pre-test and the treatment.
B.  It imposes no serious limitation. The statistical analysis of the post-measures takes into account the possibility of initial differences in the groups that happen by chance from random assignment.
C.  It moderately reduces the validity of generalizations because we can't tell how much progress the students in each group made.
D.  It seriously reduces the validity of generalizations.
E.  None of the above

Question 4:  What aspects of the standardized mathematics test limit generalizations that can be made validly about the superiority of the computerized tutorial?
A.  Standardized tests are invalid measures of mathematics skills
B.  That test may not have aligned equally well with the "normal" curriculum and the computerized tutorial.
C.  That test may not align well with the other standardized tests and mandated state tests that are being used throughout the country to assess math performance.
D.  Both (A) and (B)
E.  Both (B) and (C)

Question 5:  What aspects of the assessment of students' interest in the mathematics limit generalizations that can be made validly about the superiority of the computerized tutorial?
A.  There are no limits; it is admirable that interests, as well as achievement, were assessed.
B.  Insufficient information is given to decide.
C.  One question about interests cannot be a valid basis for generalization.
D.  Interests cannot be generalized.
E.  None of the above

Question 6:  Subsequently, the computerized tutorial is tested in 75 diverse schools across the country, against various standard curriculums, assessed with several different standardized tests, and assessed longitudinal as students take four years of instruction. The results are essentially the same as in the original experiment, with only modest variation from school to school, and with the magnitude of the differences increasing over each successive year. The computerized tutor results in more learning of mathematics and more interest in mathematics. Would we have sufficient basis to conclude that schools throughout the country should adopt the computerized tutor? Why?
A.  Yes, it is proven substantially more effective in a wide range of conditions.
B.  Not without knowing how much it costs to deliver the math instruction with the computerized tutor. If the total costs are twice those of conventional instruction, moderate gains may not be worth the additional costs.
C.  Not without knowing if there are other text-based or computer-based curriculums that might be even more effective than the computerized tutorial that was tested.
D.  No, computers can never replace what teachers do.
E.  Both (B) and (C)