acm - an acm publication
Articles

The reliability of GRE scores in predicting graduate school success
a meta-analytic, cross-functional, regressive, unilateral, post-kantian, hyper-empirical, quadruple blind, verbiage-intensive and hemorrhoid-inducing study

July 2005 | BY John Orlando

|

To view the full citation in the ACM Digital Library please click here

Studies of GRE Reliability
There have been numerous studies of the reliability of GRE scores as predictors of graduate school success (See Bibliography). Unfortunately, results span the range from finding little if any predictive validity (Morrison, Monahan, Wesche), to finding a strong correlation between GRE scores and graduate school achievement (Kingston, Harvancik, Kuncel).

The situation is complicated by the common tendency to use the results to advance political agendas. The National Association of Scholars, a conservative group fighting political correctness on campus, sites a strong correlation study to press universities to use test scores in admissions, while Fairness in Testing sites numerous studies against the correlation to fight the use of test scores on grounds that they are discriminatory.

The issue is further muddled by questions about the criteria used to measure academic success. Some studies use first-year graduate GPAs. Others use final GPA, and still others use percentage of students to complete a program. Even the use of the last group creates problems. One study that measured success by completion of the graduate program did not control for people who were dismissed due to academic failure rather than left on their own for reasons such as job commitments or family emergencies (Nelson and Nelson). In other words, it is likely that academically competent persons who left due to non-academic reasons were lumped into the group used to gauge failure. We know that this is not a trivial group, since adult students in particular face a far greater range of outside commitments that can interfere with their education.

Mich Kabay notes that studies often compare GREs to the GPAs of those who complete graduate programs. This restricts the range of subjects being tested by ignoring those who were never admitted in the first place, as well as those who drop out. This is an important oversight. If the issue were just distinguishing between those who will be highly successful in a graduate program and those who will be just successful, then the tested group would be sufficient. But because the test is being used to make admissions decisions in the first place, lack of information about applicants not admitted means that we cannot tell if the test is being used to deny admission to those who would have been successful. Because of this weakness, the Education Testing Service itself flatly states that "a cutoff score based only on GRE scores should never be used as a sole criterion for denial of admission" www.gre.org/scoreuse.html

Results of Studies
Taken as a whole, the evidence suggests that there is some correlation between GRE scores and graduate achievement. But there is widespread disagreement about the degree of correlation. The Educational Testing Service, which funds a considerable amount of research into the validity of the GRE, asserts only that "GRE General Test scores tend to show moderate correlations with first-year [GPA] averages"(ETS 1990). It also admits that there are "critical skills associated with scholarly and professional competence that are not currently measured by graduate admissions tests" (ETS 1989).

The devil is in the details when it comes to GRE validity studies, as relevant correlations are often embedded within distinctions in the data. For instance, there is considerable amount of variability in predictive validity of GRE scores between disciplines (Braun and Jones). This can partly be explained by the fact that the GRE is actually three separate tests: analytic, verbal, and quantitative. Different disciplines demand these skills in different degrees. Thus, predictive validity tends to improve when a particular test is matched to a particular discipline. One study concluded that the verbal score is the best predictor of success in majors that are "descriptive in nature," while the quantitative score is the best predictor for "symbol-oriented disciplines" (Kaiser). See Appendix 1 for a study that separates results by test and discipline. There are also subject matter tests offered by the ETS that are rarely required by admissions committees, but tend to be better predictors of success than any of the regular tests (ETS 1990). Due to this variation, the Educational Testing Service advises against using a composite score to judge applicants www.gre.org/scoreuse.html.

Another interesting variable is type of admission. One study looked at the GRE as predictor of success for regular admissions versus probationary admissions; students admitted with academic credentials below what is normally required for admission. Not surprisingly, those admitted on probation tended to do less well (with success defined, in this case, as completing the program) than regular admissions. But for probationary students, qualitative and analytical scores best predict academic success, while verbal scores best predict success for regular admissions (Nelson and Nelson).

This study also split subjects by discipline to get an even more fine-grained analysis. Even here anomalies arose. For instance, among regularly admitted students, there was not a significant difference in GRE scores between those that succeeded and those that did not. In fact, in many of the disciples those who scored better in the GRE were more likely to fail than those that scored worse (Nelson and Nelson: p.9, See Appendix II). These disciplines include Applied Sciences, Education, and the Humanities, Life sciences, and Social Sciences. The authors never speculate as to the source of this odd result, but it raises the possibility that the aforementioned inclusion of students who left the programs voluntarily within the set of "failures" might be skewing the results. Perhaps students who do very well in certain areas of the GRE tend to have more demanding jobs that are more likely to interfere with their studies, or are more likely to change jobs in the middle of their studies.

Finally, there are few studies on the difference between older and younger students in the predictive validity of GRE scores. One might hypothesize that older students will do less well on GRE tests than younger students because they have been away from the academic world longer and have gotten out of the practice of taking tests. One study found that there was actually little difference in the predictive validity of test scores of different aged groups (Clark). But another found that there "was a significant underprediction of first-year grade average for older females in all graduate fields. Although it had been predicted that they would do less well than younger students and about as well as males, they in fact earned considerably higher grades than all other groups" (Swinton). It merits note that the Educational Testing Service specifically cautions against giving too much weight to the test for those students "who are returning to school after an extended absence" www.gre.org/scoreuse.html

Recommendations
The Educational Testing Service has a variety of guidelines about the use of the GRE. If it is to be used, a school is advised to make a careful examination of these guidelines. ETS publishes a Guide to the Use of Scores than can be downloaded free at www.gre.edu. The most important guidelines are used to inform the recommendations below.

Given the well-documented problems with the GRE tests, if it is to be used at all in graduate school admission decisions, it should only be used as part of an "all things considered" judgment. As mentioned above, the Educational Testing Service advises against using it as a floor for admissions. ETS instead says that:

    Regardless of the decision to be made, multiple sources of information should be used to ensure fairness and balance the limitations of any single measure of knowledge, skills, or abilities. These sources may include undergraduate grade point average, letters of recommendation, personal statement, samples of academic work, and professional experience related to proposed graduate study.

ETS also advises against using a composite GRE score. It instead suggests that universities choose those scores from the three seperate tests-analytical, quantitative, and verbal-which best map to the discipline the student is interested in entering. ETS even suggests that each department conduct its own validity studies on the use of the GRE in light of the variation observed in results, and will provide advice on the design of these studies without charge.

ETS also states that "small differences in GRE scores (as defined by the standard error of measurement) should not be used to make distinctions among examinees." All tests have some standard measure of error, and ETS breaks them out by test. Details can again be found in the Guide to the Use of Scores.

It is clear that the GRE is not required as a measure of likely graduate school success, and ETS never says that it is. Many schools do not require the test. Some studies suggest that undergraduate GPA is a better predictor of graduate achievement than GRE (Monahan). Given the wide disagreement in studies of GRE predictive validity, the best advice that emerges from the literature is that if the GRE is used, it should only be used as one measure among many.

Statistical Data
Appendix I and II contain the data from Nelson and Nelson's study of the GRE as predictor of graduation rates for regular and probationary students in different disciplines. The legend is as follows:

Graduates: Students who completed their program.
Non-Graduate: Students who did not complete their program.
9-Hr GPA: GPA after the first 9 hours of graduate work.
Final GGPA: Final graduate grade point average.
GRE-V: GRE verbal score.
GRE-Q: GRE quantitative.
GRE-A: GRE analytical.

(It's not clear what "final graduate GPA" means for students who did not graduate, unless it's the student's GPA at the time of leaving the program.)

Note the wide variation in predictive power of GRE scores between disciplines, with some disciplines actually showing that students who score lower on the GRE do better than those who score higher.


Appendix I
Comparison of Graduates and Non-Graduates
Who Began as Probationary Students

Table 1
ALL PROBATIONARY STUDENTS
Graduates(N=258) Non-Graduates(N=130)
9-Hr GPA 3.58 3.35
GRE-V 449 421
GRE-Q 473 468
GRE-A 494 469
Final GGPA 3.61 3.29

Table 2
APPLIED SCIENCES
Graduates(N=58) Non-Graduates(N=35)
9-HR GPA 3.65 3.47
GRE-V 418 394
GRE-Q 455 450
GRE-A 456 442
Final GGPA 3.63 3.41
Table 3
COMMUNICATION SCIENCES
Graduates(N=51) Non-Graduates(N=18)
9-Hr GPA 3.38 3.20
GRE-V 445 410
GRE-Q 471 422
GRE-A 519 467
Final GGPA 3.46 3.21

Table 4
EDUCATION
Graduates (N=59) Non-Graduates(N=30)
9-Hr GPA 3.70 3.65
GRE-V 423 398
GRE-Q 418 456
GRE-A 443 436
Final GGPA 3.71 3.55
Table 5
HUMANITIES AND ARTS

Graduates (N=11) Non-Graduates (N=8)
9-Hr GPA 3.52 3.56
GRE-V 503 437
GRE-Q 481 490
GRE-A 505 425
Final GGPA 3.64 3.39
Table 6
LIFE SCIENCES
Graduates (N=16) Non-Graduates (N=10)
9-Hr GPA 3.60 2.93
GRE-V 463 463
GRE-Q 490 455
GRE-A 534 505
Final GGPA 3.70 2.90


Appendix II
Comparison of Graduates and Non-Graduates
Who Bagan as Regularly-Admitted Students


Table 1
ALL REGULARLY-ADMITTED STUDENTS
Graduates (N=896) Non-Graduates (N=239)
UG-GPA 3.29 3.20
9-Hr GPA 3.71 3.46
GRE-V 486 500
GRE-Q 515 511
GRE-A 547 549
Final GGPA 3.76 3.42
Final GGPA 3.61 3.63

Table 9
SOCIAL SCIENCES
Graduates (N=23) Non-Graduates (N=8)
9-Hr GPA 3.55 3.19
GRE-V 460 461
GRE-Q 466 431
GRE-A 520 476
Final GGPA 3.57 3.16

Table 2
APPLIED SCIENCES
Graduates (N=74) Non-Graduates (N=28)
UG-GPA 3.18 3.16
9-Hr GPA 3.65 3.47
GRE-V 446 513
GRE-Q 497 525
GRE-A 519 548
Final GGPA 3.69 3.39

Table 3
COMMUNICATION SCIENCES
Graduates (N=143) Non-Graduates (N=55)
UG-GPA 3.22 3.13
9-Hr GPA 3.53 3.45
GRE-V 487 494
GRE-Q 482 476
GRE-A 529 543
Final GGPA 3.65 3.45

Table 4
EDUCATION
Graduates (N=61) Non-Graduates (N=22)
UG-GPA 3.15 3.22
9-Hr GPA 3.83 3.71
GRE-V 446 490
GRE-Q 483 527
GRE-A 497 555
FInal GGPA 3.83 3.73

Table 5
HUMANITIES AND ARTS
Graduates (N=102) Non-Graduates (N=40)
UG-GPA 3.42 3.34
9-Hr GPA 3.78 3.46
GRE-V 555 549
GRE-Q 529 534
GRE-A 574 587
Final GGPA 3.83 3.41

Table 6
LIFE SCIENCES
Graduates (N=152) Non-Graduates (N=20)
UG-GPA 3.29 3.08
9-Hr GPA 3.67 3.20
GRE-V 448 437
GRE-Q 469 498
GRE-A 516 522
Final GGPA 3.70 3.24

Table 7
PHYSICAL SCIENCES
Graduates (N=61) Non-Graduates (N=19)
UG-GPA 3.20 3.20
9-Hr GPA 3.68 3.63
GRE-V 489 483
GRE-Q 665 617
GRE-A 614 587
Final GGPA 3.75 3.57

Table 8
PSYCHOLOGY
Graduates (N=266) Non-Graduates (N=46)
UG-GPA 3.39 3.22
9-Hr GPA 3.79 3.34
GRE-V 505 492
GRE-Q 540 502
GRE-A 570 522
Final GGPA 3.83 3.33

Table 9
SOCIAL SCIENCES
Graduates (N=37) Non-Graduates (N=9)
UG-GPA 3.09 3.29
9-Hr GPA 3.69 3.61
GRE-V 464 516
GRE-Q 461 401
GRE-A 518 530
Final GGPA 3.73 3.22



Bibliography

"ERIC" refers to the Education Resources Information Center at www.eric.gov

Boldt, Robert F. (1986). Generalization of GRE General Test Validity across Departments. ERIC Document No. 281 865.

Bornheimer, D.G. (1984). Predicting Success in Graduate School Using GRE and PAEG Aptitude Test Scores. College and University, v. 60 (no. 1) pp. 54-62.

Braun, Henry I. & Jones, Douglas H. (1985). Use of Empirical Bayes Methods in the Study of the Validity of Academic Predictors of Graduate School Performance.
ERIC Document No. 255 545.

Clark, Mary Joe. (1986). Test Scores and the Graduate Admission of Older Students.
ERIC Document No. 271 498.

Enright, M. K. & Gitorner, D. (1989). Toward a description of successful graduate students. Princeton, NJ: Educational Testing Service.

Educational Testing Service. (1998). GRE Guide to the Use of Scores, 1998-1999. Princeton, NJ.

Educational Testing Service. (1989). Toward a Description of Successful Graduate Students. Princeton, NJ.

Fairtest (2001). Examining the GRE: Myths, Misuses, and Alternatives. www.fairtest.org/facts/gre.htm

GRE Validity Study Service (1990). Validity of the GRE: 1988-1989 Summary Report. Educational Testing Service, www.gre.org/respredict.html.

Goldberg, Edith L. & Alliger, George M. (1992). Assessing the Validity of the GRE for Students in Psychology. Educational and Psychological Measurement. v52, n4, p1019-27, Win 1992.

Harvancik, Mark J. & Golsan, Gordon. (1986). Graduate Record Exam Scores and Grade Point Average: Is There a Relationship?
ERIC Document No. 270 682.

Hartnett, R. & Payton, B.F. (1977). Minority Admissions and Performance in Graduate Study: Preliminary Study of Fellowship Programs of the Ford and Danforth Foundations. New York: Ford Foundation.

Hebert, David J & Holmes, Alan F. (1979). Graduate Record Examinations Aptitude Test Scores as a Predictor of Graduate Grade Point Average. Educational and Psychological Measurement, v39, n2, p415-20, Sum 1979.

Jacobson, R. L. (1993). Critics Say Graduate Record Exam does not measure qualities needed for success and is often misused. The Chronicle of Higher Education, March, pp. 27-28.

Kaiser, Javaid. (1982). The Predictive Validity of GRE Aptitude Test. ERIC Document No. 227 174, abstract.

Kingston, Neal M. (1985). The Incremental Validity of the GRE Analytical Measure for Predicting Graduate First-Year Grade-Point Average.
ERIC Document No. 226 021.

Kuncel, Nathan R.; Hezlett, Sarah A., & Ones, Deniz S. (2001). A Comprehensive Meta-Analysis of the Predictive Validity of the Graduate Record Examinations: Implications for Graduate Student Selection and Performance. Psychological Bulletin 127 (1), 162-181.

Milner, M., McNeil, J. & King, S.W. (1984). The GRE: A Question of Validity in Predicting Performance in Professional Schools of Social Work. Educational and Psychological Measurement, vol. 44, pp. 945-950.

Monahan, Thomas C. (1991). Using Graduate Record Examination Scores in the Graduate Admissions Process at Glassboro State College.
ERIC Document No. 329 183.

Morrison, T. & Morrison, M. (1995). A Meta-Analytic Assessment of the Predictive Validity of the Quantitative and Verbal Components of the Graduate Record Examination with Graduate Grade Point Averages Representing the Criterion of Graduate Success. Educational and Psychological Measurement, v. 55 (no. 2) pp. 309-316.

National Association of Scholars, (2002). The Validity of GRE Subject Tests. www.nas.org/publications/sci_newslist/6_5/c_gre_predicts.htm

Nelson, Jacquelyn & Nelson, C. Van. (1995). Predictors of Success for Students Entering Graduate School on a Probationary Basis.
ERIC Document No. 388 206.

Onasch, C. (1994). Undergraduate Grade Point Average and Graduate Record Exam Scores as Predictors of Length of Enrollment in Completing a Mater of Science Degree.
ERIC Document No. 375 739.

Oltman, P.K. & Harnett, R.T. (1984). The Role of the GRE General and Subject Test Scores in Graduate Program Admission. Princeton, NJ: Educational Testing Service.

Penncock-Roman, M. (1994). Background Characteristics and Futures Plans of High-Scoring GRE General Test Examinees. Research report ETS-RR9412 submitted to EXXON Education Foundation, Princeton, NJ: Educational Testing Service.

Scott, R.R. & Shaw, M.E. (1985). Black and White Performance in Graduate School and Policy Implications For Using GRE Scores in Admission. Journal of Negro Education, v. 54, no.1, pp.14-23.

Sternberg, R. & Williams, W. (1997). Does the Graduate Record Examination Predict Meaningful Success in the Graduate Training of Psychologists? American Psychologist, v. 52 (no. 6), pp. 630-641.

Swinton, Spencer S. (1987). The Predictive Ability of the Restructured GRE with Particular Attention to Older Students. ETS Research Report 87-22.

Thornell, John G & McCoy, Anthony. (1985) The Predictive Validity of the Graduate Record Examinations for Subgroups of Students in Different Academic Disciplines. Educational and Psychological Measurement, v45, n2, p415-19, Sum 1985.

Wesche, Lilburn, E., et al. (1984). A Study of the MAT and GRE as Predictors of Success in M.Ed. Programs.
ERIC Document No. 310 150.

Wilson, Kenneth M. (1986). The Relationship of GRE General Test Scores to First-Year Grades for Foreign Graduate Students: Report of a Cooperative Study.
ERIC Document No. 281 862.

Wilson, Kenneth M (1982). A Study of the Validity of the Restructured GRE Aptitude Test for Predicting First-Year Performance in Graduate Study.
ERIC Document No. 240 122.

COMMENTS

Some reference to statistical significance and effect size would be very helpful for those of us prepared to understand them. Thanks for the article. Happy holidays.

— mfa, Wed, 28 Dec 2011 22:39:46 UTC

POST A COMMENT
Leave this field empty