Assessment Notes

Center of Inquiry Home

September 23, 2009

Cogitating Aggravating Disaggregating

by Paul Sotherland

Disaggregating a summative data point, such as an institutional score on the Collegiate Learning Assessment (CLA), can reveal patterns in component data that some folks find affirming and others find aggravating. I will use experiences at Kalamazoo College to illustrate phenomena undoubtedly taking place at most institutions of higher education that are assessing whether students are attaining desired learning outcomes and that are using insights gained to improve undergraduate education.

When colleagues and I disaggregated Kalamazoo College’s CLA score and attempted to interpret patterns in data comprising that score in light of insights gained from the National Survey of Student Engagement (NSSE) (Sotherland, Dueweke, Cunningham, and Grossman, 2007), colleagues in the languages were pleased to see “their” work with students having a very positive effect on learning outcomes—whereas several colleagues in the natural sciences were less than pleased with the outcome revealed by our disaggregation and interpretation. Curiously, our science colleagues seemed to “take a surface approach . . . by dismissing this information as a special case and simply wrapping it around their current paradigm” (Bain & Zimmerman, 2009, ¶ 16). They critiqued the instruments (the CLA and the NSSE), they noted small sample sizes, and they questioned the validity of interpretations based on a data-lacking replication. We had hoped that all of our colleagues would take a “deep approach” (Bain & Zimmerman, ¶ 16) to what the CLA and NSSE revealed “by grappling with how this new information (could) irrevocably change their mental model, ultimately creating a new and deeper conceptual understanding” (Bain & Zimmerman, ¶ 16). How can taking a deep approach be fostered? Bain and Zimmerman suggest that providing opportunities to “grapple with the dissonance they encounter—to try, fail, receive feedback, and try again—before anyone makes a judgment of their efforts” can lead to deep learning and meaningful improvement (¶ 16).

We began providing opportunities to “grapple” by administering the CLA to a second cohort of seniors, comparing their disaggregated scores with those of first-year students (who had, by then, declared majors, allowing us to sort their scores by academic division), and then iteratively discussing our discoveries with colleagues. Administering the CLA to 2007 seniors yielded results that were indistinguishable from those of 2006 seniors. At the institution level, mean CLA scores of the 2006 and 2007 cohorts differed by less than 0.1%; and at the level of academic division, differences in “adjusted” CLA scores (Sotherland et al., 2007) were similar for both cohorts. These results helped address the sample size and replication critiques, but our science colleagues persisted by suggesting that students who were better at CLA-like tasks gravitated toward majoring in the languages, whereas students who were less able to do well on the same tasks ended up majoring in the sciences. CLA scores of first-year students, grouped by the academic division of their intended major, have begun to refute that claim as well.

After receiving the 2007 CLA data for Kalamazoo College seniors and finding patterns that were similar to those in CLA data for 2006 seniors, we reexamined CLA data for 2005 first-year students to see if, for example, students with higher CLA scores tended to gravitate to certain disciplines. Because intended majors of the 2005 first-years (currently seniors) were now known, categorizing their incoming CLA scores by the academic division of their declared major and comparing those scores with CLA scores from seniors in the same academic division became possible. As shown in the graph below, we used students’ “adjusted” CLA scores (i.e., given their SAT scores, how much students performed above or below expected on the CLA ) to explore how academic trajectories over four years might account for variation in CLA performance. We found essentially no difference in first-year adjusted CLA scores among academic divisions, and on average, first-year students scored slightly below what was expected of them on the CLA. In contrast, there were marked and significant differences (revealed by one-way ANOVA) among academic divisions in how well seniors performed on the CLA. Moreover, there was a sizeable difference between adjusted CLA scores of first-years and those of seniors majoring in foreign languages, but only a small (and nonsignificant as revealed by independent samples t-test) difference between adjusted CLA scores of first-years and seniors majoring in natural sciences. (NB: At Kalamazoo College’s mean SAT score (1250) for these students the mean “expected” difference in nonadjusted CLA scores is about 110. Thus, for example, seniors majoring in the natural sciences scored about 110 points higher on the CLA than did first-year students with similar SAT scores, even though their adjusted CLA scores were not significantly different from those of first-year students. On the other hand, Kalamazoo College seniors majoring in foreign languages scored 110 plus about 200 points higher on the CLA than did first-years with similar SAT scores.) Therefore, the suggestion that students who are “good at the CLA” tend to gravitate toward the languages, made by colleagues in the sciences as a means of “explaining away” the disparity in CLA scores among academic divisions, seems unfounded.

What now? Acknowledging what seems to be working well (e.g., performance of language majors on the CLA), figuring out what might catalyze growth in those students over four years (especially in light of rubrics used to score the CLA (Benjamin et al., 2009), and making improvements and applying what is learned to the education of all students certainly seems prudent. Equally prudent, though initially more difficult, would be to “grapple with the dissonance” (in a supportive and nonjudgmental environment), come to grips with problems and potential solutions, try the solutions, assess outcomes, and grapple again. Doing this iteratively, while viewing outcomes from multiple perspectives (Shulman, 2007), will help us attend to the initial aggravation wrought by disaggregation and foster productive cogitation on what yields effective education. 


I am grateful to Charlie Blaich for the nudge to write, Anne Dueweke for constructive criticism of the writing, and The Teagle Foundation for support to focus our attention on what’s important.

Literature Cited

Bain, K. & Zimmerman, J. (2009). Understanding great teaching. Peer Review, 11(2), 9–12. Available from the Association of American Colleges and Universities Web site:

Benjamin, R., Chun M., Hardison C., Hong E., Jackson C., Kugelmass H., et al. (2009). Returning to learning in an age of assessment: Introducing the rationale of the Collegiate Learning Assessment. Available from the Collegiate Learning Assessment Web site:

Shulman, L. S. (2007). Counting and recounting: Assessment and the quest for accountability. Change 39(1), 20–25. Available from the Carnegie Foundation for the Advancement of Teaching Web site:

Sotherland, P., Dueweke, A., Cunningham, K., & Grossman, R. (2007). Multiple drafts of a college’s narrative. Peer Review, 9(2), 20–23. Available from the Association of American Colleges and Universities Web site:


Paul Sotherland is a professor of biology at Kalamazoo College.