"You can teach students one lesson a day; but if you can teach them to learn by creating curiosity, they will continue the learning process for as long as they live." ~ adapted from Clay P. Bedford

Friday, March 18, 2011

What Are We Measuring and What's Worth Measuring?

ED-D 337B: Mini Inquiry Project - I apologize in advance for the length of this post, when I get started, sometimes I just can't stop!
In EDCI 431: Philosophical Foundations of Education, I argued that the educational practices set down by ancient Chinese philosopher Confucius and ancient Greek philosopher Plato were still instrumental to the current education system. For both Plato and Confucius, education was central to maintaining a moral society (Spring, 2008, p. 4). Although they had very different and conflicting views about morality, their goal of rejecting “any form of democracy involving mass participation in governance” (Spring, 2008, p. 4), was the same. They both created foundations for contemporary assessment, the tracking concept that the United States education system is based on, and the Socratic dialogue, which is the primary way of communication and learning between student and teacher. While I’m grateful for most of their contributions to education, I question why their assessment philosophies are still relevant today. As society evolves and advances, shouldn’t education and assessment evolve as well? Why are standardized tests still used when we know that they are not an accurate representation of a student’s ability? Why do we continue to use assessment methods as sorting mechanisms instead of educative aids?
When I came across Zachary Stein, Theo Dawson and Kurt W. Fischer’s article, “Redesigning Testing: Operationalizing the new Science of Learning,” I thought it was too good to be true. It seems that there are contemporary researchers, looking into contemporary theories and studies, to create contemporary and more appropriate ways of assessing a student’s ability. Their article outlines a possible solution to both standardized tests and summative assessment methods; they purpose that we redesign our current testing infrastructure and adopt a more comprehensive infrastructure that works for today’s students. Assessment is a reality that no teacher enjoys, however with so many tests, and so much emphasis on accountability, we have to ask ourselves, what are we measuring and what is worth measuring?  
Confucius’ educational ideas were “central to imperial rule in China” (Spring, 2008, p. 4). During the 12th century, his ideas and texts were used as the core curriculum for the “imperial civil service examination system” (Spring, 2008, p. 4). These examinations served as a stepping stone for today’s standardized tests. The imperial civil service exams were originally a “method for recruiting talented commoners to government” services (Spring, 2008, p. 4). It is for this reason that exams were applied to the education system, thus making examinations an “avenue to wealth and prestige” (Spring, 2008, p. 4).  With this in mind, fast forward to today’s educational system: standardized tests and entrance exams are a common form of assessment today. Tests, quizzes, assignments, and exams are used to not only measure a student’s knowledge, skill, and/or aptitude in a particular subject, but to also rank and compare one student’s ability to another.  
Which brings us to Confucius’ “Three Halls methods” which tied “education more closely to the government civil service examination and was intended to provide for greater social mobility” (Spring, 2008, p. 9). The Three Halls method ranked individuals based on performance. There are three halls: “Outer Hall,” “Inner Hall,” and the “Upper Hall.” Only the best students were sent to the “Upper Hall” or “the Imperial University where they would be prepared for the civil service examination and entrance into government service” (Spring, 2008, p. 9), where the lower achieving students remained in their respective halls. Similar to Confucius, Plato’s educational system is also based on a hierarchy. The Philosopher-King is at the top and everyone else is ranked based on the “myths of metals.” This myth attempts to convince individuals that “they are born unequal in their abilities and that they should accept their social positions as determined by the education system” (Spring, 2008, p. 15). There are four categories: gold, silver, iron and brass; each of these metals dictates a different role that individual will play in society.
 These methods strike a remarkable resemblance to the United States of America tracking system, which places students on certain tracks based on their performance on a standardized test. In the US, a high school student will sit down “to take a standardized test that will ultimately determine both her chances of graduation and the standing of her school” (Stein et. al, in press, p.2). Students who achieve higher on the standardized test will study higher mathematics, more foreign languages, and literature while students on less academic tracks acquire vocational skills such as welding, typing, or cosmetology. One of the most unfortunate characteristics of the tracking system is that students are usually not offered the opportunity to take classes deemed more appropriate for another track, even if the student has demonstrated interest and/or ability in the subject. Is this fair? No. Is this reality? Yes.
Why is this reality? It seems that students in the United States who do poorly on standardized tests are punished because of their performance. Standardized tests are sorting mechanisms; they have been ever since the nineteenth century. It as if the United States’ education system is clinging on to the traditional foundations that have dictated their system for years and they are afraid of letting go. Thankfully, in Canada, standardized tests are not used nearly as often as they are in the US; however, there are still standardized tests, provincial exams and large scale tests available to rank and order students based on their performance on a test.
All of these tests, of course, force teachers to teach to the exam, teach their students how to write exams, and teach students how to memorize for the exam. Students ultimately learn the exam instead of learn for learning sake; teaching to the test ultimately prepares “students for life as if it were a set of multiple-choice questions” (Stein et. al, in press, p.7). As well, the exam is designed to cover content as opposed to learning. At the moment I would argue that the curriculum is preoccupied with content rather than learning. Facts, dates, plants and Shakespeare (among other things) are included in the high school curriculum, but to what end? Anyone can memorize various facts, but what good are facts – especially when the Google search engine can provide an answer to any question in less than two seconds? I think that critical thinking skills and real-life application are what’s important. What good are facts when a student can’t do anything with them? What good is learning about all the facts of WWII if a student can’t think critically, or even relate to the information to make meaning out of it? Is there room for critical thinking and real-life situations in the curriculum? Is there a compromise? Is there an answer?
Stein, Dawson and Fischer (in press) discuss an extremely intriguing compromise that combines a standardized test with formative assessment practices in the paper, “Redesigning Testing: Operationalizing the new Science of Learning.” The paper proposes that we redesign and build tests that are both standardized and formative; these tests should be grounded in research about learning, not based on content. Could this be the answer? Could this new assessment infrastructure work? Today’s education system is shaped by complex standardized testing infrastructure; this infrastructure sorts students instead of aiding them in their education. The paper focuses on the dynamic learning process and the developmental practices that characterize how individuals learn thus creating the foundation for new assessment techniques. Fischer’s Dynamic Skill Theory and Dawson’s Lectical Assessment System are combined to create the Disco Test Initiative which intends to refocus the practice of testing from sorting to educative aids.
Because our entire education system is predicated on a complex standardized testing infrastructure set down by ancient philosophers, it does not address the current needs of the students. The current system was put in place to ensure that the entire United States would be united by a “common language, culture, and ideals” (Gronlund & Cameron, 2004, p. 2). Schools were therefore “tasked with ensuring that this homogeneity was achieved” (Gronlund & Cameron, 2004, p. 2). As well, with technological advancement, “not all jobs required the same level of education” (Gronlund & Cameron, 2004, p. 2) thereby creating the tracking system. Assessment and testing infrastructures should benefit the students and aid them in their goals, not discourage or sort them into categories that no longer exist. The testing infrastructure should be “based on research into the nature of learning will be better able to meet the challenges facing educational system in the 21st century” (Stein et. al, in press, p.3). Many aspects of the education system are changing so rapidly, and because of these changes,
the values that shape test reform efforts should transcend outdated dichotomies about the function of testing and the purposes of education – moving beyond unproductive either/or commitments: either tests as sorting mechanisms or tests as educative aids; either tests of competencies or tests of content; either tests to train the work force or tests to foster reflective citizens. (Stein et. al, in press, p.3)
I pose the questions again: what are we measuring and what is worth measuring? I completely agree with Stein et al when they assert that “tests should be based on research about how students learn and guided by explicit commitments to re-shaping schools in positive new directions” (p.3). Because the education system is currently grounded in standardized test, reform and change may be difficult; however, according to Stein et. al there is a way to create standardized test that serve as educative aids . By combining the advances made in psychometrics and cognitive developmental psychology, tests can be redesigned to be “broad and flexible” as well as a standardized (Stein et. al, in press, p.3). The Disco Test Initiative combines the "approach to researching and measuring learning – wherein diverse learning sequences can be understood in terms of a common scale – with advances in computer-based tools” (Stein et. al, in press, p.4). The result is a brand new formative test “with the kind of objectivity and validity that are desirable in standardized tests” (Stein et. al, in press, p.3).
The Disco Test Imitative is a combination of Fischer’s Dynamic Skill Theory (General Skill Scale) and Dawson’s Lectical Assessment System (LAS). Each of these are discussed in detail in the article and they both “represent fundamental advances both our understanding of learning and our methods for studying and measuring it” (Stein et. al, in press, p.13). The ultimate goal of the Disco Test Initiative is to “build standardized tests that can be customized to different curricula and built around empirical research into how students learn, providing both educative feedback and psychometrically reliable scores” (Stein et. al, in press, p.14). It requires that students learn the big idea rather than content. Their learning is recorded and assessed based on the evolution of their thought process not the facts they have memorized. This kind of assessment will determine how an individual student learns and place them on individual learning trajectories; it will also provide useful feedback that will help the student develop.
Among the many challenges that assessment must face, the Disco Test Initiative must also do the following things (Stein et. al, in press, pp.15-16):
1.       Be grounded in solid empirical evidence about the ways in which students learn specific concepts and skills.
2.       Be composed of intriguing items that allow students to show how they think about what they have learned, rather than simply demonstrating that they can get a “right” answer.
3.       Not waste students’ time and be a useful learning experience.
4.       Provide students, teachers, and parents with a record of learning in which each milestone is meaningfully connected to specific knowledge and skills.
5.       Have a long shelf-life, which implies that (1) they are enduring importance and that (2) it should be very difficult to cheat on them and (3) they should be used in ways that make it seems pointless to cheat on them.
6.       Provide data that researchers can use to continually refine our understanding of learning.
The Disco Test Initiative provides students with an opportunity to showcase what they know and “engage in meaningful action;” it also provides teachers to provide students with meaningful feedback (Stein et. al, in press, p.16). The initiative is comprised of completely open ended questions that “require short essay responses consisting of judgments and justifications that not only show (1) what the student know, but also (2) how they understand what they know and (3) how they can use their knowledge to deal with similar tasks and situations” (Stein et. al, in press, p.18). Students respond to a given question. Once they have responded they “check” their answers by using a low inference rubric (also known as a “coding menu”). This rubric is comprised of a number of answers. Students chose the response that most closely matches their own. Coding is a very important part of the learning process, it “allows [students] to reflect upon their own performance in light of a range of response options” (Stein et. al, in press, p.19).
                Once they have decided which answer resembles their own, students are directed to a report that provides them a lot of information. It shows them their score on the General Skill Scale, describes their current level of understanding (based on the performance), and offers suggestions to help the student progress. These reports allow students to track their own development! It gives them the responsibility for their own learning. These reports are also available for parents and teachers.
                The Disco Test Initiative can be used by the entire class, in small groups, or individually. It is so versatile that a student can answer the same question without “exhausting its potential to help them gain an increasingly sophisticated understanding of targeted concepts” (Stein et. al, in press, p.22). Because the tests are based on the “big idea” and larger concepts within the curriculum, students are rewarded for good thinking rather than right answers. As well, the test can be used by “entire schools or districts, [they] can follow the development of individual students over time, providing a high quality method of tracking student progress and evaluating curricula” (Stein et. al, in press, p.21). It’s a win-win situation!

                Each student is different. Each student learns differently, tests differently, and should be evaluated differently. We no longer live in a society that needs to place students on tracks and force them into industries that they do not want to participate in. We no longer live in a society where education is used as source of social control. With advancing technologies and the evolution of society, we face unique conditions that “render traditional ideas about the nature of socialization and adult life obsolete” (Stein et. al, in press, p.24). The Disco Test Initiative overcomes the dichotomy that lies between testing to prepare the workforce and testing to foster critically minded citizens. It provides students with the opportunity to “apply their knowledge to the kinds of problems they will face in the real world – messy, open ended problems without simple answers” (Stein et. al, in press, p.25). It focuses on learning, not memorizing facts. Students are encouraged to learn for learning sake – they are not simply memorizing facts, dates, and biological systems so they can do well on a test. I think that this initiative and this new testing infrastructure could be the future of assessment; it addresses the issues we are having with assessment today and offers a solution. What do you think? What are we measuring? What is worth measuring? 


References

Gronlund, N. E. (2004) Assessment of student achievement. Toronto: Pearson

Spring, J. (2008). Wheels in head: Educational philosophies of authority, freedom, and culture from Confucianism to human rights. New York: Lawrence Erlbaum Associates. 

Stein, Z., Dawson, T., & Fischer, K.W. (in press). Redesigning testing: Operationalizing the new science of learning. In M.S. Khine & I.M. Saleh (Eds.), New science of learning: Cognition, computers, and collaboration in education. New York: Springer.