Digital Natives: Teens, Tweens, & Digital Media: Games Even a Bureaucracy Could Love: The Future of Testing and Data-Driven Learning

With testing and Common Core Standards gaining steam in school reform circles, some educators are asking how to ensure that digital tools like computer-based “stealth assessments” will change classrooms for the better.

---

Filed by Christine Cupaiuolo

The current issue of Washington Monthly features a special report on the standards-and-testing model of school reform, which, despite protests from both the left and the right, continues to drive student learning and is poised to play an even bigger role in coming years. Through a series of stories, including one on computer-based “stealth assessments,” the overriding question remains: Will it change classrooms for the better?

Given all the controversy surrounding No Child Left Behind, you’d think national reform efforts would be front-page news. But as writes Paul Glastris in the introduction, much of what’s currently being debated and proposed in education has yet to appear on the public’s radar.

“Unlike previous waves of school reform, which were debated in Congress and covered in depth by the press, this next one is the product of compacts among states and a quiet injection of federal money—and has therefore garnered almost no national attention,” writes Glastris. “Consequently, few Americans have any idea about the profound changes that are about to hit their children’s schools.”

The first stage—already under way—involves the Common Core Standards states are implementing to provide a framework for the knowledge and skills all students are expected to attain in each grade level.

These standards, writes Robert Rothman in “Transcontinental Education, “are making possible new assessments that could radically transform instruction and learning.” Rothman, the author of “Something in Common: The Common Core Standards and the Next Chapter in American Education,” predicts a period of innovation—for educational products as well as teaching techniques.

Next comes a new set of high-stakes tests, scheduled to roll out beginning in 2014, based on the standards. In “A Test Worth Teaching To,” Susan Headden, a senior writer/editor for the think tank Education Sector, surveys the history of testing, problems with the current system, and future assessment models that are supposed to encourage deeper learning. She notes the multiple obstacles to creating a perfect testing system, starting with costs and technological capacity:

It is a given that the new assessments will be administered on computers. This assumes two things: that students are comfortable working digitally, and that school districts have the necessary technological capacity. The first is probably a safe assumption; the second less so. Ask any state assessment director what he worries about most, and the answer is almost always some variation on “bandwidth.” In an informal survey taken by the common core R&D teams, more than half the states are already reporting significant concerns about capacity, including the number of computers available, their configurations, and their power and speed. This poses a dilemma: requiring too much technology may present insurmountable challenges for states, while requiring too little may limit innovation.

Another concern is scoring. While advances in artificial intelligence have made computer scorers of essay writing seem less like a sci-fi novelty, there are concerns that more machine scoring will lead to less complex learning:

By way of making assurances, the ETS says that machines can identify “unique” and “more creative” writing and then refer those essays to humans. Still, the new tests will be assessing writing in the context of science, history, and other substantive subjects, so machines must somehow figure out how to score them for both writing and content. Likewise, machines struggle to score items that call for short constructed responses—for instance, an item that asks the student to identify the contrasting goals of the antagonist and the protagonist in a reading passage. A machine can handle this challenge, but only when the answer is fairly circumscribed. The more ways a concept can be described, the harder it is for the machine to judge whether the answer is right. [...]

The risk of all this, of course, is that in pursuit of a cheaper, more efficient means of scoring , the test makers will assign essays that are inherently easier to score, thus undermining one of the common core’s central goals, which is to encourage the sort of synthesizing, analyzing , and conceptualizing that only the human brain can assess.

Those concerns have been raised by others, including Les Perelman, a director of writing at the Massachusetts Institute of Technology. Perelman has studied algorithms developed by Educational Testing Service, creator of the e-Rater, which can grade 16,000 essays in a mere 20 seconds. As he recently explained to The New York Times, the e-Rater has a rather large problem: It can’t identify truth. Accuracy simply has no relevance to test scores. Nor does meaning—a jumble of long nonsensical sentences that use big words can result in higher score than a concise, well-written response.

The mechanics of Refraction, seen here in a screenshot, cover many important fraction concepts, including equal partitioning, addition, multiplication, mixed numbers, improper fractions, and common denominators.

The third stage embraces computers even more, in the form of new learning software that can assess students’ skills in the background as they learn and play. It addresses what is perhaps the most important question in this package of stories: Could real time assessment and response lead to the end of testing as we know it?

In “Grand Test Auto,” Bill Tucker, managing director of Education Sector (and soon-to-be deputy director for policy development, U.S. Program, at the Bill & Melinda Gates Foundation), describes research underway on several different learning systems, including Refraction, an online puzzle game for teaching fractions.

The prototype was designed by Zoran Popovic, a computer scientist and the director of the Center for Game Science at the University of Washington in Seattle. More than 100,000 people have played it so far—give it a try yourself.

Refraction appears relatively simple but packs a complex back-end powerhouse that “records hundreds of data points, capturing information each time a player adjusts, redirects, or splits a laser.” The game, in other words, reveals not just whether a student solved the puzzle, but how. Tucker writes:

Popovic’s game is one of dozens of experiments and research projects being conducted in universities and company labs around the country by scientists and educators all thinking in roughly the same vein. Their aim is to transform assessment from dull misery to an enjoyable process of mastery. They call it “stealth assessment.”

At this point, all this work is still preliminary—the stuff of whiteboards and prototypes. Little if any of it will be included in two new national tests now being designed with federal funds by two consortia of states and universities and scheduled to be rolled out in classrooms around the country beginning in 2014. Still, researchers have a reasonably clear grasp of what they someday—five, ten, or fifteen years from now—hope to achieve: assessments that do not hit “pause” on the learning process but are embedded directly into learning experiences and enable a deeper level of learning at the same time.

On the one hand, this sounds incredibly cool. One might make the analogy that gaming is to traditional classroom learning as playing a sport is to exercising repetitively and in isolation. The key point being that gaming is not only learning made fun—it is learning made relevant and meaningful. Once digital spaces enable students to engage in real-world applications of knowledge (even if it’s a virtual real world), and data-tracking allows teachers to assess them while they are engaging in these active activities, why would we ever go back?

Jane McGonigal points to the joy of the “epic win” that motivates gamers to learn habits and correct their mis-steps to save worlds from doom. These new game-tests are important because of their ability not just to give students that “epic” feeling of accomplishment, but because the students can have that cake—and the bureaucratic bean-counters can eat it, too.

Making over classrooms into online game worlds for each subject area is, of course, not the full answer. Tucker discusses other projects underway that allow for more open-ended creativity and room to develop reasoning and conceptual understanding. And he brings up these concerns:

The coming revolution in stealth assessment is not without potential dangers, pitfalls, and unintended consequences. If students perceive that the constant monitoring is meant primarily to judge them, rather than help them improve, then they may be less likely to experiment or take risks with their learning. Worse still, it’s conceivable that teachers would just find new ways to teach to the test, focusing their instruction on how to beat a computerized assessment algorithm rather than how to solve a challenging physics problem.

Eric Klopfer, director of MIT’s Education Arcade and a proponent of stealth assessment, warns against a superficial “gamification” of learning. Just as in traditional classrooms, where the use of gold stars and special awards is only as sound as the underlying relationships among students and teachers, adding game-like rewards to educational lessons only works if the game itself is rewarding. If you give students a reward for things they don’t want to do, Klopfer says, then students stop doing those things as soon as the reward stops. It takes good instruction to challenge and engage learners. The best intrinsic motivation isn’t a flashy game, Klopfer says, but “success through meaningful accomplishments.”

Indeed, for many of us, what ultimately makes learning—and life—meaningful are the relationships we build. Any test or game that doesn’t incorporate that sense of humanity will be the wrong answer.

Related: Opinions from students aren’t included in this package of stories, but that doesn’t mean they don’t have some bold ideas of their own. Tea’a Taylor, a senior at Freedom High School in Orange County, Fla., created this video through her school’s Patriot Productions and with the help of digital educator Cody Stanley. It looks at the effect the high-stakes Florida Comprehensive Assessment Test has had on students—especially students who flunked the test on their first try, despite records of high academic achievement. (via The Answer Sheet).