It was reported today in the NYT that several schools are cutting back on classes and focusing on math and reading. This news is paired with protests in California over its exit exam (which we have previously discussed on this blog) and news of scoring errors on SAT tests as well as several state exams.
I wanted to devote a little space to this because I happen to know a lot about it and I have grave concerns. There are several things that are being left out of the standardized testing story and that policymakers may or may not get.
First: The Testing Movement
The tendency is to see testing either as a panacea for all things wrong with education or the cause of them. The testing movement really gained steam under Bush Sr. He was pushing what became known as Goals 2000. The idea was for each state to have solid state standards of what students should be able to do and then to periodically administer tests that were based on those standards. Teachers’ unions were 100% behind the idea. All 50 states had educational standards, however, most of those were weak, minimum competency requirements. And the tests that measured “progress” were one-size-fits all, norm referenced tests like the Iowa Basic Skills or the Stanford 9 which were purchased by states along with text books. The idea of Goals 2000 was to raise the bar all around. It was voluntary; however, a certain amount of federal funding was tied to implementation. Back in 1994, when I was working in educational issues, the firm I worked for was trying to evaluate the strength of state standards under development. What we learned is that something like 48 states were participating in the program. All were planning to build tests around the new standards by 2000. Overall, the quality of the standards ranged from poor to good, but not great.
By 1996, states were starting to contract with the big test companies to build custom tests for their state. These tests took 2-5 years to develop, cost millions and millions of dollars in development, administration, and scoring. Testing was usually done in 3rd or 4th, 6, 8, and one high school year, usually 10th or 12th and they covered math, science, English Language Arts (ELA), and Social Studies. Goals 2000 was a good, middle of the road type of program.
By 1999, a disturbing trend was starting up. Teacher pay was being linked to student performance on these tests. Thus was born the “high stakes” test. Testing specialists advised against this. For starters, standardized tests like polls and surveys are not a science in the true sense. They can be “scientized” (my term) by analyzing trends in field test results, but like all statistics, they are utlimately based on what is often problematic data. Like judges, statististians can only analyize and make judgements based on the data they recieve. Furthermore, psycometricians can design algorithms to do just about anything to smooth out undesirable results. I mention this because testing specialists do NOT design tests to be used in carrot-stick- scenarios. They know the limits of standardized testing.
Two: The Industry
With the arrival of NCLB I was expecting a gravy train for testing companies like McGraw-Hill, Harcourt, ETS, etc. It has turned out to be only partially true. Companies as well as states are facing shortages in qualified people, cost over runs due to inefficient operation and constantly shifting demands from state policy makers. These companies have a terribly fickle clientele. The states often change requriements half way through the development process, stiffen up drop dead dates, etc. And the companies accept this without much complaint because they have to make the customer happy.
The NCLB has been like a tsunami for the testing companies. They can’t handle the demand. These were companies that mainly developed shelf-products and custom contracts were very limited and largely for license exams in various professions. In the past, a company could build a shelf test for $5 mil and then sell it for several years earning $15 mil. About 70% of their business was shelf-product. Now 70% is custom contract work and they haven’t yet adjusted. Custom tests can cost 4-5 times the amount to develop as shelf products. Profit margins are a lot lower and competition is pushing companies to bid lower and lower on contracts and make bigger and bigger promises. States are increasing penalties, sometimes $100K a day for a late deliverable. ETS lost $18 mil on a 3-year $175 mil California contract. It has since won a bid for a second 3 year term, and let's hope it has learned something.
There is growing side industry of small start ups. Now there are firms that will deliver the test; others will adminisiter it; others will score it; some will provide innovative score reports, etc. So now big companies can co-bid on state contracts. So McGraw-Hill will take the lead on the project and run the psycometrics and scoring and say Riverside will write the content or visa versa. The idea of "onestop shopping" is coming to an end because big companies are loosing too much money.
The actaul company employees are doing double shifts under big time pressures. Many are overworked and burnt out. They all have to criss-cross the country visting clients and the squeeze to save on travel budgets is on, thus the accomodations, which weren't great in the past, are even worse now. All that travel takes a toll on employee health and morale.
The state departments of education and their assessment and accountability offices aren’t in any better shape. They are severely understaffed thanks to budget cuts. Qualified personnel are hard to keep. Private industry pays more and the politics are a lot less messy. But even private industry is having hard time paying big salaries when the bottom line is getting thinner.
Three: The Problems and the hypocrisy: What no one wants you to know
A. Test developers are told to design tests that measure student performance on state standards. So you have to have clear, comprehensive, and cogent state standards to begin with. That is a key element to your blueprint. Absent that, test developers, like architects, are left to interpret and design things to the best of their abilities. Many states still do not have proper standards.
B. State testing programs are run by policymakers who may have once been educators but who are now politicians. Politicians earn votes by promising to “fix” education. Thus, they have a tendency to be either over-confident about the skills and abilities of their students and teachers or over-ambitious in their goals. So they come to one of the 5 big testing companies wanting a tough test. Then they start building policies in order to incentivize teachers and punish “low performing” schools. There are no stakes for students . Telling students to do well so that teacher earns more money is as stupid as it is useless.
Once the tests come out, teachers (who were part of the test development process. In some states, they actually write the test questions and the testing companies have to review and edit those- a miserable task let me tell you), parents, unions, school administrators are all yelling. The politicians are now in a tough spot. So they come back to the company and ask the psycometricians to mess with the statistical data, develop algorithms to smooth over edges, etc. while the company faces huge penalties for late product delivery. And politicians either start developing new policies or altering older ones in order to re-adjust to reality. They fiddle. They want the tests dumbed down. They extend implementation dates, they develop new programs to prepare students or train teachers. the empanel committes of "neutral specialists" to "advise" them. And then in the end, when the statistics show improvement, they stand up and claim that NCLB works!
C. Despite the importance of these tests, states spend less than one quarter of 1% of their edu. budgets on testing according to one Harvard study. Industry insiders say that states spent on average $10-$30 per student for the tests. Eduventrues says they spent twice that amount on test-prep materials.
NCLB was a bold attempt to use testing to push overall educational improvement. It seemed like a good idea on paper. In practice however, it is another story.
The Losers: Everyone
Federal funding and teacher salaries are contingent on student performance on these tests regardless of other challenges such as ESL, special needs students, or other systemic problems. And in the end everyone looses. States can always find the money to pay testing companies, but they can’t fund PE classes, art, music, vocational or business classes. Everything gets directed at the lowest performers at the expense of the middle and upper range performers. The lowest performers don’t get the type of education they need or the types of opportunities that the better off get to explore what fires their curiosity. It is an attempt not just to standardize testing, but the student as well.
It is a prime example of bringing a business model to bear on a social program. It doesn’t work because it doesn’t attack the root of the problem. The root of the problem, in my view, is a culture that does not value education. It is over-worked a parents who don’t sit with their children at night a help with homework. It is a culture that thinks the only thing that counts is math, science, and reading and fails to see that if a kid loves music, the math becomes more relevant to his world. Or if he loves to paint, then suddenly reading or maybe learning a foreign language isn’t so hard. It is a culture that fails to place any responsibily for success square on the shoulders of the students.
Tegardless of what we do, not everyone will succeed . We should fail people who don't keep up and accept that as a "cost of doing business". That said, however, everyone should have the opportunity to succeed. And that just isn't the case. Separate but equal? Tell that to an inner-city youth who's Jazz band just lost its funding.