The call for higher academic standards to make our schools stronger and more equal is being heard across the country today. The standards movement is often portrayed as being concerned about determining what knowledge and skills children should acquire. But standards are supposed to increase the accountability of our schools for achievement as well. PRESS supports the production of high quality, easily understood, grade-by-grade content standards for Wisconsin's schools. However, we also believe local school districts need maximum flexibility in using such standards in planning their schools' curricula and classroom practices. At the state level, we find policy makers claiming that standards will not be useful in improving student achievement unless more "innovative" high-stakes performance tests substantially replace the standardized tests and writing samples being used today for state-wide student assessment. This assertion seems to rest upon two premises: (1) the idea that our schools have already dealt successfully with teaching the fundamentals of reading, mathematics, and English; and (2) the idea that standardized tests cannot adequately assess the so-called "higher-order thinking skills" which many educationists believe standards will specify. Many parents disagree with educationists about both of these assertions. We will be discussing the disagreements about assessments in another site, (The Assessment Debate) but in this discussion, let's focus on disagreements about how our schools are doing at teaching the basics. Multiple public opinion polls have shown that educationists and the public do not see "eye-to-eye" on this issue. At the Governors' Summit on Education held this spring, Public Agenda Foundation spokesmen told the assembled governors and CEOs:
[E]ven though large numbers of teachers voice support for higher standards, they do not generally see low standards--or youngsters finishing school without the basics-as widespread or urgent problems. Perhaps since teachers are generally satisfied with public schools' performance in teaching academic skills, their support for standards is less vigorous than the public's.
While a majority of the public...believes the schools are not placing enough emphasis on the basics, most teachers...believe they do. Almost half...of the public, and 65% of community leaders, believe that "a high school diploma is no guarantee that the typical student has learned the basics." Only three in ten teachers...share that belief. A majority of the public...thinks kids are not taught enough math, science and computers; most teachers...think they are...
Probably more worrisome to those backing high-level, rigorous standards is teachers' tepid response to the value of advanced learning and study....[O]nly 21 percent of teachers think that an excellent academic education is the most important determinant of career success....
Parents want educationists to be sure that effective instructional practices are used in their children's schools, particularly in the crucial early elementary grades when reading and fundamental mathematics skills are learned. Success in obtaining basic skills during the first three years of school is important for all students, especially those from disadvantaged backgrounds. Comparative research has shown that use of effective instructional practices in reading, mathematics, and English during grades K-3 will greatly improve the chances that a child will graduate from high school or be accepted into a college (see Meyer, 1984, The Elementary School Journal 84:380-394 & Gersten and Keating, 1987, Educational Leadership 45:28-31). Poor reading skill is strongly correlated with antisocial behavior and imprisonment. Thus, although many policy makers and educationists want to focus only on outcomes (after-the-fact accountability) by "souping up" assessment systems, we believe that classroom inputs (before-the fact accountability) are just as important.
Perhaps the reason attention to inputs has not worked well in the past is that the culture of our educational establishment, including education schools, state education departments, and district administrators has not been one that seeks to critically examine research data with the goal of encouraging use of instructional programs that have been shown to work in the classroom. This is the fundamental idea supporting the notion of before-the-fact accountability. It seems that whenever "new" programs are being discussed, there is a strong group lobbying in their behalf. Those arguing for the programs most forcefully are nearly always those who will benefit directly or indirectly, either economically or in career enhancement, by implementation of the programs. For example, consider what happens when states begin to talk about high-stakes assessments to evaluate how well students are achieving academic knowledge and skills. The creation and administration of high-stakes assessments is a real growth industry. Between 1960 and 1989 the amount spent nationally on testing increased 160% in inflation adjusted dollars while student enrollment increased 15% (Report by Congressional Office of Technology and Assessment, 1992). Now the assessment industry is proposing that states use performance assessments in evaluating their schools. These assessments will cost 20-30 times what traditional achievement tests. The traditional tests are mainly standardized multiple-choice tests and writing samples. The multiple-choice tests can be machine graded. The performance assessments require more subjective scoring. One has to hire trained scorers who spend large amounts of time rating the students' work. In fact, one has the feeling that if machine-graded multiple-choice examinations had not been developed and states were using only the performance assessments now being promoted, there would be an outcry about our not using computer technology to produce a more reliable and economical testing system. We certainly support the idea of having American students held to high standards, and believe our educational leaders should investigate tests given in Europe, such as the German abitur and realschule and French baccalaureat, as well as the American General Educational Development (GED) test and National Assessment of Educational Progress (NAEP) test. The European tests do feature some performance assessment items. However, the items are much simpler than the ones being proposed by most American policy makers, they clearly require application of facts, and they have specific correct answers. The application of knowledge to be assessed by many of these European performance assessments could very likely be measured by well constructed multiple-choice tests. What strikes us about the European tests is the rigor of knowledge in the academic core disciplines which they require.
It seems to us that the intense focus on funding for more expansive assessment tools, which educationists are putting up as a major public agenda item, diverts our attention from a much more fundamental problem in K-12 education. Attention to testing formats will not solve what we believe is the fundamental cultural problem in education, the inability of the educational establishment to satisfactorily deal with before-the-fact accountability. It seems to us that it is time for educationists to have their pet philosophies and proposals analyzed not simply by other educationists and the politicians they tap for public funding to support them, but by a mechanism which enlists professionals skilled in scrutinizing data for its validity and pragmatic citizens not connected to the educational establishment. Neither the professionals nor the citizens involved in such analyses ought to have an economic or career advancement interest in a particular K-12 education program, whether it would be an "assessment innovation" or a particular approach to classroom teaching. An educational institute, which would be overseen by professionals from medicine, engineering, statistics, and perhaps other fields accustomed to data validation, as well as citizens with backgrounds in business and technology, might help change the present culture in education today. Individual states could use institutes to consider problems unique to their states. Such an institute would not mandate particular practices. Rather, it would produce what we hope would be a more balanced and honest appraisal of educational program proposals than we are presently able to obtain. The analyses of such an institute could help policy makers and parents have a more realistic interaction with educationists.
How does the development of recommendations regarding program implementation in education occur today at the state level? Basically an elected official appoints a panel. The panel will almost certainly be dominated by educationists or representatives of organizations which have close ties with the educational establishment. Some business representatives may be added. Perhaps a representative from the PTA (or even PRESS heaven forbid!) is invited to participate "to represent parents." Presentations on possible programs are made to the panel by a few educationists who have a vested interest in seeing that a particular program is funded. After a few meetings, an educationist on the panel writes a report which rather predictably supports what has been presented; the representation made to the public is that a large amount of effort and analysis led to a "consensus" by the panel. Recommendations for spending millions of dollars are hatched by this kind of process. This process contrasts starkly to the procedure followed when a scientist must apply to the National Institutes of Health or the National Science Foundation for funding to conduct research. Such research projects can cost millions of dollars and so mechanisms to evaluate programs for their merit have been carefully designed. Basically, when the scientist submits his grant application, copies are sent to several peer reviewers (other scientists), who write critiques of the proposal. Some of the reviewers may even be direct professional competitors of the person submitting the grant. A committee of scientists familiar with the general area within which the proposed research falls also reads the grant application. This committee meets with staff people who can analyze the budget and considers both their own written evaluations and the evaluations made by the peer reviewers. The grant application is scored by a vote of the committee and its likelihood of funding depends upon how high this score is. The idea of having a scientist applying for a million dollar grant come to a committee comprised of individuals with only limited knowledge about the nuances of what is being proposed, give an oral presentation supported by perhaps a few brief handouts, and then having this committee decide in favor of funding would be absolutely ludicrous! Yet this is what goes on repeatedly in education! Of course, in the review of scientific grant applications we have scientists evaluating other scientists' proposals. So you might ask, why not simply have a committee of educationists evaluate proposals made by other educationists and then accept what they have to say. There are two problems with this. First, in the review of scientific research grant applications, the scientists comprising the review committee know from the outset that they have only so much money available to spend on research projects and they must determine who is going to get to use that money for research. The idea of giving educationists a "blank check" to be made out to the presently favored educational fad is not a good one. Second, the scientists are clearly scientists. Any arguments made for or against a proposal will swing upon the strength of the data presented, the past productivity of the applicant, and the likelihood of the proposal's success based upon the best objective evidence available. The prevailing culture among scientists is one of skepticism and caution with respect to claims made and data required to support claims. Unfortunately, we do not believe that this is the prevalent culture among many educationists. Their culture seems to be one in which fads are implemented before testing is performed. When failure occurs, an attempt is then made to explain it away or to simply rename the failed fad and work to have it adopted once more in a somewhat different form..
Recent problems in California exemplify the basic problems in the educational culture which we outlined above. Although we are picking California as an example, we believe these fundamental problems in educational culture are present throughout the United States. In 1991, California funded a new version of the California Learning Assessment System (CLAS), which was to use performance assessment test items extensively. The new assessments were to "require a lot more than fill-in-the bubble answers to multiple-choice questions." The tests would demand "analysis, problem solving, and comprehension based on global criteria" (Scharag The American Prospect winter 1995, pp.53-62). After over $50 million had been spent on the project, California's Governor Pete Wilson vetoed legislation that would have spent $26 million more on CLAS. Citizens were complaining that the answers to many of the questions were self-referential. In the reading test students were asked not for an analysis to show how well they had read and understood the text but for "your thoughts and feelings" (in the fifth grade) and "your first response" (in the tenth grade). The Los Angeles Daily News castigated the science test because test items appeared to be written as propaganda recommending environmental activism. The tests were so time consuming and expensive to score that the state DPI had scored the work of only a few of the thousands of students who completed the assessment. The most damning evidence against CLAS came from an independent statistical validation of CLAS ordered by the Governor. A group of psychometricians (experts in test development) concluded that there were grave doubts that the test could ever be made reliable for the individual student scores that the test was supposed to provide.
CLAS was to be a supreme example of improvement in education produced by after-the-fact accountability. Supposedly the administration of this test and its scores were to stimulate improved teaching in the classroom. This assertion is based upon the expectation that educationists will truly be stimulated by the assessment and its data to improve schooling. Let's think about this. What was the response of educationists to the plummeting ACT and SAT scores that occurred between 1964 and 1976? Has this stimulated more rigor in the classroom? Has this stimulated the use of more effective instructional practices? We think there is minimal evidence for this. Certainly policy makers increased the requirements for years of English, mathematics, science, and social studies at the high school level between the early 1980s and 1990. But ACT and SAT scores have improved only modestly in real terms.
Clearly the claim that improved assessments will improve classroom rigor rests on the belief that the culture of educationists is one which will respond to assessment data by undertaking educational practices which have been shown to work best for enhancing student achievement. What was going on in California at the same time CLAS was being developed shows that this is certainly not the culture of educationists. Unfortunately educationists have swung from one fad to another with a cycle length of 5-20 years. Thus, just as CLAS was being developed California had begun to put frameworks in place to specify what curricular materials were acceptable for use in its schools and what teaching methods were to be used in the key areas of reading and mathematics. The educationists adopted a purely whole language approach to reading instruction "hook, line, and sinker." They sought actively to expunge anything that had the flavor of phonics or Direct Instruction from California's elementary schools. They ignored plummeting reading scores for 4th graders in 1992. The same year 90% of surveyed California teachers responded that they were using whole language reading instruction. The state intensified its whole language efforts. By March, 1995, its own tests showed the majority of its 4th, 8th, and 10th graders failed to reach even minimally acceptable performance levels in reading and writing. Then, in April, 1995, results of the nationally administered NAEP reading test showed that California 4th graders ranked last among the 39 states that gave the test. Only the territory of Guam, where students also took the test, had lower scores. Not surprisingly, disadvantaged students suffered the most under California's whole language program. But poor performance was not limited just to high-risk students. Analyses have shown that both white and black students from California did much more poorly than their counterparts in other states on the NAEP test. California's white students also ranked last in reading compared to white students in other states who took the NAEP test.
So here we had a spectacle in which educationists were touting a testing innovation, and performance assessments in particular, as the "key to making California's schools world class in the 21st century" at the same time they were pushing a destructive educational practice in their state's schools. They had ignored completely comparative research looking at what methods work best to teach reading to students from diverse backgrounds at the same time they were stating that whole language had to be used to deal with the state's "diverse learners." They had ignored the outcome measures from their existing state-wide tests of student reading skills. These tests had proven reliability and validity.
We invite you to think about how an automobile assembly line is similar to the educational process. The goal of the assembly line is production of a high quality automobile. Let's say there is an inspector who uses a caliper (an assessment tool) to examine the doors of the cars coming off the line. If he finds the doors don't match expected standards, he reports it to the line foreman and his technical staff. This group, like the educationists confronted with adverse outcome data on student achievement, ought to take corrective action. If the company is to be successful, its stockholders need to be sure that the foreman and the technical staff take actions that have been shown objectively to work in fixing the door problem, not just actions which are appealing to them. The public is very much like the stockholders of the company in this example, and presently it is not clear that the public can rely upon the educationists to propose programs that will work for our children. Safeguards are needed to prevent the expenditure of huge amounts of money and the promotion of fads which do not have adequate research supporting their widespread use. It is not surprising that the California legislature, which is now considering spending hundreds of millions of dollars to fix its state reading and mathematics instruction, is looking at setting up an independent institute to evaluate proposed educational programs.
We think the following ideas about accountability, standards, and assessment deserve serious consideration:
Return to PRESS Home Page