Toch - 2006 Margins of Error

 

Embed or link this publication

Description

This is the document from which the segment entitled "Harder than the Dickens" was taken.

Popular Pages


p. 1

educationsector reports margins of error the education testing industry in the no child left behind era by thomas toch

[close]

p. 2



[close]

p. 3

table of contents margins of error recommendations endnotes about the author about education sector acknowledgements 5 19 22 23 23 23 © copyright 2006 education sector all rights reserved 1101 pennsylvania ave n.w fifth floor washington dc 20004 202.756.4944 · www.educationsector.org

[close]

p. 4

margins of error the education testing industry in the no child left behind era 25

[close]

p. 5

state standards and standardized tests have become dominant forces in american public schooling for most of its history public education in the u.s was a local matter with local schools and school systems setting their own educational priorities but in the wake of mounting evidence that the preparation most students received from public schools wouldn t suffice in a postindustrial economy and with the conscience of the nation having been transformed by the civil rights movement policymakers began to pursue a new paradigm one that sought to establish statewide public school standards and hold local educators accountable if their students fell short of these standards standardized tests used to measure student performance against the new state expectations are the linchpin of this strategy of standards-based reform the no child left behind act of 2001 nclb solidified standards-based reform as a national priority part of a bold attempt by federal policymakers to force state and local educators to improve the education of minorities and other students that public schools traditionally hadn t served very well the legislation required that by the spring of 2006 states test nearly every public school student in grades three through eight and in one high school grade to gauge whether students have met standards in reading and math a task requiring some 45 million standardized tests annually to comply with nclb 23 states that have not yet fully implemented the law s testing requirements will administer some 11.4 million new tests during the 2005-06 school year alone half in reading half in math within two years states must begin testing students in a minimum of one elementary middle and high school grade in science under nclb requiring at least another 11 million tests.1 standardized test scores form the basis of nclb s accountability mechanisms school report cards tutoring and school-choice options for students and serious consequences for low-performing schools increasingly as a result the content of statewide tests has become the focus of teaching and learning in public school classrooms throughout the nation to the point where many schools have begun to do much more testing than is required by nclb in an effort to prepare their students for the high-profile nclb-mandated exams but this surge in testing has created immense challenges for both the industry that writes scores and reports the vast majority of the new statewide tests and the state agencies charged with carrying out nclb s requirements nclb s test-based accountability system has given local educators powerful incentives to help students whom public education has long neglected but the scale of the nclb testing requirements competitive pressures in the testing industry a shortage of testing experts insufficient state resources tight regulatory deadlines and a lack of meaningful oversight of the sprawling nclb testing enterprise are undermining nclb s pursuit of higher academic standards symptoms of the turmoil in the testing industry aren t difficult to find newspapers carry accounts of testing companies giving students college scholarships to atone for the fact that scoring errors deprived them of their high school diplomas of scoring errors sending thousands of students to summer school when they had in fact passed their tests of months-long scoring delays of administrators losing their jobs for low scores on tests that had they been scored correctly would have shown improvements in student achievement.2 margins of error the education testing industry in the no child left behind era 5

[close]

p. 6

these problems have damaged the credibility of standards-based reform in the eyes of many educators and parents and they have attracted the attention of the office of inspector general at the u.s department of education which has announced plans to examine the extent of test-scoring and reporting mistakes under nclb.3 but there are deeper more structural problems stemming from the tremendous expansion of statewide standardized testing that haven t made headlines in the buildup to the full implementation of nclb s testing requirements in spring 2006 many states are constructing tests that don t fully measure student and school performance against state standards and they are using tests that measure mostly low-level skills a move that encourages teachers to make the same low-level skills the priority in their classrooms at the expense of the higher standards that nclb has sought to promote the testing infrastructure that undergirds nclb s accountability system must be improved as this report makes clear and if steps aren t taken to do so teachers and principals will lose valuable tools to improve instruction and both nclb s work on behalf of public education s neediest students and standards-based reform itself will be increasingly at risk statewide testing envisioned under nclb as a key part of the solution to what ails public schools is fast becoming part of the problem in public education the industry the testing industry is surprisingly small given its outsize role in public education today eduventures inc a boston-based research firm estimates that the value of tests testing services and test-prep materials purchased in 2006 will be $2.3 billion but that includes purchases by school systems and schools state-level testing and college-admissions testing and test prep total expenditures for developing publishing administering grading and reporting nclb-required statewide tests eduventures estimates will be $517 million in the 200506 school year.4 some testing company executives peg the number somewhat higher at $700 million to $750 million still a small portion of the approximately $500 billion the united states spends on public elementary and secondary education annually.5 a handful of companies capture some 90 percent of the statewide testing revenue eduventures estimates they include pearson educational measurement a subsidiary of london-based publisher pearson plc ctb/mcgrawhill a division of the new york-based publishing and information conglomerate mcgraw-hill cos harcourt assessment inc owned by anglo-dutch publishing giant reed elsevier riverside publishing a division of the privately owned publisher houghton mifflin co and the nonprofit princeton-based educational testing service ets best known as maker of the sat collegeadmissions test they are full-service companies that create tests align them with state standards ensure they are technically sound publish distribute and score them and analyze results other smaller full-service companies have entered the statewide testing business more recently including measurement inc questar educational systems data recognition corp and non-profits measured progress northwest evaluation association and american institutes of research and there is a growing number of niche companies that focus on aspects of the testing enterprise such as test-question writing or test scoring states adding reading and math tests in 2005-0 state connecticut illinois kansas kentucky maine massachusetts michigan minnesota missouri montana nevada new hampshire total total tests 311,286 974,160 315,138 443,828 126,474 456,750 1,062,907 383,214 566,802 90,602 188,358 198,032 11,352,872 state new jersey new york ohio oklahoma pennsylvania rhode island vermont virginia washington wisconsin wyoming total tests 619,588 1,704,592 850,850 190,066 864,686 150,796 89,104 565,096 629,156 512,240 59,147 source editorial projects in education research center national center for education statistics education sector calculations margins of error the education testing industry in the no child left behind era

[close]

p. 7

the major players ctb/mcgraw-hill major test terranova activities development administration scoring reporting state testing contracts 23 k-12 tests administered 2005 16.5 million test items written 2005 167,000 percent of nation s students taking a ctb test 35 k-12 testing revenue n/a corporate parent the mcgraw-hill companies the path to nclb the major players have been around for a long time harcourt s stanford 10 test and ctb-mcgraw hill s terranova tests date to the 1920s riverside publishing s iowa test of basic skills to the 1930s but in keeping with the tradition of local control in public education publishers for decades sold their elementaryand secondary-school achievement tests only to schools and school systems where they were used to compare local student performance with that of representative national samples of students so-called norm groups the publishers local sales staff sold the tests at the same time they sold textbooks because the major publishers were in both businesses there was thus a patchwork of different tests in place in every state rather than a single statewide testing system there were typically no consequences for local educators for their students performance on the tests and there wasn t any attempt to measure student performance against state standards because with local school systems establishing their own educational agendas there weren t any statewide standards all that began to change in the late 1960s the elementary and secondary education act of 1965 esea of which nclb is the latest reauthorization called for evaluation of federal programs for disadvantaged students and set aside funding for the task a nascent accountability movement also took shape in the 1970s as state lawmakers in the face of reports that many students weren t learning and demands that state officials address the problem started to require statewide testing programs as a way of ensuring that students had minimum competencies in core subjects they wanted to know for example that sixth-graders were performing at least as well as typical fourth or fifth-graders michigan created the first statewide standardized testing program in 1969 and florida created the second in 1971 the minimum-competency movement expanded to other states during the 1970s with lawmakers typically requiring testing at two or three grade levels each year and the movement spread more rapidly in the early 1980s with the publication of a nation at risk and several other national studies that laid bare the troubled state of the nation s public schools new funding for school reforms began to flow from state coffers in the wake of the reports and educational testing service major test none creates custom-designed statewide exams activities test development reporting state testing contracts 15 k-12 tests administered 2005-06 7 million k-12 testing revenue $150 million key fact best known as creator of the sat new player in k-12 testing corporate parent non-profit harcourt assessment major test stanford 10 testing activities development administration scoring reporting state testing contracts 22 k-12 tests administered 2005-06 9.5 million test items written 2004 nearly 85,000 open-ended questions scored 2005-06 40 million k-12 testing revenue n/a corporate parent reed elsevier pearson educational measurement major test none creates custom-designed statewide exams activities development administration scoring reporting state testing contracts 20 k-12 tests administered 2005-06 40 million k-12 testing revenue n/a key fact nation s largest test scorer corporate parent pearson plc riverside publishing major tests iowa test of basic skills activities design development scoring assessment management state testing contracts 4 k-12 tests administered 2005-06 n/a k-12 testing revenue n/a key fact top player in formative-assessment market corporate parent houghton mifflin company source testing companies margins of error the education testing industry in the no child left behind era

[close]

p. 8

lawmakers wanted evidence that their investments were paying dividends so they mandated more testing by the end of the 1980s frustration with the pace of local reforms had led president george h.w bush to convene a national summit with the nation s governors in charlottesville va to explore ways to promote reform on a larger scale bush and the governors led by thenarkansas gov bill clinton helped win bipartisan support for standards-based reform by establishing as a national goal that students demonstrate competency in core subjects such as english and math in grades four eight and 12 by the time congress reauthorized the elementary and secondary education act in 1994 clinton was in the white house and standards and accountability were the watchwords of reform known as the improving america s schools act iasa the clinton administration s esea-reauthorization legislation required every state to put in place both standards and tests in reading and math at three grade levels and about two-thirds of the states had done so by the end of the second clinton presidency.6 most of the tests were designed to measure whether students mastered states standards rather than how they compared with students nationally they were criterion-referenced tests rather than the norm-referenced tests that most states had introduced previously nclb built on the clinton-era accountability measures it more than doubled the amount of testing required of the states from three grade levels to seven it established much tighter deadlines for introducing new tests it required that results be broken down by a range of subgroups of students in every school and most significant it linked serious consequences for schools to student test scores today under nclb more students are tested more often than at any time in the nation s history and the stakes are far higher `harder than the dickens creating high-quality tests is difficult and labor intensive the process involves determining the length and content of a test hiring curriculum experts to write questions and ensuring that the questions align with state standards so the questions test what students are supposed to know then questions are field-tested on thousands of students to ensure that they don t discriminate against groups of students but do discriminate between strong and weak students a complex mathematical task that requires comparing how students do on other questions with how they perform on the questions being trial-tested test-makers also have to ensure that every multiplechoice question has one and only one correct or clearly best answer and that the questions on a test reflect an appropriate range of difficulty another complex statistical computation has to be performed to ensure that the same scores on different tests represent the same level of performance then tests have to be edited printed and distributed to every public school in the country it s a demanding process under the best of circumstances hundreds of people have to touch every item says gary cook a research scientist at the university of wisconsin s center for education research who served as wisconsin s testing director and as vice president of state accounts at harcourt assessment the difficulty of writing questions that can clear testing s many hurdles has typically resulted in a majority of them being dropped even those drafted by the most experienced item writers says h.d hoover who was the principal author of the iowa test of basic skills itbs for nearly two decades building a good test looks easy but it s harder than the dickens says hoover ohio now administers 400 different forms of its statewide tests in order to field-test enough questions to keep its bank of test items stocked as a result it costs anywhere from $300 to $1,000 to develop a simple multiple-choice question the least state spending on standardized testing millions spent per academic year on the development publishing administration grading and organizing of state exams $50 54 $500 millions 51 45 50 $250 $0 04­05 05­0 0­0 0­0 source the state of the k-12 state assessment market eduventures inc 2005 margins of error the education testing industry in the no child left behind era

[close]

p. 9

expensive type of test item state tests typically have 50 to 100 questions per subject per grade this complex test-making infrastructure is buckling under the weight of nclb s testing demands there s way too much demand and not enough supply says hoover testing companies are desperately looking for people to write test items says hoover it s hard to do well and it s hard to recruit people to do it adds daniel koretz a testing expert at the harvard graduate school of education testing company executives tell me `we re having a hell of a time finding the caliber of people we need the surge in state testing under nclb has created a severe shortage of the specialists who do the analyses of how test items perform in field trials and the other heavy statistical lifting in test-making though the work of these experts who are trained in measurement theory and statistics and are known as psychometricians is crucial to creating high-quality tests only a handful of them enter the workforce each year from the university of iowa michigan state the university of massachusetts and the dozen or so other campuses that train them under a dozen a year reports a survey of doctoral degrees earned nationwide between 1995 and 2003 by the national opinion research corp an additional 35 ph.d.s were awarded annually in the related field of statistics testing and education measurement cook the university of wisconsin testing expert refers to nclb as the no psychometrician left unemployed act 7 the dearth of testing experts isn t hard to explain psychometrics is a highly technical mathematics-based discipline that doesn t pay particularly well by privatesector standards about $120,000 a year in top industry slots and much less in state testing agencies and many potential recruits undergraduates studying educational and quantitative psychology at colleges of education are discouraged from entering the field by education school professors many of whom are opposed to the rise of standardized testing in public education says hoover they say we re the bad guys to make matters worse a growing number of psychometricians are working on non-educational tests says james impara president of the national council on measurement in education a professional organization for education measurement experts over 1,000 occupations from accounting to firefighting now require licensure or certification impara says and many of them give tests to applicants testing companies also face immense pressures at the back end of the testing cycle in the pre-nclb era states and school systems gave testing companies months the rise of standardized testing test sales 1955 to 199 $300 $250 millions of 199 dollars $200 $150 $100 $50 0 1955 1959 1991 1993 1995 195 191 193 195 19 199 191 193 195 19 199 191 193 195 19 199 year source the bowker annual of library and book trade information 1970-98 association of american publishers 1970-1998 when tests were purchased by schools and school systems and in the early years of the standards movement when most states used the stanford and other major norm-referenced tests as their statewide exams test publishers produced a new battery of tests only every six to eight years because they didn t release test items and were thus able to recycle their tests the result was a manageable demand for test items and ample time to vet them nclb has changed that the law s requirement that states align their tests to challenging state standards is an important step toward clarifying classroom expectations but it is forcing the testing industry to custom-build the majority of the tests that must be in place at seven grade levels in every state this spring and because a growing number of states release at least portions of their tests once they have been administered each year in order to give educators and parents more-detailed reports of student performance testing companies have to generate vastly larger pools of credible test questions and do so within far shorter timelines in the view of many in the industry they can t find enough qualified people to do the work 199 margins of error the education testing industry in the no child left behind era 9

[close]

p. 10

long windows in which to score standardized tests because the results rarely had immediate consequences now completed answer sheets are routed from schools to testing company scoring centers where results are tabulated and then uploaded directly to state education department computers or as in michigan back to school systems and from there to the state agencies states testing staffs calculate the percentages of students meeting state standards in reading and math once they do this for every nclb student subgroup students are grouped by race/ethnicity family income disability and language proficiency in every tested grade in every public school and school system they grade schools and school systems on the basis of whether sufficient percentages of their students as a whole and in every subgroup have met state standards on the tests what nclb calls adequate yearly progress then the state agencies package the ratings in reports that nclb requires them to supply to school systems school systems in turn must route the state ratings to schools and parents in time for parents to place their children in tutoring or in different public schools prior to the start of the next school year an opportunity that nclb grants students in schools that fail to make adequate yearly progress with many schools starting up in august that means that the entire testing and state rating process must be completed within six weeks from the end of the typical public school year this would be difficult enough to do successfully with long timelines but many state policymakers under pressure from public educators to give students as much time as possible to prepare for nclb s high-stakes tests are demanding that tests be administered late in the school year and that testing companies nonetheless complete their scoring and reporting in time to place underachieving students in summer school and in time for states to do their adequate yearly progress calculations ahead of the midsummer deadline for public reporting says jeff galt chief executive officer of harcourt assessment lobbying by local educators led the ohio legislature in 2005 to move the state s two-week testing window from march to may beginning in 2007 the legislature also mandated that ohio s testing contractors washington d.c based american institutes of research and north carolina-based measurement inc both small companies report scores on the tests by june 15 two weeks faster than in the past and some states want even quicker turnarounds michigan ended its contract with measurement inc in 2005 in the wake of months-long delays in scoring the state s tests pearson the state s new contractor has to get test results to local school systems within 30 days despite the fact that testing companies have sought to upgrade their test-processing infrastructure in recent years the pressure put on them by the volume of testing and the new scoring deadlines is immense says scott marion vice president of the new hampshire-based center for assessment a nonprofit test-consulting firm that advises 15 state testing agencies and it is intensified says stuart kahl president and ceo of measured progress a new hampshire-based testing company by the fact that companies often must spend weeks after students take tests tracking down test booklets that school systems have failed to forward resolving discrepancies between enrollment figures and the number of students tests and cleaning up basic student biographical information required to report test results under nclb headlines about scoring blunders are one measure of the overwhelming demands of the scale and speed of test processing required by nclb another is that over half of the school systems in a 2005 national survey by the the making of a reading test question tests developers select reading passages and send them to bias committee for screening question developers cull remaining passages and draft test questions for review by content committee surviving material developed into pilot tests sometimes after oversight by state testing agency pilot test administered and results analyzed valid items applied to new tests or future test item-bank new tests given final review by content and bias committees then referred to state department of education approved tests released for administration source scott marion center for assessment margins of error the education testing industry in the no child left behind era 10

[close]

p. 11

glossary of terms validity the extent to which tests accurately measure the knowledge or skills that the tests are intended to measure constructed response a test item that requires students to provide the answer to a question as opposed to a multiple-choice question where students choose among possible answers that the test creator provides constructed-response questions can be as simple as fill in the blank e.g 9 x 9 or require more complex answers such as a written essay rubric a tool used to score answers to a test question most commonly used for scoring answers to more complicated constructed-response questions rubrics help ensure consistent grading from different graders by describing the specific elements of the answer needed for students to receive various score levels criterion-referenced test an assessment that measures the extent to which students have mastered a specific body of knowledge and skills such as standards-based tests that determine if students have reached certain predefined levels of proficiency in a subject norm-referenced test an assessment that measures a student s performance relative to that of a representative national sample of students called a norming sample results on norm-referenced tests are often expressed in percentiles a student scoring at the 60th percentile on a norm-referenced test for example scored better on the test than 60 percent of the students in the norming sample scaling a process of converting raw test results such as the number of correct answers into a score that can be used to compare results from different students or different versions of a test student scores on the sat for example are converted to a scale where the minimum score for each section is 200 and the maximum score is 800 equating the process of placing scores from different versions of the same test on a common scale so that student results on those tests can compared on a fair apples to apples basis equating for example ensures that a score of 700 on the sat in 2005 is comparable to a score of 700 in 2006 reliability a measure of the consistency and dependability of a test score s representation of a student s knowledge and skills there are various dimensions of reliability such as the consistency of test results when comparing the administration of the same test at different points in time or comparing the results of different questions that measure the same skill or comparing the scores given by different graders to the same answers source scott marion center for assessment pearson educational measurement center on education policy said that late reports by state departments of education created serious or moderate problems in meeting nclb s start-of-the-school-year deadline for informing parents of their children s eligibility to attend higher-performing public schools.8 market pressures the testing industry is facing these challenges in a time of tight budgets and thin margins a study by harvard economist caroline hoxby revealed that states typically spend less than one-quarter of 1 percent of public school revenues on their statewide testing programs.9 in 2005-06 combined federal state and local per-student spending in public education averaged over $8,000 despite testing s tremendous influence on what students are taught and how teachers teach in the nation s public schools and despite the importance of testing to school reform under nclb states spend between $10 and $30 per student on their testing programs says harcourt s galt and other industry experts eduventures estimates that schools and school systems spend twice that amount on test-prep materials.10 the major testing companies weren t fazed by state testing budgets when they were selling large quantities of the itbs stanford and other national norm-referenced tests to schools and school systems these catalogue sales as they are known in the testing industry were lucrative says cook who coordinated bids for state testing contracts at harcourt publishers would invest $3 million to $6 million in a testing series and earn $15 million to $20 million over the five to eight-year life of the tests he says because they were able to keep the development cost of every test booklet they sold low by using the same tests for a number of years but schools and school systems are buying far fewer of the major norm-referenced tests in the nclb era of statewide testing sales of such tests are down 30 percent to 70 percent says cook in the new nclb marketplace the publishers must make customized criterion-referenced tests that measure students grasp of each state s unique academic expectations such tests galt says can be five times as expensive to construct as the itbs and other norm-referenced tests and the fact that many states own the copyrights to their tests and release them to the public after the tests have been administered has wiped out the economies that publishers enjoyed when they were able to use their norm-referenced tests for several years margins of error the education testing industry in the no child left behind era 11

[close]

p. 12

running the result predictably has been much lower profit margins when you are building a new test every year says hoover the itbs author it s difficult to get your money back to compensate the major testing companies are vying aggressively both with one another and with measured progress data recognition corp and the other new players for as many state nclb testing contracts as possible in an attempt to achieve efficiencies through scale but this intense competition has led to more pressure on profit margins princeton-based ets lost $18 million on its first nclb testing deal a three-year $175 million contract with california the nation s largest market says anthony carnevale a former ets vice president nonetheless california tentatively approved a new three-year deal with ets in late 2005 for even less money john oswald an ets senior vice president and general manager of the company s elementary and secondary-education division says ets lost money on its first california contract made a slight profit during a one-year extension to the deal and expects the new contract to be profitable with three or four companies bidding one always has a reason to low-ball to get into the state says kahl of measured progress pearson won a three-way competition in 2005 for michigan s testing business with a $48 million bid on a three-year contract the second company data recognition bid $84 million and the third measurement inc $114 million says jon twing senior vice president for test and measurement services at pearson where he heads the company s 200-person division that develops state tests we are reducing our rates in bids because our competitors are doing the same thing it cuts profit margins there s a tremendous amount of competition to get volume to cover fixed costs says galt harcourt s president in response to higher testing volume and tighter scoring deadlines harcourt is spending $50 million over three years on printers scanners software and other test-processing infrastructure most of which galt says remain idle for 10 or 11 months of the year doug kubach president and ceo of pearson educational measurement says there s hyper-competition in the industry penalty clauses in state testing contracts that have become more common and more prescriptive since nclb became law are also squeezing testing company profit margins pearson s 2005 deal with michigan for instance stipulates that the company must pay the state four cents per student per day for every nclb test it fails to score and return to school systems within four weeks a potential maximum fine of $100,000 a day pearson s penalty clock began ticking in early december of that year and the company quickly racked up two weeks worth of fines in several parts of the state says roeber adding that we ll probably save some money on our testing program this year meanwhile nclb has spawned a secondary testing market that is further taxing the testing industry s capacity the tremendous pressure on schools and talent search ph.d.s granted in psychometrics 1994 to 2004 15 12 9 3 0 1994 1995 199 199 199 1999 2000 2001 2002 2003 2003 year number of phds ph.d.s granted in educational assessments/testing measurement 1994 to 2004 15 12 number of phds 9 3 0 1994 1995 199 199 199 1999 2000 2001 2002 year source national opinion reasearch corporation 2004 2004 2004 margins of error the education testing industry in the no child left behind era 12

[close]

p. 13

school systems to have their students to do well on nclb tests and the advent of technology that permits companies to very rapidly give school superintendents principals and teachers detailed breakdowns of student test results have produced a burgeoning market among school systems for so-called formative tests short tests that are administered throughout the school year to help educators respond quickly to student weaknesses the rise of formative testing has increased demand for banks of test questions dramatically industry experts say sales increased 50 percent between 2003 and 2006 to $323 million eduventures estimates uneducated consumers state departments of education are ultimately responsible for carrying out nclb s testing mandates and they are if anything in a weaker position than the testing industry to respond to the surge in testing under nclb many state testing offices suffer from heavy turnover and shortages of skilled staff as a result of underfunding and hiring freezes introduced during the recession of the late 1990s the capacity of assessment and accountability offices in state departments of education is very low says marion of the center for assessment who was testing director in wyoming and there has been a significant increase in workload since the passage of nclb without an increase in qualified staff education sector found in a survey of state testing directors that it conducted for this report that over half the states have problems recruiting and retaining the testing staff they need to respond to nclb adequately testing companies and large school systems many say are luring away scarce psychometricians and other key staff with higher salaries ohio s new testing director judy feil for example has lost seven of 20 staffers in the past year to burnout and higher-paying jobs in school systems and testing companies she has been forced to hire replacements without testing backgrounds she says because skilled people are difficult to find many state testing offices operate with skeleton staffs indiana a relatively large state has five professional testing employees testing directors themselves turn over at an alarming rate matt gandal executive vice president of achieve inc a washington d.c based organization that promotes high education standards says the testing directors of ohio florida texas maryland and rhode island have left in the past few years for the private sector and an opportunity to earn higher salaries marion describes standing with six or seven state testing directors at a national conference only half of whom had been in their jobs for more than eight months the result of the understaffing and lack of expertise in many state testing offices says marion is that for the testing companies it s like being auditors of their own work many state testing agencies simply don t have the capacity to scrutinize the work of their testing contractors closely industry leaders nine companies capture more than 95 percent of the expenditures from states for tests and testing services company primary state contracts education testing service california new jersey harcourt assesment arizona new mexico deleware rhode island hawaii south dakota idaho virginia illnois wyoming mississippi riverside publishing arkansas iowa louisiana ctb/mcgraw-hill alabama mississippi alaska missouri california new mexico colorado new york connecticut north dakota district of south dakota columbia tennessee florida west virginia indiana wisconsin kentucky measured progress maine new hampshire massachusetts utah montana vermont nevada questar education system arkansas minnesota pearson education measurement minnesota new jersey texas washington northwest evaluation association idaho data recognition corp alaska pennsylvania source the state of the k-12 state assessment market april 2005 eduventures inc margins of error the education testing industry in the no child left behind era 13

[close]

p. 14

troubling consequences the mounting scoring errors and reporting delays that have resulted from the many challenges confronting the testing industry and state testing agencies as they struggle to respond to nclb s testing mandates have tarnished nclb s testing-based system of school accountability they have created a publicrelations problem but the lack of state oversight of testing contractors the industry wide shortage of testing experts and the many other problems that have plagued the spread of statewide testing under nclb are also damaging the cause of standards-based reform in ways that don t make many headlines but are arguably more fundamental testing experts say many statewide tests are not getting sufficient psychometric scrutiny to ensure that they accurately measure student and school performance under nclb states and contractors should be doing a lot more validity studies to be sure that what the tests are saying about student achievement is accurate says marion of the center for assessment who has taught test-making at the university of maine but they aren t doing it in many cases says cook they are putting [test items on the street they shouldn t that s particularly true of test questions that require students to write a response rather than fill in a bubble on an answer sheet the reason is that so-called openresponse questions are more costly to field-test because they must be scored by people rather than machines you are paying a fortune on an individual item just to try it out says hoover the onetime itbs author so frequently companies never try them out and they are bad items university of iowa psychometrician stephen dunbar hoover s successor at the itbs refers to nclb as no item left behind because the law has led to such a shortage of quality test questions in another example of the consequences of psychometric failings the ohio department of education announced in the fall of 2005 that measurement inc had failed to correctly translate raw scores on the state s high school test into scores on a publicly reportable scale the scaling mishap resulted in new scores for 5,000 of the 5,400 students who had taken the test the previous summer including 900 students who had been told they could not graduate because they had failed the test when they hadn t.11 and while nclb s strength as a source of standardsbased school reform rests on its requirement that states measure students grasp of statewide standards and then take steps to improve schools and school systems where students don t measure up lack of time money and skilled staff have led a substantial number of states to introduce tests that many testing experts say are not fully aligned with state standards tests that don t test what states expect their students to know this is happening in part experts say because rather than building tests from scratch states are hiring testing companies to augment the stanford and other national norm-referenced tests with questions that cover topics in state standards but the tests aren t always what they should be when you ask publishers if they align the tests with state standards you ll rarely get an answer of less than `85 percent says marion but our studies show it s a lot lower 50 percent as a result teachers are teaching stuff that they can t be sure is on the tests because the tests don t necessarily measure the skills that states say teachers should teach nor is the quality of the many practice tests that students are taking in increasing numbers to prepare for nclb testing what it should be formative testing has the potential to help students by giving teachers frequent and in theory useful information on student performance but so far say industry analysts schools and school systems have been unwilling to pay for high-quality test items for these new tests leading testing companies to focus resources on supplying the new market with banks of test questions that are less fully field-tested and thus less expensive but are also less accurate measures the work of the testing industry content creation standards alignment psychometric evaluation test publishing test distribution test delivery test scoring score analysis reporting source the state of the k-12 state assessment market eduventures inc 2005 margins of error the education testing industry in the no child left behind era 14

[close]

p. 15

niche players companies align to achieve american institute of research applied measurement professionals brown publishing network cresst knowledge analysis technologies mazer mcrel measurement inc pacific metrics plato learning publishers resource group questar smartpro3 stanford research institute the grow network thomson prometric vantage learning victory productions west ed westat wireless generation words numbers key content creation standards alignment psychometric evaluation test delivery test scoring score analysis prescriptive remediation what many school systems were willing to pay for them school districts don t appreciate or can t afford highquality items and tests says alvaro fernandez a former edusoft executive they have an insatiable hunger for inexpensive item banks that their teachers can use to help them do better on the nclb tests so houghton struck a deal with fs creations an ohio-based company to market low-end questions alongside its higher-quality riverside products says fernandez simple questions perhaps the most troubling classroom consequence of the tumult in the testing industry is the strong incentive the problems have created for states and their testing contractors to build tests that measure primarily low-level skills nclb has sought to lift the level of teaching in the nation s classrooms by requiring states to set challenging standards for what students should know and be able to do but testing experts say that many of the tests that states are introducing under nclb contain many questions that require students to merely recall and restate facts rather than do more demanding tasks like applying or evaluating information largely because it s easier and cheaper to test the simpler tasks such test questions do have a role it s important that students grasp the most basic skills but because teachers have so much riding on their students results tests that stress such skills encourage teachers to emphasize them in their classrooms at the expense of the high standards that nclb has sought to promote they strip teachers of the incentive to teach higher-level skills tests are focusing more and more on rote skills because it s difficult given the demand that they be constructed quickly and cheaply for anything else to happen says hoover writing items that tap higher levels of comprehension is really difficult the problem is that tests of rote skills encourage rote teaching it s not a good model for instruction as marion puts it the further away we get from testing the types of things we want kids to do in school the less likely we are to improve education such tests also give a skewed sense of student achievement scores on reading tests that measure mainly literal comprehension are going to be higher than those on tests with a lot of questions that require students to source staying ahead of the curve eduventures inc 2004 of student performance certain firms claim to offer tens of thousands of exam items eduventures writes in an industry report testing in flux but many of the items says eduventures have not undergone rigorous psychometric evaluation 12 says marion the items that end up on most of these formative tests are ones that get rejected from state tests as a result many formative test questions don t accurately measure what students know but the use of such test items is increasingly widespread in 2003 in one of a number of moves by major testing companies to tap into the formative-testing market houghton mifflin bought edusoft a then-three-year-old company that permits teachers to give tests scan student answer sheets upload them to edusoft servers and receive detailed score reports school systems began asking houghton to supply banks of practice test items but houghton couldn t deliver fully field-tested items for margins of error the education testing industry in the no child left behind era 15

[close]

Comments

no comments yet

YOUBLISHER
About
What Others Say
Sitemap
Impressum

PUBLISHERS
Login
Signup
Tutorials
FAQ
Support

BUSINESS
Overview
Advertising
Support

DEVELOPERS
API

LEGAL
Report a Copyright Violation
Copyright FAQ
Terms of Use
Privacy Policy