ela-techrep

 

Embed or link this publication

Description

State Testing Program- English Language Arts

Popular Pages


p. 1

new york state testing program 2009 english language arts grades 3­8 technical report submitted november 2009 ctb/mcgraw-hill monterey california 93940 copyright © 2009 by the new york state education department

[close]

p. 2

copyright developed and published under contract with the new york state education department by ctb/mcgraw-hill llc a subsidiary of the mcgraw-hill companies inc 20 ryan ranch road monterey california 93940-5703 copyright © 2009 by the new york state education department permission is hereby granted for new york state school administrators and educators to reproduce these materials located online at http www.emsc.nysed.gov/ciai/testing/pubs.html in the quantities necessary for their school s use but not for sale provided copyright notices are retained as they appear in these publications this permission does not apply to distribution of these materials electronically or by other means other than for school use copyright © 2009 by the new york state education department

[close]

p. 3

table of contents section i introduction and overview 1 introduction 1 test purpose 1 target population 1 test use and decisions based on assessment 1 scale scores 1 proficiency level cut scores and classification 2 standard performance index scores 2 testing accommodations 2 test transcriptions 2 test translations 3 section ii test design and development 4 test description 4 test configuration 4 test blueprint 5 2009 item mapping by new york state standards 18 new york state educators involvement in test development 18 content rationale 19 item development 19 item review 20 materials development 21 item selection and test creation criteria and process 21 proficiency and performance standards 22 section iii validity 23 content validity 23 construct internal structure validity 24 internal consistency 24 unidimensionality 24 minimization of bias 26 section iv test administration and scoring 28 test administration 28 scoring procedures of operational tests 28 scoring models 28 scoring of constructed-response items 29 scorer qualifications and training 30 quality control process 30 section v operational test data collection and classical analysis 31 data collection 31 data processing 31 classical analysis and calibration sample characteristics 33 classical data analysis 37 item difficulty and response distribution 37 point-biserial correlation coefficients 44 distractor analysis 44 test statistics and reliability coefficients 44 speededness 45 differential item functioning 45 copyright © 2009 by the new york state education department

[close]

p. 4

section vi irt scaling and equating 48 irt models and rationale for use 48 calibration sample 49 calibration process 52 item-model fit 53 local independence 60 scaling and equating 61 anchor item security 63 anchor item evaluation 63 item parameters 69 test characteristic curves 75 scoring procedure 79 weighting constructed-response items in grades 4 and 8 80 raw score-to-scale score and sem conversion tables 80 standard performance index 87 irt dif statistics 89 section vii reliability and standard error of measurement 92 test reliability 92 reliability for total test 92 reliability of mc items 93 reliability of cr items 93 test reliability for nclb reporting categories 93 standard error of measurement 98 performance level classification consistency and accuracy 99 consistency 99 accuracy 100 section viii summary of operational test results 102 scale score distribution summary 102 grade 3 102 grade 4 103 grade 5 104 grade 6 105 grade 7 106 grade 8 107 performance level distribution summary 108 grade 3 110 grade 4 110 grade 5 111 grade 6 112 grade 7 113 grade 8 114 section ix longitudinal comparison of results 116 appendix a ela passage specifications 119 appendix b criteria for item acceptability 124 appendix c psychometric guidelines for operational item selection 126 appendix d factor analysis results 128 copyright © 2009 by the new york state education department

[close]

p. 5

appendix e items flagged for dif 132 appendix f item-model fit statistics 134 appendix g derivation of the generalized spi procedure 140 estimation of the prior distribution of t j 141 check on consistency and adjustment of weight given to prior estimate 144 possible violations of the assumptions 144 appendix h derivation of classification consistency and accuracy 146 classification consistency 146 classification accuracy 147 appendix i scale score frequency distributions 148 references 156 copyright © 2009 by the new york state education department

[close]

p. 6

list of tables table 1 nystp ela 2009 test configuration 4 table 2 nystp ela 2009 cluster items 5 table 3 nystp ela 2009 test blueprint 6 table 4a nystp ela 2009 operational test map grade 3 7 table 4b nystp ela 2009 operational test map grade 4 8 table 4c nystp ela 2009 operational test map grade 5 10 table 4d nystp ela 2009 operational test map grade 6 12 table 4e nystp ela 2009 operational test map grade 7 13 table 4f nystp ela 2009 operational test map grade 8 16 table 5 nystp ela 2009 standard coverage 18 table 6 factor analysis results for ela tests total population 25 table 7a nystp ela grade 3 data cleaning 31 table 7b nystp ela grade 4 data cleaning 32 table 7c nystp ela grade 5 data cleaning 32 table 7d nystp ela grade 6 data cleaning 32 table 7e nystp ela grade 7 data cleaning 33 table 7f nystp ela grade 8 data cleaning 33 table 8a grade 3 sample characteristics n 194543 34 table 8b grade 4 sample characteristics n 192275 34 table 8c grade 5 sample characteristics n 193173 35 table 8d grade 6 sample characteristics n 197010 35 table 8e grade 7 sample characteristics n 201481 36 table 8f grade 8 sample characteristics n 205928 36 table 9a p-values scored response distributions and point biserials grade 3 38 table 9b p-values scored response distributions and point biserials grade 4 39 table 9c p-values scored response distributions and point biserials grade 5 40 table 9d p-values scored response distributions and point biserials grade 6 41 copyright © 2009 by the new york state education department

[close]

p. 7

table 9e p-values scored response distributions and point biserials grade 7 42 table 9f p-values scored response distributions and point biserials grade 8 43 table 10 nystp ela 2009 test form statistics and reliability 45 table 11 nystp ela 2009 classical dif sample n-counts 46 table 12 number of items flagged by smd and mantelhaenszel dif methods 47 table 13 grades 3 and 4 demographic statistics 50 table 14 grades 5 and 6 demographic statistics 51 table 15 grades 7 and 8 demographic statistics 52 table 16 nystp ela 2009 calibration results 53 table 17 ela grade 3 item fit statistics 55 table 18 ela grade 4 item fit statistics 56 table 19 ela grade 5 item fit statistics 57 table 20 ela grade 6 item fit statistics 58 table 21 ela grade 7 item fit statistics 59 table 22 ela grade 8 item fit statistics 60 table 23 nystp ela 2009 final transformation constants 63 table 24 ela anchor evaluation summary 64 table 25 2009 operational item parameter estimates grade 3 70 table 26 2009 operational item parameter estimates grade 4 71 table 27 2009 operational item parameter estimates grade 5 72 table 28 2009 operational item parameter estimates grade 6 73 table 29 2009 operational item parameter estimates grade 7 74 table 30 2009 operational item parameter estimates grade 8 75 table 31 grade 3 raw score-to-scale score with standard error 81 copyright © 2009 by the new york state education department

[close]

p. 8

table 32 grade 4 raw score-to-scale score with standard error 82 table 33 grade 5 raw score-to-scale score with standard error 83 table 34 grade 6 raw score-to-scale score with standard error 84 table 35 grade 7 raw score-to-scale score with standard error 85 table 36 grade 8 raw score to scale score with standard error 86 table 37 spi target ranges 88 table 38 number of items flagged for dif by the linn-harnisch method 91 table 39 ela 3­8 tests reliability and standard error of measurement 92 table 40 reliability and standard error of measurement mc items only 93 table 41 reliability and standard error of measurement cr items only 93 table 42a grade 3 test reliability by subgroup 94 table 42b grade 4 test reliability by subgroup 95 table 42c grade 5 test reliability by subgroup 95 table 42d grade 6 test reliability by subgroup 96 table 42e grade 7 test reliability by subgroup 97 table 42f grade 8 test reliability by subgroup 98 table 43 decision consistency all cuts 100 table 44 decision consistency level iii cut 100 table 45 decision agreement accuracy 101 table 46 ela grades 3­8 scale score distribution summary 102 table 47 scale score distribution summary by subgroup grade 3 103 table 48 scale score distribution summary by subgroup grade 4 104 table 49 scale score distribution summary by subgroup grade 5 105 copyright © 2009 by the new york state education department

[close]

p. 9

table 50 scale score distribution summary by subgroup grade 6 106 table 51 scale score distribution summary by subgroup grade 7 107 table 52 scale score distribution summary by subgroup grade 8 108 table 53 ela grades 3­8 performance level cut scores 109 table 54 ela grades 3­8 test performance level distributions 109 table 55 performance level distribution summary by subgroup grade 3 110 table 56 performance level distribution summary by subgroup grade 4 111 table 57 performance level distribution summary by subgroup grade 5 112 table 58 performance level distribution summary by subgroup grade 6 113 table 59 performance level distribution summary by subgroup grade 7 114 table 60 performance level distribution summary by subgroup grade 8 115 table 61 ela grades 3­8 test longitudinal results 116 table a1 readability summary information for 2009 operational test passages 120 table a2 number type and length of passages 123 table d1 factor analysis results for ela tests selected subpopulations 128 table e1 nystp ela 2009 classical dif item flags 132 table e2 items flagged for dif by the linn-harnisch method 133 table f1 ela item fit statistics grade 3 134 table f2 ela item fit statistics grade 4 135 table f3 ela item fit statistics grade 5 136 table f4 ela item fit statistics grade 6 137 table f5 ela item fit statistics grade 7 138 table f6 ela item fit statistics grade 8 139 copyright © 2009 by the new york state education department

[close]

p. 10

table i1 grade 3 ela 2009 ss frequency distribution state 148 table i2 grade 4 ela 2009 ss frequency distribution state 149 table i3 grade 5 ela 2009 ss frequency distribution state 150 table i4 grade 6 ela 2009 ss frequency distribution state 151 table i5 grade 7 ela 2009 ss frequency distribution state 152 table i6 grade 8 ela 2009 ss frequency distribution state 154 copyright © 2009 by the new york state education department

[close]

p. 11

section i introduction and overview introduction an overview of the new york state testing program nystp grades 3­8 english language arts ela 2009 operational op tests is provided in this report the report contains information about op test development and content item and test statistics validity and reliability differential item functioning studies test administration and scoring scaling and student performance test purpose the nystp is an assessment system designed to measure concepts processes and skills taught in schools in new york the ela tests target student progress toward three of the four content standards as described in section ii test design and development subsection content rationale the grades 3­8 ela tests are written for all students to have the opportunity to demonstrate their knowledge and skills in these standards the established cut scores classify student proficiency into one of four levels based on their test performance target population students in new york state public school grades 3 4 5 6 7 and 8 and ungraded students of equivalent age are the target population for the grades 3­8 testing program nonpublic schools may participate in the testing program but the participation is not mandatory for them in 2009 nonpublic schools participated in all grade tests but were not well represented in the testing program the new york state education department nysed made a decision to exclude these schools from the data analyses in 2009 public school students were required to take all state assessments administered at their grade level except for a very small percentage of students with disabilities who took the new york state alternate assessment nysaa for students with severe disabilities for more detail on this exemption please refer to the school administrator s manual for public and nonpublic schools sam available online at http www.emsc.nysed.gov/osa/sam/gr3-8ela-08.pdf test use and decisions based on assessment the grades 3­8 ela tests are used to measure the extent to which individual students achieve the new york state learning standards in ela and to determine whether schools districts and the state meet the required progress targets specified in the new york state accountability system there are several types of scores available from the grades 3­8 ela tests and these are discussed in this section scale scores the scale score is a quantification of the ability measured by the grades 3­8 ela tests at each grade level the scale scores are comparable within each grade level but not across grades because the grades 3­8 ela tests are not on a vertical scale the test scores are reported at the individual level and can also be aggregated detailed information on the derivation and properties of scale scores is provided in section vi copyright © 2009 by the new york state education department 1

[close]

p. 12

irt scaling and equating the grades 3­8 ela tests scores are used to determine student progress within schools and districts support registration of schools and districts determine eligibility of students for additional instruction time and provide teachers with indicators of a student s need or lack of need for remediation in specific content-area knowledge proficiency level cut scores and classification students are classified as level i not meeting learning standards level ii partially meeting learning standards level iii meeting learning standards and level iv meeting learning standards with distinction the proficiency cut scores used to distinguish among levels i ii iii and iv were established during the process of standard setting there is reason to believe and evidence to support the claim that new york state ela proficiency cut scores reflect the abilities intended by the new york state education department performance of students on the grades 3­8 ela tests in relation to proficiency level cut scores is reported in a form of performance level classification the performances of schools districts and the state are reported as percentages of students in each performance level detailed information on a process of establishing performance cut scores and their association with test content is provided in the bookmark standard setting technical report 2006 for grades 3 4 5 6 7 and 8 english language arts and the new york state ela measurement review technical report 2006 for english language arts standard performance index scores standard performance index spi scores are obtained from the grades 3­8 ela tests the spi score is an indicator of student ability and knowledge and skills in specific learning standards and it is used primarily for diagnostic purposes to help teachers evaluate academic strengths and weaknesses of their students these scores can be effectively used by teachers at the classroom level to modify their instructional content and format to best serve their students specific needs detailed information on the properties and use of spi scores are provided in section vi irt scaling and equating testing accommodations in accordance with federal law under the americans with disabilities act and fairness in testing as outlined by the standards for educational and psychological testing american education research association american psychological association and national council on measurement in education 1999 accommodations that do not alter the measurement of any construct being tested are allowed for test takers the allowance is in accordance with a student s individual education program iep or section 504 accommodation plan 504 plan school principals are responsible for ensuring that proper accommodations are provided when necessary and that staff providing accommodations are properly trained details on testing accommodations can be found in the school administrator s manual test transcriptions for the visually impaired students large type and braille editions of the test books are provided the students dictate and/or record their responses the teachers transcribe student responses to multiple-choice questions onto scannable answer sheets and the copyright © 2009 by the new york state education department 2

[close]

p. 13

teachers transcribe the responses to the constructed-response questions onto the regular test books the large type editions are created by ctb/mcgraw-hill and printed by nysed and the braille editions are produced by braille publishers inc the lead transcribers are members of the national braille association california transcribers and educators of the visually handicapped and the contra costa braille transcribers and they have library of congress and nemeth code [braille certifications braille publishers inc produced the braille editions for the previous grades 4 and 8 tests camera-copy versions of the regular test books are provided to the braille vendor who then produces the braille editions proofs of the braille editions are submitted to nysed for review and approval prior to production test translations since these are assessments of student proficiency in english language arts the grades 3­8 ela tests are not translated into any other language copyright © 2009 by the new york state education department 3

[close]

p. 14

section ii test design and development test description the grades 3­8 ela tests are new york state learning standards-based criterionreferenced tests composed of multiple-choice mc and constructed-response cr items the tests were administered in new york classrooms during january 2009 over a twoday grades 3 5 7 and 8 or three-day grades 4 and 6 period the tests were printed in black and white and incorporated the concepts of universal design copies of the op tests are available online at http www.nysedregents.org/testing/elaei/09exams/home.htm details on the administration and scoring of these tests can be found in section iv test administration and scoring test configuration the op test books were administered in order on two to three consecutive days depending on the grade table 1 provides information on the number and type of items in each book as well as testing times students were administered a reading section book 1 all grades book 3 grades 4 6 and 8 and a listening section book 2 students in grades 3 5 and 7 also completed an editing paragraph in book 2 the 2009 teacher s directions available online http www.emsc.nysed.gov/osa/elaei/ela-td-35.pdf and http www.emsc.nysed.gov/osa/elaei/ela-td-6-8.pdf as well as the 2009 school administrator s manual http www.emsc.nysed.gov/osa/sam/ela-sam-09.pdf provide details on security scheduling classroom organization and preparation test materials and administration table 1 nystp ela 2009 test configuration grade 3 day 1 2 totals 4 1 2 3 totals 5 1 2 totals 6 1 2 3 totals 1 2 3 1 2 1 2 3 book 1 2 mc 20 4 24 28 0 0 28 20 4 24 26 0 0 26 number of items cr total 1 21 3 7 4 28 0 28 3 3 4 4 7 35 1 21 2 6 3 27 0 26 4 4 4 4 8 34 allotted time minutes testing prep 40 10 35 15 75 25 45 10 45 15 60 10 150 35 45 10 30 15 75 25 55 10 45 15 60 10 160 35 continued on next page copyright © 2009 by the new york state education department 4

[close]

p. 15

table 1 nystp ela 2009 test configuration cont grade 7 day 1 2 totals 8 1 1 2 totals 1 2 3 book 1 2 mc 26 4 30 26 0 0 26 number of items cr total 2 28 3 7 5 35 0 26 4 4 4 4 8 34 allotted time minutes testing prep 55 10 30 15 85 25 55 10 45 15 60 10 160 35 does not reflect cluster-scoring reflects actual items in the test books in most cases the test book item number is also the item number for the purposes of data analysis the exception is that constructed-response items from grades 4 6 and 8 are cluster-scored table 2 lists the test book item numbers and the item numbers as scored because analyses are based on scored data the latter item numbers will be referred to in this technical report table 2 nystp ela 2009 cluster items grade 4 4 4 6 6 6 8 8 8 cluster type listening reading writing mechanics listening reading writing mechanics listening reading writing mechanics contributing book items 29 30 31 32 33 34 35 31 35 27 28 29 30 31 32 33 34 30 34 27 28 29 30 31 32 33 34 30 34 item number for data analysis 29 30 31 27 28 29 27 28 29 test blueprint the nystp grades 3­8 ela tests assess students on three learning standards s1 information and understanding s2 literary response and expression and s3 critical analysis and evaluation the test items are indicators used to assess a variety of reading writing and listening skills against each of the three learning standards standard 1 is assessed primarily by use of test items associated with informational passages standard 2 is assessed primarily by use of test items associated with literary passages and standard 3 is assessed by use of test items associated with a combination of genres in addition students are also tested on writing mechanics which is assessed independent of alignment to the learning standards since writing mechanics is associated with all three learning standards the distribution of score points across the learning standards was determined during blueprint specifications meetings held with panels of new york state educators at the start of the testing program prior to item development the distribution in each grade reflects the number of assessable copyright © 2009 by the new york state education department 5

[close]

Comments

no comments yet