Responding to complaints about the readability of STAAR tests, Texas lawmakers included provisions in HB 3 that required the commissioner of education to study the exams used in grades 3-8 to examine whether the assessments are written at an appropriate reading level, whether they only include content aligned with TEKS for that or earlier grades, and whether they only include passages written at the reading level or below for the grade level taking the assessment. 

TEA contracted with the University of Texas at Austin to conduct the study. UT released its report on 2018-19 STAAR tests in early December. It found that the vast majority of passages in that year’s reading and writing exams were within or below the test’s grade level, and that most questions aligned with what the state expects students to learn in each subject. But the researchers struggled to determine whether the test questions were too challenging for students. They concluded that analyzing the complexity of the test questions "in a reliable manner for this report is not possible."

UT will release its report on 2019-20 STAAR tests by Feb. 1.  

Study details

UT researchers examined the grades 3-8 STAAR reading, math, writing, science and social studies exams. Reasearchers had three objectives:

  • Task 1: A content alignment study of 17 tests
  • Task 2: A readability study on questions and answers for 17 tests
  • Task 3: A readability study on passages for six reading and two writing tests

Content alignment findings

To evaluate item alignment with precoded content standards, The Meadows Center for Preventing Educational Risk convened a panel of staff members and affiliated faculty members with content expertise and research and evaluation experience. Two panelists independently coded each item as either aligned or not aligned. When panelists disagreed, a third panelist independently reviewed the item in question and made a final determination. When a rating of not aligned was assigned, reviewers indicated the reason(s) for the rating and provided an alternative student expectation that more closely aligned with the knowledge and skills addressed in the item, if one existed.

Across all grades and subject areas, the overwhelming majority of items were rated as aligned to the precoded content standards. The percentage of items requiring a third reviewer ranged from 1.4% (mathematics) to 10.7% (writing). Across grades and content areas, a total of eight items were rated as not aligned by a third reviewer. Within each subject area, the final percentage of items rated as aligned to the precoded content standards ranged from 93% (social studies) to 100% (reading).

As a result, the data indicate that across grade levels and subjects, all tests included in this study were aligned with the TEKS content standards for the grade level tested.

Item readability findings

Existing research on readability pertains primarily to passages of text. There is little guidance and even less research on evaluating the readability of test items, other than a widespread recognition of the measurement challenges. Researchers said, "Because of the lack of research to guide our approach to item-level readability, we compared several methodologies to determine whether we could produce reliable result.

"Because we do not have confidence in these results, we were forced to conclude that analyzing item readability in a reliable manner for this report is not possible. Unless and until additional research provides clear guidance and evidence of a reliable way to evaluate item readability, we cannot recommend conducting analyses of the grade-level readability of test items. It is important to note that we were asked to analyze item readability, not item difficulty. An analysis of item difficulty requires a different methodology than an analysis of readability."

Passage readability findings

Overall, two of the three indices fell within or below the English/Language Arts grade band for the test’s grade level for 30 of the 35 passages analyzed. In other words, 86% of passages met the criterion for readability as defined in this study when the ELA norms were used.

Click here to read the full report from the University of Texas.