The problem with using tests to rate music, art and gym teachers

For the first time this year, all Georgia teachers will be rated in part on student test results.

That’s straightforward enough for teachers whose students take state standardized tests. But the majority of teachers – in subjects like art, music and gym – teach subjects and grades that aren’t covered by such high-stakes tests.

For them, many school districts have come up with their own exams. Educators and research suggest that system isn’t good enough for evaluations that could make or break careers.

The new system for rating these teachers is open to cheating, educators say, because in some cases teachers administer and grade the very tests used to evaluate them. The quality of tests varies by district, meaning a Spanish teacher in Gwinnett could be graded differently than one in Atlanta. And there are concerns about fairness, because research shows teachers of non-state tested subjects tend to score lower than those who teach courses where state standardized tests are given.

Georgia Department of Education deputy superintendent Avis King said the department is aware of the concerns and taking steps to address them.

“That’s why we are being very careful and cautious as we move forward,” she said.

The state’s plan is part of a new educator evaluation system which bases about half of teachers’ job ratings on an administrator watching them teach and about half on their students’ academic growth.

For teachers of grades and subjects covered by state tests, including math, English, social studies and science, students’ growth is measured by state tests.

For about 70 percent of teachers, whose areas are not covered by state tests, it’s often measured by tests their own districts design.

The new system could change. Georgia’s incoming state school superintendent, Richard Woods, has said test scores should play a smaller role in teacher evaluations. And Georgia has asked the U.S. Department of Education for a delay in using the new overall ratings for decisions about hiring, firing and pay. But they’ll still be used this year to determine which educators in 26 districts receiving federalRace to the Top money get millions of dollars in bonuses.Educators have told the Georgia Department of Education there are problems with how teachers of non-state tested subjects are evaluated, state reports on districts already using the new system show.

The tests and the cut-off scores that place teachers at different rating levels vary from district to district. Some districts — like Atlanta Public Schools — use multiple-choice tests to evaluate all teachers. Other districts combine multiple choice tests with other kinds of tests, like essays or how well music students, for example, play a C-major scale.

The state has sample materials to help guide districts in setting goals for student growth in different classes.

Carrie Staines, a teacher at Druid Hills High School in DeKalb County, said the quality of test questions in her district is poor. She should know: She was among the DeKalb teachers who volunteered to help write them. The Advanced Placement psychology test she wrote with two other teachers is far too short, at 20 questions, and reflects only “random” tidbits of knowledge that isn’t necessarily crucial, she said.

State officials say the new system isn’t supposed to be used to compare teachers in different districts. The idea is to measure how much students “grow” in every classroom, said Michele Purvis, an evaluation system specialist with the Georgia Department of Education.

“They’re not designed to compare this British lit class in this district to a British lit class in another district,” she said.

Another issue educators are concerned about: Student growth ratings for teachers of areas not covered by state tests tend to be lower than those for teachers of state-tested subjects, according to a 2014 University of Georgia research report.

In some cases, the lower scores could be due to initial miscalculations in districts’ expectations, said King, the state department of education official. “There’s a learning curve involved” with the new tests, she said.

And in some districts, teachers administer and grade the tests that are used to evaluate them. The state monitors its standardized tests in math, reading and other areas for cheating, but security for these new, local tests is left up to individual districts. So far, the number of potential test-security problems reported has been “relatively low,” King said.

But Melissa King Rogers, an English teacher at Druid Hills High School in DeKalb County, said “I think it’s just wide open to the sorts of scandals we’ve seen in APS.”

She was referring to the test-cheating scandal that resulted in the indictment of 35 former Atlanta Public Schools employees and allegations of secret answer-erasure parties and other subterfuge.

The Atlanta Journal-Constitution asked Bill Slotnik, executive director of the nonprofit Community Training and Assistance Center, which has helped dozens of states develop ways of evaluating teachers, if Georgia’s method for teachers of areas not covered by state standardized tests is fair and likely to be effective.

“Fairness, like beauty, tends to be in the eye of the beholder,” he said.

But Georgia’s system appears to be running into challenges, he said. Georgia would do better to show educators how a new evaluation system could improve instruction, he said, and involve teachers directly in finding better ways to teach students and reach the goals set under the new system.

“The more these kinds of things don’t happen, the more” the evaluation process “or any other reform just becomes a compliance activity,” he said.

Staff writer Jeff Ernsthausen contributed to this article.

Other states do it

Georgia is one of about 20 states using student academic growth as a major factor in rating teachers. In Georgia and other states, new teacher evaluation systems were part of applications for federal Race to the Top grants.

How good is your 8th-grade band teacher?

Georgia’s new teacher rating system bases about half of teachers’ job ratings on an administrator watching them teach and half on their students’ academic growth.

For example, to measure “student growth” in an 8th-grade band class, students might take a three-part test at the start of school and again in April. The test could include playing two major scales, a sight-reading exercise and a multiple-choice test.

About half of the teacher’s overall job rating depends on how much test scores from students in their classes improve from the start of school to the end.

Source: Georgia Department of Education sample student learning objective statement

By Molly Bloom and Ty Tagami
The Atlanta Journal-Constitution



Filed under Teacher Evaluations

6 responses to “The problem with using tests to rate music, art and gym teachers

  1. Your comment:
    “For the first time this year, all Georgia teachers will be rated in part on student test results.
    That’s straightforward enough for teachers whose students take state standardized tests.”
    As journalists you need to do your homework!
    Here are just a few of the glitches in the “you can measure academic growth by comparing this year’s class to last year’s class.
    How can you measure academic growth from Fifth Grade U.S. History when compared to Sixth Grade World History of Latin America, Canada, Europe, and Australia?
    How can you measure academic growth from Sixth Grade World History of Latin America, Canada, Europe, and Australia when compared to Seventh Grade World History of Africa, Southwest Asia (Middle East), Southern and Eastern Asia?
    How can you measure academic growth from Seventh Grade World History of Africa, Southwest Asia (Middle East), Southern and Eastern Asia to Eighth Grade Georgia History?
    And this is only in the social sciences. Check out Science.
    This, this-year misleading compared to last-year mess is nuts.
    However, it is not quite as disjointed for elementary Math or English Language Arts.
    Take the time to check out the Standards, for yourself.

  2. bkendall527,
    You asked how you measure growth in an independent subject that doesn’t build on itself from year to year. “Growth Model” is a misleading phrase when talking about the independent subjects. The point of comparing student test scores with their academic peers makes sure that you are comparing students that start in the same place. When you have an independent subject like the ones you mention, everybody starts in the same place … the beginning.

    We will be able to see how well teachers teach the various academic peer groups compared to the other teachers. An ‘A’ student in World History knows just as much as about Georgia History as a ‘C’ student in World History. How well will the Georgia History teacher do with the various peer groups? That is what the growth models will tell us in the cases you mentioned.

  3. bkendall527

    GADoE has written about profiling students, is this what you are referring to?
    Because if it is, it will not work.
    You appear to be making an assumption that every student starts at the beginning, and that is a generalization not based on data. Every Student shows up the first day with different levels of knowledge, and skill sets.
    It is easy to forget that schools push students to use short term memory to get them through standardized testing.
    How much short term memory lasts the summer?
    Which is one reason why we need a method to assess where each student stands academically on the first day of school, every year, so you can attempt to accurately measure academic growth for the current school year, or estimate the difference a particular Teacher made on a student.
    GADoE required SLO (Student Learning Objectives) tests to be written by the districts that chose the option to field Race to the Top. My School District was one of them. Then GADoE chose to go to Georgia Milestones instead of using the only assessment that assesses academic growth each school year.
    I suspect that GADoE is planning to continue to hide the truth about how poorly the average Georgia student scored during the last fourteen years on standardized assessments.
    I know because GADoE has been kind enough to publish the “Mean Scale Score” for the Standardized assessments. Possibly with the thought that nobody could do the math to determine just what the value of the Mean Scaled Score was.
    Once you realize that a scaled score is simply a percent of a perfect scaled score the computation is easier. I have completed the computations for the CRCT and EOCT for the last eight years. The best I can say, “It’s Depressing.”

  4. I wouldn’t call it profiling. Like you said, students have varying levels of knowledge. I say those varying levels of knowledge are the academic peer groups and students are only compared to other students in that group.

    Personally, I’m not a big fan of all these tests. It’s the centralization of education. I don’t like the short term regurgitation you referred to, teaching to the tests, the pressure on the kids, the time away from learning to take the tests, the state and federal government getting involved with a heavy hand, etc …

    The GaDOE mentioned “the low threshold for meeting the state’s standard — among the lowest in the nation — wasn’t doing Georgia students any favors.”

    I would like to see your computations for the CRCT and EOCT.

  5. bkendall527

    I was interviewed by Wayne Washington. I suspect he thinks I am a nut.

    Although, I agree that the new assessment will be harder, if for no other reason than higher levels of Literacy Skills required for the written portions.

    Why may Mr. Washington think I am a nut?

    I shared:

    In February 2014, it was announced at the “2014 CRCT Pre-Administration Workshop” that, “Cut scores are the same as the past several administrations.” Found on page ten.

    Which is not true because; If you request the GADoE documents roughly labeled “An Accountability & Assessment Brief – Scale Scores and Cut Scores for the Criterion-Referenced Competency Test (CRCT),” for each of the last eight years you will find that the assessment standards have been changed every year, including 2014.

    Last Spring forty-four of the sixty CRCT standards were dropped. And my best guess, it was done in May.

    Then in a June fourth Press release.

    “The increased expectations for student learning reflected in Georgia Milestones may mean initially lower scores than the previous years’ CRCT or EOCT scores. That is to be expected and should bring Georgia’s tests in line with other indicators of how our students are performing,” State School Superintendent Dr. John Barge said.

    Which make me wonder.

    Are the assessment standards really being raised, or are we just asking a few tougher open ended questions?

    Once it becomes History, we will know what was done.

  6. bkendall527

    I don’t compare Apples to Pineapples like the GADoE publishes to the public annually. I track Graduating Classes. If your still interested in my computations for the CRCT and EOCT. I have a chart for the Class of 2014 which includes ACT and SAT.

    My email