Teacher Effectiveness

Tuesday, October 15th, 2013

Let’s say you were trying to measure teacher effectiveness, Handle suggests:

You could theorize that any student’s test scores are 100% derived from the teacher’s pedagogic style.  You would then test all the students, take a class average, compare it to all the other class averages, and grade the teacher on where the result fits in the broader distribution.

Well, perhaps you find that theory absurd.  Just ask any teacher; they’ll confirm it’s absurd.  Now what?  Well, maybe you alter your theory and say that the teacher is only responsible for the improvement in the class average from the previous year.  You do the same as above and less unrealistic but still pretty absurd.  The teachers will still resist (in one way or another) if you try to evaluate them that way.

But lets say you test each student for their IQ and average test scores.  You could even measure all their social statistics.  You then put them into tracked, leveled classes according to both cognitive ability and prior knowledge, so that the teacher can teach in one way and use time more efficiently than if she had to deal with a large variation in ability and preparedness.  Then you come up with an ‘average expected value added’ tailored to each student given similar profiles around the country.

And then, finally, you grade the teacher relative to her peers on the basis of how much value she actually added to the students based on what we expected her to be able to achieve.  Now the teachers might relax the grip on their pitchforks and actually get on board your bandwagon.  That’s because you are now measuring something that they know aligns with the notion of ‘teacher effectiveness’ and accords with reality, and not concocted utopian fantasy.

In other words, your latest social theory is now tempered with a lot more common sense reality than when we started.  But, you know, it’s funny, we aren’t actually measuring teacher effectiveness or school quality in this realistic, common-sense way.  Why not?  That ‘structure of taboos’ thing, that’s why not.

Comments

  1. Boonton says:

    Hmm,,

    But, you know, it’s funny, we aren’t actually measuring teacher effectiveness or school quality in this realistic, common-sense way. Why not?

    What is this easy ‘common-sense’ way of measuring techer effectiveness?

    But lets say you test each student for their IQ and average test scores. You could even measure all their social statistics. You then put them into tracked, leveled classes according to both cognitive ability and prior knowledge, so that the teacher can teach in one way and use time more efficiently than if she had to deal with a large variation in ability and preparedness. Then you come up with an ‘average expected value added’ tailored to each student given similar profiles around the country.

    Ahhh that’s it. Just measure ‘all their social statistics’ (and how many of those are there? 10? 50? 1000?) and produce an ‘average expected value added’…

    This all brings to mind Nate Silver’s book, The Signal and the Noise, which had a very interesting chapter about loading all the stats he could find into a model that could predict which players in the minor leagues would likely make excellent major league baseball players. Baseball is a good sport to try this with since it has a huge amount of statistical data to pull from. Nonetheless he still was only able to get about even with the ability of the best talent scouts who operate with their experience and ‘hunches’. Of course he isn’t the only person running such models, there are a few of them (and now probably even more) and all of them are quite complicated and quite different from each other. This indicates to me that such an effort to produce such a ‘common sense’ model would be very exposed to gaming the system.

    Of course baseball also benefits from the fact that good output is well defined. While people can argue forever about who the best baseball player is, it’s pretty easy to see what a good player would be. Can he catch? hit? run? That’s a pretty limited number of ways to measure the quality of a player and it seems like something we can measure in an objective manner.

    What makes a good output of a school? Is it how many go onto college and become English Phds? What about the student who asorbs everything in HS but opts to go into a mechanical trade? Or becomes an artist? Was his education bad because he didn’t rack up more academic kudos?

    The implied output here is test scores. As in “Little Johnny got 110 last year on the Massive High Stakes Super Test. The complicated model we built says this year, given his IQ/Social Status/Parents being divorced/Income Status/Religion/Birth Order says he should get 115 this year. If he gets higher than that we can say the teacher did a great job, if he did less a bad job.

    But is that the right output? Does getting 120 on the test indicate Johnny is being better educated than if he only gets 105? Let’s say with a lot of digging through Big Data we figure out how to double test scores. what happens then? Will that make everyone better? Smarter? Will productivity double? If it doesn’t will the test makers issue us refunds for all the time and money spent on this project?

Leave a Reply