You would think that by now every organization on earth would have heard of Goodhart’s law. The law can be briefly stated as, “When a measure is used as a target, it ceases to be a measure.” We see the law in action all the time.
My favorite example is tying student performance on standardized tests to teacher pay and promotion. It sounds like such a good idea when you first hear it. Of course, what happened is that teachers—often at the insistence of their superiors—began “teaching to the test”.
Amazon, apparently, didn’t get the message and started monitoring employee use of AI tools. They pinky swore that the results would not be used for performance evaluations but their employees are a bit more intelligent than management supposed. They noticed that supervisors were monitoring their token usage and reacted exactly at Goodhart predicted. They started using AI to automate non-essential tasks to increase their token usage.
To be fair to Amazon, they never explicitly made token usage a target but what’s the point of monitoring it otherwise? Any employee with an ounce of brains knows that when management starts collecting information about your activities, it’s not idle curiosity and they’re apt to see the results in their next evaluation.