The History of UTF-8

Via the Programming subredit I was led this History of UTF-8 by Rob Pike. I already knew some of that history (such as Ken Thompson and Rob Pike designing UTF-8 on a paper placemat at a diner) but much of the story is new to me. Pike felt compelled to set the record straight because of a persistent story that UTF-8 was designed by IBM and then first implemented in Plan 9.

In fact, the IBM folks had called Pike to vet a design of theirs but Pike didn’t like it because their scheme couldn’t synchronize a byte stream with less than one character being consumed. So Pike and Thompson went to dinner where Thompson sketched out the encoding on the famous placemat. They returned from dinner and while Thompson banged out the packing and unpacking code, Pike started working on the graphics library. The code was done the next day and in another day they had it running in Plan 9.

This is a great story and if you have any interest in Unix (or Plan 9) history, it’s well worth a read. The story comes complete with the code that Thompson wrote that first night as well as some subsequent code, which is, apparently, pretty much what is running in Plan 9 today.

This entry was posted in General. Bookmark the permalink.