Unicode in AWK

A few days ago I wrote about the excellent video of David Brailsford and Brian Kernighan discussing AWK and its history. In the video, Kernighan mentions that he’s been working on enabling Unicode in the One True AWK. Here’s a pull request from Kernighan showing that he’s mostly accomplished that goal.

At one level, it’s easy to believe that it’s basically a trivial change but as AWK demonstrates it’s not always so easy. Probably the hardest thing is fixing AWK’s regex parser to accept and deal with Unicode. But even simple things like calculating the length of strings can be a problem.

When AWK was first developed—and long afterwards—ASCII was sufficient. These days, it’s a real imposition to deal with an app that doesn’t support Unicode. Kernighan’s porting AWK to support it will ensure that AWK will continue to be a useful tool not only for English people speaking people but for those who speak languages with non-ASCII characters as well.

If you’re a young engineer, the idea of open source and having access to the source code to your tools seems unexceptional. That’s just the way it is. But AWK comes from a time when that wasn’t true. It’s great to see the original AWK still available and still under development. AWK and its developers are a national treasure that we should all be thankful for.

Posted in General | Tagged | Leave a comment

Shouting at Disks

Recently, I wrote about a Janet Jackson song that could cause laptops to crash. That turned out to involve frequencies from the song that resonated with a critical frequency in the disk subsystem and was solved simply by installing a filter to damp out the offending frequency.

Right after I published that story, I stumbled upon a video from 14 years ago of someone shouting at a disk array. It didn’t cause a crash but monitoring software clearly showed that disk latency went up when the disk was shouted at.

The engineers doing the demonstration explained this as having to do with vibration so perhaps it’s different from the Janet Jackson menace but it’s surprising how unexpected things can affect disk performance.

Posted in General | Tagged | Leave a comment

Emacs and the Unix Philosophy

Ramin Honary has a six part series of posts that presses the claim that Emacs does, in fact, adhere to the Unix Philosophy that a program should do one thing and do it well. Almost everyone else’s opinion is that that makes no sense at all. Emacs, after all, is famous—or infamous, depending on your sensibilities—for its Borg-like assimilation of any computer task that wanders into its event horizon.

But Honary makes the case that Emacs is not (merely) an editor but should thought of as in Elisp interpreter. In that sense, the one thing it does well is to run Elisp functions. He goes further and claims that Unix and the Bourne shell are really a sort of proto-functional programming.

It’s an interesting post although Honary gets a few historical facts wrong. Philip Kaludercic has a post, Notes on “Emacs fulfills the UNIX Philosophy” that helps fill in the blanks.

None of this really matters, of course. If you use Emacs, you don’t care if it adheres to the Unix Philosophy or not. If you don’t use Emacs, you don’t care if it adheres to the Unix Philosophy or not. Still, it’s an interesting idea and worth discussing for its own sake.

One thing Honary says that I absolutely agree with is that Emacs doesn’t have extensions; it has apps. I’ve objected before to referring to Emacs packages as extensions and saner heads told me to relax. Honary provides a logical reason for my more emotional response.

Posted in General | Tagged , | Leave a comment

Janet Jackson and Crashing Laptops

Raymond Chen occasionally posts interesting stories from his (long) time at Microsoft. His latest offering tells the story of how Janet Jackson used to have the power to crash laptops. It turned out that playing Jackson’s Rhythm Nation on certain laptops would cause them to crash. A little experimentation showed that playing the music on one laptop could even cause another nearby laptop that wasn’t playing the music to crash.

I’ll let you read Chen’s post to see what was happening and how they fixed it but the interesting thing is that Chen speculates that the fix may still be in place even though the hardware involved is no longer used. It was one of those things were the fix was installed with instructions that it should not be removed and years later no one knew why it was there but were afraid to remove it.

It’s a real problem. Sometimes, like in this case, the fix is no longer doing anything useful but sometimes removing it without a thorough understanding of what it was doing could lead to disaster. Things like this are what make our industry so endlessly engaging.

Posted in General | Tagged | Leave a comment

Brailsford & Kernighan on AWK

Computerphile has a another wonderful discussion between David Brailsford and Brian Kernighan. We are quickly reaching the time when all the original Unix people will be gone (Kernighan is 79 or 80) so these chats are our last chance to get an oral history of what it was like in the beginning.

This particular chat is about AWK. I thought that by now everyone knew that the K in AWK stands for Kernighan but judging from the comments, apparently not. AWK dates back to the 1970s and is still maintained—even the original AWK—as well as the GNU version GAWK. It’s my favorite scripting language and tremendously powerful for problems in its domain.

One of the things Kernighan revealed in the video is that he’s recently spent some time in making (the original) AWK work with Unicode and that his summer vacation project is to update the AWK Book, which if you follow the video link you’ll learn is from 1988. It’s still available but at a outrageous price so a new version would be very welcome, especially to younger engineers who may not have access to the original.

I always enjoy these Brailsford/Kernighan chats and inevitably come away from them knowing something I didn’t know before. In an age where many people in the field don’t know that Kernighan is the K in AWK or even that he’s the K in K&R, these videos become more important than ever.

Posted in General | Tagged | Leave a comment

Keeping Data and Code in the Same File

John D. Cook has another post in his series on coding in Org-mode. The latest emphasizes how you can keep data, code, and documentation in a single (Org) file.

There’s nothing new in that idea for most Irreal readers, of course, but there is one new thing I didn’t know. When you use a table as input data for a code block, the header is not part of the data by default. You can get Org-mode to pass the header too by specifying the unintuitive parameter :colnames no on the source block line. Cook also gives some Python code that shows how to print the table along with the header and also do some calculations on the data.

Cook is a consulting mathematician and I view this series of posts as him documenting his evolving use of Org-mode in his work. The whole series is worth a look.

Posted in General | Tagged , | Leave a comment

More dwim-shell-command

Álvaro Ramírez continues his roll with yet another function for his dwim-shell-command framework. This time, it’s a function to combine several .png files into a single .pdf file. As with the other functions, the point is not to enable new functionality but to make complicated invocations of existing programs easy to remember and use.

As Ramírez says, while it can be hard to remember the command to use for some action, dwim-shell-command lets you name the task, which is easier is to remember, and capture complicated parameters for the process.

As I’ve written before, the dwim-shell-command package is now available on Melpa and Ramírez has broken out the framework code from the individual shell commands that he’s written. That makes it a bit easier if you’re not interested in his functions but want to write your own.

If you frequently invoke commands from the shell with hard to remember names and complex calling sequences, you should definitely take a look at this package.

Posted in Blogging | Tagged | Leave a comment

Tenacity!

Apropos of nothing, this story really appealed to me. I admire cranky guys like Chaturvedi who just resist being pushed around no matter how small the stakes are. The story doesn’t make clear his motivation but I’d guess it’s less principle than a desire not to suffer what he feels was an injustice.

For Westerners like me who don’t have such facts at their fingertips, 20 rupees is approximately 25 cents (US). That means he worked 22 years at a yearly rate of about a penny just to prove he was right.

The whole story is weird but the weirdest part is why Indian Railways didn’t pay Chaturvedi his quarter and make the whole thing go away years ago.

Update [2022-08-14 Sun 15:25]: principal → principle

Posted in General | Leave a comment

Fundamental Laws

For some reason there was a recent pointer to this 6 year post by Matthew Jones on some of the fundamental laws of software development. Most of them will be familiar to Irreal readers but it’s nice to see them listed along with their explanations in one place.

Jones lists 15 laws or principals. They are:

  1. Occam’s Razor
  2. Hanlon’s Razor
  3. The Pareto Principle
  4. Dunning-Kruger Effect
  5. Linus’s Law
  6. Robustness Principle
  7. Eagleson’s Law
  8. Peter Principle
  9. Dilbert Principle
  10. Hofstadter’s Law
  11. The 90-90 Rule
  12. Parkinson’s Law
  13. Sayre’s Law
  14. Parkinson’s Law of Triviality
  15. Law of Argumentative Comprehension

Some of these, like Hanlon’s Razor and the Dilbert Principal, are tongue-in-cheek while others, like The Pareto Principal and the Dunning-Kruger Effect, are serious, a précis of actual research.

Oddly, the most famous law of all, Murphy’s Law, doesn’t make an appearance. As every developer knows, it is always with us and operative. Regardless, the list is amusing and worth taking a look at if you’re searching for a momentary diversion.

Posted in General | Tagged | Leave a comment

Red Meat Friday: Emacs Sucks

As you can tell from the title, this is the rawest of red meat. The title comes from a post on reddit by BlackberryPerfect938 entitled Why Emacs Sucks. On the one hand, what else is new? Plenty of people try Emacs and decide they don’t like it—nothing wrong with that. There’s also the fact that after a while, the editor wars become boring. Still, it’s worth taking a look at BlackberryPerfect938’s arguments.

His major complaint, as I understand it, is that Emacs “feels old”. The reasons he thinks it feels old are:

  1. There are a lot of external packages that implement modes with overlapping and sometimes conflicting functionality with built-in modes.
  2. Many times these external packages implement capabilities that—BlackberryPerfect938 feels—should be built in. He gives LSP as an example that he finds particularly annoying.
  3. Emacs is “distracting”. He gives, the admittedly enjoyable, desire to tinker with your configuration as one example, and the existence of games as another.
  4. Legacy keybindings.
  5. The Emacs community consists mostly of “old folk” such as technical people, scientists, and professors.

You probably don’t need Irreal to call BS on those complaints but here at Irreal we live to serve so we will anyway.

  1. This is just an example of how Emacs can be configured or extended by anyone to meet their specific needs. Often, of course, others find those customizations/extensions useful so they’re made available to all through one of the repositories.
  2. This is Emacs evolution is action. Someone will write a useful package that gets used by more and more Emacers. Eventually, when the usefulness is confirmed, the package may get absorbed into Emacs core. Org-mode is an example of this.
  3. No one is forced to use any of the games. I don’t but they’re there if you want them. How is this a problem? The constant tweaking of your Emacs configuration just means that users adjust the editor as their needs change.
  4. No complaint about Emacs would be complete without whining about the editor not following the CUA bindings that came years after Emacs was introduced. And, of course, the complaints always forget to mention that a single line of configuration will, in fact, enable those CUA bindings.
  5. This seems to me to be the most bizarre of the complaints. It boils down to “People who are experienced and knowledgeable tend to use Emacs. Those who are younger and lack that experience do not.” Therefore…. It just doesn’t make sense.

The commenters were not kind to BlackberryPerfect938 as you can see by following the link. As I’ve said many times, there are plenty of reasons not to use Emacs but BlackberryPerfect938’s post doesn’t give any of them.

Posted in General | Tagged , | Leave a comment