New Federal Open Access Policy

Finally some sense from the US government. According to this White House news release, research funded by the government must be available to the public without cost or delay. In particular, the current policy of a one year embargo on published research will be discontinued. Of course, being the government, the new policy will not be fully in force until 2026.

As I’ve written many times, it’s immoral to ask the American public to pay for research and then lock up the results of that research behind a paywall. Worse, subscriptions to a single journal can run into the hundreds or even thousands of dollars per year.

The “normal” citizen will have no interest in access to these journals of course but there are plenty of people with the ability and interest to read and use the research but who don’t have access to a university library or other source of the journals. And, of course, researchers from third world countries who are fully capable of using and extending the research have no access at all,

The completely normalized, but illegal, solution is sites like Sci-Hub that curate the papers and make them available for free. This infuriates the publishers whose copyrights are being violated but most of the scientific community—fed up with the publishers’ greed and refusal to remedy the situation—seem fine with Sci-Hub and other pirate sites.

Added before publication

Here’s a couple more articles on the change. One from Science and the other from Ars Technica. These articles looks briefly at some of the consequences of the new rule.

Posted in General | Tagged | Leave a comment

Refiling Org Headline Nodes

Mario Jason Braganza has a useful post that considers moving a headline node from one org file to another. His use case is moving items from his TODO file to his current task file as he acts on TODO items.

Org mode, of course, has an easy way of doing that. You can move nodes to another location from within the current file or to another file altogether using the org-refile command. It’s bound to Ctrl+c Ctrl+w so it’s easy to invoke.

The problem is that it can be a bit fiddly to set up. The reason for that is that you have to specify potential targets for the refiling. There are two aspects to that:

  1. Possible files to contain the refiled node
  2. Headings within the target file to contain the refiled node

Braganza explains how to set all this up. Oddly, he arrives at the exact configuration I have except that I consider more subheadings in the target file than he does. I’ve had it set up for so long I no longer remember configuring it. Lately, I’ve been using it more and more instead of just cutting and pasting nodes.

If you sometimes move nodes from your Org files and want to something move sensible than cutting and pasting, take a look at Braganza’s post to see how to set up org-refile.

Posted in General | Tagged , | Leave a comment

Red Meat Friday: Knuth versus McIlroy

One of our cherished stories, at least of one of my cherished stories, is the account of Don Knuth and Doug McIlroy solving the same problem. The problem was

Given a text file and integer k, print the k most common words in the file (and the number of their occurrences) in decreasing frequency.

The TL;DR is that Knuth wrote a 10 page WEB program to solve the problem while McIlroy solved the same problem with a 6 line Unix shell script.

You can imagine the lessons that were drawn but they all omit a crucial bit on context: Knuth was asked to demonstrate literate programming by solving the problem so McIlroy’s solution wasn’t really on-point.

These days whenever the story comes up the usual reaction is to cry foul because of that missing context—here’s a representative example—but I draw a different conclusion and always have. The two solutions represent two ways of approaching a problem: (1) write a program de novo to arrive at a solution or (2) leverage existing software to solve it.

Many of us—me included—tend to reach for Knuth’s solution first and even think of the quick and easy shell solution as cheating. That’s silly, of course. The point is to solve the problem not to write code. Yes, in the particular case of the story, Knuth’s answer was the best one but usually, absent other special circumstances, McIlroy’s is clearly superior. It’s worth remembering that we have tools other than the hammer of writing code and that not every problem is a nail requiring that hammer.

Posted in General | Tagged | Leave a comment

An Afterword to Yesterday’s Post on Google’s False Accusation

After I wrote yesterday’s post on Google falsely accusing an innocent man of child molestation, I saw this post on the story by John Gruber. My first thought was that I would fold whatever Gruber had to say into my post.

I noticed when reading the post, however, that there a couple of problems with it so I decided on a separate Irreal post. The first problem was annoying. Despite the fact that the user lost a decade’s worth of email, photos, his cellular plan, his email address, and other valuable assets, Gruber fails to draw the obvious conclusion: Don’t commit anything valuable to Google’s care. In fact, don’t use Google’s services at all if you don’t want to lose your data and run the risk of Google informing on you to the police. Your best bet is to tell Google to “lose your number”.

The second problem was infuriating. Gruber, while acknowledging that it was a terrible story and an injustice, excuses Google on the grounds of “good intentions”. You see that a lot whenever child pornography—or other viscerally disturbing subjects, or even more mundane reasons for spying on users—is discussed. “Yes, it’s terrible that innocent people got caught up in Google’s (or whomever’s) surveillance net but it’s okay because Google had good intentions.” The idea is that it’s okay to spy on users in the service of combating the scourge of child pornography.

It’s not. It’s saying, “We don’t have any reason to think you’ve done anything wrong but we’re going to spy on you just to make sure.” In an earlier time we used to call those people nosey parkers and shunned them. Or perhaps bloodied their noses or had them arrested. That’s extreme but there was a lot less gratuitous snooping into other people’s business.

Posted in General | Tagged | Leave a comment

Google Falsely Accuses a Father of Molesting His Son

What!? You’re stilling using Google services? Haven’t you been listening to anything I’ve told you? I’m sorry but I’m over being polite. If you’re still using Gmail, Google Docs, or Google Photos, then you’re being naively foolish and deserve whatever bad thing happens to you as a result.

Google, apparently not content with simply arbitrarily closing users’ accounts and seizing their data when one of their automating scanning tools finds something they disapprove of, have started reporting them to the police. Read this post from Cory Doctorow on how Google closed a user’s account and reported him to the police for child abuse and even after he was cleared of wrong doing refused to restore his data.

The TL;DR is that the user’s son had swollen genitals and when the parents consulted his doctor, the doctor asked them to take a picture of the boys genitals and send them to him because he wasn’t seeing patients in person due to COVID. Since the photo was automatically synced to Google Photos, it was scanned and Google, clutching their pearls, was sure a crime had been committed and notified the police giving them access to all of the user’s files.

The police soon realized that there was no crime or dubious behavior but Google is still refusing to admit they were wrong and won’t restore the data. Other than the loss of data—and other problems described in the post—you could, I guess, say the story had a happy ending but it’s not at all hard to imagine that it might not have.

If you deal with Google you’re putting yourself at risk of not just losing your data but of serious legal difficulties. Stop being stupid. There are plenty of good alternatives.

Posted in General | Tagged | Leave a comment

Router Security

Apple has a useful page on how to set up your routers securely. They don’t currently have a router product so this isn’t about how to configure an Apple product. The advice applies to any router. The page is advertised as a way to configure the “settings for Wi-Fi routers, base stations, or access points used with Apple products” but the advice is good regardless of what devices you’re using.

The TL;DR is

  • Use WPA3 Personal for the routers security setting
  • Set the SSID to a unique name
  • Disable the hidden network setting
  • Disable MAC filtering
  • Enable automatic firmware updates
  • Configure Radio Mode to All or Wi-Fi 2 through Wi-Fi 6
  • Enable all bands supported by the router
  • Set Channel selection to Auto
  • Set channel width to 20 MHz for the 2.4GHz band and Auto for the 5GHx band
  • Enable DHCP unless some other devices on the network is providing this service
  • Set the DHCP lease time to 8 hours for home & office networks
  • Enable NAT unless some other device is providing the NAT service
  • Enable WMM

The above is just a précis of the advice on the page. Read the article for the details on the advice and what the various settings mean. Again, even though this is an Apple page, the advice is applicable even if there are no Apple products on your network. There is a bit of advice and corresponding settings for a Mac, iPhone, or iPad but that’s in a separate section and can be ignored if you don’t have one of those devices.

Update [2022-08-23 Tue 15:37]: Added link to Apple article.

Posted in General | Tagged | Leave a comment

More On the Brailsford-Kernighan Video

The video chat between David Brailsford and Brian Kernighan has sparked a lot of interest and commentary among the Unix faithful. Dough McIlroy offered this story concerning egrep to the conversation. The egrep connection is that it was egrep’s regex technology that powered AWK.

McIlroy explains that for years he thought he was responsible for getting Ken Thompson to cut the regex code from ed and turn it into grep. He learned much later—through the THUS mailing list—that, in fact, Thompson had already done that to make a tool for his own use and all that really happened was that McIlroy’s urging got him to make it publicly available.

McIlroy goes on to say that he used egrep as an integral part of his calendar program but that it painfully slow. Al Aho, the developer of egrep was mortified and introduced lazy state building to egrep to make it faster. That worked but now McIlroy wonders whether he really inspired Aho to make the change or whether it was something he had already planned.

A bit later Mohamed Akram sent a message saying that he had written a blog post about calendar that explains what it does and how it does it. The TL;DR is that it’s a simple script that calls a C program to build a regex and then uses that regex in an egrep call to extract the desired information from the calendar file. Read Akram’s post for the details and the code.

McIlroy is a master at this sort of thing and calendar is a nice example of the way he leveraged existing tools tied together with a shell script to get things done.

Posted in General | Tagged , | Leave a comment

Unicode in AWK

A few days ago I wrote about the excellent video of David Brailsford and Brian Kernighan discussing AWK and its history. In the video, Kernighan mentions that he’s been working on enabling Unicode in the One True AWK. Here’s a pull request from Kernighan showing that he’s mostly accomplished that goal.

At one level, it’s easy to believe that it’s basically a trivial change but as AWK demonstrates it’s not always so easy. Probably the hardest thing is fixing AWK’s regex parser to accept and deal with Unicode. But even simple things like calculating the length of strings can be a problem.

When AWK was first developed—and long afterwards—ASCII was sufficient. These days, it’s a real imposition to deal with an app that doesn’t support Unicode. Kernighan’s porting AWK to support it will ensure that AWK will continue to be a useful tool not only for English people speaking people but for those who speak languages with non-ASCII characters as well.

If you’re a young engineer, the idea of open source and having access to the source code to your tools seems unexceptional. That’s just the way it is. But AWK comes from a time when that wasn’t true. It’s great to see the original AWK still available and still under development. AWK and its developers are a national treasure that we should all be thankful for.

Posted in General | Tagged | Leave a comment

Shouting at Disks

Recently, I wrote about a Janet Jackson song that could cause laptops to crash. That turned out to involve frequencies from the song that resonated with a critical frequency in the disk subsystem and was solved simply by installing a filter to damp out the offending frequency.

Right after I published that story, I stumbled upon a video from 14 years ago of someone shouting at a disk array. It didn’t cause a crash but monitoring software clearly showed that disk latency went up when the disk was shouted at.

The engineers doing the demonstration explained this as having to do with vibration so perhaps it’s different from the Janet Jackson menace but it’s surprising how unexpected things can affect disk performance.

Posted in General | Tagged | Leave a comment

Emacs and the Unix Philosophy

Ramin Honary has a six part series of posts that presses the claim that Emacs does, in fact, adhere to the Unix Philosophy that a program should do one thing and do it well. Almost everyone else’s opinion is that that makes no sense at all. Emacs, after all, is famous—or infamous, depending on your sensibilities—for its Borg-like assimilation of any computer task that wanders into its event horizon.

But Honary makes the case that Emacs is not (merely) an editor but should thought of as in Elisp interpreter. In that sense, the one thing it does well is to run Elisp functions. He goes further and claims that Unix and the Bourne shell are really a sort of proto-functional programming.

It’s an interesting post although Honary gets a few historical facts wrong. Philip Kaludercic has a post, Notes on “Emacs fulfills the UNIX Philosophy” that helps fill in the blanks.

None of this really matters, of course. If you use Emacs, you don’t care if it adheres to the Unix Philosophy or not. If you don’t use Emacs, you don’t care if it adheres to the Unix Philosophy or not. Still, it’s an interesting idea and worth discussing for its own sake.

One thing Honary says that I absolutely agree with is that Emacs doesn’t have extensions; it has apps. I’ve objected before to referring to Emacs packages as extensions and saner heads told me to relax. Honary provides a logical reason for my more emotional response.

Posted in General | Tagged , | Leave a comment