A Graduate Student Research Workflow

Koustuv Sinha is a PhD student in machine learning and natural language processing. Because much of his time is devoted to reading research papers in his field, he’s devoted significant effort into optimizing his workflow.

The TL;DR is that he’s used Emacs and Org-mode to develop an efficient method of discovering and curating interesting papers. It starts with the discovery. For this he uses Elfeed to subscribe to various Arxiv feeds in his areas of interest. He uses the elfeed-score package to rank these papers in the approximate order of his interest in them.

When he reads an abstract in the Elfeed results that seems interesting, he fires off a process that captures the paper’s metadata to his bibliography file; grabs a copy of the paper, renaming it to the bibtext key, and storing it in a central repository; and adds it to an Org file listing the papers we wants to read.

A lot of this is accomplished by leveraging John Kitchin’s org-ref. He calls org-ref functions directly to get and store the paper’s data. It’s a nice example of reusing someone else’s codebase in your own.

Sinha provides a huge number of details in his post so be sure to take a look. If you have similar needs, this is an excellent starting point for your own workflow or even something worth stealing wholesale.

Posted in General | Tagged , | Leave a comment

Remote Work At The Doctor’s Office

If you’ve been reading Irreal for more than the last 5 posts you know that I’m a big supporter of remote work. There are many many jobs that can be done remotely and there’s no reason for many employees to be on-site. Still, there are some jobs that intrinsically require an on-site presence. It’s hard, but not impossible, to imagine a store clerk working from home. Some jobs, though, just seem to require an in-person presence.

One such job is medical provider. To be sure, COVID-19 has seen the rise of virtual appointments but sometimes doctors really do need to be able to put their hands on you to do their job.

The other day, I went to one of my doctors for a yearly checkup and when the doctor came into the examining room, he had an iPad with an active video session. He told me that that the woman at the other end was a medical transcriptionist who was based on the other coast of Florida. During the exam, the transcriptionist reminded the doctor of past findings from tests and recorded the current findings.

If the transcriptionist had been in the exam room, I wouldn’t have thought twice about it but the fact that she was far away made the process seem unusual and strange. But that’s just silly. She could see and hear everything that was going on—we even waved goodbye when the exam was over—so was able to capture all the relevant data just as well as someone who was in situ.

There isn’t anything really surprising about all this. It’s just that we don’t often think about medicine as a field ripe for disruption by remote work. It is, however, already happening.

Posted in General | Tagged | Leave a comment

Org-mode Versus Jupyter Notebook

John D. Cook, a consulting mathematician, who runs the TeX and Typography Twitter feed as well as several similar—mostly mathematically focused—feeds has two posts that consider using Org-mode instead of Jupyter Notebook. It’s interesting because it comes from someone who is neither a developer nor a dedicated Org user.

The first post considers Org-mode as a light weight Jupyter Notebook. It stresses how easy it is to mix text, LaTeX markup, code, and the results of running that code in a single file. That’s a real win if you’re trying to do reproducible research or simply trying to simplify your workflow. Since everything is text, it’s easy to integrate it into your version control system.

The second post is reminiscent of Mike Hamrick’s video of using Org and Org Babel to create documents that automatically maintain their consistency as parameters change. Cook’s post covers exactly that: how to specify parameters separately from their use in order to maintain consistency as things change.

If you’re a hardcore Emacs/Org user or even a long time Irreal reader, none of this will be new to you but it’s a really excellent introduction to one tiny aspect of Org’s power. It’s definitely worth your while if you’re new to this aspect of Org-mode.

Posted in General | Tagged , | Leave a comment

Red Meat Friday: PHP Is The Right Choice

One thing you have to say for Daniel Abernathy is that he’s not afraid of the heavy lift. He’s got a post that presses the claim that PHP Is the Right Choice in 2022 and Beyond. It’s hard to find more people than you can count on one hand who will admit to liking PHP but, of course, its popularity gainsays that popular wisdom. Still, it’s fair to say that PHP is the Rodney Dangerfield of programming languages.

Even Abernathy admits the post’s title may be a little overstated but he does make the case that there’s a lot to like about the language and ecosystem and that it’s not like it used to be.

Manuel Odendahl seems to agree but some reddit commenters are less obliging. One comments that “As much as I despise java, at least a group of allegedly competent engineers took the time to actually design the language and its type system, as opposed to hacking together a bunch of stupid shit workarounds on top of an already hacked together brain-damaged non-designed crap, which is the case of php.”

I’m completely agnostic on the matter because I don’t know the language at all. I’ve written exactly one line of PHP and that was because of exigent circumstances. I only got away with it because it’s sufficiently C-like that I could fall back on my mental muscle memory.

Regardless, hating on PHP is well entrenched and nothing Abernathy or Odendahl can say will do much to change that. That’s why Abernathy’s fearless, if ultimately futile, defense of the language has earned him a coveted spot on Irreal’s Red Meat Friday.

Posted in General | Tagged | Leave a comment

Mickey on Evaluating Elisp

Mickey from Mastering Emacs has an excellent post on the various ways of evaluating Elisp in Emacs. As Mickey says, there are several ways of doing it depending on the context and it pays to be familiar with them all.

The most familiar way is probably eval-last-sexp (Ctrl+x Ctrl+e). It’s really useful because it will evaluate almost anything: s-expressions (of course) but also numbers, strings, and most special forms. The situation for special forms has improved a bit in Emacs 28 so be sure to take a look at Mickey’s post to get the details.

There’s also eval-buffer and eval-region, which do as their names suggest. These commands generally don’t evaluate special forms such as devar, defface, and defcustom. That’s generally what you want so it’s a feature instead of a bug. Again, see the post for the details.

The method that I always tend to forget about is eval-defun, bound to Ctrl+Meta+x. It’s especially handy for evaluating functions because, unlike eval-last-sexp, you can call it from anywhere within the function instead of needing to be at the end. If you call it with the universal argument, it will turn on debugging for the function. It’s worth reading Mickey’s article just for the section on this command.

Finally, there’s Eshell and IELM. Most Eshell users know you can evaluate many Elisp expressions there but when you want a real Elisp REPL, IELM is what you want. It’s perfect for experimenting with code that’s longer than a single expression. I use it fairly often and love it.

Like all of Mickey’s posts, this one is definitely worth your time and effort.

Posted in General | Tagged | Leave a comment

Thoughts On Thoughts On RSS

Matt Rickard has a—at least to me—provocative post on RSS. As I’ve said many times, I’m a big believer in and user of RSS. Google did its best to kill it off but it turned out to be too useful to discard. Along with the excellent Elfeed it’s my main way of discovering and curating interesting blog posts.

That’s why I disagree with several of Rickard’s points. Rickard appears to take the point of view of a content creator interested in monetizing content. That’s a valid viewpoint, of course, but I look at it from a user’s point of view and very much like the way it works.

Rickard notes that the typical RSS entry is much like an email. It doesn’t render HTML very well and certainly doesn’t support Javascript. Rickard says that’s okay for email but not for general blog content. Perhaps, but I use RSS to point me to interesting blog posts—that I read with my browser—not as the primary way to consume a post. Indeed, many of the RSS entries don’t have the whole post and some have only the title. I like the primarily text based entry that renders quickly and helps me decide if a post is interesting enough to read.

At the other end of the spectrum, Rickard says “discovering a feed and seeing raw XML was too technical for the average user.” Well yeah but who reads the raw XML? I’d guess virtually no one. There’s nothing hard about subscribing to a feed either. Usually it’s just clicking on a link. It’s true you have to already know about a site to subscribe but that’s true no matter how you consume it.

Rickard doesn’t seem to be against RSS. He just notes that it’s not ideal for commercial content creators and doesn’t look as nice as a blog post rendered in a browser.

Although those who want to hoover up your Web activity or sell you things have done their best to put a full stop to RSS, users love it and keep it going. As the name suggests it’s a simple protocol and doesn’t require much maintenance. I, for one, hope it’s with us for a long time.

Posted in General | Tagged | Leave a comment

Dired-rsync

Yi Tang has an interesting post on the dired-rsync package. It’s been around for a while, apparently, but I hadn’t heard of it before Tang’s post. The TL;DR is it allows you to use rsync in dired in the same contexts that you would otherwise use Copy.

Tang lists all sorts of reasons why he believes rsync is superior to cp and scp but, oddly, doesn’t mention the major one: rsync only send the parts of the file that are different from the target. It is, in short, a tool optimized for copying an updated file.

Much of the post is devoted to explaining how Tang has integrated the package into his workflow. It’s perfect for downloading large data files from a server to his local machine where he can manipulate and analyze them. He explains how he set everything up in case you have a similar use case and want to recreate his workflow.

It’s a nice post that also explains some of the gotchas if you want to use dired-rsync yourself. It’s on Melpa and setting it up is simple. You can simply copy Tang’s use-package configuration for an excellent starting—or permanent—setup.

Posted in General | Tagged | Leave a comment

They Never Give Up

The Simple Analytics Blog has a disturbing post about Vodaphone and their reintroduction of persistent tracking. Vodafone & Deutsche Telekom are network providers whose job it is to send our data across the Internet and nothing more. They’re supposed to be a simple pipe that passes the data along without interference.

But there’s money to be made so of course they’re abandoning that role. They want to add a unique ID to each transaction so that Websites can query, and pay, them to see what other sites a user has accessed. Vodaphone, of course, is claiming that this is actually a privacy friendly policy but only the most naive will be deceived.

As the article points out, Apple is trying to circumvent this sort of move with their iCloud Private Relay service that encrypts your Web transactions so that Internet providers can’t spy on them. But you don’t need to rely on Apple. Just run a VPN and all your provider can see is that you are connecting to your VPN provider.

On a recent beach vacation with my family, I routed everything through my VPN provider, ExpressVPN, and was delighted at how transparent it was. Once I turned it on, it automatically reconnected each time I woke up one of my devices. There’s really no reason that you couldn’t just leave it running all the time. Indeed, I forgot to turn it off and after I got home I didn’t realize it was still running until a day or two later. As far as I can tell, there was no delay so there’s really no reason just to keep it running all the time. An added bonus is Vodaphone and their ilk will hate that.

Update [2022-08-06 Sat 12:55]: are → our

Posted in General | Tagged | Leave a comment

Changing How Emacs Works

Karthik has a nifty video on how to change the way Emacs works. We’re all fond of saying that Emacs is infinitely extensible and customizable but then we usually go out for a beer without saying how. Kathink remedies that by showing us how to change Emacs’s behavior even when we don’t know what we’re doing.

Kathink uses Notmuch to read his email from within Emacs but he’s got a problem. He’s a developer so a lot his email includes a patch or a diff as a MIME attachment. That’s fine but most of the time he doesn’t want to see large patches or diffs. What he’d like is for those two MIME types to be folded by default so that they don’t clutter up his emails but so that he can unfold them when he does want to see their content.

He begins by saying he has no idea how Notmuch works but he doesn’t let that stop him from resolving the issue. What follows is his step-by-step discovery of how to solve his problem. He beings with the usual checking of the documentation and customize subsystems but, sadly, that was of no avail. Instead, he had to turn to the source code.

Notmuch has a lot of code, none of which he’s familiar with so it seems like an impossible task but Karthik shows definitively that that’s not the case. He doesn’t use a debugger or any fancy tools; he just burrows around in the code until he zeroes in on the solution. His method resonates with me because it’s pretty much what I do. Most of the time I don’t really know exactly where I’m heading but just follow the clues until I arrive at the solution. That’s exactly what Karthik does.

The video also demonstrates somethings that I already knew but didn’t appreciate enough until I saw them in action. The first of those is using xref-find-defintions to navigate through the code. The second is setting what Karthik calls “pins” to remember locations so that you can return to them at will. He does that with point-to-register to set the pin and jump-to-register to return to it.

This is a really good video and I recommend it to all Emacs users. It’s 29 minutes, 44 seconds so plan accordingly but do try to find the time.

Update [2022-08-01 Mon 13:10]: Work → Works

Posted in General | Tagged | Leave a comment

Improvements to dwim-shell-command

Álvaro Ramírez has been busy making improvements to his excellent dwim-shell-command package. I’ve written a lot about this package recently but that’s okay because it’s something most Irreal readers would want to know about. The TL;DR is that the package provides a DWIM interface between Emacs and the shell making it easy to invoke various utilities from Emacs that would normally be started from the shell.

The new version allows dwim-shell-command to operate on a set of files in a region rather than having to be marked in dired. There’s also a marker to insert the contents of the clipboard into a command. That’s perfect for inserting a URL that you’ve clipped from, say, the browser and using it in a shell command.

Finally, Ramírez has added numeric and alphabetic counters that allow for names that are the same except for the counter value. That works just as you’d think it would. The package is, after all, meant to provide do what I mean actions.

As far as I can see, this package started out as a quick hack that allowed Ramírez to create an easy way of invoking frequent but complex shell commands. Once he’d laid down the framework, new applications kept suggesting themselves to him and the project has grown.

As I wrote the other day, the package is now available on Melpa so it’s easy to try it out if you’re interested.

Posted in General | Tagged | Leave a comment