John Kitchin, whose work I’ve mentioned many times, states the obvious.
What would the Emacs honor society be called? λλλ of course. #emacs
— John Kitchin (@johnkitchin) December 20, 2014
John Kitchin, whose work I’ve mentioned many times, states the obvious.
What would the Emacs honor society be called? λλλ of course. #emacs
— John Kitchin (@johnkitchin) December 20, 2014
I’ve written a couple of times about the New York City Taxi Commission’s metadata and how easily it can be abused even though it was anonymized by removing the personally identifiable information. The anonymization notwithstanding, data analysts were able to recover a plethora of personal information attributable to specific individuals. You can read the two posts above to see how easy this was.
Now the New York Times is reporting on the problems of anonymous metadata. They report on a study published in Science showing that anonymous credit card metadata could be deanonymized over 90% of the time if the analyst had 4 pieces of outside information about a person in the anonymized metadata. Even social scientists, who relish this type of data and use it in their research, are concerned about the privacy issues that it represents.
Much this data is collected as a side effect of ordinary business practices. For example, credit card data has to be collected so that the credit card user can be billed and the merchant reimbursed. There is, however, no reason to release this data in any form, no matter how much social scientists might wish to study it or advertisers wish to leverage it for targeted advertising. The data belongs to the customers and no one else. Nothing compels companies to make this sort of data available and it should, in fact, be illegal to do so.
The situation with data collected by the government is more complex. It is almost always subject to FOIA requests and therefore subject to release no matter what the holders or subjects of the data might prefer. That’s what happened with the NYC Taxi Commission data: someone filed a FOIA request for the data and then promptly deanonymized it. It’s certainly the case that the government must collect some data. Income tax returns, for example, are full of sensitive personal information. That’s why they’re specifically exempted from FOIA and why it’s illegal to release the data to anyone.
Sadly, though, the government collects lots of data that (a) is not protected from FOIA and (b) probably doesn’t need to be collected. Much of the data is collected simply because technology makes it cheap and easy to do so and because it might, someday, be useful. As the Science report makes clear there are no effective standards in place to guarantee the safety of data so in the absence of a compelling reason to do so, it shouldn’t be collected in the first place.
Of course, lots of people want to get their hands on that data and almost never for the benefit of the people it was collected from. For that reason it almost certainly will continue to be collected. The people who want it have deep pockets and will ensure to flow isn’t shut off. That’s too bad for the rest of us.
My New Year’s post was about how John Kitchin used Emacs and Org-mode to run a graduate course in Chemical Engineering. If you want to see it in action, he’s posted a short video that shows the system from both the students’ and instructor’s point of view.
I love seeing how people leverage the Lisp Machine aspect of Emacs to solve problems having little to do with editing text. If you like it too, be sure to check out the video. It’s only six and a half minutes to you won’t have to schedule time.
#Emacs of the day: Daily status reports? Notes in org-mode? Grab the block with "M-h" then "C-c C-e h o" export as HTML & open in browser.
— Robin Green (@fatlimey) January 19, 2015
When I get tired of blogging I’m going to write some Elisp that everyday will make a post that says, “Abo-abo has a great post today. Go read it.” Really, if you don’t already have (or emacs in your feed, you should add it immediately. I learn something new from it almost every day.
One of his latest posts is about his refactoring workflow. The problem is to change a function name in every file in a directory. There are multiple subdirectories. How would you do this? I can think of several ways, mostly including dired
and perhaps keyboard macros. Abo-abo has another approach. His steps are
rgrep
to get a list of every occurrence of the function name.
wgrep
so that you can edit the rgrep
output and have the results reflected in the original files.
iedit
to change every instance of the function name at once.
iedit
.
wgrep
, writing the changes back to the files.
See abo-abo’s post for the details.
This is a really outstanding post and I encourage everyone to take a look. I hadn’t been using iedit
or wgrep
but installed them so that I could take advantage of abo-abo’s technique.
I’ve long been a multple-cursors
user so I was interested in abo-abo’s use cases for iedit
versus multiple-cursors
and queried him on it. You can read his answer here. One thing that struck me is that he described iedit
as “a drop-in `occur` and `query-and-replace`.” You can see how that works with his refactoring process. Most excellent.
I use Magit all the time and really like it but I don’t know how to do much more than stage and commit changes. Sometimes I can even resolve a merge conflict but I always have to stumble through it. As a result, I’m always on the lookout for Magit tutorials that help me get better at using it.
My latest find is a post by Shingo Fukuyama on using Magit to rewrite git commit messages. Fukuyama has lots of screen shots to show you what you’ll see as you follow the steps he lays out.
In the same vein, Howard Abrams has a similar tutorial on using Magit to squash commits together. The process is very much like the one that Fukuyama describes for rewriting commit messages. I really like articles like these; they help me extend my Magit knowledge in a relatively painless way.
One of nicest techniques from Scheme is the idea of streams. Streams1 let you create a virtually infinite list. For example, we can compute the square roots of the first 5 Fibonacci numbers with
(mapcar #'sqrt '(0 1 1 2 3))
0.0 | 1.0 | 1.0 | 1.4142135 | 1.7320508 |
But suppose we want to print the square roots of an arbitrary number of Fibonacci numbers. We’d like something like
(setq list-of-fibonacci-numbers '(0 1 1 2 3 5 8 13 21 34)) (defun sq-roots (n l) (when (> n 0) (print (sqrt (car l))) (sq-roots (1- n) (cdr l)))) (sq-roots 5 list-of-fibonacci-numbers)
0.0 1.0 1.0 1.4142135 1.7320508
where list-of-fibonacci-numbers
has at least \(n\) members. Of course we don’t know what \(n\) will be so we really need a infinite list of Fibonacci numbers. That’s what streams do. They simulate an infinite list by calculating the members of the list on-the-fly as needed.
Atabey Kaygun has a nice post that considers how to implement Scheme streams in Common Lisp. Rather than simply duplicate the Scheme implementation, which, as I show below, is relatively easy, Atabey produces two different implementations. One, that he describes as “stateful,” uses a closure to remember the state of the stream and calculate the next value.
The other method is more functional and builds an actual list as the calculation proceeds. He uses this method to build a stream, fibonacci
, that returns the Fibonacci numbers. Using that we can solve our problem as
(defun sq-roots (n) (labels ((roots (i l) (while (> i 0) (print (sqrt (car l))) (roots (1- i) (cdr l))))) (roots n (f-take n fibonacci))))
We can’t express the Fibonacci numbers with Atabey’s stateful implementation but it could be trivially modified to permit it. The trouble with the stateful solution, as Atabey tells us, is that it’s use-once. Even if we hold its head, any use of the stream modifies its internal state so that further uses reflect what has already happened.
These examples are instructive but notice that we must build the list of \(n\) Fibonacci numbers before we can start the calculations. What if \(n=10^{10}\)? That’s a pretty big list and will almost surely exceed the memory of most computers. What we’d like is a “lazy list” that calculates its elements on-the-fly. That’s what streams do.
Here’s a Common Lisp implementation based on the Scheme from Section 3.5 of SICP. The real implementation builds memoization into delay
but we ignore that for simplicity. See SICP’s implementation or Atabey’s memoization post for details. The basic building blocks are given below:
(defmacro delay (expr) `(lambda () ,expr)) (defun force (delayed-object) (funcall delayed-object)) (defmacro cons-stream (x y) `(cons ,x (delay ,y))) (defun stream-car (s) (car s)) (defun stream-cdr (s) (force (cdr s)))
The delay
macro delays the evaluation of expr
by wrapping it in a function. The force
function evaluates the delayed expression by calling the function that it got wrapped in. The cons-stream
builds a cons from \(x\) and \(y\) but arranges to delay the evaluation of \(y\). The last two functions are direct analogues of their list counterparts but operate on streams instead. Notice that we can do without stream-car
since it merely calls car
. See SICP for further explanation.
Now, we can build some infinite lists (streams). Suppose we want an infinite list of integers. Here it is:
(defun integers (&optional (n 1)) (cons-stream n (integers (1+ n))))
The first time it’s called, integers
returns
(1 . (lambda () (integers (1+ n))))
where \(n=1\) and \(n\) is held in integers
‘ closure. When stream-cdr
is called on this, the call to integers
will be evaluated and return
(2 . (lambda () (integers (1+ n))))
with \(n=2\). Thus, the physical stream is just a single cons but it acts as an infinite list of integers.
Let’s take the square root of the first \(n\) integers. First we define a function to take the square root of the first \(n\) elements of a stream:
(defun sq-roots-of-stream (n s) (when (> n 0) (print (sqrt (stream-car s))) (sq-roots-of-stream (1- n) (stream-cdr s))))
and then pass it a steam of integers:
(sq-roots-of-stream 10 (integers))
1.0 1.4142135 1.7320508 2.0 2.236068 2.4494898 2.6457512 2.828427 3.0 3.1622777
Notice that the list of integers is not calculated in advance.
Here’s the solution to our original problem. First, a stream of Fibonacci numubers:
(defun fibs (&optional (a 0) (b 1)) (cons-stream a (fibs b (+ a b))))
and then we reuse sq-roots-of-stream
to calculate the results:
(sq-roots-of-stream 10 (fibs))
0.0 1.0 1.0 1.4142135 1.7320508 2.236068 2.828427 3.6055512 4.582576 5.8309517
Once you get the hang of it, it’s really easy to work with streams and it avoids having to precalculate large lists of intermediate results.
In Common Lisp, streams refer to input/output channels such as STDIN
and STDOUT
. I’m using the term in the Scheme sense.
Artur Malabarba over at Endless Parentheses has a handy optimization for commenting out a single line or, perhaps, \(n\) lines at once. The usual procedure is to select the lines you want to comment and type 【Meta+;】 to call comment-dwim
. I use that all the time but what if you want to comment (or uncomment) just one or perhaps a few lines?
Malabarba’s code comments/uncomments the current line or \(n\) lines if a numerical prefix is given. He binds it to 【Ctrl+;】 which gives it a nice symmetry with 【Meta+;】.
In the comments, Scott Turner suggests a version that calls comment-dwim
if there is a region set and Malabarba’s code otherwise1. You could then just rebind【Meta+;】to the new version and have the best of both worlds2. Either way, if you find yourself commenting and uncommenting lines a lot, you may find this bit of Elisp handy.
The two most important days in your life are the day you are born and the day you find Emacs. — Mark Twain /topic on #emacs
— punchagan (@punchagan) December 11, 2014
If you liked Sacha Chua’s post on micro-habits that I wrote about yesterday, be sure to check out her video on the same subject. In it, she discusses various packages for switching windows and for jumping around in a buffer or even multiple buffers.
She also discusses how she leverages the key-chord
package to invoke her window switching and buffer navigation. You don’t have to worry about trying to remember her configuration because it’s available online in an excellent literate programming Org file. Watch the video and get any specifics you need from her configuration file.
The video is just over eight and a half minutes so it should be easy to fit a viewing into your schedule. As always with Chua, your time will be well spent.