A Parable About Git

A Couple of days ago I wrote about Tom Preston-Werner‘s talk on Git and how he approached the subject matter in a particularly effective way. Today, I was trawling through his Web site and found a longish post entitle The Git Parable that serves as a nice companion piece to the talk.

In the post he helps us understand the concepts behind Git by asking us to imagine that we are a developer with no VCS of any kind on our system. Because we are conscientious developers, we realize that we need to implement some method of keeping track of the versions of a large program that we are developing. The rest of the post is a parable about how we build a Git-like system from the ground up.

We start by realizing that what we want is to be able to take snapshots of our code as we implement each feature and save these snapshots so that we can recreate the software as it was at the time of any snapshot. Our first solution is to copy the working directory into another directory and name it snapshot-0. We do this for each snapshot but change the name so that we have a series of directories named snapshot-0, snapshot-1, snapshot-2 and so on. With each snapshot we add an extra file that contains a summary of the changes in the snapshot. We call it the description file.

Everything is fine until the first release. As we start on the next release we suddenly start getting bug reports so we fix those starting from the snapshot of the first release. But now we lose the implicit ordering of the snapshots because the bug fix is the child of the release snapshot, not the latest snapshot that we are currently working on. This introduces the idea of branches and causes us to explicitly record the parent of each snapshot in the description file.

The parable goes on to describe how new situations cause us to add new features to our system. For example, we take on a co-developer and now we suddenly have two different snapshots with the same name. We solve that by replacing snapshot-n with the SHA1 of the snapshot’s contents. By the end of the parable, we have built a system that looks pretty much like Git.

This is an excellent way of introducing Git because

  • It shows the simplicity of the ideas behind Git.
  • We can see how each feature is there to solve a real world problem.
  • Understanding the concepts behind Git’s internals makes it much easier to learn the system and feel confident in using it.

Unless you are a Git expert, I urge you to read this post. If you have colleagues that are new to Git, this parable will be a real help for them in understanding and learning the system.

Posted in General | Tagged | Leave a comment

The Security Hall of Shame

Two more inductees into the Security Hall of Shame. Honestly, I could devote a whole blog to this sort of thing. Perhaps we should start a Security Hall of Shame blog similar to Steve Friedl’s No Dashes or Spaces Hall of Shame.

The most annoying part of this sorry spectacle is that when these sites are inevitably compromised they will whine about the evil, but brilliant, “hackers” who somehow overcame their defenses when the truth is that they got owned by a bunch of script kiddies. Security is devilishly hard to get right but that’s no excuse for just being stupid about it.

Posted in General | Tagged | Leave a comment

An Interesting Talk On Git

Over at Ontwik, Tom Preston-Werner has a nice (video) talk on Git. I like it because instead of focusing on the mechanics of using Git, Preston-Werner shows you what Git is doing internally at each stage of the process. I wish that I’d seen this when I was learning Git; it would have made the process of coming up to speed easier.

There are a couple of problems with the presentation, neither of which are Preston-Werner’s fault. It was often hard to see what he was typing because the camera was set a little low on the screen and the top line was half cut off. That’s not too big an issue because his narrative was clear and it was fairly easy to fill in the chopped off text.

The bigger problem was that there were a lot of questions from the audience that were not really that important and caused Preston-Werner to run out of time. That’s a shame because the talk was very interesting and informative. He did promise to clean things up and post the whole talk a little later so we can hope to see it all shortly.

Problems aside, this is a great talk and worth watching even if you are pretty familiar with Git already.

Update: your → you

Posted in General | Tagged | Leave a comment

Using Org Modes’s Date Routines In Any Buffer

I really like Org Mode’s method for inserting time/date stamps into an Org file. Org Mode has two types of dates, active and inactive, that differ in how Org Mode treats them but we needn’t worry about that here. In either case when you ask to insert a date, Org Mode presents you with a three month window of calendars centered at the current month and a default of the current day’s date. You can just press return to accept that, add a time or change the date to something else. You can enter an absolute date or you can say +5 to indicate 5 days from today or +1w for a week from today and so on. You can also click on the calendars to pick a date and, of course, you can scroll the calendar window. There are lots of options for entering the dates and times as described here in the Org Mode Manual.

Unless I’m writing code, I’m usually in Org Mode so the date routines are available most of the time. Sometimes I want to enter a date in a comment in a program or I am in some other sort of buffer where the Org Mode key bindings aren’t available. Therefore, I wrote a quick piece of Emacs Lisp to call the Org function directly and bound it to 【F5】:

(defun jcs-insert-date (and-time)
  "Prompt for and insert date at point into the current buffer using the
org-read-date routines."
  (interactive "P")
  (org-time-stamp and-time t))

(global-set-key (kbd "<f5>") 'jcs-insert-date)

Now when I want to insert a date stamp I just type 【F5】 or 【Ctrl+u F5】 if I want the add the current time.

Posted in General | Tagged | Leave a comment

Lessons From Dropbox

I’ve written before about Dropbox and their supposed scandal regarding the perfectly obvious fact that they could, in fact, read users’ files stored on the site. Despite the lamentations of the aggrieved and even the filing of a complaint with the FTC, I continue to think that those complaining are just being silly or clueless.

Now, sadly, there is a real problem at Dropbox. Earlier this week, Dropbox pushed an update that inadvertently allowed access to any Dropbox account for which the user’s email address was known. This was discovered very rapidly and was fixed within four hours. Nonetheless, the Dropbox logs showed that there was account activity on a small number of the accounts. Yesterday, Dropbox announced that although less than 100 accounts were affected, someone had logged into some accounts and were able to view file and folder names but that no files or account settings were modified and the files did not appear to have been downloaded or viewed.

Obviously Dropbox has egg on their face and, unlike the previous brouhaha, this was a serious failure on their part. There are a couple of obvious lessons that we can take away from this. First, it really is unacceptable for Dropbox to have pushed an insufficiently tested patch to the operational system. To their credit, Dropbox admits this and is not making any excuses.

The second take away is, to my mind, more important. Things like this happen even to the most careful people and users should be asking themselves, “What would it mean to me if it did.” In this case, users should have asked themselves, “What confidential data would I lose if the account were compromised? How devastating would the loss be?” If you’re using Dropbox to sync your college term papers between a laptop and a desktop, then you might be annoyed but you wouldn’t really care. If you’re syncing confidential company plans or sales figures between a laptop and desktop then you might care a lot. Perhaps your company stands to lose substantial amounts of money. Perhaps you’ll get fired.

The point is, each user should make a rudimentary calculation about what a compromise would mean to them and if the answer is other than “Meh” they had better take steps to protect themselves. No one—no one—should feel sorry for the user who whines that he thought Dropbox (or whoever) was going to protect them. If it’s important and you’re going to store it in the cloud, you had better encrypt it yourself.

Fortunately, in the case of Dropbox this is particularly easy to do so there’s no excuse for anyone to have suffered any real harm. But, of course, many people have not protected themselves and they will be mad at Dropbox. Many of them will likely sue; they had better hope that I’m not on the jury.

Update: That didn’t take long.

Posted in General | Tagged | Leave a comment

Hackers Redux

Yesterday I wrote about Steven Levy’s Hackers. Today, serendipitously, I happened across a Wired article from last year that I’d filed away intending to read and then forgotten. In it Levy revisits Hackers and catches us up on what his heroes are doing today.

Some, like Bill Gates and Paul Graham, have achieved great financial and commercial success. Others, like Richard Greenblatt and Richard Stallman have eschewed all that and remained true to the hacker ideals that they grew up with. Levy also takes a look at the newcomers, like Mark Zuckerberg, who are carrying on the hacker tradition, at least as they see it.

If you’ve read Hackers and enjoyed it, you’re sure to enjoy this article.

Posted in General | Leave a comment

The Last Hacker

A few weeks ago someone posted a link to Richard Stallman’s Speech at the 2002 International Lisp Conference. It’s an interesting read that, among other things, recounts the story told in the Epilogue to Steven Levy’s Hackers about RMS single handedly (and independently) duplicating every improvement and bug fix made by Symbolics and then giving them to LMI. This was to avenge what he felt was Symbolics and Russ Noftsker‘s betrayal of the MIT AI Lab and Hacker Culture. It was fun, after almost 25 years, to read that story again.

Also interesting is Stallman’s history of early Emacs. Most of us have probably heard a lot of that story before but it’s nice to hear a cohesive history by someone who was there. The original Emacs didn’t have a Lisp interpreter in it, but rather was built on top of the TECO command language. It was Dan Weinreb who first implemented Emacs on top of Lisp. Later, James Gosling wrote an Emacs that had low level functionality written in C and the higher level routines written in a Lisp-like language called mocklisp. Eventually RMS wrote the first of the modern GNU Emacs.

If, like me, you’re fascinated by the history of our (hacker) culture and you haven’t read this story before, it’s definitely worth a few minutes of your time. And, of course, if you haven’t read Hackers you should immediately do so. It’s a wonderful story of our history starting from the time of the Tech Model Railroad Club at MIT and ending in 1983 with the story that RMS relates in his talk at the 2002 ILC.

Posted in General | Leave a comment

Scrolling Emacs by Line

The other day, I stumbled across this post over at Anything goes about scrolling up and down by line in Emacs. The poster gives two simple functions and some key bindings to scroll a window up or down a line. In the comments, Jürgen notes that 【Ctrl+v】 and 【Meta+v】 are already bound to scrolling the window and that you can scroll by line instead of page by giving a prefix argument. Thus, to scroll down one line you would type 【Ctrl+u 1 Ctrl+v】 or even easier 【Ctrl+1+v1.

This is often convenient when you’re editing and want to see a few lines above the top of the window or below the bottom of the window. I used to do that by using 【Meta+r】 to move the point to the top/center/bottom of the window and then move the point up or down to scroll a line or two. That worked OK, I suppose, until I started using paredit, which rebinds 【Meta+r】 to paredit-raise-sexp. That made using 【Meta+r】 to scroll useless when editing Lisp or Scheme files so I would generally scroll up or down a page to see those hidden lines.

Adding a prefix to 【Ctrl+v】 and 【Meta+v】 to scroll by line is one of those things I’m sure I learned in the past but had forgotten. I’m glad to be reminded because it’s surprising how often the need to do so comes up.

Footnotes:

1 If you’re on a Mac and are using Spaces, you may have to change the Spaces preferences to use 【⌘ Cmd+num】 instead of 【Ctrl+num】 to switch to a specific space. That’s because OS X will intercept the 【Ctrl+num】 and Emacs won’t see it.

Posted in General | Tagged | 3 Comments

The Common Lisp Loop Macro

I don’t like the CL Loop macro. I’m not alone on that; Paul Graham isn’t a fan either. On the other hand, Peter Seibel has a more positive view of them. That two accomplished and intelligent Lispers can disagree on the matter indicates that neither view is right or wrong, merely a matter of taste.

My two main objections to them are

  1. It’s not lisp
    If I wanted to write in TCL or AppleScript or something, then I’d write in that language instead of Lisp. I really like Lisp s-expressions and see no reason to go out of my way to avoid them.
  2. There’s no real specification
    The loop facility is almost always documented by examples. As Graham says, its specification, to the extent that it has one, is the implementation. All this means that they are hard to understand and apply to situations that aren’t one of the patterns covered by the examples.

Loop partisans often claim that writing iteration is more concise with the loop macro and this is sometimes true (although I would argue that the code is less clear) but not always. Here’s a case in point. We’re offered some loop macro code that checks if any Emacs buffer file name contains the word projects. This was intended to replace a long and complicated function that the poster had seen on another blog. When I first saw this, lots of different ways to do it in normal Lisp popped into my head and this post was originally going to be about how we could write a function to make this check in regular Lisp and still be just as concise.

But then I followed the link back to the original post and saw this elegant solution in the comments

(some (apply-partially 'string-match "projects") (mapcar 'buffer-name (buffer-list)))

That truly is a thing of beauty1. The iteration is implicit in the some function and the apply-partially makes a lambda construct unnecessary.

The some function is standard in CL but not in Scheme. Because it is handy and often just what you need, I thought I’d implement it in Scheme and add it to my standard library.

;; Return #t if pred? is true for at least one
;; member of seq, #f otherwise.
(define some
  (lambda (pred? seq)
    (cond
     ((null? seq) #f)
     ((pred? (car seq)) #t)
     (else (some pred? (cdr seq))))))

As you can see, it’s trivial to implement.

The apply-partially function is more interesting. It’s like

(f a1 a2 ... an)

except the first few arguments are fixed. What happens is that

(apply-partially f a1 a2 ... ak)

returns a new function that accepts the remaining (non-fixed) arguments and then applies f to all the arguments. Thus,

(apply-partially 'string-match "projects")

returns a function that checks if its argument contains the word projects.

Strangely enough, it’s harder to explain than to implement:

;; Make a new function that applies f to
;; the x arguments and its input
(define apply-partially
  (lambda (f . x)
    (lambda y
      (apply f (append x y)))))

Footnotes:

1 The original poster points out that the above code checks the buffer’s name not the name of the file associated with the buffer. The commenter responds with another version, which is a bit more complicated but still elegant and much nicer than the loop solution.

Posted in Programming | Tagged , | Leave a comment

Still More Password Analysis

Three more bloggers have weighed in with an analysis of the 62,000 passwords that LulzSec released recently. These three analyses take a look at the structure of the passwords and have some interesting details that I hadn’t seen before.

Aviv Ben-Yosef and Rafe Kettler take a look at the complexity of the passwords. As you might expect, the results are not encouraging, although the average length is 7.63, which is higher than I would have thought. Here are some startling results from Kettler

  • 43.108% of the passwords were all lower case
  • 19.536% of the passwords were all numeric
  • 36.914% of the passwords had some mixture of lower case, uppercase, numbers, and symbols (although not necessarily all of those types)

Over at R-bloggers, Colin Gillespie takes a slightly deeper look. He considers those passwords that would not fall to a simple dictionary attack and investigates their structure. It’s fairly intuitive that some characters will be used more than others and he drills down on that. Among other things, he discovered that

  • 20 characters (out of 78) cover 25% of the passwords
  • 27 characters cover 50% of the passwords
  • 31 characters cover 80% of the passwords

As Gillespie dryly remarks, if you’re trying to crack passwords, it’s clear that brute force is not the way to go. We users can also take away a lesson from this. If you want passwords that are hard to crack, it might be worthwhile using the less popular characters.

There are lots of other interesting results in these posts so if you’re interested in this sort of thing you should take a look.

Posted in General | Tagged | Leave a comment