Git Merge and Rebase Explained

Git has become the de facto standard version control system. Many people find it confusing or even impossible to understand. In actuality, Git is based on some simple ideas that, once understood, make Git almost transparent. An example of this is Tom Preston-Werner’s explanation of Git as taking a series of snapshot of your file system. Once you understand the idea of taking snapshots, a lot of the mystery of what Git’s doing dissolves.

Still, there are Git functions who details are not obvious. A couple of these are merge and rebase—especially rebase. Alex Ford has an excellent three part post on how merge and rebase work and when you should use each. Ford uses screen shots from SourceTree to show the state of the commit graph as a branch is created and then merged to make the merge process clear. Then he repeats the process with rebase so that you can see how the two commands differ.

This is the clearest explanation of the difference between merge and rebase that I’ve seen. If you’re at all unclear about them, Ford’s posts can help you achieve enlightenment.

Posted in General | Tagged | Leave a comment

One Side of the Argument Has Class

That distant relative agrees to take back the sweater.

Posted in General | Tagged | Leave a comment

Three Groups of Editors

I came across this tweet

that reminded me of my slightly less controversial post on the same subject. That post was from 2 and a half years ago so it’s worth considering how things have changed.

The most obvious change is that TextMate is no longer a player. It’s apparently still available and the source code of the new version development has moved to GitHub but one hardly hears about it anymore. In terms of my original post, that leaves Emacs and Vim as the editors of choice for experienced developers. The battle between them for the top spot proceeds apace, of course, with partisans on both sides as adamant as ever.

There are a couple of new editors that are getting a lot of buzz: Sublime Text and Atom. Emacs users haven’t been much impressed and I assume the same is true of Vim users as well. Emacs and Vim are like black holes. Once you start using them you never leave1. It may be that in the long term these editors will gain traction but in the mean time, I think it’s still true that the most experienced engineers are using Emacs or Vim.

There are lots of decent editors, each with their own champions, and plenty of experienced and talented developers who use them. Still, as a general statement I think it’s still true that the most experienced and talented engineers tend to prefer Emacs or Vim, and that to a first order approximation their use signals someone who is experienced and cares about optimizing their work flow.

On the other hand, as far as uptake goes, we have this.

Footnotes:

1

Although there is movement between the two editors. It’s just that there’s no escape from the twin system of Emacs and Vim.

Posted in General | Tagged | 2 Comments

The Real Purpose of Operating Systems

An amusing anecdote from Perry Metzger’s The Editor of a Lifetime talk.

Posted in General | Tagged | Leave a comment

A Quick Introduction to Generic Functions & CLOS

Zach Beane points to a nice post by Nicolas Hafner on Generic Functions and CLOS. CLOS is often considered a hideously complicated system but the basics are easy to understand and use and most people will never need to explore the complex parts. Hafner’s post reminds me of Joe Marshall’s excellent Warp Speed Introduction to CLOS.

Together, these two posts serve as a sort of orientation to the Common Lisp object system. Both approach CLOS through generic functions, which are useful whether or not you ever define or instantiate an “object.” Both are short and definitely worth a read.

Posted in Programming | Tagged , | 1 Comment

And By the Way…

…Happy Programmers’ Day.

Posted in General | Tagged | Leave a comment

Things My Mom Taught Me

Suppose a distant relative sends you a sweater. It’s a perfectly good sweater but not a style you like or would ever wear. Instead of rolling your eyes and sending a pro forma thank you note, you take to twitter and your blog to complain bitterly that you were given a sweater you didn’t like. What would your mother say?

I don’t know about you but if I did that my mom would deliver me a scolding and a lecture about manners. But that’s just me. The Internet appears to be ablaze with weeping and gnashing of teeth over Apple’s adding the latest U2 album to everyone’s iTunes collection. Here’s TechCrunch waxing apoplectic about the gift. And I can’t tell you how many Twitter posts I’ve seen complaining about it. Even people who aren’t Apple users and don’t have an iPhone or iPad are complaining. It’s apparently difficult to delete music from your iTunes account (although it can be hidden) so these folks are feeling especially aggrieved.

Talk about a first-world problem. It’s not as if there aren’t serious things to complain about. How about NSA snooping, for example? Seriously, people, put the damn sweater in a drawer and get on with your lives. And anyone from Apple reading this can take it as my thank you note.

Update: Here’s a selection of Tweets bemoaning the U2 download. I know I must be old because I do know who U2 is. So old, I actually have several of their albums.

Posted in General | Tagged | 2 Comments

Mathematics in a Blog Post

As some of you know, I was trained as a mathematician. It’s odd, then, that my posts don’t have more mathematics in them. The reason for that is that this blog is mostly about computers and development so there’s not a lot of high level mathematics to discuss. What mathematics I do include are usually things like polynomials that are easily handled with sub- and superscripts.

Still, from time to time I have wanted to include something a bit more complicated but could never figure out how to integrate the math in a visually pleasing way. My two posts on Shamir’s Secret Sharing finally motivated me to figure out how to offer typeset mathematics on the blog page.

I’m abashed that it took me so long. The process is so easy that I really have no excuse for not doing it years ago. Here, for others wishing to have mathematics in their blogs, is what to do. This post is mainly aimed at the Org mode → org2blog → WordPress workflow that I use (see here for details) but I’ve tried to indicate how other workflows can also integrate mathematics into a blog post.

There are two parts to the problem:

  1. Getting the mathematics to display on your blog
  2. Getting the mathematics to display locally so you can easily preview the result

The easiest way to address both problems is with MathJax. It’s a bit of JavaScript that gets downloaded to the client’s computer and takes care of typesetting the mathematics. All this happens transparently so your blog readers don’t have to do anything. You don’t have to do much of anything either. The folks at the MathJax project maintain the MathJax CDN, a network of computers that will serve the JavaScript for you. Thus there is no need to have your own local copy (although that’s possible if you like).

The easiest way to get this working on WordPress is to install a plugin (although even this isn’t strictly necessary). I’m using LaTeX for WordPress but there are several others. The MathJax site has instructions for using MathJax with several common blogging platforms or how to add it to your theme file if you’d rather do that.

In my case, I installed the plugin and everything just worked1. To get mathematics you simply mark up your LaTeX with \(...!\) for in-line expressions and \[...!\] for displayed expressions. For example,

\[f(a)= \frac{1}{2\pi i}\oint_{\Gamma} \frac{f(z)}{z-a}dz!\]

yields the beautiful Cauchy Integral Formula \[f(a)= \frac{1}{2\pi i}\oint_{\Gamma} \frac{f(z)}{z-a}dz\]

To get the typesetting locally, you have to arrange for the MathJax JavaScript to be loaded. With Org-mode, I simply added

#+HTML_HEAD: <script type="text/javascript"
#+HTML_HEAD:  src="http://cdn.mathjax.org/mathjax/latest/MathJax.js">
#+HTML_HEAD: </script>

near the top of the source file. If you’re using a different system (why, for goodness sake?) you have to arrange to get the <script> markup into your generated HTML. Once you’ve done that the mathematics should appear locally when you preview your post.

Very simple, no? As I say, I’m abashed that it took me so long.

Footnotes:

1

If you’re using org2blog to publish your Org mode blog posts, you need to turn off org2blog‘s automatic translation of the mathematics markup with

(setq org2blog/wp-use-wp-latex nil)

Posted in Blogging | Tagged , | 14 Comments

Shamir’s Secret Sharing Implementation

Yesterday, I discussed Shamir’s secret sharing and how it works. The TL;DR was a method to give \(n\) people a secret number so that any \(k\) of the recipients \((k\lt n)\) can reveal the secret. The method works by defining a \((k-1)\)-degree polynomial with the secret number as the constant term and then distributing \(n\) points on the polynomial. Once we have \(k\) points, we can recover the polynomial—and therefore the secret constant term—by interpolation.

We begin with a couple of constants and a function. The randp function uses strong-random from the ironclad library to generate cryptographically secure random integers between 0 and \(p-1\). We use it to generate the coefficients for the polynomial and the \(x\)-coordinates of the \(n\) points. The *prng* parameter holds the state for randp.

*P* is a 256 bit prime number. It’s the default size for the number of elements in our finite field. Because \(\mathbb{Z}_{p}\) is a finite field, you might think that we could simply try each element of the field to recover an encryption key based on the constant term of the polynomial. That would work for a low order field but using *P* gives us a field with \(2^{256}\) members rendering a brute force attack infeasible.

(defparameter *prng* (ironclad:make-prng :fortuna) "Holds the RNG state.")
(defparameter *P* 108838049659940303356103757286832107246140775791152305372594007835539736018383 "256 bit prime")

(defun randp (p)
  "Generate a random number for the polynomial coefficents and sample points."
  (ironclad:strong-random p *prng*))

Because we’re working in \(\mathbb{Z}_{p}\), “division” is a bit more complicated. The modinv function uses the Euclidean algorithm to compute the (mod \(m\)) inverse of a number. When we need to divide by some number, \(x\), we use modinv to compute its inverse and then multiply by that inverse to “divide.”

(defun modinv (a p)
  "Find the inverse of a mod p, p a prime. Uses the Euclidian algorithm.
This is Algorithm X of Section 4.5.2 of TAOCP."
  (labels ((euclid (u v)
             (if (= (caddr v) 0)
                 (let ((u0 (car u)))
                   (if (minusp u0)
                       (+ u0 p)
                       u0))
                 (let ((q (floor (/ (caddr u) (caddr v)))))
                   (euclid v (mapcar (lambda (x y) (- x (* q y))) u v))))))
    (euclid (list 1 0 a) (list 0 1 p))))

We’ll generate the \(n\) points by choosing \(n\) random, non-zero \(x\) values and evaluating the polynomial at those points. The efficient way to evaluate polynomials is with Horner’s methods. I wrote about Horner’s method previously.

(defun horner (a x)
  "Evaluate the polynomial with coefficients in the array a at x."
  (if (= (length a) 1)
      (aref a 0)
      (do* ((k (1- (length a)) (1- k))
            (val (* (aref a k) x) (* (+ val (aref a k)) x)))
           ((<= k 1) (+ val (aref a 0))))))

The make-keys function uses randp to choose the polynomial’s coefficients (and thus the secret) and the \(x\)-values of the \(n\) points. Then it uses horner to generate the \(n\) points.

The xvals list is initialized to (0) so that 0 will not be chosen as an \(x\)-value. It’s discarded before xvals is input to the mapcar by the butlast.

(defun make-keys (n k &optional (p *P*))
  "Returns a random secret and n distinct keys.
K keys are sufficient to recover the secret. Each key is a pair, a point
on a random polynomial over the field Z/p."
  (let ((a (make-array k))
        (xvals '(0)))                   ;ensure x=0 not chosen as a point
    (dotimes (i k)
      (setf (aref a i) (randp p)))
    (while (< (length xvals) (1+ n))
      (setq xvals (adjoin (randp p) xvals)))
    (cons (aref a 0)
          (mapcar (lambda (x) (list x (mod (horner a x) p))) (butlast xvals)))))

Given the \(k\) points \((x_{0}, y_{0}) \ldots (x_{k-1}, y_{k-1})\), we retrieve the secret by evaluating the polynomial resulting from the Lagrangian interpolation of the \(k\) points at 0. We can do this all at once by computing \[\sum_{i=0}^{k-1} y_{i} \prod_{\substack{0\leq{}j\leq{}k-1 \\ j\neq{}i}} \frac{-x_{j}}{x_{i}-x_{j}}\] as explained in yesterday’s post. Notice that retrieve-secret is an almost literal translation of the above.

(defun retrieve-secret (v &optional (p *P*))
  "Recover the secret given k points and the prime."
  (let ((k (length v))
        (secret 0)
        prod)
    (dotimes (i k (mod secret p))
      (setq prod (cadr (aref v i)))
      (setq secret (+ secret 
                      (dotimes (j k prod)
                        (unless (= j i)
                          (setq prod (* prod
                                        (car (aref v j))
                                        (modinv
                                         (- (car (aref v j))
                                            (car (aref v i)))
                                         p))))))))))

As you can see, Shamir’s method is easy to understand and implement. None-the-less, it’s a powerful way of having a distributed secret.

Posted in Programming | Tagged , | 2 Comments

Shamir’s Secret Sharing

References to Jeremy Kun’s The Mathematics of Secret Sharing keep popping up in my feed so it appears there’s some interest in it. The problem is to entrust shares in a secret to \(n\) people in such a way that the secret can be recovered if \(k\) shareholders \((k < n)\) agree to reveal the secret. If less than \(k\) of the shareholders agree, the secret remains hidden. For each shared secret, the number \(k\) is fixed but any number of shares can be distributed. The secret must be a number so in practice the secret will usually be an encryption key.

It might seem like this would be a difficult problem requiring some esoteric mathematics but it turns out to be simple and requires no more than high school mathematics. The method uses the fact that it takes \(m + 1\) points to recover an \(m^{\textrm{th}}\) degree polynomial (because, after all, there are \(m + 1\) coefficients to determine). For example two points determine a line (1st degree polynomial), three points determine a parabola (2nd degree polynomial) and so on.

As an example, suppose our secret is 7, that we want 6 shareholders, and that we require at least 4 shares to reveal it. We define a third degree polynomial with a constant term of 7, say \(x^{3} + 3x^{2} – x + 7\), and then pick 6 points on the polynomial (avoiding the \(x=0\) point) and give each of the shareholders a point. This is illustrated in Figure 1 where the green points are given to the shareholders and the blue point corresponds to the secret.

polynomial.png

Figure 1: \(x^{3} + 3x^{2} – x + 7\)

Now if four of the shareholders agree to reveal their points, it is a simple matter to recover the polynomial with Lagrangian interpolation1, and retrieve the secret as the constant term.

This method has the advantage of being theoretically unbreakable. If an attacker has \(j < k\) points, a polynomial having any desired constant term can easily be constructed that passes through the \(j\) points. That means that the \(j\) points could represent any secret. Only by having \(k\) points can an attacker recover the secret.

For various technical reasons the calculations should be carried out over a finite field. That doesn’t really change anything except that “division” calculations are a bit more difficult. Tomorrow, I’ll present Lisp code that implements Shamir’s Secret Sharing over the finite field \(\mathbb{Z}_{p}\) where \(p\) is a 256 bit prime.

Footnotes:

1

Rather than computing the actual polynomial, we can recover the constant term directly with \[\sum_{j=0}^{k-1} y_{j} \prod_{\substack{0\leq{}m\leq{}k-1 \\ m\neq{}j}} \frac{-x_{m}}{x_{j}-x_{m}}\] where \((x_{0}, y_{0}) \ldots (x_{k-1}, y_{k-1})\) are the \(k\) points required for a \((k-1)^{\textrm{st}}\) degree polynomial. All that’s going on here is that we’re evaluating the Lagrange polynomial at 0.

We could also solve the four simultaneous equations by hand or use the Vandermonde matrix and Guassian elimination.

Posted in Programming | Tagged , | 1 Comment