Calling Babel From A Table

While I was reading the Lisp Idioms article that I blogged about yesterday, it occurred to me that the first example on accumulating a list presents an excellent opportunity to talk some more about Babel and how to use it with Org-mode tables. It will also give us a nice, if trivial, example of Reproducible Research.

Most Org-mode users are aware of the spreadsheet capabilities of Org-mode tables. You can, for example, sum all the values in a row or column or perform other familiar operations on the data. With Babel, you can also have a cell make a call to a Babel code block to get its value. Whereas in our previous examples we’ve used code blocks to build new tables, in this example we will first build the table and then have the code blocks populate (most of) the cells.

To review, in the Lisp Idioms article one of the examples was of ways to accumulate a list from a stream of values. This is, of course, a common operation in Lisp and Scheme and there are some mildly diverse ways of doing it. Just for variety, we will write our functions in Scheme rather than Common Lisp.

Cons the items onto a list and reverse the results at the end

The most common way of doing this is shown in the first code block. The cntr definition just provides a stream of integers—a new one each time it’s called. The E1 function does the actual work of assembling the list and the let at the end just calls E1 and calculates the time it took to do its work. All this should be very familiar to even beginning Scheme programmers.

(define cntr ((lambda () (let ((k 0)) (lambda () (set! k (1+ k)) k)))))
(define E1
  (lambda (n)
    (let loop ((k n) (lst '()))
      (if (zero? k)
          (reverse lst)
          (loop (1- k) (cons (cntr) lst))))))
(let ((start (get-internal-run-time)))
  (E1 n)
  (/ (- (get-internal-run-time) start) internal-time-units-per-second))

The next code block is almost the same as the first except that it uses reverse! instead of reverse to reverse the list. Theoretically, reverse! should be faster because it reuses the cons cells from the first list rather than making a separate list from scratch.

(define cntr ((lambda () (let ((k 0)) (lambda () (set! k (1+ k)) k)))))
(define E2
  (lambda (n)
    (let loop ((k n) (lst '()))
      (if (zero? k)
          (reverse! lst)
          (loop (1- k) (cons (cntr) lst))))))
(let ((start (get-internal-run-time)))
  (E2 n)
  (/ (- (get-internal-run-time) start) internal-time-units-per-second))

Append the items onto the end of the list

In some sense this is the most intuitive, if naive, method if you are not a Lisp or Scheme programmer. The problem is that append must search down the list each time to find the list’s end and is therefore very inefficient. No serious Lisp or Scheme programmer would ever do this.

(define cntr ((lambda () (let ((k 0)) (lambda () (set! k (1+ k)) k)))))
(define E3
  (lambda (n)
    (let loop ((k n) (lst '()))
      (if (zero? k)
          lst
          (loop (1- k) (append  lst (list (cntr))))))))
(let ((start (get-internal-run-time)))
  (E3 n)
  (/ (- (get-internal-run-time) start) internal-time-units-per-second))

Place the items in a vector in the order in which you retrieve them

This is certainly the fastest method and the one that, say, C programmers would use. Its problem is that you have to know the largest number of items that you can get so that you can preallocate the vector. Common Lisp has a way of extending vectors and the Lisp Idioms article has an experiment for that method but Scheme doesn’t have resizable vectors so we don’t have that case.

(define cntr ((lambda () (let ((k 0)) (lambda () (set! k (1+ k)) k)))))
(define E4
  (lambda (n)
    (let ((vec (make-vector n)))
      (let loop ((k 0))
        (if (= k n)
            vec
            (begin
              (vector-set! vec k (cntr))
              (loop (1+ k))))))))
(let ((start (get-internal-run-time)))
  (E4 n)
  (/ (- (get-internal-run-time) start) internal-time-units-per-second))

Running the experiment

We start with a table like this:

|       n | Reverse | Reverse! | Append   | Vector |
|---------+---------+----------+----------+--------|
|       1 |         |          |          |        |
|       2 |         |          |          |        |
|    1024 |         |          |          |        |
|    4096 |         |          |          |        |
|   16384 |         |          |          |        |
|  262144 |         |          | Too Long |        |
| 1048576 |         |          | Too Long |        |

The code blocks have the same names as in the header. For example, the first code block has a

#+SRCNAME Reverse(n)

just above it. The last two rows of the Append column are set to “Too Long” because the Append method takes too long to run for lists of those sizes. All the other cells in columns 2–5 are empty.

Next, we set the code for each of the empty cells to ‘(sbe @1 (n $1)). This is easily done with the table editor and results in the TBLFM line

#+TBLFM: $2='(sbe @1 (n $1))::$3='(sbe @1 (n $1))::$5='(sbe @1 (n $1))::@2$4..@7$4='(sbe @1 (n $1))

sbe is a macro that calls the code block with the name of its first argument and passes the code block the rest of its arguments. The @1 means row one of the current column and thus refers to the name of the code block from the table header. The $1 in the second argument means column 1 of the current row and thus refers to the n-value for this cell.

Now all we need do is put the point anywhere in the table and type C-u C-c C-c and the table will be filled in with the run times.

n Reverse Reverse! Append Vector
1 0 0 0 0
2 0 0 0 0
1024 0 0 1/100 0
4096 0 0 6/25 0
16384 1/100 0 387/100 0
65536 1/100 1/100 1321/20 1/100
262144 1/25 3/100 Too Long 3/100
1048576 19/100 13/100 Too Long 1/10

get-internal-run-time returns number of 100 Hz ticks used by the Guile interpreter (thus internal-time-units-per-second = 100). As expected, the Vector method is fastest but not by as much as we might expect. The other interesting result is that except for really long lists, reverse! does not outperform reverse to any significant extent. The Append method, of course, is out of the running.

This post is a simple example of Reproducible Research because everything needed to reproduce the post is contained in a single the Org-mode file. If this were a serious research paper, any scientist could request a copy of the file and run the experiments for his or herself.

Update: Corrected the definition of get-internal-run-time.

This entry was posted in Programming and tagged . Bookmark the permalink.