Org Version 7.8.02 Is OUT

There’s a new version of Org mode available. You can download it from here. The download page also has instructions for cloning the Git repository so that you can stay as up-to-date as you like.

The main changes appear to be with Babel processing. Some of these are incompatible so if you’re a Babel user be sure to take a look. The changes page has an Elisp function to automate updating your old Babel buffers to the new syntax.

Concurrent with the new release, the Org team has also updated the site with a completely new look.

Posted in General | Tagged , | Leave a comment

Converting S-Expressions To XML In Emacs

My last two posts mentioned the ease with which s-expressions can be converted to XML and vice versa. Out of curiosity, I decided to write an sexpr to XML function in Elisp.

The implementation consists of two functions. The first, sexpr->xml, does the actual conversion, while the second, convert-to-xml, takes care of Emacs bookkeeping details and acts as a driver. For convenience, convert-to-xml takes an sexpr as its input but it would be easy to have it operate on a region or even a whole buffer.

We’ll use the example sexpr from Yegge’s The Emacs Problem post. Note that, again for simplicity, I am dealing with only a single record but everything would work exactly the same if there were multiple records wrapped in a '(log …) sexpr.

'(record
  (date "2005-02-21T18:57:39")
  (millis 1109041059800)
  (sequence 1)
  (logger nil)
  (level 'SEVERE)
  (class "java.util.logging.LogManager$RootLogger")
  (method 'log)
  (thread 10)
  (message "A very very bad thing has happened!")
  (exception
    (message "java.lang.Exception")
    (frame
      (class "logtest")
      (method 'main)
      (line 30))))

The converter is very simple.

1:  (defun sexpr->xml (sexpr)
2:    (let ((tag (car sexpr)))
3:      (princ (format "<%s>" tag))
4:      (dolist (o (cdr sexpr))
5:        (if (atom o)
6:            (princ (format "%s " o))
7:          (sexpr->xml o)))
8:      (princ (format "</%s>" tag))))

It’s passed an sexpr whose first symbol is the XML tag. That gets saved away in the let on line 2 and printed as the opening tag on line 3. The dolist loop in lines 4–7 looks at each of the other objects in the sexpr. If an object is not another sexpr, it is printed with a trailing space. If it is another sexpr, sexpr->xml is called recursively on line 7 to process it. When all the objects in the sexpr have been processed, the end tag is printed on line 8.

The output of sexpr->xml is a single line of XML with no formatting at all. Also, any nils will appear explicitly in the XML instead of the tag pair being empty and all the quoted symbols will be wrapped in <quote></quote> tags because the lisp reader turns 'symbol into (quote symbol).

Now let’s look at the driver function:

1:  (defun convert-to-xml (sexpr)
2:    (with-output-to-temp-buffer "*XML*"
3:      (sexpr->xml sexpr)
4:      (set-buffer "*XML*")
5:      (xml-mode)
6:      (replace-regexp "\\bnil\\b\\|<quote>\\|</quote>" "" nil (point-min) (point-max))
7:      (sgml-pretty-print (point-min) (point-max))))

Lines 2 and 3 call sexpr->xml and arrange for its output to go into a buffer named *XML*. When sexpr->xml returns, the *XML* buffer is selected and set to xml-mode (really nxml-mode). The replace-regexp on line 6 deletes any occurrences of nil and gets rid of <quote> and </quote> tags. Finally, sgml-pretty-print is called to format the XML nicely.

That’s a lot of work for not very much code. We could, of course, take care of the formatting and fixing up the nil and <quote> problems right in sexpr->xml but I wanted to show how simple the conversion can be without a lot of busy details. Besides, we have the power of Emacs so it would foolish not to use it.

The final result of running convert-to-xml on our sample sexpr is

<record>
  <date>2005-02-21T18:57:39
  </date>
  <millis>1109041059800
  </millis>
  <sequence>1
  </sequence>
  <logger>
  </logger>
  <level>SEVERE
  </level>
  <class>java.util.logging.LogManager$RootLogger
  </class>
  <method>log
  </method>
  <thread>10
  </thread>
  <message>A very very bad thing has happened!
  </message>
  <exception>
    <message>java.lang.Exception
    </message>
    <frame>
      <class>logtest
      </class>
      <method>main
      </method>
      <line>30
      </line>
    </frame>
  </exception>
</record>

Any Elisp coders out there might want to consider what a translator in the other direction (XML to s-expressions) would look like. Not having s-expressions to leverage makes the processing a little more difficult but not by much.

Posted in Programming | Tagged , | 2 Comments

Ruby vs. Lisp

Yesterday, I wrote about Steve Yegge’s The Emacs Problem post in which he examined a claim by a colleague that, among other things, Emacs should be rewritten to use Ruby as the interpreter. Today, serendipitously, I came across Learning Ruby, and Ruby vs. Lisp by Hans Hübner on his Netzhansa blog. Hübner is a Lisper who works in an environment where there is a lot of Ruby development going on and he decided to learn a bit about it. After reading The Ruby Programming language by David Flanagan and Yukihiro Matsumoto and doing a bit of programming, he could see the appeal or Ruby but felt that it didn’t offer him anything that Common Lisp didn’t have and that Ruby had some problems:

  • Ruby strives for succinctness but often at the cost of complex syntax and special case rules to resolve ambiguities.
  • Ruby’s functional programming aspects appear to have been bolted onto its original object oriented model after the fact. This makes using Ruby in a way that is not pure object oriented messy and hard to understand.
  • Ruby runs about 10 times slower than the same program in CL.

Yegge likes Ruby too but he believes it falls into the category of “scripting languages” by which he means “a bunch of miserable hacks: Perl, Python, Ruby, Groovy, Tcl, Rexx… you name it. They all start life with no formal grammar or parser, no formal bytecode or native-code generation on the backend, no lexical scoping, no formal semantics, no type system, nothing. And that’s where most of them wind up.1” That’s harsh, of course, but it does help explain all the warts in the language.

Incidentally, despite these complaints, Yegge is very complimentary about Ruby in his Tour de Babel post about programming languages. He considers it Perl done right and believes that it will largely displace Perl. Yegge didn’t seems to be fazed by the complex syntax that bothered Hübner and said that it usually works the way you expect it to.

Ruby has a lot of devoted fans and I’ve often been tempted to learn it but it seems to me that it solves pretty much the same problems that Lisp and Scheme do and I’m very happy with those languages so I’m still not ready to take the plunge. Maybe next week.

Update: phased → fazed

Footnotes:

1 The last comment in his The Emacs Problem post.

Posted in Programming | Tagged | 3 Comments

The Emacs Problem

Almost all Emacs users are familiar with Steve Yegge’s Effective Emacs post. If you’re an Emacs user and haven’t read it, stop what you’re doing and go read it right now. As it happens, Yegge wrote several posts about Emacs and one of my favorites, which I’ve just reread is The Emacs Problem.

That post begins with a claim by one of Yegge’s colleagues that Lisp is not a good language for text processing, that Emacs goes to show that, and that Emacs should be rewritten with Ruby as the interpreter. I know, I know; me too. But Yegge examines that claim seriously and in the process generates an wonderfully readable case that nothing but Lisp is really any good at text processing.

He begins by noting that when we think of text processing we almost always think of regular expressions as the main tool. He notes that Lispers are generally skeptical of regexps and say things like, “Regexps aren’t useful for tree structured data and why are you storing your data as text in the first place instead of s-expressions (that is, as Lisp)?” Then he goes into a comedy routine about log files and how they’re usually just a line or two of text that is most easily processed with regexps and that those Lisp losers appear not to know that.

But then he says that he just noticed that his java.util.logging output had suddenly changed to being XML (this was with Java 1.5 in 2005). He gives an example of a short text-based log entry and the corresponding entry in XML and concedes that sometimes the extra metadata in the XML can be useful and that tools like Xpath provide a powerful way of processing and querying XML data.

Next, he gives the same data as an sexpr. It’s shorter, clearer, and much easier to read than the XML and can be operated on by Xpath-like tools available for Lisp. In fact, you can think of the data as executable and write functions or macros that can make it transform or process itself. He then identifies the Text-Processing Conundrum:

  1. You want to be able to store and process text data.
  2. Doing this effectively requires the data be tree structured in all but trivial cases.
  3. The only good, general tool for doing this is XML.
  4. XML processing is supposed to be easy but rapidly becomes complex when you start using tools like XSLT and XQuery or worse yet write your own transformations using a SAX or DOM processor in the language of your choice.
  5. But those are your only options.

Unless you’re using Lisp. With Lisp, data is code and so you can store your data as a Lisp program. Querying and transforming it are almost trivial.

He goes on to consider other text processing such as configuration files and finds that the same principles apply only more so. By writing your configuration file in Lisp, it ceases being a configuration file and becomes part of your program.

Finally, he considers the question of rewriting Emacs in some other language. He goes through all the usual reasons why that’s not practical (at least until guile becomes the interpreter for Emacs if it ever does) and recites the usual litany of political problems surrounding Emacs and RMS.

To my mind, though, he misses the main point. Why would you want to write it in something besides a Lisp-like language. As he amply demonstrates in the first part of the post, Lisp is a great language for text processing and the only one without serious problems. Sure, a lot of the text that Emacs deals with isn’t tree structured, but a lot of it is and Emacs has a powerful regexp system for the text that isn’t. Just read through Xah Lee’s excellent series of articles on text processing with Emacs, for instance. A recurring theme in Lee’s posts is text-soup automation in which he uses Elisp to transform arbitrary text.

Yegge’s post is excellent and if you haven’t read it, you’re missing out on a treat. Also be sure to read the comments. He develops some of the ideas in the post further in responses to the commenters.

Posted in Programming | Tagged , | 5 Comments

Copying Or Writing An Emacs Region

Emacs provides a nice set of commands to copy a region in a buffer to another buffer or to write it to a file. In each case, these functions will prompt for the name of the file or buffer when called interactively. The functions are:

  • append-to-buffer
    Despite its name, this function does not, strictly speaking append the region to a new buffer. Rather it inserts it in the new buffer at point.
  • copy-to-buffer
    This command replaces the text in the new buffer with the region from the current buffer.
  • append-to-file
    Append the region onto the end of the file.
  • write-region
    Write the region to the named file. When called from Elisp, this function has 4 optional arguments that control whether or not to append to the file, whether the file can already exist, and alternate naming.

These aren’t commands that you’ll use a lot but they are convenient for moving part of a buffer to another buffer or file.

Posted in General | Tagged | Leave a comment

Running Emacs Lisp On Several Files

Xah Lee as an instructive post on converting and scaling images with Emacs. Unlike him, I don’t often need to scale or convert even a single image, let alone many, but he does show an interesting technique to pass several files to an Elisp function. The idea is to use dired to mark the files you’re interested in and then use dired-get-marked-files to pass them to the function.

The template for performing some action on each of the marked files is

(defun process-each-file (files)
  "Perform an action on each file in list FILES."
  (interactive (list (dired-get-marked-files)))
  (mapc (lambda (f) (perform-action-on-file f)) files))

For example, if you want to perform perform-action-on-file on each JPG file you can use 【*..jpg in dired to mark all the JPG files and then call process-each-file.

The trick that makes this work is passing a list to the interactive declaration. I’ve written about that before. This is a handy technique that can often be used to advantage.

Posted in Programming | Tagged , | 8 Comments

Drawing Key Sequences In Emacs (Updated)

This is a sad story of my HTML ineptitude but one in which I eventually find the right solution. I’ve written several times (here, here, and here) about how I implemented Xah Lee’s trick of representing key sequences in a visual manner like this: 【⌘ Cmd+Tab

When I was first experimenting with the markup I used <key>...</key> tags to delimit the keys and added the appropriate bit of CSS with a key tag. That worked perfectly in my local testing but when I tried it out with the live blog it turned out that WordPress disappeared the <key> tags, presumably because it didn’t recognize them. After a bit of thinking, I came up with the <span class="key"> that I’ve been using ever since. That works well and the only complaint I have is that the markup for a complicated key sequence is pretty long and I tend to get problems when Emacs tries to split lines.

Then Aankhen asked why I wasn’t using <kbd>...</kbd> instead of the longer <span...> method. I replied that I’d originally used <key> but that WordPress eliminated them from the text. He explained to me (probably more politely than I deserved) that <key> isn’t an HTML tag but that <kbd> is. By this time I already knew that Lee used <kbd> but I had just assumed that he had made it up—just as I had made up <key>—and that it worked for him because he wasn’t using WordPress. So, I thought, I’ll just add a kbd tag to the CSS and both methods will work. Tested it locally and everything was fine. Added the kbd to the site CSS, ran a test and it didn’t work. I concluded that this was just more WordPress weirdness and gave up. Several weeks later it popped into my head that the reason it hadn’t worked was that my browser hadn’t reloaded the site’s style sheet when I changed it. Fortunately, I had left the change in so when I retried it, everything worked as expected.

I like the shorter <kbd> tags because they avoid the overly long markup sequences that were causing me a slight irritation. All that was left to do was to change prettify-key-sequence to use the new method. Here, for the record, is the new code

(defmacro key (k)
  "Convenience macro to generate a key sequence map entry
for \\[prettify-key-sequence]."
  `'(,k . ,(concat "@<kbd>" k "@</kbd>")))

(defun prettify-key-sequence (&optional omit-brackets)
  "Markup a key sequence for pretty display in HTML.
If OMIT-BRACKETS is non-null then don't include the key sequence brackets."
  (interactive "P")
  (let* ((seq (region-or-thing 'symbol))
         (key-seq (elt seq 0))
         (beg (elt seq 1))
         (end (elt seq 2))
         (key-seq-map (list (key "Ctrl") (key "Meta") (key "Shift") (key "Tab")
                            (key "Alt") (key "Esc") (key "Backspace")
                            (key "Enter") (key "Return") (key "Space")
                            (key "Delete") (key "F10") (key "F11")
                            (key "F12") (key "F2") (key "F3")
                            (key "F4") (key "F5") (key "F6") (key "F7")
                            (key "F8") (key "F9")
                            ;; Disambiguate F1
                            '("\\`F1" . "@<kbd>F1@</kbd>")
                            '("\\([^>]\\)F1" .
                              "\\1@<kbd>F1@</kbd>")
                            ;; Symbol on key
                            '("Opt" . "@<kbd>⌥ Opt@</kbd>")
                            '("Cmd" . "@<kbd>⌘ Cmd@</kbd>")
                            ;; Combining rules
                            '("\+\\(.\\) \\(.\\)\\'" .
                              "+@<kbd>\\1@</kbd> @<kbd>\\2@</kbd>")
                            '("\+\\(.\\) \\(.\\) " .
                              "+@<kbd>\\1@</kbd> @<kbd>\\2@</kbd> ")
                            '("\+\\(.\\) " .
                              "+@<kbd>\\1@</kbd> ")
                            '("\+\\(.\\)\\'" .
                              "+@<kbd>\\1@</kbd>"))))
    (mapc (lambda (m) (setq key-seq (replace-regexp-in-string
                                     (car m) (cdr m) key-seq t)))
          key-seq-map)
    ;; Single key
    (if (= (length key-seq) 1)
        (setq key-seq (concat "@<kbd>" key-seq "@</kbd>")))
    (delete-region beg end)
    (if omit-brackets
        (insert key-seq)
      (insert (concat "【" key-seq "】")))))
Posted in General | Tagged , | 1 Comment

Specifying Any Modifier Key In Emacs

A couple of weeks ago, I wrote about a Xah Lee post that, among other things, mentioned how to set the hyper and super keys for use with Emacs. Lee uses them a lot because he has a bunch of private functions that he wants to bind to key sequences and most of the convenient sequences that don’t involve hyper or super are already used. I don’t have that problem because I don’t have as many private functions and because, not being as fanatical about efficiency as Lee, I’m content to let smex find the private functions that I do use.

Nonetheless, it is sometimes useful to have a hyper or super key even if you haven’t dedicated some spare keys on your keyboard to them. It turns out that there is a way of causing a keystroke to be modified by the hyper or super key: just type 【Ctrl+x @ h】 and the next key will be modified by “hyper”. Similarly, typing 【Ctrl+x @ s】 will cause the next key to be modified by “super”. If you wanted to modify by both hyper and super you can type 【Ctrl+x @ h Ctrl+x @ s】.

Actually, the same trick works for any of the modifier keys. You can get a complete list by typing 【Ctrl+x @ Ctrl+h】 but here is a quick summary.

Summary

Key Sequence Modifier
Ctrl+x @ S Shift
Ctrl+x @ a Alt
Ctrl+x @ c Control
Ctrl+x @ h Hyper
Ctrl+x @ m Meta
Ctrl+x @ s Super
Posted in General | Tagged | Leave a comment

A History Of Unix

Warren Toomey has a great article on The Strange Birth and Long Life of Unix in the IEEE Sprectrum. Toomey runs the Unix Heritage Society that maintains a mailing list for those interested in Unix history and, most importantly, maintains a collection of all the publicly available Unix source code.

The outline of the story is well known. When AT&T withdrew from the Multics project, Ken Thompson, Dennis Ritchie, and some of the other Bell Labs researchers were left without the convenient, interactive, time sharing system that Multics had provided and set about to create their own. What is new to me is how much of a skunk works project this was. AT&T management felt that they had been burned with Multics and were dead set against any further OS research. Thompson experimented with some of his file system and OS ideas on the GE-645 that they’d used for their Multics work but realized that the 645 would soon go away so he abandoned his work. Then, of course, he found that storied PDP-7, ported his Space Travel game to it and opened the door to what would eventually result in Unix.

Toomey relates how early Unix was in many ways like the open source movement today. AT&T couldn’t sell Unix and their lawyers were afraid that offering any support could be interpreted as violating their consent decree so fixing bugs was up to the users. Eventually those were contributed back to AT&T and incorporated into new editions. Thompson apparently even smuggled bug fixes to the users through Usenix.

Toomey tells a great story of the early days of Unix and if, like me, you’re a Unix aficionado, you’ll want to give it a read. If you’re an open source enthusiast, you’ll want to read it to see how hard people had to work to share code and knowledge in the early days. All in all, a great read for anyone interested in computers.

Posted in General | Tagged | Leave a comment

emacs follow-mode

Xah Lee has a nice post on follow-mode. He describes how to take a wide frame, break it up into two or more horizontal windows, and then have the windows act as one large window into the buffer. That is, the end of the first window flows into the start of the second and so on. If you move the cursor off the end of a window it will move to the next window.

This isn’t something I want to do too often but it can sometimes be useful. I’ve seen people asking how to do this several times so Lee’s post is a valuable one. Fortunately, it’s easy to set up and once you do it, you will almost certainly remember how to do it in the future.

Posted in General | Tagged | 1 Comment