Org Mode and Broken Links

As a couple of you have pointed out to me recently, HTML links in my posts are sometimes getting broken. That happens when the link has a parameter specified with a question mark and equal sign. For example, here's a link to one of my blog posts that gets affected

http://irreal.org/blog/?p=2168

For some reason Org Mode has started escaping equal signs in links so that the above gets turned into

http://irreal.org/blog/?p%3D2168

which is, of course, incorrect. This happens in org-link-escape, which is called by org-make-link-string when I insert a link with 【Ctrl+c Ctrl+l】. No problem, I thought, they must have added the equal sign to the list of characters to escape; I'll just fix it up with a buffer local variable or something. Unfortunately, when I checked I discovered that the equal sign has always been there and that neither org-link-escape nor org-make-link-string have been changed in over a year.

I put in a little bit of time trying to track down what changed (it didn't do this in Org 7 and maybe even not in the early Org 8 versions) but couldn't find anything. I'll keep looking but in the mean time, here's a little bit of Elisp that I threw together to fix things up:

(defun jcs-clean-link ()
  "Clean up munged = in an Org link."
  (interactive)
  (save-excursion
    (goto-char (point-min))
    (while (search-forward-regexp "\

\

![
^]%]*\\(%3D\\)[^]]*\$$" nil t) (replace-match "=" nil nil nil 1))))

I can call that before exporting the Org file to HTML and everything will be fine. I may advise the exporting function to call jcs-clean-link for me but for now I'm doing it manually.

The whole thing has not been a total waste of time, though, because I learned two new things. First, calling 【Meta+xvisible-mode shows the links as they really are rather than just the description part. I used to switch to text-mode to do this which is sort of a pain so this is a good find for me.

Second, although it doesn't help me with the current problem, I discovered that you can set Org variables affecting export—even if they aren't one of those supported by an option—with the BIND directive. To set variable to value, you just add the line

#+BIND: variable value

to your Org file. You also have to set org-export-allow-bind-keywords to t for this to work. Back when I thought my links were getting munged during export, I speculated that maybe XHTML Strict required the escaping and I was able to disprove that by using BIND to cause the file to be exported as HTML4.

If any of you know what's going on with the links, please leave a comment.

This entry was posted in General and tagged , . Bookmark the permalink.
  • I wrote an alternative to org2blog, and I found that there was a change to the export code after 8.0.3 that broke link handling---my unit tests suddenly started showing failures.

    https://github.com/mdorman/org-blog/commit/e1d23f366afb31054946db2bbc123cbc12400a7c is the commit that fixed link handling for me. It is, as the comments indicate, not ideal, but might be helpful.

  • David Maus

    Org escapes links when inserting via `org-insert-link' in order to protect characters that would otherwise conflict with Org's own syntax (i.e. square brackets). The link is unescaped when read by `org-open-link'. Thus if you read a link from an Org buffer you should apply `org-link-unescape' yourself.

    Org's "auto-escaping" is not ideal and was recently discussed here:

    http://thread.gmane.org/gmane.emacs.orgmode/74983/focus=75024

    • jcs

      Hi David, thanks for stopping by and helping me to understand things a little better. I've already seen the thread on gmane but your remarks helped me put it into context. I was thinking that before Org 8 there was some code that prevented the = from being escaped. After reading your comments, I now believe that the problem is at the other end: the = is not being unescaped before exporting it. In view of the new exporter framework introduced in version 8 that makes sense.

      Again, thanks for sharing your wisdom on the matter.

  • rick

    Hi-

    I am the (new :) maintainer of ox-html. I have pushed a fix for this problem (commit 6f5180bd) to org master.

    A couple of other things:

    * the command `org-toggle-link-display' will toggle the display of org links without affecting the visiblity of other text in the buffer (i bind it to "C-c L").
    * the new exporter has built-in support for export filters, no need to defadvise anything.
    * This would have been fixed quicker if you had posted the issue to the orgmode mailing list ;)

    • jcs

      Rick,

      Thanks so much for the info and, of course, the fix. I was getting ready to dig into ox-html but I don't know the code base so I wasn't looking forward to it. Thanks for saving me. Also, thanks for the pointer to org-toggle-link-display.

      I know how annoying it is when people post known problems to a mailing list so I scanned Gmain looking for the problem. I found the thread that David Maus referenced (above) but nothing else so I felt like it was a known problem and didn't repost it.

      As a larger issue, this whole thing shows how great the Org community is. I whine about a problem on an obscure blog and two org maintainers drop by to help. It doesn't get much better than that.

      I'll post an update on the blog in case other people go Googling about the problem. Again, thanks for the help.