Xah Lee has another Emacs challenge. This time, it’s to replace entries of the form
<tr><td>pound</td><td>£</td><td>pound sign, U+00A3</td></tr>
with entries like this:
<tr><td>pound</td>£<td>pound sign, U+00A3</td></tr>
This is, presumably, a small tweaking of Lee’s very useful page on HTML/XML Entities that I’ve written about before.
It would be reasonably easy to do this by writing some Emacs Lisp but I wanted to see if I could do it interactively. It turns out to be pretty simple using query-replace-regexp
. The secret is to use \,
expr as the replacement string in query-replace-regexp
. With that replacement string, when the regular expression is matched expr is called to provide the replacement. I used
&#\([0-9]+\);
as the regular expression and
\,(format "%c" \#1)
as the replace string. The \#1
takes the first subgroup and converts it to a number. Then the rest of the format
statement delivers that unicode character as the replacement text.
The careful reader will note that this doesn’t give quite what Lee asked for because the <td>...</td>
tags are still there. I’m guessing that Lee really wanted them there but if not, it’s easy to get rid of them by using
<td>&#\([0-9]+\);</td>
as the regular expression.