Extracting Text From An Emacs Buffer

Rajasegar Chandran has an interesting and useful post on how to extract text from an Emacs buffer. You wouldn’t think there’s a lot to know but there are at least two wrinkles. First you might not want the whole buffer and second you (probably) don’t want to include the text properties.

Text properties are things like font faces that allow Emacs to display text as “rich text”. They’re really useful but if you want to work on just the underlying text, they can get in the way so, of course, Emacs gives you a way to omit them.

To deal with all this there are four functions that cover most of what you’d want to do.

buffer-substring
Extract a substring including its text properties from the buffer.
buffer-substring-no-properties
Like buffer-substring but strips the text properties.
buffer-string
Extract the entire buffer as a string. The text properties are included.
thing-at-point
Return the object that the point is on.

Most of these are self explanatory but thing-at-point is a bit richer. You can specify the type of object you want to extract such as a symbol, list, url, filename, email address, or many other object types. It’s sister function bounds-of-thing-at-point returns the start and end of the object. It’s really useful if you want to extract, say, a URL from the buffer.

These function are more useful than you might think. My init.el has several instances of buffer-substring-no-properties and thing-at-point just for my personal configuration. If you’re not familiar with them, take a look at Chandran’s post.

This entry was posted in General and tagged . Bookmark the permalink.