More Experiments with Indexing Org Files

A couple of weeks ago I wrote about John Kitchin’s use of SQLite to index his Org files. Kitchin uses Org mode as a centerpiece of his workflow and has about 5 years of files spread across his local file system, Dropbox, Google Drive, and other places. His idea was to index the files on headline, tags, citations, and a bunch of other fields that you can see by following the above link.

Now Kitchin is continuing his experiment by using the noSQL database MongoDB. It turns out that there are some advantages to using MongoDB but also disadvantages. Follow the link to see how easy it is to build the database and some strategies for getting the information out again. He continues the experiment in a subsequent post on implementing CRUD operations in MongoDB, which you should also read.

Kitchin’s work on indexing Org files along with Karl Voit’s approach to the same problem is really interesting. Those of us who put more and more of our lives and work product in Org files are, sooner or later, going to need something like this. Currently, I rely on tags or, if all else fails, a text search to find the entry I’m looking for but that only works for agenda files. So unless I’m careful to put a link to my work product in my journal (or one of the other agenda files) searching for tags won’t work. There is also, I suppose, a scaling problem. Once I get enough data, searching for tags or text is going to be too slow.

If you find yourself using Org for a significant part of your workflow, you should read these posts to help you get ready for the day that your own files get too large for easy searching.

This entry was posted in General and tagged , . Bookmark the permalink.