Russ Cox on Globbing

If you’re like me, you probably don’t think about globbing very much. It’s just a way of fuzzy matching file names in a manner reminiscent of regular expression matching. We all use it everyday but have you ever thought about its implementation or efficiency?

Russ Cox, of course, has thought about these issues and has posted a really interesting article with his results. He starts with a simple benchmark that measures the time to complete ls (a*)nb in a directory containing the single file a100 where the exponents represent string repetition. He discovered that this resulted in an exponential run time whereas ls | grep (a*)nb ran in O(n) time.

Of course, globbing is done by the shell so he tried the (non-grep) experiment with different shells. Most shells showed exponential time but csh was linear in n. Cox also discovered that some FTP servers exhibit the same behaviors.

The results are odd because you’d think globbing is a much simpler problem than regex matching. It can be, of course, and Cox gives an example implementation that is both simple and fast.

Be sure to read the article to see all Cox’s results and analysis. As usual with Cox’s posts, it’s really interesting and well worth your time.

UPDATE [2017-05-22 Mon 08:21]: Added link to article.

This entry was posted in General and tagged . Bookmark the permalink.
  • When I read the Bash manual (http://www.tldp.org/LDP/abs/html/globbingref.html) is the first time I saw globbing. It seems crude and incomplete. Wrong. Wrong of course. Globbing is really sweet. Since Bash is /the/ official shell for Linux, it is worth reading the whole thing so it was all fun to read.

  • Kyle946

    Were you planning ever to actually link to the article you mention? Obviously in these Google-ish days, it shouldn't be hard to find, and I suspect it's this: https://research.swtch.com/glob . That said, you mentioned the article was written by Russ Cox, whereas I'm pretty sure that one was written by Matt Damon (https://swtch.com/~rsc/). :-)

    • jcs

      Oops. Sorry. Added the link.