crowdflow.net

In an interesting coda to the Apple consolidated.db story that I blogged about here and here, some German hackers have started crowdflow.net, a Web site dedicated to generating an open source database of WiFi and Cell sites similar to that being built by Apple, Google, SkyHook Wireless, and others. They are doing this by asking people to send them the log files from their iPhones. They have a Java applet that you can use to extract the files from your iPhone backup and a file upload widget on their site to send them the extracted files anonymously. Currently, they are accepting only iPhone logs but they plan to include Android devices as well.

So far most of their data is from the EU, particularly Germany. They have some nice visualizations of the data on the crowdflow site. There’s also a blog and a link to download the database they have to date.

I’m not sure how successful this will be long term, especially since Apple will shortly stop including consolidated.db in the iPhone backups, but it’s a nice hack and could prove useful.

Posted in General | Leave a comment

FIFO Queues in Scheme and Lisp

FIFO queues are a standard data structure that have many uses but they can be surprisingly difficult to get right. The standard implementation is an array with two pointers called front and rear.

http://irreal.org/blog/wp-content/uploads/2011/05/wpid-queue.png

It’s the details of maintaining the pointers that cause trouble. An object is added to the queue by advancing the rear pointer—in a circular fashion—and placing the new object where it points. An object is taken off the queue by removing the object pointed to by the front pointer and then advancing the pointer, again in a circular fashion. You also have to worry about what it means when both pointers point to the same place.

In Scheme and Lisp, FIFOs are often implemented with lists by manipulating the cdr of the last element to push a new object onto the queue. This is fast and direct but you still have to maintain the rear pointer. There’s a nice method of implementing FIFO queues that I learned from Programming Praxis. The idea is that the queue is made up of two lists, front and back. You add objects to the queue by consing them onto the front of the back list, and you remove objects from the queue by popping them off the front of the front list. If the front lists goes empty, you set it to the reverse of the back list and set the back list to the empty list.

Here’s an implementation of that idea in Scheme. A Lisp implementation is very similar.

 1:  ;;;
 2:  ;;; Queue obect
 3:  ;;;
 4:  
 5:  ;; 'push x: push x onto the rear of the queue
 6:  ;; 'pop: remove the head of the queue and return it
 7:  ;; 'peek: return the head of the queue
 8:  ;; 'show: show the queue's contents
 9:  ;; 'fb: show the front and back parts of the queue (for debugging)
10:  (define make-queue
11:    (lambda ()
12:      (let ((front '()) (back '()))
13:        (lambda (cmd . data)
14:          (define exchange
15:            (lambda ()
16:              (set! front (reverse back))
17:              (set! back '())))
18:          (case cmd
19:            ((push) (push (car data) back))
20:            ((pop) (or (pop front)
21:                       (begin
22:                         (exchange)
23:                         (pop front))))
24:            ((peek) (unless (pair? front)
25:                      (exchange))
26:                      (car front))
27:            ((show) (format #t "~s\n" (append front (reverse back))))
28:            ((fb) (format #t "front: ~s, back: ~s\n" front back))
29:            (else (error "Illegal cmd to queue object" cmd)))))))

Calling make-queue returns a closure that implements the queue. You can push and pop objects onto and off of the queue by sending it the push or pop command.

(define q (make-queue))
(q 'push "Hello, World!")
(q 'pop) → "Hello, World!"

As you can see on line 19, the push command merely pushes the object onto the front of the back list. The pop command on lines 20–23 just gets the first object on the front list. If the front list is empty, exchange (line 14) is called to set the front list to the reverse of the back list.

There are a few other commands mostly for debugging purposes. This is a tremendously helpful trick that I use all the time—the make-queue function is part of my standard library.

Finally, just for completeness, here are the unless, push, and pop macros that the above code uses. These are pretty standard and may even come with your Scheme implementation.

(define-macro (unless pred . body)
  `(if (not ,pred)
       (begin
         ,@body)))

;; Push an object onto a list
(defmacro push (obj lst)
  `(set! ,lst (cons ,obj ,lst)))

;; Pop an object off a list and return the object
(defmacro pop (lst)
  (let ((t (gensym)))
    `(if (null? ,lst)
         #f
         (let ((,t (car ,lst)))
           (set! ,lst (cdr ,lst))
           ,t))))
Posted in Programming | Tagged | 3 Comments

Just. Kill. Me. Now.

I’m currently serving as Technical Editor for a book by one of my publisher’s sister imprints. It’s interesting work and I enjoy it. What I don’t enjoy is working with Word and its clones.

This imprint insists on .doc files from their authors and then use those files for the entire production cycle. This makes a lot of sense for them. The book I’m working on is one of a series of similarly themed books and using Word allows them to use a style sheet to enforce the style of the series. It helps in production too. The copy editors, technical editors, and others make their changes directly to the file with tracking on. When it gets back to the author he can accept or reject each change just by clicking a button. When he’s finished it goes to production for printing and they can simply export the file to PDF and send it to one of their vendors for printing.

As I’ve written elsewhere, I find it almost physically painful to use Word or any word processor. I keep thinking about how much I hate using the word processor instead of thinking about what I’m writing. Of course, that doesn’t apply so much here because I’m just correcting technical errors and putting in notes to the author. Still, NeoOffice1, my Word clone of choice, insists on nagging me about things and fixing up what it thinks are my mistakes. Want to start a sentence with “IN is a …?” NeoOffice will decide you meant “In” and fix it for you. It tries to anticipate what word you’re typing and helpfully puts its tentative completed word up for you to accept. That’s great but then I have a hard time seeing what I’ve typed so far when it guesses wrong. Sometimes when I type -- it turns it into an em-dash; sometimes it doesn’t and I can’t figure out the rule. It appears to me that I’m typing something identical to the time it gave me an em-dash but I get two short dashes instead. And the worst, absolute worst, is dealing with lists. Word and its evil siblings just refuse to do what you want without major effort. It makes me want to stick a pencil in my eye.

I know, I know. You can turn this stuff off but it’s not easy to figure out how. As far as I’m concerned, if there’s not a button to do it, it’s too hard to do. Life is way too short to spend any of it hunting through help files trying to figure out how to get your word processor to mind its own business and just do what you tell it. That’s why I do my writing with Emacs and typeset it with Groff. It’s all text and Emacs and Groff do exactly what I tell them to, not what they think is good for me.

My publisher probably wishes that everyone would use Word as it would make their production cycle much more efficient but happily they continue to indulge me and a few other curmudgeons by allowing us to send them PDF files. It means the copy editors have to break out their red pencils and mark up a paper copy but it makes me so much happier.


Footnotes:

1 I’m not picking on NeoOffice here; all Word-like word processors exhibit the same sorts of pathological behavior.

Posted in General | Tagged | Leave a comment

A Handy Interactive Unicode Reference

Unicode is a wonderful thing. It gives us every conceivable alphabet and a large set of non-alphabetic symbols as well. The problem for me is that I’m lost as soon as I step outside the familiar ASCII subset. There are just too many symbols to know them all—over 100,000 at this point. The book documenting the last version, Unicode 5, was 1,472 pages long. Mostly, of course, I’m interested in Latin-based alphabets and symbols so that cuts the list down a lot but it’s still large and I don’t use most of them often enough to learn them. That means that I need a reference. Since most of the things I write that use non-ASCII characters end up on the Web, I bookmarked this handy list from Xah Lee.

Today, though, I found a link to this interactive chart of the entire Unicode space. You can click on a category such as ASCII, SYMBOLS, NERD (A and B), CIRCLED, LINES, SHAPES, DINGS, or ASTRO to see some of the common things you might want to use or you can use the three sliders to explore the whole space. I’m not sure that this thing would be practical for day to day use as a reference but it’s fun to play with and if you do need an out of the ordinary glyph, this is a good way of finding it.

So, while I’ll stick to Lee’s chart for most of my work, I’m glad to have found the interactive chart. It sure beats carrying around a 1,400 page book.

Posted in General | Tagged | Leave a comment

Apple Responds

Just as I predicted in my Two Tales post, Apple has responded to the consolidated.db brouhaha. And just as predicted by the serious commenters on the issue, Apple was not tracking users but merely caching data that helps with location services. In fact, the data in consolidated.db is not even cell sites and WiFi hotspots that that particular phone has seen; it’s part of a larger crowd-sourced database that is too large to fit on the phone, so Apple downloads a subset to the phone based on its current location.

Apple does retrieve location data from the phone—that’s how it builds the large crowd-sourced database—but that data is encrypted and anonymized before it’s sent to Apple. Apple says that they can not tie the data back to the originating phone. Apple says that it is also collecting anonymous traffic data with the goal of building another crowd-sourced database to improve traffic service in the future.

To help assuage customer fears (and doubtless to mollify the politicos trying to make hay with the issue), Apple is promising an IOS update in the next few weeks that will

  • Reduce the size of the cell site and WiFi hotspot data cached on the phone.
  • Cease backing the cache up.
  • Delete the cache entirely when Location Services is turned off.

Read the whole statement at the above link. Doubtless there will be a few who mumble paranoiacally about cover ups and lies but reasonable people will see Apple’s explanation as a straightforward statement of the facts and pretty much what they always expected.

Posted in General | Tagged | Leave a comment

org2blog

Now that I’ve moved my blog to WordPress, I’m simplifying my work flow by using org2blog. It’s still early days—I’ve only published three posts and the about page with it—but so far I like it a lot. I write my posts in Emacs using Org-mode just as I always have but instead of having to export them to HTML, call up the blog, and paste the HTML into the Blogger editor, I just type

C-u M-x org2blog/wp-post-buffer

and org2blog takes care of everything else.

Installation is a snap. I had previously bookmarked this post by Gabriel Saldana and this one by Da Zhang on how to set things up but I didn’t really need them; the README had all the information. Here’s how to get going in 4 easy steps:

  1. Download org2blog from github:
    git clone http://github.com/punchagan/org2blog.git
  2. If you don’t already have it, get xml-rpc from Launchpad.
  3. Set your load path to point at these.
  4. Follow the instructions in the README.org that comes with org2blog to configure your .emacs or init.el.

That’s all there is to it.

Posted in Blogging | Tagged | Leave a comment

A Tale of Two Security Scandals

Recently there’s been a lot of buzz about two events on the security/privacy frontier. In the first, researchers Pete Warden and Alasdair Allan discovered that the iPhone maintains a database, consolidated.db, that contains a table of cell sites and WiFi hotspots that the phone has seen. There was a large outcry and the press thought they had a big scandal on their hands: “Apple is secretly tracking their customers” they shouted. Politicians, of course, couldn’t resist a chance to pander and spoke darkly of “inquiries into the matter.”

The thing is, none of this was news. I remember reading about it a year ago and Apple produced an explanation at the time. Fortunately, the adults quickly asserted themselves largely calming everyone down. The aforementioned panderers are still demanding (yet another) explanation from Apple so doubtless we’ll hear more shortly but as things stand now, the whole kerfuffle is a non-event.

The second non-event didn’t get nearly as much press and the press it did get was mainly confined to the geeky corners of the Internet. Dropbox announced that, their privacy standards notwithstanding, they would provide a decrypted copy of your data to law enforcement authorities upon being served a warrant. This certainly isn’t a surprise; after all, one could hardly expect them to do otherwise. Like everyone else they have to obey the law. And really, no one thought they should do anything else. What caused all the fuss was the supposed admission that Dropbox could read your data.

You can sort of understand the consternation because Dropbox has always insisted that your data was encrypted and secure and that not even their employees could read it. But, really, if you’re not my Aunt Millie, what do you make of that statement? Even a moment’s thought tells you that it can’t be literally true. After all, they have to decrypt it to send it to you and the fact that you can read the data from their Web site tells you that they do decrypt it before they send it to you. So, “not even our employees can read it” can only mean that they have procedures in place to prevent unauthorized access to your data. In other words, their security guarantee is something along the lines of, “We encrypt your data so that if someone breaks into our servers they won’t be able to read it, and we have procedures in place to prevent our employees from accessing it.”

For most of us, most of the time, that’s probably good enough. Dropbox is a way of synchronizing computers, after all, and while no one wants their files being read by the world, the majority of what runs through Dropbox probably isn’t sensitive—more likely it’s boring and no one would bother to read it even if they could. Still, there are occasional sensitive documents with information in them that we definitely should protect: Social Security Numbers, bank account information, passwords, proprietary company data and the like. What should we do to protect that data if we’re uncomfortable with the default security from Dropbox?

One nice solution proposed by Russell Ballestrini is to turn your Dropbox into a TrueCrypt volume. Then your data is automatically encrypted and decrypted on your machines so it’s safe in the cloud no matter what Dropbox does. Unfortunately, as the comments to Ballestrini’s post make clear, there are problems with this approach and it probably won’t work for most people. In any event, it’s overkill because, as I said above, we usually aren’t passing around data that anyone but us cares about.

That leaves the occasional sensitive document and for those the answer is easy: simply encrypt them. If you’re using Emacs and deal primarily in text documents, as I do, you can make this basically transparent by using epa (EasyPG) as described here and here. If you use public key encryption and don’t encrypt the key on your computer, the process is absolutely transparent. Of course, not encrypting the key opens up a vulnerability and you would probably want to keep it encrypted at least on any portable computers.

If you regularly deal with non-text documents and the application associated with them doesn’t offer encryption (as OpenOffice, Word, and Pages do, for example) then you can use GnuPG, PGP, or a similar utility. That’s not as convenient of course, but it probably won’t be an everyday occurrence either.

The bottom line is that if you are going to put sensitive documents in the cloud, then it’s up to you to encrypt them. Depending on a third party to do it for you and then raising a fuss when it turns out they aren’t really secure just doesn’t cut it.

Posted in General | Tagged | Leave a comment

Open for Business at Our New Location

This is the Irreal blog formerly hosted by Blogger at http://irrealblog.blogspot.com/. I was going to move all the old posts over before I started posting here, but I’ve decided to just get things going at Irreal’s new home and move the old posts as I get time. They were all generated out of Org-mode files so I can always just repost them if the import doesn’t work for some reason. In the mean time, the old posts are still at Blogger and I will leave that blog up for the time being at least.

OK, then. Let’s get going…

Posted in Blogging | 2 Comments