The Technical Problems with Apple’s CSAM Scanning

I’ve written several posts (1, 2, 3, 4, 5, 6, 7) about Apple’s planned CSAM scanning project but those posts have always focused on how Apple was betraying their long held promise to protect our privacy and not allow random snooping through our iPhones. To me, that’s the most important aspect of the controversy: it’s really important—at least to me—to feel secure that your smartphone is not collecting information about you and reporting it to the government or anyone else.

This post is a quick look at the other reasons to be against Apple’s plan. I’ve collected them under the rubric of “Technical problems” but, strictly speaking, they aren’t all “technical.”

The first issue is that the matching is based on hashing CSAM images and comparing the hash against images on the phone. The problem is collisions. Hashing is a way of reducing any sequence of bits, say \(n\) long, to a sequence of \(m ≪ n\) bits. That necessarily means that more than one sequence of bits will reduce to the same \(m\) bit value. In terms of Apple’s system, that means that more than 1 image will have the same hash value.

Hashing algorithms generally deal with these collisions by comparing the two values that were hashed to make sure they’re a match but that’s not possible here because it’s illegal a have a copy of the actual CSAM image, let alone to download it to billions of iPhones. To a first order of approximation, Apple treats identical hash values as a match. It protects against ambiguity by two means:

  1. The number of matches have to exceed a threshold (\(≈30\)) before Apple even looks at the results.
  2. Even when the threshold is exceeded a human being looks at the potentially matching images to make sure they really are child porn.

Apple’s protocol is more robust than I’ve recounted here (see this technical summary and this paper detailing the theoretical details). Among other things it’s resistant against hackers generating false matches or pedophiles discovering offending hashes so they can bypass them. In addition, Apple is prevented by technical means from seeing any of the potential matches until the threshold is exceeded. Despite some folks generating images that hash to the same value (1, 2), Apple’s protocol seems to me to be pretty robust against such attacks.

That’s bad news for Apple’s hope of refining and reintroducing their plan. It’s hard to see how anything like the current plan could be more secure so any changes are likely to be eyewash.

But the real danger with Apple’s plan is something out of their control: that the machinery for CSAM scanning will be repurposed for other, less savory purposes. Apple. of course, says that they will refuse to let it happen but that claim is empty. When presented with a court order or law demanding that they scan for, say, terrorist images or content, they will have no choice but to comply. And that doesn’t begin to address what they will do when China and others bring economic pressure to bear. Apples CSAM scanning machinery is easy to adapt for other types of content monitoring. It’s really not something you want on your phone.

This isn’t just paranoid ravings. The developers of the algorithm that Apple uses have written a WAPO op-ed in which they say that they concluded the algorithm was too dangerous to be used precisely because of its potential for abuse. If you read none of the other links from this post, you should read that one.

The takeaway from all this is that Apple should, once and for all, abandon this project. The dangers are manifest and the benefits few. Those dealing with CSAM images can easily work around Apple’s scanning and the rest of us will be paying with our privacy for nothing. Part of problem is that Apple’s protocol is clever and a nice piece of engineering. Those involved with its development are doubtless in love with the result and will hate to throw it away but that’s the right thing to do.

This entry was posted in General and tagged , , . Bookmark the permalink.