I ran across a couple of nice interview questions and an interesting story over at Tanya Khovanova’s Math Blog. The two questions are:

- Given a list of integers from 1 to n but with one of the integers missing, write an efficient algorithm for finding the missing integer.
- Given a list of integers from 1 to n but with one of the integers missing and another integer repeated, write an efficient algorithm for finding the missing and repeated integers.

In both problems the numbers in the list are in no particular order.

So the challenge is to write the two algorithms in Emacs Lisp. The interviewer expected a *O*(*n log n*) algorithm for time for the second problem. Can you do better?

Don’t go over to Khnovanova’s blog until you have solved the problems because there are spoilers in the comments. After you solve the problem, though, be sure to read her story about “hiring the smartest people in the world.”

Working within Emacs’ integer limitations, here’s an O(n) time, O(1) space solution for number 1. It just sums up the numbers and keeps track of candidates for n.

http://pastebin.com/Ucg2ZUf5

Or even simpler?

http://pastebin.com/GTtTFN5W

That solution requires the whole list exist at once in memory, so it’s not constant space anymore.

Here’s another O(n) time, O(1) space solution for problem 1.

(require 'cl)

(defun find-missing (l)

"Given a list L with the numbers for 1 to N in arbitrary order,

but one number missing (so the length of L is N-1), returns the

missing number."

(let ((len (length l)))

;; The correct sum of [1..N] is N * (N + 1) / 2

(- (/ (* (1+ len) (+ len 2)) 2)

;; substract the actual sum

(reduce #'+ l))))

Oh, it’s actually identical to dmanny’s solution. But it’s DOCUMENTED and thus better. ;-)

The main problem of emacs lisp here is indeed the lack of bignums.

In this circumstances, a “smart” solution is only valid up to list lengths of 13 or 19.

We could try to use rationals to compute f/p instead, but we couldn’t guarantee to stay within fixnum for numerators and denominators on all inputs (it would depend on the order of the numbers and their decomposition in prime factors).

;; Notice: the following missing-and-duplicate-integers functions

;; still use O(log N) space, hidden in the factorial and product

;; bignums (for lists of length>13 (32-bit) or length>19 (64-bit)).

(defun factorial (n)

(loop for i from n downto 1 for p = i then (* p i) finally (return p)))

(defun missing-and-duplicate-integers (lopi)

"

2. Given a list of integers from 1 to n but with one of the integers

missing and another integer repeated, write an efficient algorithm

for finding the missing and repeated integers. The numbers in the

list are in no particular order.

Return: a list (missing duplicate)

"

;; We won't count checking the pre-condition in the time or space complexity:

(assert (every (lambda (i) (and (<= 1 i) (<= i (length lopi)))) lopi))

(assert (= (1- (length lopi)) (length (remove-duplicates lopi))))

(let* ((n (length lopi))

(s (reduce '+ lopi))

(p (reduce '* lopi))

(f (factorial n))

(a (/ (* n (1+ n)) 2)))

(list (/ (* (- a s) f)

(- f p))

(/ (* (- a s) p)

(- f p)))))

(defun missing-and-duplicate-integers (lopi)

"

2. Given a list of integers from 1 to n but with one of the integers

missing and another integer repeated, write an efficient algorithm

for finding the missing and repeated integers. The numbers in the

list are in no particular order.

Return: a list (missing duplicate)

"

(assert (every (lambda (i) (and (<= 1 i) (<= i (length lopi)))) lopi))

(assert (= (1- (length lopi)) (length (remove-duplicates lopi))))

(loop

for i from 1

for e in lopi

for a = i then (+ a i)

for f = i then (* f i)

for s = e then (+ s e)

for p = e then (* p e)

finally (return (list (/ (* (- a s) f)

(- f p))

(/ (* (- a s) p)

(- f p))))))

(list (missing-and-duplicate-integers '(1 2 3 3 4 5 7 8))

(missing-and-duplicate-integers '(1 1 2 3 4 5 6 7))

(missing-and-duplicate-integers '(2 3 4 5 6 7 8 8)))

`((6 3) (8 1) (1 8))`

Mergesorting the list is O(n log(n)) and can take O(1) space. Then just iterate O(n) through the sorted list to find the missing and the duplicate numbers. I’m not seeing in the problem that can get by those complexities.

Although, Emacs Lisp’s mergesort is done in-place, it is not iterative so it’s hard to call it O(1) space.

(defun missdup (numlist)

"Find missing and duplicate values in NUMLIST."

(setq numlist (sort numlist ' delta 1)

(while (> delta 1) ;; Support multiple duplicates.

(add-to-list 'missing n)

(setq delta (1- delta)

n (1+ n)))

(setq n (1+ n)

prev next))

(when (null missing)

(add-to-list 'missing n))

(list 'missing missing 'duplicates duplicates)))

Commenting is broken.

I had used both a greater than and a less than symbol but now the text between the two is deleted, even though I used a

`code`

tag. I’ll leave the missing code as an exercise for the reader.