A Palindrome Predicate Coda

As I wrote yesterday, I upgraded my Common Lisp environment to SBCL 2.0 so of course I wanted to try it out. Writing Lisp programs in the excellent Slime environment that Emacs provides is truly a pleasure that I haven’t enjoyed nearly enough lately.

The natural thing to play around with is the palindrome predicates that I wrote about last month. That code was written in Scheme and produced some surprising results. If you read and enjoyed that post, be sure to check out the comments. Chris Wellons did a deep dive into what was going on in Guile that caused the surprising results. It’s definitely worth reading if only to appreciate Wellons’ virtuosity.

I rewrote the two predicates in Common Lisp and reran the benchmarks from the original post.

(defun palindrome-r-p (str)
  "Recursive version of a palindrome predicate."
  (let ((len (length str)))
    (cond
      ((zerop len) t)
      ((= len 1) t)
      ((not (eq (aref str 0) (aref str (1- len)))) nil)
      (t (palindrome-r-p (subseq str 1 (1- len)))))))

(defun palindrome-i-p (str)
  "Interative version of a palindrome predicate."
  (do ((left 0 (1+ left))
       (right (1- (length str)) (1- right))
       (palindromep t))
      ((or (>= left right) (not palindromep)) palindromep)
    (setf palindromep (eq (aref str left) (aref str right)))))

(defconstant str (make-string 100000 :initial-element #\x))

(defun bench (rcnt f)
  "Run f rcnt times with timing."
  (time
   (do ((i rcnt (1- i)))
       ((zerop i) t)
     (funcall f str))))

with the results

SB-SPROF> (bench 10 #'palindrome-i-p)
Evaluation took:
  0.006 seconds of real time
  0.006490 seconds of total run time (0.006487 user, 0.000003 system)
  100.00% CPU
  18,167,680 processor cycles
  0 bytes consed

T

  SB-SPROF> (bench 10 #'palindrome-r-p)
  Evaluation took:
    46.223 seconds of real time
    46.627848 seconds of total run time (44.819204 user, 1.808644 system)
    [ Run times consist of 14.377 seconds GC time, and 32.251 seconds non-GC time. ]
    100.88% CPU
    129,420,331,778 processor cycles
    99,994,412,160 bytes consed

  T

The timing for the recursive version shows why I ran the loop 10 times instead of 1000 as I did with Guile.

SBCL is an industrial strength Lisp that compiles to native code and has true tail recursion so if you haven’t read Wellons’ commentary to the original post, you might be surprised at the comparative speeds. The problem, of course, is that unlike Guile, Common Lisp always allocates a new sequence for a subsequence and never shares storage with the old sequence. The problem, then, is almost certainly that each (recursive) call to palindrome-r-p is allocating new storage and copying the substring into it.

Of course, one of our cardinal commandments is to avoid statements like the last and invoke the profiler:

CL-USER> (require :sb-sprof)
("SB-SPROF")
CL-USER> (in-package :sb-sprof)
#<PACKAGE "SB-SPROF">

SB-SPROF> (with-profiling (:reset t :sample-interval .00001) (palindrome-r-p str))
Profiler sample vector full (15,506 traces / approximately 499,999 samples), doubling the size
Profiler sample vector full (31,019 traces / approximately 999,997 samples), doubling the size
T
SB-SPROF> (report)

Number of samples:   50000
Sample interval:     0.00001 seconds
Total sampling time: 0.5 seconds
Number of cycles:    0
Sampled threads:
 #<SB-THREAD:THREAD "new-repl-thread" RUNNING {1002BC7883}>

           Self        Total        Cumul
  Nr  Count     %  Count     %  Count     %    Calls  Function
------------------------------------------------------------------------
   1  34735  69.5  34735  69.5  34735  69.5        -  SB-KERNEL:UB32-BASH-COPY
   2  14592  29.2  49975  99.9  49327  98.7        -  SB-KERNEL:VECTOR-SUBSEQ*
   3    647   1.3    647   1.3  49974  99.9        -  "foreign function __pthread_sigmask"
   4      9   0.0  49997 100.0  49983 100.0        -  PALINDROME-R-P
   5      6   0.0      6   0.0  49989 100.0        -  SB-KERNEL:HAIRY-DATA-VECTOR-REF/CHECK-BOUNDS
   6      1   0.0      1   0.0  49990 100.0        -  (SB-VM::OPTIMIZED-DATA-VECTOR-REF CHARACTER)
   7      1   0.0      1   0.0  49991 100.0        -  (FLET SB-THREAD::EXEC :IN SB-KERNEL::POST-GC)
   8      1   0.0      1   0.0  49992 100.0        -  "foreign function pthread_getspecific"
   9      0   0.0  50000 100.0  49992 100.0        -  "Unknown component: #x2276F230"
  10      0   0.0  50000 100.0  49992 100.0        -  SWANK::EVAL-REGION
  11      0   0.0  50000 100.0  49992 100.0        -  (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL)
  12      0   0.0  50000 100.0  49992 100.0        -  SWANK-REPL::TRACK-PACKAGE
  13      0   0.0  50000 100.0  49992 100.0        -  SWANK::CALL-WITH-RETRY-RESTART
  14      0   0.0  50000 100.0  49992 100.0        -  SWANK/SBCL::CALL-WITH-DEBOOTSTRAPPING
  15      0   0.0  50000 100.0  49992 100.0        -  SWANK::CALL-WITH-BUFFER-SYNTAX
  16      0   0.0  50000 100.0  49992 100.0        -  SWANK-REPL::REPL-EVAL
  17      0   0.0  50000 100.0  49992 100.0        -  SIMPLE-EVAL-IN-LEXENV
  18      0   0.0  50000 100.0  49992 100.0        -  EVAL
  19      0   0.0  50000 100.0  49992 100.0        -  SWANK:EVAL-FOR-EMACS
  20      0   0.0  50000 100.0  49992 100.0        -  SWANK::PROCESS-REQUESTS
  21      0   0.0  50000 100.0  49992 100.0        -  (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS)
  22      0   0.0  50000 100.0  49992 100.0        -  SWANK/SBCL::CALL-WITH-BREAK-HOOK
  23      0   0.0  50000 100.0  49992 100.0        -  (FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "/Users/jcs/quicklisp/dists/quicklisp/software/slime-v2.22/swank/sbcl.lisp")
  24      0   0.0  50000 100.0  49992 100.0        -  SWANK::CALL-WITH-BINDINGS
  25      0   0.0  50000 100.0  49992 100.0        -  SWANK::HANDLE-REQUESTS
  26      0   0.0  50000 100.0  49992 100.0        -  (FLET SB-UNIX::BODY :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE)
  27      0   0.0  50000 100.0  49992 100.0        -  (FLET "WITHOUT-INTERRUPTS-BODY-4" :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE)
  28      0   0.0  50000 100.0  49992 100.0        -  (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE)
  29      0   0.0  50000 100.0  49992 100.0        -  (FLET "WITHOUT-INTERRUPTS-BODY-1" :IN SB-THREAD::CALL-WITH-MUTEX)
  30      0   0.0  50000 100.0  49992 100.0        -  SB-THREAD::CALL-WITH-MUTEX
  31      0   0.0  50000 100.0  49992 100.0        -  SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE
  32      0   0.0  50000 100.0  49992 100.0        -  "foreign function call_into_lisp"
  33      0   0.0  50000 100.0  49992 100.0        -  "foreign function new_thread_trampoline"
  34      0   0.0    649   1.3  49992 100.0        -  "foreign function signal_emulation_wrapper"
  35      0   0.0      3   0.0  49992 100.0        -  "foreign function maybe_gc"
  36      0   0.0      3   0.0  49992 100.0        -  "foreign function interrupt_handle_pending"
------------------------------------------------------------------------
          8   0.0                                     elsewhere
#<CALL-GRAPH 50000 samples {1017C52A13}>

You can consult Section 16.2 of the SBCL manual for the details but what the profiling report says is that 99.9% of the time was spent in subseq and most of that was spent copying data.

The take away from all of this is that if you’re using Lisp and a frequently called function needs to take a substring, you might want to avoid subseq if you can. The other takeaway is that Guile’s implementation of substring with its copy-on-modify semantics is a real win.

This entry was posted in General and tagged , , , . Bookmark the permalink.