As I wrote yesterday, I upgraded my Common Lisp environment to SBCL 2.0 so of course I wanted to try it out. Writing Lisp programs in the excellent Slime environment that Emacs provides is truly a pleasure that I haven’t enjoyed nearly enough lately.
The natural thing to play around with is the palindrome predicates that I wrote about last month. That code was written in Scheme and produced some surprising results. If you read and enjoyed that post, be sure to check out the comments. Chris Wellons did a deep dive into what was going on in Guile that caused the surprising results. It’s definitely worth reading if only to appreciate Wellons’ virtuosity.
I rewrote the two predicates in Common Lisp and reran the benchmarks from the original post.
(defun palindrome-r-p (str) "Recursive version of a palindrome predicate." (let ((len (length str))) (cond ((zerop len) t) ((= len 1) t) ((not (eq (aref str 0) (aref str (1- len)))) nil) (t (palindrome-r-p (subseq str 1 (1- len))))))) (defun palindrome-i-p (str) "Interative version of a palindrome predicate." (do ((left 0 (1+ left)) (right (1- (length str)) (1- right)) (palindromep t)) ((or (>= left right) (not palindromep)) palindromep) (setf palindromep (eq (aref str left) (aref str right))))) (defconstant str (make-string 100000 :initial-element #\x)) (defun bench (rcnt f) "Run f rcnt times with timing." (time (do ((i rcnt (1- i))) ((zerop i) t) (funcall f str))))
with the results
SB-SPROF> (bench 10 #'palindrome-i-p) Evaluation took: 0.006 seconds of real time 0.006490 seconds of total run time (0.006487 user, 0.000003 system) 100.00% CPU 18,167,680 processor cycles 0 bytes consed T SB-SPROF> (bench 10 #'palindrome-r-p) Evaluation took: 46.223 seconds of real time 46.627848 seconds of total run time (44.819204 user, 1.808644 system) [ Run times consist of 14.377 seconds GC time, and 32.251 seconds non-GC time. ] 100.88% CPU 129,420,331,778 processor cycles 99,994,412,160 bytes consed T
The timing for the recursive version shows why I ran the loop 10 times instead of 1000 as I did with Guile.
SBCL is an industrial strength Lisp that compiles to native code and has true tail recursion so if you haven’t read Wellons’ commentary to the original post, you might be surprised at the comparative speeds. The problem, of course, is that unlike Guile, Common Lisp always allocates a new sequence for a subsequence and never shares storage with the old sequence. The problem, then, is almost certainly that each (recursive) call to palindrome-r-p
is allocating new storage and copying the substring into it.
Of course, one of our cardinal commandments is to avoid statements like the last and invoke the profiler:
CL-USER> (require :sb-sprof) ("SB-SPROF") CL-USER> (in-package :sb-sprof) #<PACKAGE "SB-SPROF"> SB-SPROF> (with-profiling (:reset t :sample-interval .00001) (palindrome-r-p str)) Profiler sample vector full (15,506 traces / approximately 499,999 samples), doubling the size Profiler sample vector full (31,019 traces / approximately 999,997 samples), doubling the size T SB-SPROF> (report) Number of samples: 50000 Sample interval: 0.00001 seconds Total sampling time: 0.5 seconds Number of cycles: 0 Sampled threads: #<SB-THREAD:THREAD "new-repl-thread" RUNNING {1002BC7883}> Self Total Cumul Nr Count % Count % Count % Calls Function ------------------------------------------------------------------------ 1 34735 69.5 34735 69.5 34735 69.5 - SB-KERNEL:UB32-BASH-COPY 2 14592 29.2 49975 99.9 49327 98.7 - SB-KERNEL:VECTOR-SUBSEQ* 3 647 1.3 647 1.3 49974 99.9 - "foreign function __pthread_sigmask" 4 9 0.0 49997 100.0 49983 100.0 - PALINDROME-R-P 5 6 0.0 6 0.0 49989 100.0 - SB-KERNEL:HAIRY-DATA-VECTOR-REF/CHECK-BOUNDS 6 1 0.0 1 0.0 49990 100.0 - (SB-VM::OPTIMIZED-DATA-VECTOR-REF CHARACTER) 7 1 0.0 1 0.0 49991 100.0 - (FLET SB-THREAD::EXEC :IN SB-KERNEL::POST-GC) 8 1 0.0 1 0.0 49992 100.0 - "foreign function pthread_getspecific" 9 0 0.0 50000 100.0 49992 100.0 - "Unknown component: #x2276F230" 10 0 0.0 50000 100.0 49992 100.0 - SWANK::EVAL-REGION 11 0 0.0 50000 100.0 49992 100.0 - (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL) 12 0 0.0 50000 100.0 49992 100.0 - SWANK-REPL::TRACK-PACKAGE 13 0 0.0 50000 100.0 49992 100.0 - SWANK::CALL-WITH-RETRY-RESTART 14 0 0.0 50000 100.0 49992 100.0 - SWANK/SBCL::CALL-WITH-DEBOOTSTRAPPING 15 0 0.0 50000 100.0 49992 100.0 - SWANK::CALL-WITH-BUFFER-SYNTAX 16 0 0.0 50000 100.0 49992 100.0 - SWANK-REPL::REPL-EVAL 17 0 0.0 50000 100.0 49992 100.0 - SIMPLE-EVAL-IN-LEXENV 18 0 0.0 50000 100.0 49992 100.0 - EVAL 19 0 0.0 50000 100.0 49992 100.0 - SWANK:EVAL-FOR-EMACS 20 0 0.0 50000 100.0 49992 100.0 - SWANK::PROCESS-REQUESTS 21 0 0.0 50000 100.0 49992 100.0 - (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS) 22 0 0.0 50000 100.0 49992 100.0 - SWANK/SBCL::CALL-WITH-BREAK-HOOK 23 0 0.0 50000 100.0 49992 100.0 - (FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "/Users/jcs/quicklisp/dists/quicklisp/software/slime-v2.22/swank/sbcl.lisp") 24 0 0.0 50000 100.0 49992 100.0 - SWANK::CALL-WITH-BINDINGS 25 0 0.0 50000 100.0 49992 100.0 - SWANK::HANDLE-REQUESTS 26 0 0.0 50000 100.0 49992 100.0 - (FLET SB-UNIX::BODY :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE) 27 0 0.0 50000 100.0 49992 100.0 - (FLET "WITHOUT-INTERRUPTS-BODY-4" :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE) 28 0 0.0 50000 100.0 49992 100.0 - (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE) 29 0 0.0 50000 100.0 49992 100.0 - (FLET "WITHOUT-INTERRUPTS-BODY-1" :IN SB-THREAD::CALL-WITH-MUTEX) 30 0 0.0 50000 100.0 49992 100.0 - SB-THREAD::CALL-WITH-MUTEX 31 0 0.0 50000 100.0 49992 100.0 - SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE 32 0 0.0 50000 100.0 49992 100.0 - "foreign function call_into_lisp" 33 0 0.0 50000 100.0 49992 100.0 - "foreign function new_thread_trampoline" 34 0 0.0 649 1.3 49992 100.0 - "foreign function signal_emulation_wrapper" 35 0 0.0 3 0.0 49992 100.0 - "foreign function maybe_gc" 36 0 0.0 3 0.0 49992 100.0 - "foreign function interrupt_handle_pending" ------------------------------------------------------------------------ 8 0.0 elsewhere #<CALL-GRAPH 50000 samples {1017C52A13}>
You can consult Section 16.2 of the SBCL manual for the details but what the profiling report says is that 99.9% of the time was spent in subseq
and most of that was spent copying data.
The take away from all of this is that if you’re using Lisp and a frequently called function needs to take a substring, you might want to avoid subseq
if you can. The other takeaway is that Guile’s implementation of substring
with its copy-on-modify semantics is a real win.