As I wrote yesterday, I upgraded my Common Lisp environment to SBCL 2.0 so of course I wanted to try it out. Writing Lisp programs in the excellent Slime environment that Emacs provides is truly a pleasure that I haven’t enjoyed nearly enough lately.
The natural thing to play around with is the palindrome predicates that I wrote about last month. That code was written in Scheme and produced some surprising results. If you read and enjoyed that post, be sure to check out the comments. Chris Wellons did a deep dive into what was going on in Guile that caused the surprising results. It’s definitely worth reading if only to appreciate Wellons’ virtuosity.
I rewrote the two predicates in Common Lisp and reran the benchmarks from the original post.
(defun palindrome-r-p (str) "Recursive version of a palindrome predicate." (let ((len (length str))) (cond ((zerop len) t) ((= len 1) t) ((not (eq (aref str 0) (aref str (1- len)))) nil) (t (palindrome-r-p (subseq str 1 (1- len))))))) (defun palindrome-i-p (str) "Interative version of a palindrome predicate." (do ((left 0 (1+ left)) (right (1- (length str)) (1- right)) (palindromep t)) ((or (>= left right) (not palindromep)) palindromep) (setf palindromep (eq (aref str left) (aref str right))))) (defconstant str (make-string 100000 :initial-element #\x)) (defun bench (rcnt f) "Run f rcnt times with timing." (time (do ((i rcnt (1- i))) ((zerop i) t) (funcall f str))))
with the results
SB-SPROF> (bench 10 #'palindrome-i-p)
Evaluation took:
0.006 seconds of real time
0.006490 seconds of total run time (0.006487 user, 0.000003 system)
100.00% CPU
18,167,680 processor cycles
0 bytes consed
T
SB-SPROF> (bench 10 #'palindrome-r-p)
Evaluation took:
46.223 seconds of real time
46.627848 seconds of total run time (44.819204 user, 1.808644 system)
[ Run times consist of 14.377 seconds GC time, and 32.251 seconds non-GC time. ]
100.88% CPU
129,420,331,778 processor cycles
99,994,412,160 bytes consed
T
The timing for the recursive version shows why I ran the loop 10 times instead of 1000 as I did with Guile.
SBCL is an industrial strength Lisp that compiles to native code and has true tail recursion so if you haven’t read Wellons’ commentary to the original post, you might be surprised at the comparative speeds. The problem, of course, is that unlike Guile, Common Lisp always allocates a new sequence for a subsequence and never shares storage with the old sequence. The problem, then, is almost certainly that each (recursive) call to palindrome-r-p is allocating new storage and copying the substring into it.
Of course, one of our cardinal commandments is to avoid statements like the last and invoke the profiler:
CL-USER> (require :sb-sprof)
("SB-SPROF")
CL-USER> (in-package :sb-sprof)
#<PACKAGE "SB-SPROF">
SB-SPROF> (with-profiling (:reset t :sample-interval .00001) (palindrome-r-p str))
Profiler sample vector full (15,506 traces / approximately 499,999 samples), doubling the size
Profiler sample vector full (31,019 traces / approximately 999,997 samples), doubling the size
T
SB-SPROF> (report)
Number of samples: 50000
Sample interval: 0.00001 seconds
Total sampling time: 0.5 seconds
Number of cycles: 0
Sampled threads:
#<SB-THREAD:THREAD "new-repl-thread" RUNNING {1002BC7883}>
Self Total Cumul
Nr Count % Count % Count % Calls Function
------------------------------------------------------------------------
1 34735 69.5 34735 69.5 34735 69.5 - SB-KERNEL:UB32-BASH-COPY
2 14592 29.2 49975 99.9 49327 98.7 - SB-KERNEL:VECTOR-SUBSEQ*
3 647 1.3 647 1.3 49974 99.9 - "foreign function __pthread_sigmask"
4 9 0.0 49997 100.0 49983 100.0 - PALINDROME-R-P
5 6 0.0 6 0.0 49989 100.0 - SB-KERNEL:HAIRY-DATA-VECTOR-REF/CHECK-BOUNDS
6 1 0.0 1 0.0 49990 100.0 - (SB-VM::OPTIMIZED-DATA-VECTOR-REF CHARACTER)
7 1 0.0 1 0.0 49991 100.0 - (FLET SB-THREAD::EXEC :IN SB-KERNEL::POST-GC)
8 1 0.0 1 0.0 49992 100.0 - "foreign function pthread_getspecific"
9 0 0.0 50000 100.0 49992 100.0 - "Unknown component: #x2276F230"
10 0 0.0 50000 100.0 49992 100.0 - SWANK::EVAL-REGION
11 0 0.0 50000 100.0 49992 100.0 - (LAMBDA NIL :IN SWANK-REPL::REPL-EVAL)
12 0 0.0 50000 100.0 49992 100.0 - SWANK-REPL::TRACK-PACKAGE
13 0 0.0 50000 100.0 49992 100.0 - SWANK::CALL-WITH-RETRY-RESTART
14 0 0.0 50000 100.0 49992 100.0 - SWANK/SBCL::CALL-WITH-DEBOOTSTRAPPING
15 0 0.0 50000 100.0 49992 100.0 - SWANK::CALL-WITH-BUFFER-SYNTAX
16 0 0.0 50000 100.0 49992 100.0 - SWANK-REPL::REPL-EVAL
17 0 0.0 50000 100.0 49992 100.0 - SIMPLE-EVAL-IN-LEXENV
18 0 0.0 50000 100.0 49992 100.0 - EVAL
19 0 0.0 50000 100.0 49992 100.0 - SWANK:EVAL-FOR-EMACS
20 0 0.0 50000 100.0 49992 100.0 - SWANK::PROCESS-REQUESTS
21 0 0.0 50000 100.0 49992 100.0 - (LAMBDA NIL :IN SWANK::HANDLE-REQUESTS)
22 0 0.0 50000 100.0 49992 100.0 - SWANK/SBCL::CALL-WITH-BREAK-HOOK
23 0 0.0 50000 100.0 49992 100.0 - (FLET SWANK/BACKEND:CALL-WITH-DEBUGGER-HOOK :IN "/Users/jcs/quicklisp/dists/quicklisp/software/slime-v2.22/swank/sbcl.lisp")
24 0 0.0 50000 100.0 49992 100.0 - SWANK::CALL-WITH-BINDINGS
25 0 0.0 50000 100.0 49992 100.0 - SWANK::HANDLE-REQUESTS
26 0 0.0 50000 100.0 49992 100.0 - (FLET SB-UNIX::BODY :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE)
27 0 0.0 50000 100.0 49992 100.0 - (FLET "WITHOUT-INTERRUPTS-BODY-4" :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE)
28 0 0.0 50000 100.0 49992 100.0 - (FLET SB-THREAD::WITH-MUTEX-THUNK :IN SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE)
29 0 0.0 50000 100.0 49992 100.0 - (FLET "WITHOUT-INTERRUPTS-BODY-1" :IN SB-THREAD::CALL-WITH-MUTEX)
30 0 0.0 50000 100.0 49992 100.0 - SB-THREAD::CALL-WITH-MUTEX
31 0 0.0 50000 100.0 49992 100.0 - SB-THREAD::NEW-LISP-THREAD-TRAMPOLINE
32 0 0.0 50000 100.0 49992 100.0 - "foreign function call_into_lisp"
33 0 0.0 50000 100.0 49992 100.0 - "foreign function new_thread_trampoline"
34 0 0.0 649 1.3 49992 100.0 - "foreign function signal_emulation_wrapper"
35 0 0.0 3 0.0 49992 100.0 - "foreign function maybe_gc"
36 0 0.0 3 0.0 49992 100.0 - "foreign function interrupt_handle_pending"
------------------------------------------------------------------------
8 0.0 elsewhere
#<CALL-GRAPH 50000 samples {1017C52A13}>
You can consult Section 16.2 of the SBCL manual for the details but what the profiling report says is that 99.9% of the time was spent in subseq and most of that was spent copying data.
The take away from all of this is that if you’re using Lisp and a frequently called function needs to take a substring, you might want to avoid subseq if you can. The other takeaway is that Guile’s implementation of substring with its copy-on-modify semantics is a real win.