2018-10-29

Introduction

Lisp is the family of most powerful (simple and expressive) programming languages available to date. Scheme is one of its most elegant dialects. (For a fast overview of Scheme see Appendix: Scheme of SICM; for a more detailed study see SICP; for Lisp motivational essays see here and here.)

Scheme is not a common language. This implies fewer available libraries than mainstream choices, and difficulty in interfacing Scheme programs to large collaborations' code bases. Furthermore, there are different implementations and dialects, leading to a fragmented community. A program written for one implementation is likely not to work with another one. The language standards are themselves controversial, having R6RS introduced breaks over previous versions.

This is a comparison of four Scheme implementations/dialects (Chez 9.5, Chicken 4, Gambit 4.2, Racket 7.1) in view of modern scientific computing needs (see this post for a more general guide). These include purely numerical applications, but also symbolic computations. See scmutils (online README) for an exemplar package that shows the expressiveness of Scheme.

Comparison

For each item the implementations in bold outperform the others. Items are listed roughly from the most to the less important. Technical details refer to information retrieved on October, 2018. Tests where performed under Debian 4.18.10-2 (2018-10-07) x86_64 GNU/Linux.

Speed and reliability

According to these benchmarks the fastest and most reliable Scheme implementations with larger community are: Chez, Chicken, Gambit, Racket. All are available as Debian packages and also provide simple installation from source.

(Bigloo (R5RS) and Larceny (R5RS, R6RS, R7RS) also perform well, but they are not available as Debian packages.)

That said, the kind of benchmarks mentioned above should be taken with a grain of salt. They only consider runtime efficiency (not always optimally, see the Chicken Scheme case) and neglect development time. Python is one of the most popular languages for science, yet it (CPython) can be easily perform 100x slower than C. Having a Scheme implementation that is only 1x-10x slower than C/C++ is a significant improvement. It is also important to keep in mind that the variance between programmers can be larger than the variance between languages, suggesting that an appropriate programming style is more important for overall efficiency than the particular language choice (provided they are in the same ballpark).

3rd-party libraries and package manager

Foreign Function Interface

The Foreign Function Interface (to call procedures written in C or in languages that obey the same calling conventions as C, and viceversa to call scheme procedures from C) is important given the lack of native libraries.

Parallel computing

Debugging and profiling

Source documentation tools

Chez and Gambit are also supported by SLIB that has a nice inline documentation language, Schmooz.

Community

Editor

That said, all implementations are supported in Emacs.

Standards support

R6RS is controversial, incompatible with R5RS. However, it also introduced useful features (e.g., exceptions, libraries) compared to R5RS.

R7RS-small is the most recent minimalistic Scheme revision, improving over controversial R6RS.

Discussion

Here we discuss more in details similar implementations.

Chez and Gambit

Gambit is smaller and simpler, and documentation more clear. Chez is larger and its documentation is more extended (and integrated by excellent The Scheme Programming Language); however, the chapter about software distribution could profit from an update. Chez has a R6RS-compliant library support, but it has limited 3rd-party libraries availability and no established package manager; Gambit has its own library system and a few 3rd-party libraries are available via semi-official package managers, but often their installation is difficult and several links to documentation are broken.

Working only with Gambit would mainly require to write interfaces to C programs even for most basic tasks; distribution as shared libraries is nice, but the process requires cumbersome makefiles. Chez is better supported by 3rd party libraries and The Scheme Programming Language (written by Chez author) targets specifically numerical examples. Among the two, at the present Chez seems better suited for numerical computing.

Chicken and Racket

Chicken has reliable package manager, libraries and an helpful lively community (it is also the only Scheme discussed here whose source files are available on self-hosted git repository). There are extensive tutorials on several practical aspects of the language, also targeted to programmers coming from other languages. The compiler requires experimenting (and can be quite slow with inappropriate options), but it allows large room for optimization if needed. Online publicly available benchmarks put Chicken in the ballpark 10x slower than C, comparable to Racket but with a significantly smaller variance (especially more convenient when capturing continuations).

Racket has many nice features. Racket main language is not RnRS-compatible, not necessarily a (dis-)advantage. It has an excellent package manager and a large library repository. It is maybe the most friendly implementation for people coming from Python. However, embedding Racket into C will be likely challenging if needed. Performance has a large variance, use of continuations and I/O operations can seriously slow down programs by a factor O(10-100). Overall it is placed in the ballpark of 10x slower than C (see also Benchmarks Game).

Conclusions

We compared Chez, Chicken, Gambit and Racket in view of usage in scientific programming. They can be divided into two groups:

Unfortunately the lack of scientific libraries is a serious practical issue and there is no ideal implementation to promptly replace popular languages like Python. Non-negligible efforts are likely required to write from scratch wrappers to C/C++/Fortran libraries. The absence of a package manager in Chez and Gambit makes it too unpractical to use these highly efficient and reliable implementations. Due to Racket large variance in runtime efficiency (especially when capturing continuations) and in packages quality, Chicken seems the most suited candidate for a Scheme implementation usable in scientific computing tasks.

A significant community effort around Chez (starting from establishing a package manager) is much desirable. As Chez is being adopted by larger projects (e.g., there are plans to implement Racket itself in Chez), a critical mass of users may be reached.

Other Lisps

An obvious alternative to Scheme as a Lisp dialect is Common Lisp. Common Lisp has excellent, efficient (runtime usually of the same order as C, only a few factors slower) implementations like SBCL (same ballpark than Chez Scheme), a large community, an established package manager to easily access an extensive library (quicklisp) that receives frequent new contributions. Scheme is conceptually simpler than Common Lisp thanks to only one namespace for functions and variables (among other differences). E.g., Scheme only needs define for both variables and functions, there are no defun, setf, defvar, etc. (but, of course, they can be defined if needed as an extension of Scheme, e.g., in a Domain Specific Language). Especially, Scheme does not need the awkward funcall for function application.

Another Lisp dialect worth considering is Clojure. It is actively maintained, the community is significantly larger than those around Scheme implementations (although relatively small compared to mainstream languages like Python, it still provides excellent documentation and support). While being a battery-included modern language, it feels much simpler than Common Lisp. In some aspect its design choices makes it also more immediate than Scheme, see for instance the implementation of hash tables and sets. Notably, the core language has excellent support for concurrency. The number of native Clojure packages is not competitive with Python, and integration of C/C++ code is not trivial. However, running on the Java Virtual Machine, it is easy to access all libraries available for Java (one of the best supported languages for Machine Learning). Projects building and automatic dependency resolution are managed via Leiningen (build automation tool ideal for medium size and large projects). Benchmarks put it in the same ballpark as Java and SBCL (Common Lisp) for runtime performance, while it can easily be a factor 2 more requiring in memory.

Given that Scheme implementations present serious practical difficulties for usage in scientific computing (e.g., as a replacement for typical joint C/C++ and Python use), Clojure seems an excellent, Lisp-based ready-to-use alternative that still does not deviate too much from Scheme simplicity.

Updates