Voluntocracy: October 2009

There has been a lot of concern over how to harness the speed potential of multi-core processors without a disruptive reinvention of software engineering. Implicit concurrency is a solution.

The specifications of the algorithmic languages C, C++, OCAML, and Scheme deliberately leave unspecified the order of evaluation of arguments to procedure-calls. If coders are to write programs which display the same behavior when (compiled and) run on different platforms, an easily mastered discipline concerning side-effects of argument expressions must be practiced.

It turns out that this discipline is very similar to that needed in order to write programs which display the same behavior even with concurrent evaluation of arguments (and on different platforms).

Changing from unspecified-order-of-evaluation-of-arguments to concurrent-envaluation-of-arguments in these language specifications would enable them to be implicitly-concurrent:

able to spawn concurrent subtasks at many points in programs,
produce the same result whether run in 1 core (or thread) or 1000,
involve no new language primitives, and
require no complicated code analysis.

Terminology

closed-expression

Let the term closed-expression designate

a Scheme expression and
all state accessible from the value of each free variable in the expression or free in any procedures called from the expression or procedures they call (this can include files).

After the closed-expression is evaluated, the resulting state (or output) is:

the returned value or values,
all state accessible from the value of each free variable in the expression or free in any procedures called from the expression or procedures they call, and
files written by evaluating the expression.

singly-terminate

Let the term singly-terminate designate the condition of the evaluation of a closed-expression that:

returns once and only once and
does not let any continuations escape from the dynamic extent of evaluating the closed expression.

portably-repeatable

Let the term portably-repeatable designate those closed-expressions which singly-terminate and produce identical output for any serial evaluation order allowed by the Scheme report.

The Scheme Reports mandate an unspecified serial evaluation order for arguments to procedure-calls, `let' bindings, `letrec' bindings, `do' bindings, and `map'.

A caveat to portably-repeatable is the non-associativity of floating-point operations. For the extent of this discussion, assume that `+', `*', `-', and `/' with more than 2 arguments are left-associative.

concurrently-repeatable: Let the term concurrently-repeatable designate those closed-expressions which singly-terminate and produce identical output for any concurrent evaluation order where the Scheme report mandates an unspecified serial evaluation order.
Implicitly-Concurrent-Scheme: Let Implicitly-Concurrent-Scheme (ICS) designate an implementation which allows concurrent evaluation anywhere the Scheme report mandates an unspecified serial evaluation order.

Theorems

Theorem 1: All singly-terminating purely functional closed-expressions (including procedures they call) are both portably-repeatable and concurrently-repeatable.

This is because purely functional (sub-)expressions have no side effects; hence they cannot effect the execution of other purely functional (sub-)expressions.

Theorem 2: All concurrently-repeatable closed-expressions are portably-repeatable.

For a given closed-expression, its serial evaluation orders are a subset of its concurrent evaluation orders.

Theorem 3: A single threaded, non-preemptive RnRS (Scheme) implementation is an ICS implementation.

This is because concurrently-repeatable implies portably-repeatable by Theorem 2.

The Programmer Contract

The programmer who wants to write closed-expressions which have repeatable behavior across platforms must currently write portably-repeatable code (among other constraints).

There are relatively few portably-repeatable expressions which are not also concurrently-repeatable. And it is avoiding these few expressions which represents the change of habit for coders. The few expressions involve the potential for multiple threads to have side-effects on shared state.

Examples of portably-repeatable expressions which are not concurrently-repeatable are symmetric calls of a pseudo-random-number-generator having shared state:

(begin
  (require 'random)
  (let ((rs (make-random-state "seed1")))
    (+ (random 9 rs) (random 9 rs))))

Notice that the expressions:

(- (random 9 rs) (random 9 rs))

and

(+ (random 9 rs) (random 19 rs))

are not portably-repeatable; for asymmetrical functions (or asymmetrical argument expressions), the order of calls to `random' causes different results to be returned.

A concurrently-repeatable version of the random sum can be written:

(begin
  (require 'random)
  (let* ((rs (make-random-state "seed1"))
         (rnd1 (random 9 rs)))
    (+ rnd1 (random 9 rs))))

The SLIB random module detects and signals an error if `random' is called concurrently or reentrantly on the same random-state byte-array. A concurrent implementation can thus sometimes catch concurrently-repeatable violations.

The Scheme Reports also leave the order of evaluation unspecified for `map'. Examples of portably-repeatable `map' expressions which are not concurrently-repeatable have side-effecting (set!) shared-state (cnt) within a `map'ped procedure:

(let ((cnt 0))
  (let ((res (map
              (lambda (x)
                (if (negative? x) (set! cnt (+ 1 cnt)))
                (* x x))
              '(-1 2 -3 4 -5 6))))
    (list res cnt)))

Such code can be made concurrently-repeatable by separating the functional and side-effecting parts:

(let ((cnt 0) (lst '(-1 2 -3 4 -5 6)))
  (for-each (lambda (x)
              (if (negative? x) (set! cnt (+ 1 cnt))))
            lst)
  (list (map
         (lambda (x) (* x x))
         lst)
        cnt))

Or the non-functional part can be made functional:

(let ((lst '(-1 2 -3 4 -5 6)))
  (list (map (lambda (x) (* x x)) lst)
        (apply + (map (lambda (x) (if (negative? x) 1 0))
                      lst))))

Or the functional part can be accumulated in the serial `for-each'. This code is forced to run serially by the `for-each'; so it will not gain a speed benefit from concurrent evaluation:

(let ((cnt 0)
      (lst '(-1 2 -3 4 -5 6))
      (nst '()))
  (for-each (lambda (x)
              (if (negative? x) (set! cnt (+ 1 cnt)))
              (set! nst (cons (* x x) nst)))
            lst)
  (list (reverse nst) cnt))

One could also write a serial `map*' procedure to do the same.

Continuations

This expression is portably-repeatable but not concurrently-repeatable because it would return twice:

(call-with-current-continuation
 (lambda (return)
   (/ (return 0) (return 0))))

This expression is neither portably-repeatable nor concurrently-repeatable because whether it returns 0 or 1 depends on the order of evaluation of arguments:

(call-with-current-continuation
 (lambda (return)
   (/ (return 0) (return 1))))

But escape continuations can be used in concurrently-repeatable ways:

(define (careful-div n d)
  (call-with-current-continuation
   (lambda (return)
     (/ n (if (zero? d) (return #f) d)))))

The Implementation Contract

When encountering an expression (list (u) (v)), a parallelizing RnRS implementation must commence an analysis of u and v and everything they call, possibly including other modules, to determine whether any side effects to shared state might depend on concurrent evaluation order. This is the same difficulty that any parallelizing C or Fortran compiler faces. Such an analysis cannot be exhaustive because there is no guarantee that the analysis will terminate (Halting Problem).

When a parallelizing Implicitly-Concurrent-Scheme compiler or interpreter encounters (list (u) (v)), it immediately knows that it can delegate the computation of (u) and (v) to distinct threads. The Programmer Contract to write concurrently-repeatable code insures that no side-effects to shared state will depend on concurrent evaluation order.

Exceptions

When an exception is raised in one thread evaluating an argument in concurrently-repeatable code, it does not effect the evaluation of the other arguments because its side-effects do not interact with the threads of the other arguments. Whether the handler returns or not determines whether the procedure is applied to the result of those argument evaluations.

Conclusion

Implicitly-Concurrent-Scheme is a minor modification to Scheme Report semantics which allows, but does not require, both interpreters and compilers to simply parallelize the execution of programs. No new primitives are introduced. In order to write ICS programs which produce the same output irrespective of concurrent evaluation order requires a regime very similar to the that required of RnRS programs to produce the same output irrespective of serial evaluation order.

Bibliography

[Feeley]
Marc Feeley, An Efficient and General Implementation of Futures on Large Scale Shared-Memory Multiprocessors, PhD thesis, Brandeis University, April 1993, 213 pages.

It has been my experience that dynamic search-paths are disastrous for libraries on commercial and educational computer networks. The search-paths at these sites seem to only grow; paths to newer versions are placed ahead of older versions. It is not uncommon for half of the directories in a path to refer to non-existent directories or non-functioning host machines! The timeouts from hosed or inaccessible networked-file-systems can balloon into minutes the startup time for programs.

Furthermore, transient router outages can cause incompatible versions of libraries to be loaded which, in the Scheme or Lisp case, may not be noticed until the program has run for hours; but more likely will just cause a mysterious failure.

The SLIB Scheme Library's solution to this problem is to have an explicit command [(require 'new-catalog)], typically run after installations, write to a file an association list of features and resolved pathnames. SLIB sessions read this file and thereafter have instantaneous latency to library paths.

If the network can't reach a particular file, the session fails immediately or after the first timeout rather than conducting covert experiments on library version compatibility.

Voluntocracy

Sunday, October 25, 2009

Implicit Concurrency

Terminology

Theorems

The Programmer Contract

Continuations

The Implementation Contract

Exceptions

Conclusion

Bibliography

Wednesday, October 21, 2009

Saturday, October 17, 2009

Climate Change: Anthropogenesis Doesn't Matter

Monday, October 12, 2009

Biological Space-Filling Curves

Wednesday, October 7, 2009

Search-paths considered harmful

Blog Archive

About Me