Governance by those who do the work.

Sunday, October 25, 2009

Implicit Concurrency

There has been a lot of concern over how to harness the speed potential of multi-core processors without a disruptive reinvention of software engineering. Implicit concurrency is a solution.

The specifications of the algorithmic languages C, C++, OCAML, and Scheme deliberately leave unspecified the order of evaluation of arguments to procedure-calls. If coders are to write programs which display the same behavior when (compiled and) run on different platforms, an easily mastered discipline concerning side-effects of argument expressions must be practiced.

It turns out that this discipline is very similar to that needed in order to write programs which display the same behavior even with concurrent evaluation of arguments (and on different platforms).

Changing from unspecified-order-of-evaluation-of-arguments to concurrent-envaluation-of-arguments in these language specifications would enable them to be implicitly-concurrent:

  • able to spawn concurrent subtasks at many points in programs,
  • produce the same result whether run in 1 core (or thread) or 1000,
  • involve no new language primitives, and
  • require no complicated code analysis.

Terminology

closed-expression
Let the term closed-expression designate
  • a Scheme expression and
  • all state accessible from the value of each free variable in the expression or free in any procedures called from the expression or procedures they call (this can include files).

After the closed-expression is evaluated, the resulting state (or output) is:
  • the returned value or values,
  • all state accessible from the value of each free variable in the expression or free in any procedures called from the expression or procedures they call, and
  • files written by evaluating the expression.

singly-terminate
Let the term singly-terminate designate the condition of the evaluation of a closed-expression that:
  • returns once and only once and
  • does not let any continuations escape from the dynamic extent of evaluating the closed expression.

portably-repeatable
Let the term portably-repeatable designate those closed-expressions which singly-terminate and produce identical output for any serial evaluation order allowed by the Scheme report.

The Scheme Reports mandate an unspecified serial evaluation order for arguments to procedure-calls, `let' bindings, `letrec' bindings, `do' bindings, and `map'.

A caveat to portably-repeatable is the non-associativity of floating-point operations. For the extent of this discussion, assume that `+', `*', `-', and `/' with more than 2 arguments are left-associative.

concurrently-repeatable
Let the term concurrently-repeatable designate those closed-expressions which singly-terminate and produce identical output for any concurrent evaluation order where the Scheme report mandates an unspecified serial evaluation order.


Implicitly-Concurrent-Scheme
Let Implicitly-Concurrent-Scheme (ICS) designate an implementation which allows concurrent evaluation anywhere the Scheme report mandates an unspecified serial evaluation order.

Theorems

Theorem 1
All singly-terminating purely functional closed-expressions (including procedures they call) are both portably-repeatable and concurrently-repeatable.
This is because purely functional (sub-)expressions have no side effects; hence they cannot effect the execution of other purely functional (sub-)expressions.

Theorem 2
All concurrently-repeatable closed-expressions are portably-repeatable.
For a given closed-expression, its serial evaluation orders are a subset of its concurrent evaluation orders.

Theorem 3
A single threaded, non-preemptive RnRS (Scheme) implementation is an ICS implementation.
This is because concurrently-repeatable implies portably-repeatable by Theorem 2.

The Programmer Contract

The programmer who wants to write closed-expressions which have repeatable behavior across platforms must currently write portably-repeatable code (among other constraints).

There are relatively few portably-repeatable expressions which are not also concurrently-repeatable. And it is avoiding these few expressions which represents the change of habit for coders. The few expressions involve the potential for multiple threads to have side-effects on shared state.

Examples of portably-repeatable expressions which are not concurrently-repeatable are symmetric calls of a pseudo-random-number-generator having shared state:

(begin
  (require 'random)
  (let ((rs (make-random-state "seed1")))
    (+ (random 9 rs) (random 9 rs))))

Notice that the expressions:

(- (random 9 rs) (random 9 rs))
and
(+ (random 9 rs) (random 19 rs))
are not portably-repeatable; for asymmetrical functions (or asymmetrical argument expressions), the order of calls to `random' causes different results to be returned.

A concurrently-repeatable version of the random sum can be written:

(begin
  (require 'random)
  (let* ((rs (make-random-state "seed1"))
         (rnd1 (random 9 rs)))
    (+ rnd1 (random 9 rs))))

The SLIB random module detects and signals an error if `random' is called concurrently or reentrantly on the same random-state byte-array. A concurrent implementation can thus sometimes catch concurrently-repeatable violations.

The Scheme Reports also leave the order of evaluation unspecified for `map'. Examples of portably-repeatable `map' expressions which are not concurrently-repeatable have side-effecting (set!) shared-state (cnt) within a `map'ped procedure:

(let ((cnt 0))
  (let ((res (map
              (lambda (x)
                (if (negative? x) (set! cnt (+ 1 cnt)))
                (* x x))
              '(-1 2 -3 4 -5 6))))
    (list res cnt)))

Such code can be made concurrently-repeatable by separating the functional and side-effecting parts:

(let ((cnt 0) (lst '(-1 2 -3 4 -5 6)))
  (for-each (lambda (x)
              (if (negative? x) (set! cnt (+ 1 cnt))))
            lst)
  (list (map
         (lambda (x) (* x x))
         lst)
        cnt))

Or the non-functional part can be made functional:

(let ((lst '(-1 2 -3 4 -5 6)))
  (list (map (lambda (x) (* x x)) lst)
        (apply + (map (lambda (x) (if (negative? x) 1 0))
                      lst))))

Or the functional part can be accumulated in the serial `for-each'. This code is forced to run serially by the `for-each'; so it will not gain a speed benefit from concurrent evaluation:

(let ((cnt 0)
      (lst '(-1 2 -3 4 -5 6))
      (nst '()))
  (for-each (lambda (x)
              (if (negative? x) (set! cnt (+ 1 cnt)))
              (set! nst (cons (* x x) nst)))
            lst)
  (list (reverse nst) cnt))

One could also write a serial `map*' procedure to do the same.

Continuations

This expression is portably-repeatable but not concurrently-repeatable because it would return twice:
(call-with-current-continuation
 (lambda (return)
   (/ (return 0) (return 0))))

This expression is neither portably-repeatable nor concurrently-repeatable because whether it returns 0 or 1 depends on the order of evaluation of arguments:

(call-with-current-continuation
 (lambda (return)
   (/ (return 0) (return 1))))

But escape continuations can be used in concurrently-repeatable ways:

(define (careful-div n d)
  (call-with-current-continuation
   (lambda (return)
     (/ n (if (zero? d) (return #f) d)))))

The Implementation Contract

When encountering an expression (list (u) (v)), a parallelizing RnRS implementation must commence an analysis of u and v and everything they call, possibly including other modules, to determine whether any side effects to shared state might depend on concurrent evaluation order. This is the same difficulty that any parallelizing C or Fortran compiler faces. Such an analysis cannot be exhaustive because there is no guarantee that the analysis will terminate (Halting Problem).

When a parallelizing Implicitly-Concurrent-Scheme compiler or interpreter encounters (list (u) (v)), it immediately knows that it can delegate the computation of (u) and (v) to distinct threads. The Programmer Contract to write concurrently-repeatable code insures that no side-effects to shared state will depend on concurrent evaluation order.

Exceptions

When an exception is raised in one thread evaluating an argument in concurrently-repeatable code, it does not effect the evaluation of the other arguments because its side-effects do not interact with the threads of the other arguments. Whether the handler returns or not determines whether the procedure is applied to the result of those argument evaluations.

Conclusion

Implicitly-Concurrent-Scheme is a minor modification to Scheme Report semantics which allows, but does not require, both interpreters and compilers to simply parallelize the execution of programs. No new primitives are introduced. In order to write ICS programs which produce the same output irrespective of concurrent evaluation order requires a regime very similar to the that required of RnRS programs to produce the same output irrespective of serial evaluation order.

Bibliography

[Feeley]
Marc Feeley,
An Efficient and General Implementation
of Futures on Large Scale Shared-Memory Multiprocessors
,
PhD thesis, Brandeis University, April 1993, 213 pages.

Wednesday, October 21, 2009

 

Noticed these Tricyrtis blooming after our first snowfall.
Posted by Picasa

Saturday, October 17, 2009

Climate Change: Anthropogenesis Doesn't Matter

Climate change deniers express a lot of concern over whether climate change is caused by human activities or not. Implicit in asking whether climate change is anthropogenic is an assumption that if climate change isn't caused by humans, then nothing can or should be done to reverse it.

That assumption is absurd; and only serves to derail discussion of how to combat climate change. To see how absurd it is, consider the case of asteroids. If an asteroid large enough to end life-as-we-know-it were discovered to be on a trajectory to impact the earth, it would not be anthropogenic. Should we then not attempt to deflect it from catastrophe?

The important question is: How can we slow or reverse climate change and the destruction of ocean reefs from increased atmospheric CO2?

Wednesday, October 7, 2009

Search-paths considered harmful

It has been my experience that dynamic search-paths are disastrous for libraries on commercial and educational computer networks. The search-paths at these sites seem to only grow; paths to newer versions are placed ahead of older versions. It is not uncommon for half of the directories in a path to refer to non-existent directories or non-functioning host machines! The timeouts from hosed or inaccessible networked-file-systems can balloon into minutes the startup time for programs.

Furthermore, transient router outages can cause incompatible versions of libraries to be loaded which, in the Scheme or Lisp case, may not be noticed until the program has run for hours; but more likely will just cause a mysterious failure.

The SLIB Scheme Library's solution to this problem is to have an explicit command [(require 'new-catalog)], typically run after installations, write to a file an association list of features and resolved pathnames. SLIB sessions read this file and thereafter have instantaneous latency to library paths.

If the network can't reach a particular file, the session fails immediately or after the first timeout rather than conducting covert experiments on library version compatibility.