Treating integers as two's-complement strings of bits is an arcane but important domain of computer science. It is used for:Many people's first impulse is to conflate integers with bit-vectors. But this is not useful when we realize that functional operations on immutable integers-as-bits are less inefficient than side-effecting operations on large bit-vectors.

- hashing;

- Galois-field[2] calculations of error-detecting and error-correcting codes;

- cryptography and ciphers;

- pseudo-random number generation;

- register-transfer-level modeling of digital logic designs;

- index arithmetic of Fast-Fourier transforms;

- packing and unpacking numbers in persistant data structures;

- space-filling curves with applications to dimension reduction and sparse multi-dimensional database indexes; and

- generating initial values for root-finders and transcendental function algorithms.

Bit-vectors can be used to construct very fast indexes for database tables having millions of records. To keep the working-set small, creation of megabit arrays is done explicitly; the bit-vector procedures accumulate their results into one of the operands. Functional operations on bignums would be impractical for such an application.

Let

*N*

- be the largest number of significant bits in bignum inputs to a bitwise computation; and

*P*

- be the number of binary bitwise operations (LOGAND, LOGOR, ...) in a computation.

*P*bit-vectors (

*a*OR

*b*OR

*c*...). Using (bignum) integers-as-bits,

*P*bignums will be allocated and

*P*-1 bignums will be discarded in the course of the computation. Thus it consumes O(

*PN*) space and time.

Accumulating the result of multiple disjunctions in one bit-vector consumes only O(*N*) storage and O(*PN*) time.

The worst case for bit-vector accumulation is a full binary tree of 2^{q} bit-vectors with *P*=2^{q}-1 binary operations. By reusing temporary bit-vectors, storage can be limited to O(*Nq*)=O(*N*log*P*) and O(*PN*) time.

A clever optimizer could be written to convert functional bitwise expressions into accumulating bit-vector forms, but I haven't found functional bitwise operators to be easier to work with than imperative bit-vector operators for database indexing.

In addition to the generic Array and Array Mapping procedures of SLIB, the following procedures in SCM were written to support the use of bit-vectors as database indexes. This technique works best for indexes having only a few indexed values (with many records per value). One bit-vector is devoted to each indexed value for a field; a 1-bit indicates that the corresponding record has that value in that field. Indexes for values which are sparse can instead be encoded by vectors of integers, each integer indicating the record having that value in that field. Indexes coded in these ways allow fast calculation of complicated queries.

## 5.4.3 Bit Vectors

Bit vectors can be written and read as a sequence of `0`

s and `1`

s prefixed by `#*`

.

#1At(#f #f #f #t #f #t #f) => #*0001010

Some of these operations will eventually be generalized to other uniform-arrays.

__Function:__**bit-count***bool bv*

- Returns the number of occurrences of
`bool`in`bv`.

__Function:__**bit-position***bool bv k*

- Returns the minimum index of an occurrence of
`bool`in`bv`which is at least`k`. If no`bool`occurs within the specified range`#f`

is returned.

__Function:__**bit-invert!***bv*

- Modifies
`bv`by replacing each element with its negation.

__Function:__**bit-set*!***bv uve bool*

- If
`uve`is a bit-vector, then`bv`and`uve`must be of the same length. If`bool`is`#t`

, then`uve`is OR'ed into`bv`; If`bool`is`#f`

, the inversion of`uve`is AND'ed into`bv`.

If

`uve`is a unsigned integer vector, then all the elements of`uve`must be between 0 and the`LENGTH`

of`bv`. The bits of`bv`corresponding to the indexes in`uve`are set to`bool`.

The return value is unspecified.

__Function:__**bit-count****bv uve bool*

- Returns

(bit-count (bit-set*! (if bool bv (bit-invert! bv)) uve #t) #t).

`bv`is not modified.

## No comments:

Post a Comment