Go to the first, previous, next, last section, table of contents.


Experimental Datatypes

Data types have been as Lost In Time as programming syntax for the last few decades.

Pick any contemporary programming language at random, and what are you going to find?

Structs, arrays, objects, hashtables.

Yawn.

I think we've mastered those now, and can think about moving on!

I'm particularly interested in exploring "softer" datatypes and algorithms.

For example, the Bloom Filter is interesting: Take (say) a one-megabit bitvector, and a hundred hash functions. Enter words into it by hashing each one by all 100 hash functions, and setting the corresponding bits to 0 for even-numbered hash functions and 1 for the odd-numbered hash functions. Now, given a query word, hash it all 100 times and check the corresponding bits: If you get 100 hits you 100% know the word, if you get 50 hits (random expectation) you don't know the word, and in between you have different levels of "familiarity" for the word. Neat! We're now getting soft "that seems a bit familiar" responses back instead of just conventional hard Boolean "Definitely yes!" or "Definitely no!" answers. Notice old values don't get suddenly overwritten, instead they gradually become less familiar over time. Very humanlike! Very unusual effect using conventional programming techniques! And extremely space efficient. Is it coincidence that as our effects get more human-like, they also get more efficient...?

In mathematical essence, we're generating sparse megabit queries and projecting them onto our (nonsparse) state vector. (And thus an appropriate programming syntax, should be able to write the whole thing in a few lines. What would the appropriate programming syntax look like?)

The idea can be extended to other kinds (scalar or complex instead of bit) of sparse query vectors, and we can search for closest matches among the set of known vectors instead of just projecting onto the state vector. The result can be a very vague, soft notion of finding "similar" situations seen in the past, something at which conventional programming is very bad, which is one reason why conventional programs either run perfectly or stop completely. (3)

One can extend this idea further by doing hill-climbing learning on the hash functions, evaluating statistically which ones are contributing most and least, and replacing the flops by random mutations of the successful ones.)

For another example, implement a general notion of graph, together with a set of dataflow and network operators for conveniently computing interesting things. Almost any complex dataset can be cast as a graph, and there is a rich body of research literature on graph algorithms which could be applied to them. (Engineers have their own set of sparse matrix algorithms, which are really graph algorithms seen from a different direction. Linear programming is a reasonably close relative also.)

Or take off in some completely different direction! What other life is there beyond the beaten paths of structs, arrays, objects, hashtables and lookup trees? Let's use some imagination!


Go to the first, previous, next, last section, table of contents.