The `vm.c' module implements a simple virtual memory manager. Blocks of storage are named by integers, and swapped to and from disk transparently to the rest of the program. Pragmatically, the module is tuned to the assumption that the set of objects being managed consists of many (tens of thousands to millions) small (tens of bytes) objects, with a working set which is small (a few percent) relative to the size of virtual memory, and reasonably static.
Design decisions driven by the above pragmatics include:
hashtable bigbuf ------------ ------------- | | | unused | | | |-----------| | | | 0 | <------ (nullBlock) | | | 0 | | | | |-----------| | | | | | | | ... | | ... | | | | | | | |----------| |-----------| | | o-----------> | o | | |----------| | next o--------- | | | | user data | | | | ... | |-----------| | | | | | | | | | | | ... | | | | | | | | | | | |-----------| | | | | | o | <--- | | next o--------- | | user data | | | |-----------| | | | | | | | ... | | | | | | | |-----------| | | | o | <--- | | next o------------ | user data | |-----------| | | | ... | | |This gives us a fixed overhead of sixteen bytes per in-memory object. (The hashtable vector itself requires four bytes per hashtable chain, of course, which must be amortized over the objects on the chain. To keep performance up, we want to keep the chains short, so this should likely be counted as an additional 2-4 or so bytes of overhead per ram-object.) Recall that the bottom five bits of Vm_Obj pointers are free for use. This means in particular that the bottom five bits of the 'o' field in our bigbufBlocks are free for use. We always set the low bit to '1', for reasons which will become clear in a paragraph. We use the next bit as a DIRTY flag recording which objects need to be written to disk. (The remaining three bits are available to users via vm_Get_Userbits(o) and vm_Set_Userbits(o,bits)... at the moment actually only two are made available, this needs fixing.) Since we don't really expect to use a 4Gbyte bigbuf any time soon, we can likewise steal a few bits from the bottom of our bigbufBlock->next field. We steal six bits (thus limiting ourself to a bigbuf of no more than 64Mbytes --- remember that bigbufBlocks are int-aligned) and use them to hold size-in-bytes information for data blocks up through the 33-64 byte octave. Beyond this octave, we prepend a 64-bit length field to our bigbufBlock, containing fieldlength << 8. (Since this length field always has the low bit set to 0, and our 'o' fields always have the low bit set to 1, we can still unambiguously step through bigbuf when compacting.) This headerblock scheme seems nearly optimal: Clearly any hashtable design will require a key field and a 'next' field for each object, plus a root array to anchor and index the chains, so we can hardly reduce our overhead much. Mallocs tend to have a similar amount of overhead, which also suggests we're near the practical minimum.
This design is, I hope, fairly immune to pathologically bad behavior, which immunity is more important to me than incrementally better average- or best-case behavior. The design adopted also appears to have near-minimal impact on the design and implementation of the rest of the system.
By the way: `vm.c' should work as a pretty decent compacting memory manager even if you never need to page anything to disk...
Go to the first, previous, next, last section, table of contents.