Muq Source Code - Virtual Memory

Go to the first, previous, next, last section, table of contents.

Virtual Memory

The `vm.c' module implements a simple virtual memory manager. Blocks of storage are named by integers, and swapped to and from disk transparently to the rest of the program. Pragmatically, the module is tuned to the assumption that the set of objects being managed consists of many (tens of thousands to millions) small (tens of bytes) objects, with a working set which is small (a few percent) relative to the size of virtual memory, and reasonably static.

Design decisions driven by the above pragmatics include:

Objects on disk are stored in a series of .db files, each file containing the state for one user. Internally, each file consists of arrays of fixed-size slots, each array having slots twice the size of the preceding one, together bit allocation bitmaps and other overhead.
A length-in-bytes field is stored on disk for each object, a single byte added to blocks of 64 bytes or less, four bytes on larger objects.
Objects are identified by 64-bit unsigneds (Vm_Objs) which specify the file the object is in (hence its approximate length) and its offset within that file. The low 5 bits of Vm_Obj pointers are a typetag, used for various purposes, both internally by us and externally by our caller. (More on this in a moment.) The remaining 59 bits encode the octave array holding the object, and the offset of the object within that octave array, plus other overhead information.
Objects which are in ram are located using a hashtable mapping object identifiers to ram locations. Rationale: A hashtable is slower than a direct-mapped array indexed by object number (the most obvious alternative), but the direct-mapped array would lock into ram an pointer for every object in virtual memory, which is more ram than we wish to spend. In particular, if average object size dropped to 4-8 bytes, we'd have more in-ram overhead than db on disk. (!)
Objects in ram are stored in one large rambuffer, "bigbuf", packed at the zeronear end of the buffer: Ram is allocated by advancing a bugbufFree zerofarward through bigbuf, much like a stack which is pushed but never popped. When bigbuffree reaches the end of bigbuf, we swap some objects to disk if needed, close up holes left by deleted objects in bigbuf by sliding all remaining objects to the zeronear end of bigbuf, set bigbuffree to the start of the freespace thus created, and continue.
Each object in memory is stored in bigbuf, preceded by an two-slot bigbufBlock header containing the hashtable key (an 'o' field which contains the object's Vm_Obj 'name') and a 'next' field pointing to the next object on the hashtable chain. Here's a sketch of the hashtable and one three-object hashtable chain threaded through bigbuf, terminated by a a pointer to nullBlock. All hashtable chains end with a pointer to this same nullBlock, which is always in the second and third words of bigbuf:
```
     hashtable             bigbuf
   ------------         -------------
   |          |         | unused    |
   |          |         |-----------|
   |          |         | 0         | <------	(nullBlock)
   |          |         | 0         |	    |
   |          |         |-----------|	    |
   |          |         |           |	    |
   |   ...    |         |    ...    |	    |
   |          |         |           |	    |
   |----------|         |-----------|       |
   |    o----------->   | o         |	    |
   |----------|         | next  o---------  |
   |          |         | user data |    |  |
   |   ...    |         |-----------|    |  |
   |          |         |           |    |  |
   |          |         |    ...    |    |  |
   |          |         |           |    |  |
   |          |         |-----------|    |  |
   |          |         | o         | <---  |
                        | next  o---------  |
                        | user data |    |  |
                        |-----------|    |  |
                        |           |    |  |
                        |    ...    |    |  |
                        |           |    |  |
                        |-----------|    |  |
                        | o         | <---  |
                        | next  o------------
                        | user data |
                        |-----------|
                        |           |
                        |    ...    |
                        |           |
```
This gives us a fixed overhead of sixteen bytes per in-memory object. (The hashtable vector itself requires four bytes per hashtable chain, of course, which must be amortized over the objects on the chain. To keep performance up, we want to keep the chains short, so this should likely be counted as an additional 2-4 or so bytes of overhead per ram-object.) Recall that the bottom five bits of Vm_Obj pointers are free for use. This means in particular that the bottom five bits of the 'o' field in our bigbufBlocks are free for use. We always set the low bit to '1', for reasons which will become clear in a paragraph. We use the next bit as a DIRTY flag recording which objects need to be written to disk. (The remaining three bits are available to users via vm_Get_Userbits(o) and vm_Set_Userbits(o,bits)... at the moment actually only two are made available, this needs fixing.) Since we don't really expect to use a 4Gbyte bigbuf any time soon, we can likewise steal a few bits from the bottom of our bigbufBlock->next field. We steal six bits (thus limiting ourself to a bigbuf of no more than 64Mbytes --- remember that bigbufBlocks are int-aligned) and use them to hold size-in-bytes information for data blocks up through the 33-64 byte octave. Beyond this octave, we prepend a 64-bit length field to our bigbufBlock, containing fieldlength << 8. (Since this length field always has the low bit set to 0, and our 'o' fields always have the low bit set to 1, we can still unambiguously step through bigbuf when compacting.) This headerblock scheme seems nearly optimal: Clearly any hashtable design will require a key field and a 'next' field for each object, plus a root array to anchor and index the chains, so we can hardly reduce our overhead much. Mallocs tend to have a similar amount of overhead, which also suggests we're near the practical minimum.

This design is, I hope, fairly immune to pathologically bad behavior, which immunity is more important to me than incrementally better average- or best-case behavior. The design adopted also appears to have near-minimal impact on the design and implementation of the rest of the system.

By the way: `vm.c' should work as a pretty decent compacting memory manager even if you never need to page anything to disk...

Go to the first, previous, next, last section, table of contents.