Danny's Tech: Where West and East Intersect

Wednesday, November 16, 2005

GC and beyond

Last night I was playing with Squeak/Croquet. And it is an impressive tool/environment. However, I noticed that it was very slow to come up. Once it was "alive" the 3d graphics was very responsive on my computer (top of the line notebook when I bought it 2 months ago). Almost like playing a 3d shoot'em up game.

I've been thinking about this slowness since then, off and on, and this morning it hit me: Why not optimize code behind the scene just like GC (garbage collection)? Just as GC automatically collects and cleans up dynamic memory usage inside one's code, background optimizer (how about BGO since BO has negative connotation) can look at execution flow and optimize code while things are running. Unlike FX!32 (which optimized x86 machine code into Alpha machine code), this would look at the higher level usage of the language. [Note that I'm talking about doing it in a VM or virtual machine for languages like Smalltalk or Java.]

This article "Virtual Machines and the OS" talks about BGO in a similar way. I like the idea of being able to optimize the VM itself and even sharing this via network. With open source sharing of ideas and peer reviews and even formal proofs, "correct" optimization can even be automatically patched into the VM (much like we do patching of Windows or Linux via the network).

Run-time code optimization is not enough, however. I want something like MTSO (multi-thread synchronization optimizer). With multi-core world that we live in today, you cannot avoid writing multi-threaded programs which means you will have to deal with mutex (mutual exclusion of shared resources) and (synchronous and/or asynchronous) communications between the threads. These are the trickiest issues of writing multi-threaded software and if it can be done under the covers (and optimized during run-time) then it would ease both maintenance and debug of the code. Obviously for a closed system like game console this run-time optimization would have to be done preproduction so that completely optimized version can be fixed in a read-only media (ROM used to mean Memory but Media is probably more appropriate today).

Optimizing for hetreogenous multiprocessor (or AMP, Asymmetric Multi-Processor) environment would be taking this to the next level. Some fancy chips like Cell processor has one main CPU and 8 sub-CPU or coprocessors. Fortunately, you don't have to use any exotic chip to be in this environment. Most computers are AMP today: There are two unique components in the computers: CPU (central processing unit) and graphics chip(s). Most graphics chips today are programmable and are generally called GPU (graphics processing unit) and can do some extra processing beyond the CPU. With BGO, it can optimize code to run on free cycles of the GPU.

With all this optimization going on, next level of optimization would be to schedule tasks (programs, OS services, various optimizers) efficiently. Meta-scheduler, if you will, would look at the run states and adjust priorities and time slices as needed.

With virtualization getting news coverage, being able to manage the individual partitions in automatic way would be nice, too.

Labels: , ,