I ran Safari with the tracer, produced 20 MBs worth of malloc library info and then let Apple's malloc compete against the Mullocator. The results are a bit disappointing as the mullocator is not twice as fast as Safari, only 60% faster, which means that i will have to look into tweaking some parameters, as twice as fast really should be doable.
I suspect that I can improve realloc some, as the memcpy routine is never inlined apparently on PPC, by wrtiting my own memcpy. Looked around the net and already found an Altivec memcpy. I wonder how much memory one needs to move, so that it becomes a win, as context switches for non-Altivec code are bound to become much slower. (Sames goes for using doubles in memcpy).