Zerofilling - Part IV - New and Improved Application
Here's a new version, quite a bit nicer than the previous version. Here are the highlights:
vm_unallocatefor memory allocation (Thanks to Jim Magee for this)
- It is more apparent which fields are editable and which aren't.
- There is a option to loop the benchmark to stabilize the values (Apple-L)
- You can have more than one window, but you can't close them :)
mmap feature is the biggy in this release.
mmap to allocate memory is much faster than the
vm_allocate as you will find out. On my Cube 450 use
mmap is almost twice as fast as the use of
vm_allocate. That is for pure allocation and
deallocation. The actual page fault and zerofill costs are - from
my observations - virtually :) unaffected. So while this certainly
is of big interest, it in itself doesn't change the comments made
in part III.
|1p alloc + fault||41.0||51.5|
|2p alloc + fault||65.9||77,3|
If you want to know why
mmap is faster read
Excerpt from a mail from Jim Magee:
The Mach version of the call is an RPC into the kernel targeting whatever task/map happened to be passed in. Almost always, it is the current map. But the API (and the transport mechanism to get to the API) can't assume that. So, it has to do all the dirty work of adding Mach port rights to the destination port for this message, looking up a per-thread reply port and adding a reply port right, formatting and sending a message to that port, validating and atomically translating that destination port send right into a reference on a
vm_map_t, and then
finally making the
vm_allocate() call in the kernel.
Some of the same overhead kicks in on the reply (tear down the
temporary map reference, build a reply message, etc...). This all
adds up to a bit of IPC-related overhead.
When you use the BSD API, it always refers to the map that the current thread is running in. Since each thread already holds a reference against their own address map, there is no need to mess around with trying to add a reference for the duration of this call. Just take the trap arguments, grab the cached reference on the map, and make the call. Copyout a single reply argument and away you go.