mulle-objc: fast methods make mulle_objc_object_call even faster
Continued from mulle-objc: investigating the pros and cons of inlining
mulle_objc_object_call
.
To speed up the most often used methods, mulle-objc uses a per class table-based indexing mechanism called _mulle_objc_fastmethodtable
or short vtab
,
that is quite similiar to the _vptr
of C++.
Support for this faster indexing scheme, is enabled, when you use the compiler optimization level -O1 or above. At -O1 the following method mulle_objc_object_constant_methodid_call
is used:
MULLE_C_ALWAYS_INLINE
static inline void *mulle_objc_object_constant_methodid_call( void *obj,
mulle_objc_methodid_t methodid,
void *parameter)
{
struct _mulle_objc_class *cls;
int index;
if( __builtin_expect( ! obj, 0))
return( obj);
cls = _mulle_objc_object_get_isa( obj);
index = mulle_objc_get_fastmethodtable_index( methodid);
if( index >= 0)
return( _mulle_objc_fastmethodtable_invoke( obj, methodid, parameter, &cls->vtab, index));
return( (*cls->call)( obj, methodid, parameter, cls));
}
The inline function
mulle_objc_get_fastmethodtable_index
converts a methodid into an index from 0 to 23, when it matches a fast method. For instance the selector for init
has an index of 1. Remember that the compiler knows the selector of the method and this selector is an integer constant. So the compiler can solve this function at compile time. This conversion and also the if( index >= 0)
are “for free”.
Assuming we know that self is not nil and we make a call to -init
, this inlined method call simplifies to:
{
struct _mulle_objc_class *cls;
cls = _mulle_objc_object_get_isa( obj);
return( _mulle_objc_fastmethodtable_invoke( obj, @selector( init), parameter, &cls->vtab, 1));
}
_mulle_objc_fastmethodtable_invoke
will then read the appropriate function from
the vtab
and call the method.
Calculating the memory overhead
Obviously it’s much faster to call Objective-C methods this way, so why not do
it for all methods ? The answer is space constraints. Due to to the nature of
message sending, each class of a class-pair must have a vtab
like this.
Furthermore every class-pair of the runtime must have a compatible vtab
.
Lastly the indexes of the selectors must be common across all classes. That is
quite restrictive.
The overhead per added fastcall methods is therefore sizeof( IMP) * 2 * #classes
. Assuming 512 classes on a 64 bit machine that would be a 8KB overhead.
How many classes are there ?
In Xcode 7.2.1, the number of classes with a NS prefix was 489.
grep -Rho "@interface [a-zA-Z]\+" \ /Applications/Xcode.app/Contents/Developer/Platforms \ cut -d" " -f2 | awk '!a[$0]++' | sort > classes.txt` egrep '^NS' classes.txt | wc -l
Choosing fast-method selectors
As we have seen in mulle-objc: some research about selectors an average Objective-C program uses a small set of selectors a lot, and most of the other selectors rather seldomly:
703,484 class
483,335 release
343,421 retain
300,315 dealloc
299,138 autorelease
294,309 characterAtIndex:
260,189 isEqual:
250,537 hash
228,409 allocWithZone:
197,911 alloc
195,324 count
179,983 length
171,437 objectForKey:
156,440 self
138,796 superview
130,172 init
It therefore makes sense to use these for fast methods. The mulle-objc-runtime predefines indices 0-7 for the following selectors in mulle_objc_fastmethodtable.h
:
Index | Selectors |
---|---|
0 | @selector( alloc) |
1 | @selector( init) |
2 | @selector( finalize) |
3 | @selector( dealloc) |
4 | @selector( instantiate) |
5 | @selector( autorelease) |
6 | @selector( retain) |
7 | @selector( release) |
leaving 15 free slots for a Foundation or user apps to use.
MulleObjC currently defines indices 8-21 for its own purposes in ns_fastmethodids.h
:
Index | Selectors |
---|---|
8 | @selector( length) |
9 | @selector( count) |
10 | @selector( self) |
11 | @selector( hash) |
12 | @selector( nextObject) |
13 | @selector( timeIntervalSinceReferenceDate) |
14 | @selector( lock) |
15 | @selector( unlock) |
16 | @selector( class) |
17 | @selector( isKindOfClass:) |
18 | @selector( objectAtIndex:) |
19 | @selector( characterAtIndex:) |
20 | @selector( methodForSelector:) |
21 | @selector( respondsToSelector:) |
A method like count
is really a no-brainer. It is a method that one can
assume is just a simple accessor. It is also called very often. Therefore
the regular Objective-C method overhead will be a significant factor of the
runtime cost of calling count
.
A method like timeIntervalSinceReferenceDate
is debatable, because it may not
be called often enough. Also it could be making a system call, which would
negate the advantage of making this a fast method.
How to use a fast method for your own code
As MulleObjC is greedy, it leaves you only with two indexes 22,23.
So we want to speed up a method called fastMethod:
. It doesn’t matter if it is a class or an instance method. The selector for this methods happens to be 0x006f9586
.
Now whereever you call this function, you must define MULLE_OBJC_FASTMETHODHASH_23
ahead of including <mulle_objc/mulle_objc.h>
#define MULLE_OBJC_FASTMETHODHASH_23 0x006f9586
#include <MulleObjC/MulleObjC>
See: fastmethod.m for another example
Code that forgets to do this, will not get the benefits of fast method calls, but that is the only consequence.
Obviously it’s a serious problem, if you have multiple, differing definitions for MULLE_OBJC_FASTMETHODHASH_23
throughout your code.
Continued to mulle-objc: tagged pointers, boon or bane ?`.
‘mulle-objc: tagged pointers, boon or bane ?’
Post a comment
All comments are held for moderation; basic HTML formatting accepted.