Giving [] a boost

You can download my test cases. These projects have been made with XCode 1.1.

A Look over the Shoulder

You should have read and understood part 3 Method and function call innards, as I don't duplicate the necessary background information in this article.

In Method and function call innards I wrote under the headline Hidden pitfalls when using IMPs:

An often implemented compromise is to cache the method implementation during the lifetime of the caller, as in the following example.

If it were ensured, that only objects of one same class are ever stored in _array, it would be possible to optimize this method further by manually resolving not only the objectAtIndex: method address, but also the operate and annihilate method addresses.

By NOT doing this we keep the implementation more general and versatile.

Although the reasoning is OK, the subliminal implication that this method can not be further optimized is wrong.

- (void) operateAnnihilate
   SEL  sel;
   IMP  f;
   int  n;
   int  i;
   id   p;

   sel = @selector( objectAtIndex:);
   f   = [_array methodForSelector:sel];
   n   = [_array count];
   for( i = 0; i < n; i++)
      p = (f)( _array, sel, i);
     [p operate];
     [p annihilate];

Optimizing this some more

What you can safely do, to avoid the shared library stub code, is to invoke objc_msgSend with a function pointer directly.

If you are writing [[NSAutoreleasePool alloc] init], the Objective-C compiler will translate this into

   objc_msgSend( objc_msgSend( objc_getClass( "NSAutoreleasePool"),
                               @selector( alloc)), 
                 @selector( init))
this will be always going through the dyld stub code. Now although this stub code is short, it will take up approximately 20-30% of the time needed for the method call dispatch.

The stub code can be avoided with the use of a function pointer. To write the very common statement [[NSAutoreleasePool alloc] init] utilizing function pointers one would write:

   id      (*p_objc_msgSend)( id p, SEL sel, ...);
   id      (*p_objc_getClass)( char *s);
   Class   class;
   id      pool;
   p_objc_getClass = objc_getClass; 
   p_objc_msgSend  = objc_msgSend;
   class = (*p_objc_getClass)( "NSAutoreleasePool");
   pool  = (*p_objc_msgSend)( class, @selector( alloc));
   pool  = (*p_objc_msgSend)( foo, @selector( init));
Though not very spectacular for setting up a NSAutoreleasePool it will come in handy in loops:

Now the rewritten operateAnnihilate looks like this:

- (void) operateAnnihilate
   SEL   operateSel;
   SEL   annihilateSel;
   SEL   sel;
   IMP   f;
   int   n;
   int   i;
   id    p;
   id    (*p_objc_msgSend)( id p, SEL sel, ...);

  sel = @selector( objectAtIndex:);
  f   = [_array methodForSelector:sel];
  n   = [_array count];

  p_objc_msgSend = objc_msgSend;
  operateSel     = @selector( operate);
  annihilateSel  = @selector( annihilate);

  for( i = 0; i < n; i++)
    p = (f)( _array, sel, i);
    (*p_objc_msgSend)( p, operateSel);
    (*p_objc_msgSend)( p, annihilateSel);
What you gain: Speed
What you lose: Code size increases a bit. Your code starts to depend on runtime mechanics.

In this real world example the results were 16.7 vs 18.7s. This means a 10% speed gain for this method using function pointers to call objc_msgSend. Unfortuantely gcc doesn't optimize empty methods (used for operate and annihilate) that well with -O3, so the pure numbers aren't that impressive.

Can this be optimized further ? Yes sure can. Please stand by for the next installment IMP Cacheing Deluxe.

The Compiler should do that for you

I think that calling objc_msgSend through function pointers is that beneficial, that the compiler itself should be able to do that optimization for you. I'd suggest a compiler option like -fobjc-fp-calls or maybe -fdyld-call-mulle-style :) . The use of function pointers I figure starts paying off with the third call to any dyld function really.
If you want to discuss this articles, please do so in this thread in the Mull e kybernetiK Optimization Forum.