Objective-C object allocation

Every Objective C object has some memory associated it. The smallest possible object is at least 4 bytes in size (on a 32 bit machine).
Now what are those 4 bytes overhead that each Objective-C object incurs ? It happens to be the socalled isa pointer, something C++ coders might associate with their __vtab entry in classes with virtual functions, although technically __vtab and isa differ dramatically. Where C++'s __vtab points to a list of virtual methods, the isa pointer points to the class of the object. Now what does that mean ?
Place your doubles smartly
As each object gets an isa pointer in front after allocation, your first instance variable will usally not be 8 bytes aligned. For a nice 25% access improvement on your double instance variables layout your class like this:
@interface MyClass : NSObject { int someVariable; // reserve 4 more bytes double myDouble; // now 8 byte aligned } @end

Understanding classes at runtime

Apple's Documentation has become much better, since the time this was written initially. The following section gives just a quick overview of the neccessary information for the rest of the article. For in-depth information consult The Objective-C Programming Language, especially The Objective-C Runtime System.

Very different to C++ and Java, Objective-C's classes are visible and manageable at runtime. That's because at runtime they are objects. But they are objects of a special kind. Although can put them in NSArrays or get a description on them or message them in any other way like normal objects, you cannot, for instance, usefully copy them or release them. Each class at runtime is actually a pair of objects (instances of struct objc_class defined in <objc/objc-class.h>. One of these class objects maintain the information for instances, the other for classes. So -method would be stored in the former and +method in the latter. The former is called class object and the latter metaclass object.

Lets examine three classes at runtime: the classes Foo and Bar and the Foundation root class NSObject.

@interface Foo : Bar { } @end

Class objects contain a lot more information,
which is not shown here for simplification
.

@interface Bar : NSObject { } @end

You see on the left side the class objects and on the right side the metaclass objects.

The superclass pointer of each class object points to the class object a class inherits from. If you follow the superclass pointers from Foo, first you climb up the inheritance tree to Bar, then you reach NSObject and finally nil (since NSObject is a root class). (1)
Since we are dealing with objects here - remember: each class is an object - each of these class objects also has an isa pointer, that points to the metaclass object.

Now the metaclass object is also an object which can be messaged. All metaclass objects point with their isa to NSObject's metaclass, even NSObject's metaclass itself. (2)

As NSObject's metaclass inherits from NSObject, this means that all instance methods of NSObject are available for class objects and metaclass objects.

Here is a little XCode project that was used to establish the previous graphic's validity.

How an object is created

To create an Objective-C object you only have to allocate an appropriate amount of word aligned memory and put the address of a class object into the isa pointer - the first 4 bytes of that memory/object. That's it. Now you can use the address of that memory (the object) as the first argument to method calls and it will work. To show this in a practical example:

#import <Foundation/Foundation.h> @interface Bar : NSObject { } @end @implementation Bar @end @interface Foo : Bar { int value; } @end @implementation Foo - (NSString *) description { return( [NSString stringWithFormat:@"%@ value=%d", [super description], value]); } @end main() { NSAutoreleasePool *pool; // 2 ints = 8 bytes, 1 for isa, 1 for value unsigned int anObject[ 2]; // need this for NSLog pool = [[NSAutoreleasePool alloc] init]; anObject[ 0] = (int) [Foo class]; anObject[ 1] = 2; NSLog( @"Using anObject as a Foo object: %@", (id) anObject); [pool release]; return( 0); }

How Foundation objects are created

Object creation in Foundation is usually a three step method, called alloc-init-autorelease.

A straight forward example of an alloc-init-autorelease call chain for NSCalendarDate is

p = [[[NSCalendarDate alloc] init] autorelease];

which is the same as

NSCalendarDate *p; p = [NSCalendarDate alloc]; p = [p init]; [p autorelease];

just all the calls wrapped together. You could also use NSZones for allocation, which does not really change anything. alloc is just the same as allocWithZone:NULL.

[[[NSCalendarDate allocWithZone:someNSZone] init] autorelease];

alloc and are class methods. Their task is to allocate the memory that an instance is going to need and to set the isa pointer to the class. By placing the address of the class object into the first four bytes of this memory, the mere chunk of memory is transformed into an object of this class, albeit an uninitialized one. Some naive but valid code for an alloc of NSCalendarDate might be

+ alloc { NSObject *p; p = (NSObject *) malloc( sizeof( NSCalendarDate)); memset( p, 0, sizeof( *p)); p->isa = self; return( p); }

The allocWithZone: class method of NSCalendarDate in Foundation though is likely to be coded like this:

+ (id) allocWithZone:(NSZone *) zone { return( NSAllocateObject( self, 0, zone)); // mallocs and sets isa }

using a Foundation function to do the task (about the same steps as shown in the alloc method). The init instance method of NSCalendarDate could look something like this:

- (id) init { [super init]; _somestorage = get_local_system_time(); // _somestorage: instance variable return( self); }

These hypothetical allocation routines are just for illustration. It is not necessary for fixed size classes like NSCalendarDate to implement their own alloc routines, the inherited allocation routines from NSObject suffice.


How Foundation creates objects (subtle difference)

As seen in the last part Foundation seperates allocation and initialization into two steps . By this separation, the allocation code in most user classes can just be inherited from NSObject and only the specific initialization code needs to be written.

In the end the alloc/init paradigma trades a little bit of performance - two calls instead of one - for a lot more flexibility and convenience.That's the positive outcome.

The negative outcome is that some things look obscure and are hard to implement. As can be seen by the next example:

 
But what about new ?
Although there is a Foundation method called new, that does allocation and initialization in one step, this method is little else than a shortcut for writing alloc/init. At one time it had even been considered deprecated, but this does not seem to be the case anymore.

[[[NSString alloc] initWithCString:"string"] autorelease]

It looks like that the NSString implementation cannot get around two allocation calls. At the time when alloc is called that method has no way of knowing how big the string is going to be that will be passed in at init time. One likely implementation allocates space in the init routine and copies the string there, keeping a reference in the hypothetical instance variable myString. That means one allocation call for the object, and one for the string data.

+ (id) alloc { return( NSAllocateObject( self, 0, zone)); } static void copy_unichar_chars( unichar *dst, char *src, unsigned int len) { ... ... } - (id) initWithCString:(char *) s { [super init]; myString = malloc( strlen( s) * sizeof( unichar)); copy_unichar_chars( myString, s, strlen( s)); return( self); }

An alternative would be to allocate space behind the object in the NSAllocateObject/NSReallocateObject calls, which allows you to specify some additional memory. Unfortunately we still have two allocations...

+ (id) alloc { return( NSAllocateObject( self, 0, zone)); } static void copy_unichar_chars( unichar *dst, char *src, unsigned int len) { ... ... } - (id) initWithCString:(char *) s { self = NSReallocateObject( self, strlen( s) * sizeof( unichar), [self zone]); [super init]; copy_unichar_chars( self + 1, s, strlen( s)); return( self); }

This is not very efficient. Unfortunately this problem crops up in quite a few other often used classes such as NSDictionary, NSArray, NSNumber, NSData or NSSet (the list is not necessarily complete). It hits exactly those objects that you are allocating often.

A possible way out of that dilemma could be the use of "class factory methods", like stringWithCString: or numberWithInt: and hope that Foundation does something better in its implementation than a straightforward

+ genericWithArgument:(id) arg { return( [[[self alloc] initWithArgument:arg] autorelease]); }

since the size of the object and the data can be determined now ahead of allocation time. Which could translate into just one allocation call for both.

But then you can not create objects, that aren't automatically autoreleased and you have rarely - if ever - the option of the specifing the NSZone. Having objects not automatically autoreleased can be very desirable, when you are creating lots of objects and the performance penalty of autorelease is really hurting you.

Does Foundation suck ?

But it is not quite as bad as it seems, because Foundation uses a trick called a placeholder class, that saves the first basically useless allocation call. Looking back at the

[[[NSString alloc] initWithCString:"foo"] autorelease];

Foundation does not really need to allocate a proper object of that class in the first place, since all that is going to happen is that the init call will throw it away and create a new object with the proper size. So Foundations NSStrings alloc does not dynamically allocate a NSString object but instead returns a static object of class NSPlaceHolderString, that can not be released or used otherwise except for subsequent initialization. Although it looks like this scheme would fall apart, when you use allocWithZone:, since the placeholder now has to store the NSZone for the init call, in reality only one more "static" placeholder object has to be allocated and maintained per NSZone.

As cool as this sounds, it basically makes the alloc call a NOP operation, that still takes up some time.

Here is an example where you can see Foundation at work:

#import <Foundation/Foundation.h> int main (int argc, const char *argv[]) { NSAutoreleasePool *pool; NSString *s1; NSString *s2; NSZone *zone1; NSZone *zone2; pool = [[NSAutoreleasePool alloc] init]; NSLog( @"alloc"); s1 = [NSString alloc]; s2 = [NSString alloc]; NSLog( @"s1 = <%@:$%lX (zone=$%lX)>", [s1 class], (long) s1, [s1 zone]); NSLog( @"s2 = <%@:$%lX (zone=$%lX)>", [s2 class], (long) s2, [s2 zone]); s1 = [s1 initWithCString:"foobar"]; s2 = [s2 initWithCString:"blabla"]; NSLog( @"s1 = <%@:$%lX (zone=$%lX)>", [s1 class], (long) s1, [s1 zone]); NSLog( @"s2 = <%@:$%lX (zone=$%lX)>", [s2 class], (long) s2, [s2 zone]); // // Now the same but with one zone // NSLog( @"allocWithZone: (1 zone)"); zone1 = NSCreateZone( 0x1000, 0x1000, YES); s1 = [NSString allocWithZone:zone1]; s2 = [NSString allocWithZone:zone1]; NSLog( @"s1 = <%@:$%lX (zone=$%lX)>", [s1 class], (long) s1, [s1 zone]); NSLog( @"s2 = <%@:$%lX (zone=$%lX)>", [s2 class], (long) s2, [s2 zone]); s1 = [s1 initWithCString:"foobar"]; s2 = [s2 initWithCString:"blabla"]; NSLog( @"s1 = <%@:$%lX (zone=$%lX)>", [s1 class], (long) s1, [s1 zone]); NSLog( @"s2 = <%@:$%lX (zone=$%lX)>", [s2 class], (long) s2, [s2 zone]); // // Now the same but with two zones // NSLog( @"allocWithZone: (2 zones)"); zone1 = NSCreateZone( 0x1000, 0x1000, YES); zone2 = NSCreateZone( 0x1000, 0x1000, YES); s1 = [NSString allocWithZone:zone1]; s2 = [NSString allocWithZone:zone2]; NSLog( @"s1 = <%@:$%lX (zone=$%lX)>", [s1 class], (long) s1, [s1 zone]); NSLog( @"s2 = <%@:$%lX (zone=$%lX)>", [s2 class], (long) s2, [s2 zone]); s1 = [s1 initWithCString:"foobar"]; s2 = [s2 initWithCString:"blabla"]; NSLog( @"s1 = <%@:$%lX (zone=$%lX)>", [s1 class], (long) s1, [s1 zone]); NSLog( @"s2 = <%@:$%lX (zone=$%lX)>", [s2 class], (long) s2, [s2 zone]); [pool release]; return( 0); }
OUTPUT (with NSLog prefix removed)

alloc
s1 = <NSPlaceholderString:$CC68 (zone=$A088)>
s2 = <NSPlaceholderString:$CC68 (zone=$A088)>
s1 = <NSInlineCString:$D348 (zone=$A088)>
s2 = <NSInlineCString:$CC48 (zone=$A088)>

The same NSPlaceHolderObject is returned for both calls. The init call returns a newly allocated object of class NSInlineCString.

allocWithZone: (1 zone)
s1 = <NSPlaceholderString:$CCC0 (zone=$A088)>
s2 = <NSPlaceholderString:$CCC0 (zone=$A088)>
s1 = <NSInlineCString:$E4F0 (zone=$D3E8)>
s2 = <NSInlineCString:$E508 (zone=$D3E8)>

When a NSZone is used, something interesting happens. The NSPlaceHolderString object does not exist in the NSZone, we gave it with allocWithZone:. Undoubtedly it contains a pointer to the passed in NSZone, which is then used in the subsequent init call.

allocWithZone: (2 zones)
s1 = <NSPlaceholderString:$13F20 (zone=$A088)>
s2 = <NSPlaceholderString:$D418 (zone=$A088)>
s1 = <NSInlineCString:$2B640 (zone=$13F80)>
s2 = <NSInlineCString:$2C648 (zone=$13FB0)>

Two NSPlaceHolderString objects are used now by Foundation. For every NSZone, a new NSPlaceHolderString object must be created.

Wrap up

This article reviewed the basics of object creation and object allocation in Foundation. After this prerequisite we're ready to dive into different allocation strategies.


If you want to discuss this articles, please do so in this thread in the Mulle kybernetiK Optimization Forum.

Thanks to Ben Dougall, for pointing out that something was amiss in the class diagramm and text.


(1) (As you can see traversing the isa pointer will make you loop endlessly, when reaching the NSObject class, but traversing superclass you will reach nil eventually).

(2) Now if you think about the metaclass object being a NSObject, it doesn't really make sense, A NSObject object does not look like a class object as NSObject objects contain nothing except an isa pointer, but the class object for example does also contain a superclass pointer. You will just have to accept this fact. (3). Class and metaclass objects are a little special in some respects.

(3) The possible "proper" implementation would have each class's isa pointing to a hypothetical NSClass object. But this would lose the cheap benefit that each metaclass object automatically inherits all of NSObjects instance methods (also those added by categories). NSClass practically could not be a subclass of NSObject, to avoid endless loops of method lookups.