Nat! bio photo

Nat!

Senior Mull

Twitter Github Twitch

retain/release one more time

Once upon a time, on a website far, far away - no actually right here - and not so long ago, I wrote an article about the subject of implementing retain/release safely and speedily:

Atomic Increment and Decrement

Luckily the -retain and -release only needs to increment and decrement a variable safely. An increment or decrement is just such a read, modify, write operation that is a perfect match for an atomic operation.

- (id) retain
{
   MulleAtomicIncrement( &refCountMinusOne_);
   return( self);
}


- (void) release
{
   if( MulleAtomicDecrement( &refCountMinusOne_) == -1)
      [self dealloc];
}

Of course time doesn't stand still and maybe, possibly, depending on circumstances there might be a slightly faster way to implement this.

So let me show the code first and then explain it. While -retain's implementation stays as is, release is changed to:

- (void) release
{
   if( ! refCountMinusOne_)  // single-threaded and done ?
   {
      [self dealloc];
      return;
   }

   // still multi-threaded, so do it safely
   if( MulleAtomicDecrement( &refCountMinusOne_) == -1)
      [self dealloc];
}

The reference count is stored, as the variable name indicates, with 1 subtracted. So an object starts out with a refCountMinusOne_ value 0, meaning an actual reference count of 1.

The first question of course would be "is this threadsafe ?". Assuming that a read to refCountMinusOne_ is atomic, whenever another thread holds claim to the object, the value of referenceCountMinusOne_ must be 1 or larger, since the currently executing thread also holds claim to at least one reference. Reading a 0 (an actual reference count of 1) in -release implies that the object is owned now solely by a single thread (the current thread).

The second question is, "why would this be better ?". That's something that needs to be tested on the given hardware, but my perception on atomic operations is, that they usually cause inter-CPU communication and synchronization overhead. Something that I expect to become of worse effect as core numbers increase. At least in my old article, there was a non-negligable difference.
Avoiding atomic operations should therefore be a win.

So here's a scenario where this could be useful: Assume you have a subclass of NSArray, say MulleMutableArray, which allows you to initialize it with objects that are already retained, say -initWithRetainedObjects:count:. Now fill this array with objects, that have been created with +new. If you then release this array, all its items will be sent a -release message and you would never have executed an atomic instruction.
This could also be useful for objects, that are immediately added to NSAutoreleasePools having never be retained.

Negative aspects

  • could fail on hardware, where int reads are non-atomic, like f.e. old 68000 multi-CPU (ha!) hardware - if I remember - where the databus was 16 bit and int size was 32 bit.
  • will slow down "normal" code, that retains/releases. But on the bright side refCountMinusOne_ will be in the cache then pretty much guaranteed
  • it's only a statistical advantage for certain scenarios