m0nkeymafia

Right guys this is a trick which you either know or dont know.
It is the freakin tits.

I started using this technique when I wrote a GLIB for work.
It has to be cross platform so I couldnt take advantage of any architecture specific niceness.

Anyway in my benchmark tests it trounced memcpy for filling memory and also works on more complicated matters.

The technique is called

Duffs Device

 .  It uses case fallthrough to very quickly iterate through a list basically unrolling it.

Its in a

mac

 ro so cant really be used for beefy loops but for tight ones its amazing.

#define DUFF_DEVICE_8(aCount, aAction)
{
int count_ = (aCount);
int times_ = (count_ + 7) >> 3;
switch (count_ & 7){
case 0: do { aAction;
case 7: aAction;
case 6: aAction;
case 5: aAction;
case 4: aAction;
case 3: aAction;
case 2: aAction;
case 1: aAction;
} while (--times_ > 0);
}
}


If you know what loop unrolling is then youll know how to use it, if not reply and ill try answer any questions.

thedarkness

Radical stuff monkey, wasn't aware of this, keep 'em coming  Applause

I'll be looking for a place to use this.

Cheers,
td

m0nkeymafia

It can be used anywhere dude
Ill give u a rough example [excuse me if syntax is not 100% right]

const int BUFFER_SIZE = 1024*32;
DUFF_DEVICE(BUFFER_SIZE, *(dst++) = *(src++));

that should copy from src to dst buffers uber fast.
ill write a post in a sec on the notation used, not sure if many people use it
im an optimization FIEND lol

thedarkness

yeah, got you incrementing pointers as per the Stroustrup example in the Duff's Device page on wikipedia. I follow, I'm a believer ;-)

Cheers,
td

nutballs

for some reason i cant wrap my head around this one. conceptually im having a problem even getting it at all. is there a real example of how this would be used, that i could understand, maybe an analogy? (i think my ADD is getting ing the way). Obviously im an

ASP

  guy so this wont work for me, though I am curious to try it for kicks, but im sure the interlacing of statements wont be allowed. but aside from that, i am a coder of past, but maybe im outta shape. lol

perkiset

Thank god.

I read that code about a dozen times and - although it's been more than a dozen years since I've been a c++ster, I didn't think I was THAT old and stoopid... looking forward to more understanding as well - sorry boys for the Codezheimers...

thedarkness

You need to read this;
http://en.wikipedia.org/wiki/Duff's_device
http://en.wikipedia.org/wiki/Loop_unwinding

Loop unwinding is an ancient technique for optimization and, as monkey mentioned, should be done by the compiler but.......

Cheers,
td

thedarkness

Actually, I'm going to read all of this when I get the chance.

http://en.wikipedia.org/wiki/Category:Compiler_optimizations

Cheers,
td

perkiset

quote author=thedarkness link=topic=207.msg1384#msg1384 date=1179197187

http://en.wikipedia.org/wiki/Duff's_device
http://en.wikipedia.org/wiki/Loop_unwinding


Whew - not quite as crusty as a I feared. I've never used the term "Loop Unwinding" but it makes complete sense. Duffs as well - now I completely get where that's going.

Interesting point in @Wikip though - claimed that although this was pretty damn efficient, in the *to++ = *from++ form (modified Duffs) this is probably less efficient than what is optimized in the compiler - they've probably got it even more tightened down than that.

Interesting stuff - haven't thought that way in a while. Kinda nice to get there for a bit.

Thanks,
/p


Perkiset's Place Home   Politics @ Perkiset's