
![]() |
perkiset
The reason to read this post
Coding specialized routines in C and adding them to PHPcan make a <i>profound</i> difference in your server's efficiency and capability.Background Nutballs and SB were discussing <a href="http://www.perkiset.org/forum/ asp_net/classicasp_ip_to_num_and_num_to_ip_for_database_storage-t72.0.html;msg2054#new">in this thread</a> whether it was best to index based on numerics or strings (and a bunch else) – in that process I got intrigued by the notion of how fast I could convert IP addresses to long integers and was re-drawn to the Zend developer forum to look at creating custom extensions forPHP. I’d looked at it several times but did not have a compelling reason to tackle it.In the mid 80s I was programmingin Clipper, which was a dBase III compiler. The brilliant part about it was that you could augment the language with C functions of your own design. It was through this that I constructed a complete windowing library for Clipper that mimicked theMac’s trap set even before Windows existed. A verbose, readable language that could be augmented with blazing speed where required.Then the world lulled through C++ and VB until in 1994 when Delphi was released. Once again I had my hybrid language – Object Pascal with inline assembly. It doesn't get much better that that. But it was doomed to eventually fail as well… as most compiled, early bound languages are slowly feeling – at least for web development. Now with PHPwe have a verbose and easily readable scripted language for web development that can be augmented with C for speed. Since I have shunned all OS specific and client-side GUI libraries for browser based apps, speed has been an Achilles heel for many of my applications for a while now - and this looks to be a compelling solution.Today’s Task I decided to create my first extension to PHPto see just how much difference there would be with aPHPsolution to a coding task versus a C function added to the language. The chore was to convert an IP address to a double and back. As it happens, this falls perfectly into my notion of how things work – I REALLY like C strings and handling – so the idea if writing a custom string converter in C to augmentPHPis right up my alley.Extensions have a lot of prep and support items. There are special ways to allocate and earmark memory, a few little "glue" routines that must be added and a basic protocol for compiling and adding to the PHPinstance. You should REALLY follow thetutorial at Zend <a href="http://devzone.zend.com/node/view/id/1021">which you can find here</a> if this is of more interest to you, but what I will illustrate here is exactly what I did to make my first little functions.First off – there are a couple compiler switches that the tutorial suggests you turn on – they are important for real projects, but unnecessary for little things like this. AnyPHPinstallation where you have access to thephp.ini file will accept this extension.<i>IMPORTANT CAVEAT: I have not coded in C or C++ since about ’94, so I’m a bit rusty. In fact, I may be talking completely out of my ass on several of these points and invite critique and clarification if I’m being stupid.</i> First: config.m4. This file helps "configure" do … erm … what it does. This is part of the glue and after I had created my own "Hello World" function I changed this very little to make it work with "PerksFuncs." PHP_ARG_ENABLE(perksfuncs, whether to enable the PerksFuncs Library,[ --enable-perk Enable PerksFuncs Library Support])if test "$ PHP_PERKSFUNCS" = "yes"; thenAC_DEFINE(HAVE_PERKSFUNCS, 1, [whether you have PerksFuncs Library]) PHP_NEW_EXTENSION(perksfuncs, perksfuncs.c, $ext_shared)fi Next – the header file. I do not think that this needs to be separate, but based on the way they (zend) had me put this all together I am unclear on whether it is called elsewhere. Probably not, but better safe than sorry. #ifndef PHP_PERKSFUNCS_H#define PHP_PERKSFUNCS_H 1#define PHP_PERKSFUNCS_VERSION "1.0"#define PHP_PERKSFUNCS_EXTNAME "perksfuncs"PHP_FUNCTION(ip2num);PHP_FUNCTION(num2ip);extern zend_module_entry perksfuncs_module_entry; #define phpext_perksfuncs_ptr &perksfuncs_module_entry#endif As you can see, pretty much connection stuff between what we are about to write and PHP.Finally, the real stuff. The top part of all this is glue as well. The real stuff is at PHP_FUNCTION(ip2num) andPHP_FUNCTION(num2ip).#ifdef HAVE_CONFIG_H #include "config.h" #include "string.h" #endif #include " php.h"#include " php_perksfuncs.h"static function_entry perksfuncs_functions[] = { PHP_FE(ip2num, NULL)PHP_FE(num2ip, NULL){NULL, NULL, NULL} }; zend_module_entry perksfuncs_module_entry = { #if ZEND_MODULE_API_NO >= 20010901 STANDARD_MODULE_HEADER, #endif PHP_PERKSFUNCS_EXTNAME,perksfuncs_functions, NULL, NULL, NULL, NULL, NULL, #if ZEND_MODULE_API_NO >= 20010901 PHP_PERKSFUNCS_VERSION,#endif STANDARD_MODULE_PROPERTIES }; #ifdef COMPILE_DL_PERKSFUNCS ZEND_GET_MODULE(perksfuncs) #endif PHP_FUNCTION(ip2num){ char *inBuff, *outBuff; int inPtr, outPtr, octet; // This will grab the first parameter as a string and place it on the pointer inBuff and put the length into inPtr... if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &inBuff, &inPtr) == FAILURE) RETURN_NULL(); // I want to start at the last character in inBuff, not at the terminating chr(0)... inPtr--; // Create an output buffer for the result. outPtr is set to the last character position. // octet will be used for "jumping" the output ptr as we pad. outBuff = estrdup("000000000000"); outPtr = 11; octet = 3; // Do a reverse strcpy (effectively) to move what is in the inBuff into the outBuff but // skipping periods and padding at the same time: while (inPtr >= 0) { outBuff[outPtr--] = inBuff[inPtr--]; if (inBuff[inPtr] == '.') { outPtr = (octet-- * 3) - 1; inPtr--; } } // Return to PHPa double version of the string// this will also automatically trim off the leading zeros RETURN_DOUBLE(atof(outBuff)); efree(outBuff); } PHP_FUNCTION(num2ip){ char *inBuff, *outBuff, *finalBuff; int inLen, inPtr=0, outPtr=0, inNum, currOct; outBuff = estrdup(" "); // This will grab the first parameter as a string and place it on the pointer inBuff and put the length into inPtr... // Note that is was passed into PHPas a double, but I accept it here as a string soPHPdoes the conversion for me.if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "s", &inBuff, &inLen) == FAILURE) { RETURN_NULL(); } // Since the input string is of variable length (either 12, 11 or 10 digits), I need // to act as if it was 12 digits and jump over the invisible zeros in front (if 10 or 11 digits) currOct = (12 - inLen); // This state flag will tell me if I am in padding zeros or into a real number. Since the first // digit of the number is always going to be non-zero, then we set it to true to start... inNum = TRUE; while (inPtr < inLen) { while (currOct++ < 3) { if (!inNum) if ((currOct == 3) || (inBuff[inPtr] != '0')) inNum = TRUE; if (inNum) outBuff[outPtr++] = inBuff[inPtr++]; else inPtr++; } if (inPtr < inLen) outBuff[outPtr++] = '.'; currOct = 0; inNum = FALSE; } // This looks weird, but if I emalloc for a 7 byte string it forces me to 8 and I get a random char // in the last position. Doing it this way ensures that I have the correct string in the output: if (outPtr == 7) finalBuff = estrdup(" "); else finalBuff = emalloc(outPtr); for(inPtr=0; inPtr<outPtr; inPtr++) finalBuff[inPtr] = outBuff[inPtr]; RETURN_STRING(finalBuff, 0); efree(outBuff); } This is important: given all of this stuff, the only part you need to modify, if you want to add another function, is to add PHP_FUNCTION(anotherFunction);To the header file and PHP_FE(anotherFunction, NULL)To the .c file in the function_entry portion and then of course PHP_FUNCTION(anotherFunction){ // and your real code here } to the end of the C file. With this basic scaffolding you can add as many functions as you want to to <your library> with minimal headache. The recommended place to place these 3 files is in a subdir under your PHPext directory. Mine is /usr/local/include/php/ext – where I added "perk" and placed these three files. Then I did aphpize which created all the normal files required for configuration. Then I did a ./configure –enable-perksfuncs which got the files in the subdirectory ready for compiling. Finally I did a make which compiled my files into perksfuncs.so, in the modules directory. One more time, that’d be /usr/local/include/php/ext/modules/perksfuncs.so.Next step – make PHPaware of the new extension. To do this, you must edit thephp.ini file. Mine is at /usr/local/lib/php.ini. In that file you will find a directive, extension_dir directive which is pointing at the directory where your extensions will be looked for. That directory is where you must place a copy of [yourlib].so. Also, you must add a reference to your lib in the ini: extension=perksfuncs.soThis will be available immediately to console run apps, but not to Apachecalls. You will need to stop and restartApachefor your new lib to be available to web calls. Also, when I first started I was forgetting all the steps and would at times not see my updates inPHP– the steps to get your extension intoPHPare:phpize./configure --enable-perksfuncs make cp modules/perksfuncs.so /usr/local/lib/ php/extensions/no-debug-non-zts-20060613//usr/local/ apache2/bin/apachectl stop/usr/local/ apache2/bin/apachectl startNote that if all you do is change a couple lines of code in the .c file, you will only need to make and install the SO. If you do more you will probably want to at least ./configure again. If things are not behaving as you’d expect, do the entire process over to make sure that PHPis seeing the latest version of your extension.First Test The first thing I wanted to do was to make sure that my extension was running pro perly. So, after having followed the steps above, I wrote this little chunk ofPHPto test it out:<? php$addrs[] = '127.0.0.1'; $addrs[] = '216.19.200.114'; $addrs[] = '192.168.123.1'; $addrs[] = '64.64.64.64'; $addrs[] = '1.2.3.4'; $addrs[] = '1.22.33.444'; $addrs[] = '12.34.56.78'; $addrs[] = '123.456.789.123'; foreach($addrs as $inAddr) { $nums[] = $thisNum = ip2num($inAddr); echo $inAddr, ' -> ', $thisNum, " "; } foreach($nums as $inNum) { $thisNum = num2ip($inNum); echo $inNum, ' -> [', $thisNum, "] length=", strlen($thisNum), " "; } ?> (RUN RESULT) 127.0.0.1 -> 127000000001 216.19.200.114 -> 216019200114 192.168.123.1 -> 192168123001 64.64.64.64 -> 64064064064 1.2.3.4 -> 1002003004 1.22.33.444 -> 1022033444 12.34.56.78 -> 12034056078 123.456.789.123 -> 123456789123 127000000001 -> [127.0.0.1] length=9 216019200114 -> [216.19.200.114] length=14 192168123001 -> [192.168.123.1] length=13 64064064064 -> [64.64.64.64] length=11 1002003004 -> [1.2.3.4] length=7 1022033444 -> [1.22.33.444] length=11 12034056078 -> [12.34.56.78] length=11 123456789123 -> [123.456.789.123] length=15 Success! I can convert both to and from a double! The next step was to develop the real test. My testbed is as follows: * Make the test times as tightly bound to ONLY the code that is applicable ie., do not bring in file system load times or anything like that. * Create as tight a PHPsolution as I could – don't bias the results by writing crap code so that the C looks good. I wanted to see for reals what it would do.* Make the test of reasonable size. I chose 500,000 random IP addresses * I wanted to see 4 scenarios: a foreach loop handling the conversion, an array_map solution with PHPcode handling the conversion, a foreach loop using the C conversion routine and an array_map solution using the C conversion.The first thing I decided was to create all of the addresses and store them for quick retrieval. I also wanted to see the difference between implode() and serialize(). First though, here is the code I used to create 500K random addresses: <? php$addrs = array(); for ($i=0; $i<500000; $i++) { if ($i%1000 == 0) echo '.'; $addrs[] = rand(1,255) . '.' . rand(0,255) . '.' . rand(0,255) . '.' . rand(0,255); } print " writing... "; file_put_contents('./addresses.ser', serialize($addrs)); file_put_contents('./addresses.imp', implode(chr(10), $addrs)); print "done "; ?> The file sizes were quite different, as would be expected: local-cc:/usr/local/include/ php/ext/perk root# ls -l add*-rw-r--r-- 1 root wheel 7143319 Jun 12 17:36 addresses.imp -rw-r--r-- 1 root wheel 15031569 Jun 12 17:36 addresses.ser Finally, I wrote the code to perform the actual speed tests: <? php$start = mtime(); print " "; $rawBuff = file_get_contents('./addresses.ser'); elapsed("Load Serialized"); $addresses = unserialize($rawBuff); elapsed("Process Serialized - count=" . count($addresses)); $rawBuff = file_get_contents('./addresses.imp'); elapsed("Load Imploded"); $addresses = explode(chr(10), $rawBuff); elapsed("Process Imploded - count=" . count($addresses)); $outArr = array(); echo "Starting PHPForEach ";foreach($addresses as $address) { preg_match('/([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3})/', $address, $octets); $outArr[] = (double) (substr("000{$octets[1]}", -3) . substr("000{$octets[2]}", -3) . substr("000{$octets[3]}", -3) . substr("000{$octets[4]}", -3)); } elapsed("Elapsed"); $outArr = array(); echo "Starting PHParray_map ";array_map("map_foreach", $addresses); elapsed("Elapsed"); $outArr = array(); echo "Starting Extension Convert "; foreach($addresses as $address) { $outArr[] = ip2num($address); } elapsed("Elapsed"); $outArr = array(); echo "Starting Extension array_map "; array_map("ip2num", $addresses); elapsed("Elapsed"); print " "; function elapsed($msg) { global $start; $elap = mtime() - $start; echo "$msg: $elap "; $start = mtime(); } function mtime() { list($usec, $sec) = explode(' ', microtime()); return ((float)$usec + (float)$sec); } function map_foreach($address) { global $outArr; preg_match('/([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3}).([0-9]{1,3})/', $address, $octets); $outArr[] = (double) (substr("000{$octets[1]}", -3) . substr("000{$octets[2]}", -3) . substr("000{$octets[3]}", -3) . substr("000{$octets[4]}", -3)); } ?> Here is the output of that routine: Load Serialized: 0.0953040122986 Process Serialized - count=500000: 1.89832091331 Load Imploded: 0.0511639118195 Process Imploded - count=500000: 0.368293046951 Starting PHPForEachElapsed: 5.69186878204 Starting PHParray_mapElapsed: 7.13705587387 Starting Extension Convert Elapsed: 0.967149972916 Starting Extension array_map Elapsed: 0.860723018646 In a spectacularly unexpected result, the Serialize version was WAY slower than the explode! However, upon a bit of reflection this makes a lot of sense. Serialize is excellent for associative, complex arrays and objects – and in this case a straight list of addresses being serialized adds a tremendous amount of overhead that is a useless and quite expensive addition to the overall processing time. A great reminder to always make sure you’re using the best tool for the job… and to continually audit that you are, in fact, using the best tool for the job. The next important result is to note that array_map is NOT a good choice for user defined functions. See how adding the function call (and associated stack activity) VASTLY increases the length of time it takes to process the list. Finally, we can see that the C version is in fact blazingly fast compared to the PHPsolution – the C string handling clearly makes a world of difference. And as would be expected, the array_map function adds to the efficiency – a whole ‘nother tenth of a second is trimmed of by using array_map as opposed to a straight loop.Explanation of results The best result was obtained by using C string handling in a custom extension to PHPand then being accessed by array_map because the least amount ofPHPinterpretation and stack movements needed to occur. The majority of processing stayed at the C level rather than moving up to the interpreter level. Additionally, the way you can handle strings in C is vastly superior (in terms of efficiency) toPHPbecause we can address each character in the string as an array element which, if we were to look at the assembly, we would find is a REALLY efficient way of doing things.It is arguable that, if you cannot code in C, or are afraid of this level of programming, or have offline console apps that are not speed dependent, this is a fun diversion that does not offer a great deal of bank-for-buck. But if you are a web developer or spammer that needs things built really quickly then this technique is very worthy of your time investment.Good luck! /p esrun
Interesting read. I don't think I'd use it for every day coding but could be useful for some of my bigger projects. Thanks for taking the time to do a write up.
perkiset
'welcome esrun - and I agree - it's a great "also tool" to have in the box when you just need that little bit of boost for a particularly tough section. In fact, I've been waiting for 2 years to have a reason to do it!
thedarkness
this looks fascinating at first glance perk. I haven't got the time to read it now but will ASAP. I've been interested in doing this for a while but haven't found the right situation yet. I seem to either code exclusively in
PHPor C/C++ rarely mixing the two. I'm sure I can find a use for this somewhere though..............Cheers, td perkiset
I just did a little more today as an exercise (don't want to forget what I've
learned) and then put it up onto one of my production servers. Slipped in like a kitten. Once you've got the glue/scaffolding in place, it's really as simple as adding a couple lines to "present" the function toPHPand then you're off to C land.It's gonna stay in my toolbox I tell you... /p thedarkness
OK, read this now. the second code listing is actually the file
php_perksfuncs.h if that's not clear to anyone.@perk: config.h, autogenerated by configure? Whats in it? Actually, I don't understand this; #ifdef HAVE_CONFIG_H #include "config.h" #include "string.h" #endif do you really want the inclusion of string.h to be conditional? Where is HAVE_CONFIG_H and what does it signify (presumably more autoconf mumbojumbo)? Of course i believe you'll need php-devel installed sportsfans (right?).If so it's; yum install php-develup2date php-develaptitude install php-develdepending on your distro. Also, check out http://pecl. php.netfor some funkyphpextensions others have made.Cheers, td perkiset
quote author=thedarkness link=topic=311.msg2135#msg2135 date=1181793746 @perk: config.h, autogenerated by configure? Whats in it? Actually, I don't understand this; #ifdef HAVE_CONFIG_H #include "config.h" #include "string.h" #endif do you really want the inclusion of string.h to be conditional? Where is HAVE_CONFIG_H and what does it signify (presumably more autoconf mumbojumbo)? I really don't know - I typed that portion in verbatim - perhaps it should really be included at the top of the .c file but the tutorial writer didn't catch it![]() quote author=thedarkness link=topic=311.msg2135#msg2135 date=1181793746 Of course i believe you'll need php-devel installed sportsfans (right?).I don't think so - if you install PHPfrom source (configure make make install) I'm about 99% sure that it's all good to go, because on the 3machines I've played with now that's all I've done - clean distros, download and buildPHP. What I described in the article above was done without any more additions.quote author=thedarkness link=topic=311.msg2135#msg2135 date=1181793746 If so it's; yum install php-develup2date php-develaptitude install php-develA good tip in any case, but I'll be interested to see if it is actually required... perhaps that comes when you install PHPfrom source automatically.quote author=thedarkness link=topic=311.msg2135#msg2135 date=1181793746 Also, check out http://pecl. php.netfor some funkyphpextensions others have made.No lie - but I'd make sure you were comfy enough to read the code and make sure that they were not doing anything too sneaky... Ope! There goes PerksParanoia button again ![]() Thanks for your eyes on this TD... looking forward to what you come up with. /p thedarkness
@perk: STRONGLY recommend you change this;
#ifdef HAVE_CONFIG_H #include "config.h" #include "string.h" #endif #include " php.h"#include " php_perksfuncs.h"to; #ifdef HAVE_CONFIG_H #include "config.h" #endif #include <string.h> #include " php.h"#include " php_perksfuncs.h"I don't use autoconfig and don't know a lot about it but a breif look at this section of manual it would ap pearthat there is a risk that A: that code could be modified by autoconf and B: that string.h may not be included under certain circumstance and you want it included unconditionally.http://www.amath.washington.edu/~lf/ tutorials/autoconf/autoconf/autoconf.html#SEC15Also, string.h is a system header and you shoud use the <> syntax to distinguish it as such as per; http://gcc.gnu.org/onlinedocs/cpp/Include-Syntax.html Oh, and http://rpmfind .net/linux/RPM/PLD/dists/ra/updates/security/i686/php-devel-4.2.3-10.i686.html (this is of course if you are not working from thephpsource). I don't think it's necessary to build from source as all of this stuff is pretty modular these days. Also, you can use dl to load your extension at runtime and leave it out of your ini file and only load as necessary. http://php.net/dlCheers, td perkiset
quote author=thedarkness link=topic=311.msg2138#msg2138 date=1181819031 @perk: STRONGLY recommend you change this; #ifdef HAVE_CONFIG_H #include "config.h" #include "string.h" #endif #include " php.h"#include " php_perksfuncs.h"to; #ifdef HAVE_CONFIG_H #include "config.h" #endif #include <string.h> #include " php.h"#include " php_perksfuncs.h"I don't use autoconfig and don't know a lot about it but a breif look at this section of manual it would ap pearthat there is a risk that A: that code could be modified by autoconf and B: that string.h may not be included under certain circumstance and you want it included unconditionally.Agree re. unconditional string inclusion... but how to do you MAKE without having first done a configure? quote author=thedarkness link=topic=311.msg2138#msg2138 date=1181819031 Oh, and http://rpmfind .net/linux/RPM/PLD/dists/ra/updates/security/i686/php-devel-4.2.3-10.i686.html (this is of course if you are not working from thephpsource). I don't think it's necessary to build from source as all of this stuff is pretty modular these days.Gonna have to re-read that part, because I (when I looked at it) it required all the source packages to compile correctly. quote author=thedarkness link=topic=311.msg2138#msg2138 date=1181819031 Also, you can use dl to load your extension at runtime and leave it out of your ini file and only load as necessary. http:// php.net/dlThat'd probably be of great interest to folks reading this that don't have the ability to recompile a shared instance of PHPor there isn't source on the box... have you tried it? If you do, would you post the method please?Thanks for the eyeballs, /p thedarkness
quote author=perkiset link=topic=311.msg2151#msg2151 date=1181836827 Agree re. unconditional string inclusion... but how to do you MAKE without having first done a configure? You should be able to do a configure without any problems if you lay the code out as specified. quote author=perkiset link=topic=311.msg2151#msg2151 date=1181836827 Gonna have to re-read that part, because I (when I looked at it) it required all the source packages to compile correctly. You should only need the phpheaders and thephplibs to compile and link.php-devel provides these.quote author=perkiset link=topic=311.msg2151#msg2151 date=1181836827 That'd probably be of great interest to folks reading this that don't have the ability to recompile a shared instance of PHPor there isn't source on the box... have you tried it? If you do, would you post the method please?Tried dl? Yes, on many occasions, it used to be used a lot more than it is now. I'll tell you what, I'll add your functions to my phpinstance without using the source and dynamically loading and I'll outline the steps involved here.quote author=perkiset link=topic=311.msg2151#msg2151 date=1181836827 Thanks for the eyeballs, /p np, of course. Cheers, td perkiset
quote author=thedarkness link=topic=311.msg2178#msg2178 date=1181865078 You should be able to do a configure without any problems if you lay the code out as specified. Gotcha... I see now that autoconf is a different thing than configure (told you, not strong here...) quote author=thedarkness link=topic=311.msg2178#msg2178 date=1181865078 Tried dl? Yes, on many occasions, it used to be used a lot more than it is now. That'd be cool... do you want my conpiled version or did you do it yourself? thedarkness
quote author=perkiset link=topic=311.msg2180#msg2180 date=1181865902 quote author=thedarkness link=topic=311.msg2178#msg2178 date=1181865078 Tried dl? Yes, on many occasions, it used to be used a lot more than it is now. That'd be cool... do you want my conpiled version or did you do it yourself? Nah, i'll go through the whole process as an exercise and document it here. i'll make a start tonight. Cheers, td |

Thread Categories

![]() |
![]() |
Best of The Cache Home |
![]() |
![]() |
Search The Cache |
- Ajax
- Apache & mod_rewrite
- BlackHat SEO & Web Stuff
- C/++/#, Pascal etc.
- Database Stuff
- General & Non-Technical Discussion
- General programming, learning to code
- Javascript Discussions & Code
- Linux Related
- Mac, iPhone & OS-X Stuff
- Miscellaneous
- MS Windows Related
- PERL & Python Related
- PHP: Questions & Discussion
- PHP: Techniques, Classes & Examples
- Regular Expressions
- Uncategorized Threads