
![]() |
perkiset
I've recently become involved in a project where I want to fire off multiple tasks to work concurrently like forking or threading from a web request. Some of the issues surrounding my project:
First and foremost, I use MySQL for the inter-process communication. My personal style is to have each "thread" had a row in a table and update it when it is complete. Details of this are beyond the scope of my article however. The problem with firing off these requests is that normally, webRequest2 (and virtually every other web pull class) will hang around until the "page" that Apacheis serving is completed. I needed a way to short circuit that so that I can fire and forget.The solution involves a couple items. First, I modified the webRequest2 class (available in the PHPrepository - the latest version of the code is here: http://www.perkiset.org/forum/php/perks_new_webrequest_class-t616.0.html;msg5330#msg5330) so that I can supply an "early termination string. Essentially, as content comes back from the server I'll look at the entire content body and see if the string appears in it - if so, I shut down the socket and call it a day.Second, I need to write my "thread script" to make use of this handy feature. For demonstration purposes, my "master script" will call just one instance of the "thread script" and echo that it is done. My "thread script" will sleep for 10 seconds, then update a little file to let me know that it is complete. The master script looks like this: <? php// Requestor require('/www/sites/lib/classes/class.webrequest2. php');$req = new webRequest2(); $req->earlyTermStr = 'all_done'; $buff = $req->simpleGet('http://mydomain/testresponse. php');echo "Received: $buff "; ?> Obviously you need to change mydomain to ... well, your domain. I'll assume you save this file as "testpull. php."The testresponse. phpscript looks like this:<? php// Answering routine echo "all_done"; ob_flush(); flush(); sleep(10); file_put_contents('/www/sites/mydir/output.txt', 'Process complete'); ?> touch a little file called output.txt and give it 777 permissions so that this will work, obviously in a directory you have access to. If you call for testpull. phpfrom a browser, you'll see "all_done" almost immediately. Watching the directory where you have output.txt, 10 seconds later you'll see the file updated.Using this tiny method, you could have a loop that fired off as many processes as you needed (based on your Apachecapabilities of course) from a single web request. If you then busy-wait for a signal ie., perhaps a database row is updated or something, then you'd know when all of the processes were done.The last item in my requirements above is scalability. Why this is most interesting to me, is if I have a tiny database of machine IPs that can do this little request, then I could pull all the availablemachines from a database, randomize which one I'll hit and then send the requests to thatmachine... more precisely, I could round-robin through ALL of the available processors and send my "threads" to each one - effectively distributing my workload to any number of potential back-end processors.(As soon as I get into this notion, there will be whole bunches of questions on how I load balance this, which I do - I have great little scripts to help me handle my back-end process load balancing, but that is also beyond the scope of this article). /p DangerMouse
Interesting stuff Perk, nice post
![]() Would something like this help even further do you think? http://en.wikipedia.org/wiki/HTTP_pipelining and http://syndk8 .net/forum/index.php/topic,13602.0.htmlDM perkiset
I am thinking through how pipelining would be implemented in my webRequest2 class... just haven't put it together yet because I haven't had a need. Have you looked at the class and considered doing a mod and sharing it? That'd be vurrah cool...
thanks BTW, /p DangerMouse
I based a curl webrequest class I built largely on inspiration from your original, so I'm sure modifying it shouldnt be too tricky - more a case of working out suitable error checking and use cases.
Not really sure which parts of the header get repeated when pipelining nor do I have experience of working at socket level but I'll certainly look into it ![]() DM perkiset
Update: Using this methodology for the aforementioned projects and it *sings*.
I am extremely pleased with its stability and speed - and since it does not involve *real* multithreading but rather the same sort of "paging" that apachenormally employs, logging is easy and clean.vsloathe
Bah.
Wish I'd been around last week to tell you I've been working on something similar the past 2 weeks and I could've been sharing what I've done. perkiset
Great minds and all... glad to hear that it's worked out for you as well.
I'm finding that I love how stable it is under load. This, of course, makes perfect sense because it's Apacheand all... but it still is delightful to see. |

Thread Categories

![]() |
![]() |
Best of The Cache Home |
![]() |
![]() |
Search The Cache |
- Ajax
- Apache & mod_rewrite
- BlackHat SEO & Web Stuff
- C/++/#, Pascal etc.
- Database Stuff
- General & Non-Technical Discussion
- General programming, learning to code
- Javascript Discussions & Code
- Linux Related
- Mac, iPhone & OS-X Stuff
- Miscellaneous
- MS Windows Related
- PERL & Python Related
- PHP: Questions & Discussion
- PHP: Techniques, Classes & Examples
- Regular Expressions
- Uncategorized Threads