The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. October 14, 2019, 06:12:50 PM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: Sockets or LWP modules, Which is Faster?  (Read 5185 times)
Bompa
Administrator
Lifer
*****
Offline Offline

Posts: 564


Where does this show?


View Profile
« on: July 13, 2008, 10:38:34 PM »

It's not easy getting information on something like IO:SOCKET.

Most just LOVE to say "why use sockets?  Why reinvent the wheel?  Why not use blahblah.pm module?"

Maybe cuz I WANT TO!!!! damn it!!!

Cheesy

Anyways, I figure sockets must be quicker over the net.

I'm not talking about getting a few pages here or there, or logging in somewhere.

I'm talking about 10 or 20 thousand requests/hour.

Sockets must be quicker, but probably just a few miliseconds.

This post was inspired by emonk's question here
http://www.perkiset.org/forum/php/http_pipelining_and_proxy_servers-t613.0.html;msg4148#msg4148

@emonk, while researching, I read that proxy servers don't like to do this, but you probably read the same
thing by now.


bye,
storm here,
Bompa

Logged

"The most beautiful and profound emotion we can experience is the sensation of the mystical..." - Albert Einstein
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #1 on: July 13, 2008, 10:43:55 PM »

Hey Bomps -

At the core, everything is sockets, so you won't gain a lot (or, frankly, much of anything) by going that low-level unless there is a protocol or mechanism that is not supported by another means. I assume that this has something to do with your searching on the Syndk8 for the pipelining thread... is it possible to outline your intentions a bit more than just your mechanical desires? Perhaps we could suggest a path.

All that said, I, probably more than most here, totally get going to the base and working my way up, however you will be incurring a lot of protocol headaches by building your own comms.
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 416


View Profile
« Reply #2 on: July 14, 2008, 02:05:09 PM »

Hi Bompa,

using Sockets will avoid some overhead and is a bit faster.

It's ok for simple GET's. But following a 302 redirection is much
easier with LWP.

So normally I think that LWP is the better choise.

But sometimes Sockets are very useful, for example to resolve
a given IP.

Code:
use Socket;
...
my $addr = inet_aton($ip);
my $host = gethostbyaddr($addr, AF_INET);
Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #3 on: July 14, 2008, 02:45:34 PM »

Dirk -

In PHP (for example), I don't find that sockets are any faster (on the whole) than most standard protocols, and in fact, I can really mess things up by using sockets rather than all C services like cURL ... unless I have a specific need. Why is it that in PERL, going sockets would be faster and save overhead?
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 416


View Profile
« Reply #4 on: July 14, 2008, 03:00:59 PM »

Ed,

the LWP overhead is caused by loading more stuff (modules).

I have no exact speed data. It was only an impression.
And it's not easy to do a benchmark test because too many
other factors are involved, like network connnection, response
times and so on.

There is a LWP timeout issue. Sometimes LWP hangs and hangs
and the script can't kill such processes. This is also a Socket
advantage where you can simple kill a connection after a defined
time.

This is crucial if you want to spider hundred thousands pages per day.


Logged
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #5 on: July 14, 2008, 03:06:23 PM »

This topic and the post by vsloathe kinda overlap or the part where i was hinting why i build ontop of libcurl.
Speed of socket, regardless of language is all the same speed (they all are built ontop of OS sockets)

Where the killer is the allocation of strings (memory buffers etc, in python all blocks of memory is called a string).
I find this out the hard way.

Bit of background.
Company i was working for wanted built a VOIP system using speex codec.
Wisdom at the time said it have to be done in C/C++ or else u can not get speed.
Anyway i make it in python with C/C++ module. It was just as quick as C/C++ version.
And i will explain how i do it.

Basically in nutshell u have
udp packets -> internal stack -> decoder -> output and for transmission exact opposite.

i will just illustrate from internal stack you have to line up packets before they can go into decoder.
So u may have to start from byte 3 lets say and u have to join, strings etc.
In all memory managed lanugage behind the scenes it will create a new string of proper size, and destroy the old one.
So the speed loss is due to memory allocation and garbage collection (in most cases not a problem but when u pulling in 100+ packets a second).

Anyway i use approach similar to gstreamer (at time gstreamer do not exist)
when udp packets arrive i put in own memory structure.
then when it go to internal stack it just pass the pointers.
same story when it go to decoder etc.
So i just use the python like glue Smiley.

Back to topic Smiley
So in case of like parsing http header.
When u parse the header, in each section behind the scenes the language is assigning a new string (malloc is expensive)
And worse yet, when u are done with it, the string has to be destroyed.
Because in scripting language u can not access memory directly. (not without doing any hacks Smiley)
If u access memory directly u can get tremedous speed gain, but if something goes wrong u will bring down the entire VM.
Or if u make mistake, worse yet, u can corrupt other parts of it, and u can get wierd error which very hard to trace.






Logged
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!