nutballs

I am using perks webrequest class to grab amazon XML (no i cant use file_get_contents or curl). simple enough right?

I am adding line breaks to prevent the forum from blowing out...

$url = "http://ecs.amazonaws.com/onca/xml?Service=AWSECommerceService&
Operation=ItemLookup&AWSAccessKeyId={$GLOBALS['amazonapikey']}&
ItemId=".$GLOBALS['asin']."&ResponseGroup=Large&AssociateTag=".$GLOBALS['amazontag'];
$req = new webRequest();
$response = $req->simpleGet($url);
echo $response;


so... wanna explain WTF this is about? The bold piece is replacing some of the text, and happens at the end of a REALLY REALLY LONG STRING. I added breaks, but everything before that BOLD bit, is all on one line actually. obviously i assume this has to do with the chunks, but not sure how to deal with it.

<?xml version="1.0" ?><ItemLookupResponse xmlns="http://webservices.amazon.com/AWSECommerceService/2005-10-05"><OperationRequest><HTTPHeaders><Header Name="UserAgent" Value="Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12"></Header></HTTPHeaders><RequestId>XXXXXXXXXXXXXXXXX</RequestId><Arguments><Argument Name="ItemId" Value="B000V5QV4S"></Argument><Argument Name="Operation" Value="ItemLookup"></Argument><Argument Name="Service" Value="AWSECommerceService"></Argument><Argument Name="AWSAccessKeyId" Value="XXXXXXXXXXXXX"></Argument><Argument Name="ResponseGroup" Value="Large"></Argument><Argument Name="AssociateTag" Value="amazonsite-20"></Argument></Arguments><RequestProcessingTime>0.0575670000000000</RequestProcessingTime>
</OperationRequest><Items><Request><IsValid>True</IsValid><ItemLookupRequest><Condition>New</Condition>
<ApplauseeliveryMethod>Ship</DeliveryMethod><IdType>ASIN</IdType><MerchantId>Amazon</MerchantId>
<OfferPage>1</OfferPage><ItemId>B000V5QV4S</ItemId><ResponseGroup>Large</ResponseGroup>
<ReviewPage>1</ReviewPage></ItemLookupRequest></Request><Item><ASIN>B000V5QV4S</ASIN>
<ApplauseetailPageURL>http://www.amazon.com
/gp/redirect.html%3FASIN=B000V5QV4S%26tag=XXXXXXXXXX-20%26lcode=xm2%26cID=2025%26ccmID=165953%26locat
<>5b4
ion=/o/ASIN/B000V5QV4S%253FSubscriptionId=XXXXXXXXXXXX</DetailPageURL>
<SalesRank>158</SalesRank><SmallImage><URL>http://ecx.images-amazon.com/images/I/41D03zRlZLL._SL75_.jpg</URL><Height

perkiset

Hey Nuts -

I am at Lake Mead (last 3 days) and will be home later tonight - I don't have a very good rig for debugging out here, so I'll prolly look into it tomorrow or so.

/p

nutballs

no problem. just my entire

net

 work is down is all... LOL

im gonna switch back to the previous version for the time being. Whenever you get a chance to look at it would be great.

perkiset

Will do mate, sorry for the inconv...

nutballs

Yes it's your fault. I feel good about it now. You should have psychically prevented me from deploying it to everything in 1 shot! Damn your lack of god-like psychic skills.

oh wait, I just press this button here...

ah rolled back to how it was. thanks to your prior suggestion for code deployment in a

net

 work. The downside to that method is I can just deploy total crap and not bug test it because after all, i can just roll it back... lol.

thedarkness

Nuts,

<>You could try using

php

 -soap and accessing their SOAP API, it's similar to the XML response you are getting at the moment and I could give you some code to get you started if you want? May be a quick fix, see below and start looking through perk's class for anything that may output a hex number.

Why is it you can't use anything but perk's webrequest class (not that there is any reason not to use it besides the problem you are having)?

@perk: possible this is some debugging output for your chunking issue? That "5b4" looks like hex for approximately the amount of data that the request has pulled down up to that point (can't tell for sure because headers, cookies, have been stripped and data has been changed to protest the guilty  Applause ) That's my guess, something in there is reporting the amount of data received to that point, prolly the chunksize.

Cheers,
td

nutballs

thank td. actually amazon is just an example. I need to use this for *feeds, some of which may not be xml, html, or anything remotely well-formed. read as, scraping, among other things. Also the hosts I am "wanting" to install this on, generally have all the fun toys turned off. So... i cant get remote files via a consistent method. And in this case, consistency is king. 1 codeblock, deploy to *. since perks class uses sockets, they cant shut that off, or else websites wont answer requests Applause

I agree that its a chunk issue. it seems to happen about about the same length of each long line. yes you are right 5B4=1460

here is the debug headers, which apparently i forgot to post...

simpleGet: Starts with [http://ecs.amazonaws.com/onca/xml?Service=AWSECommerceService&Operation=ItemLookup&AWSAccessKeyId=XXXXXXXXXXXX&ItemId=B000V5QV4S&ResponseGroup=Large&AssociateTag=XXXXXXXXX-20]
Outbound Header:
GET /onca/xml?Service=AWSECommerceService&Operation=ItemLookup&AWSAccessKeyId=XXXXXXXXXX&ItemId=B000V5QV4S&ResponseGroup=Large&AssociateTag=XXXXXXX-20 HTTP/1.1
Host: ecs.amazonaws.com
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.12) Gecko/20080201 Firefox/2.0.0.12
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding:
Accept-Charset: Accept-Charset:ISO-8859-1,utf-8;q=0.7,*;q=0.7
Connection: close

Content-Type: text/html
Content-Length: 0

Default beforeExecute()
Execute: Starts
Execute: HostStr=[ecs.amazonaws.com] Port:80
Execute:

PHP

 5 or greater, setting timeout of 30
Execute: Sending request
Execute: Request Sent
GetChunk: Starts
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 1460
GetChunk: Received 2002
processHeaders: Starts
processHeaders: Result Code is 200
processHeaders: Array Follows
Array
(
    [Date] => Sun, 30 Mar 2008 01:42:48 GMT
    [Server] => Server
    [Content-Type] => text/xml;charset=UTF-8
    [Vary] => Accept-Encoding,User-Agent
    [Connection] => close
    [Transfer-Encoding] => chunked
)

processHeaders: Chunked Transfer - expected length is 1305
Execute: Successful Retrieve
postProcess: Content length is 20786
handleSuccess()
Default afterExecute()
Execute: Completes
Array
(
    [Date] => Sun, 30 Mar 2008 01:42:48 GMT
    [Server] => Server
    [Content-Type] => text/xml;charset=UTF-8
    [Vary] => Accept-Encoding,User-Agent
    [Connection] => close
    [Transfer-Encoding] => chunked
)

thedarkness

I'd say that that hex number is the chunk footer for each chunk which perk's class should be stripping but for some reason is not (apologies if I'm wrong perk). Maybe debugging output is turned on?

Use the source Luke. Or, post the source Luke, the bit that deals with chunks.

Cheers,
td

perkiset

Hey Nuts -

Looking at your debug log I'd say we should get on the phone... there're actually two types of chunks here - my internal packet chunking (based on the TCP stack of the server) and HTML chunking, which, because of the size of the response, that's what the server is doing. I'm not sure TD if you are correct, but that sure looks like a good notion... I'm gonna have to play with this and see if there is a piece of the HTML chunking that I've gotten wrong in the pull.

If you are comfy with this, then email me the actual URL I'd need to use to send it and let me debug it here. I've some ideas, but kind of need my fingers on the pulse to get it.

/p

nutballs

PM'd

perkiset

Right then, I'll give'r a look in a bit, thx


Perkiset's Place Home   Politics @ Perkiset's