rjonesx

Anyone ever write a class that allows for posting to YouTube? I have been hacking at it for 2 weeks (tried cURL, Snoopy, and fsockopen hand coded) to no avail...

perkiset

Where is it failing? Is it trapping you / "catching you" and giving you an error msg or just not happening?

How I would tackle it if I was ever so inclined:  Applause

* I'd get a sniffer going on my

mac

 hine - one that recombined multi-packet messages nicely and did an easy to read dump of everything in the packet. There are even a few out there that will do some interesting SSL interpretation for you.

* I'd clear all cookies/history from my

mac

 hine - I'd even think about uninstalling Flash because there are ways of storing persistence there.

* Then I'd do a straight "Go to you tube", login/post etc exactly as cleanly as I possibly could.

* In the packet logs will be the answer. The reason? There is no magic connection with your

mac

 hine other than what is in those packets. There can be nothing more mysterious than cookies, hidden post variables and such... the only think they could do to make things worse is to build some form of challenge-response mechanism in

Javascript

  that your

mac

 hine must figure out to send back as a hidden/cookie value at one point or another - but if your browser can figure it out, so can you. The bang for buck of figuring all this out: Aye, there's the rub.


Clearly this is a simple research project, right? You just want to do this just for the sake of automating an upload mechanism for your son's / daughter's school science project, right? Well then - if you do get around to grabbing those logs and there's something quizzical, I'm thinking that this is just the right place for us puzzlesters to take it apart...

/p

perkiset

Forgot to mention - any of the tools you listed above will be fine, although my personal favorite is to go right to fopen and do it all by hand because you have the ultimate level of control. Check in the

PHP

  code repository for a WebRequest class I posted which gives you pretty much utter control of the packets.

/p

nop_90

quote author=perkiset link=topic=94.msg430#msg430 date=1177455737

Clearly this is a simple research project, right? You just want to do this just for the sake of automating an upload mechanism for your son's / daughter's school science project, right? Well then - if you do get around to grabbing those logs and there's something quizzical, I'm thinking that this is just the right place for us puzzlesters to take it apart...
/p


Applause Applause

perkiset

What? Whasso funny?  :Applause

vsloathe

cURL's cookiejar misses cookies set by

javascript

  (sometimes?), and a lot of the "major" websites out there (Myspace, blogger, etc) have begun setting and more importantly deleting cookies via

javascript

 . You can easily parse them out of the response headers if you know what to look for though. I've seen recently a lot of websites will set a dummy cookie value via

javascript

  and delete it right before the actual POST (whether it be on a captcha page or whatever) to catch sloppy scripts. Make sure you look at the cookiejar your browser has vs. the cookiejar cURL (or snoopy, or whatever) has each step of the way.

Just my experience though, YMMV.

perkiset

Nice catch VS and clever of them.

mrsdf

I know a guy who knows a guy who did it with some browser scripting because it worked faster. Auto-it or something like that.

Browser automation in curl sucks but once you get it to work, it's great because it doesn't require a browser instance and you can run it remotely.

Clear the cookies in the browser, open liveheaders, and go to the main page, and start doing stuff, the easiest way to spot js cookies it when the browser sends a request with a cookie that doesn't ap

pear

  before as being set by the server replies.

When I do something like this I usually create the full account/whatever, I save the liveheaders log, and then start deleting from it anything that can't set a cookie: images, style sheets(take a look at those, after all they're just text). Look carefully when deleting them, you may see something like logo.gif that was received with a set cookie header. This will leave you with just the connections you need to do to to get it working, and perhaps with some JS, hopefully not encrypted, that will have a simple cookie value algo. If you see the browser sent a cookie and you can't figure out where it was set, just start all over again. And again. And again... If FF can do it, curl can do it. Not by itself unfortunately. I also don't allow curl to auto-follow redirects, but that's just me...

perkiset

Nice protocol MrS... a nice blueprint for dissecting the comms.

Regarding browser automation with cURL - could you please talk about that a little? I am not entirely certain, but I might have an application that would be handsomely complimented by such a thing...

TIA,
/p

mrsdf

What I meant by browser automation was to make a script smart enough to simulate the browser. The ideal situation would be to input something like : 'go to domain.com; click register button; fill in form; click submit', or something as high level as possible. This means it should handle redirects(header, html meta, refresh, js), it should automatically identify forms, automatically get all that the browser gets(images, scripts, a lot of

ajax

  stuff), handle cookies, even js ones and leave you with only a small script to write for each site.

What I got working: form detection, posting, some cookie handling, basic redirects, sending http headers in the same order as firefox does. Js proved to be a mess.. encryption, functions, I don't want to parse that.

After trying that on a few websites I figured it's almost impossible to handle all the things correctly, even if it worked for some sites, it parsed the forms, generated the posts, followed some redirects. So I tried parsing the liveheaders logs automatically but it's not really the way to go either because you'd have to guess where/how each post got generated, but it gives you a place to start from. Probably combining the log parsing with the above might do the trick.

Finally I hacked the source into a gb script and some non-generic site registration tools, because I never considered it got to a point where the curl browser could be used, for every new site it looked like I had to add a lot of stuff/rewrite the whole thing. I also know have a friend who's tried something similar in C because some site kept changing form structure and links, he got it working to some degree.

perkiset

Gotcha I understand. I was hoping it was some kind of control over FF or IE or something...

... but sounds like a cool, if daunting, project anyway.

/p

JasonD

shhh don't tell everyone about this .......

http://developer.spikesource.com/wiki/index.

php

 /Projects:TestGen4Web

and then use the bindings to replay (with variables etc) in your favourite language Applause

DangerMouse

Theres also a project called Sahi out there that effectively does browser automation I believe.

DM

perkiset

quote author=JasonD link=topic=94.msg2748#msg2748 date=1185981817

shhh don't tell everyone about this .......


Hehe... I love the diagram. Nice JD, gonna have to look at that...

/p

georgiecasey

posting videos to youtube? jasond posted this link on another thread, http://www.

php

 classes.org/browse/file/19567.html

if u get past all the ads and crap on

php

 classes, the file is there somewhere

Karlo

quote author=perkiset link=topic=94.msg2745#msg2745 date=1185976609

Gotcha I understand. I was hoping it was some kind of control over FF or IE or something...

... but sounds like a cool, if daunting, project anyway.

/p

Regarding FF automation, I have become a great fan of greasemonkey recently.
Install a XVNC server on your

linux

  box to run firefox.
Use Greasemonkey to do the dirt work.
It's still slow as hell. But it does the job for scraping those bot-protected thingys if you <>really need that data... Applause

Note sure about the "clear cookies" thing, this might require a "real" FF extension.

K.

perkiset

I've been really slow to look at greasemonkey... sounds like it's the cat's meow.


Perkiset's Place Home   Politics @ Perkiset's