
![]() |
rjonesx
Anyone ever write a class that allows for posting to YouTube? I have been hacking at it for 2 weeks (tried cURL, Snoopy, and fsockopen hand coded) to no avail...
perkiset
Where is it failing? Is it trapping you / "catching you" and giving you an error msg or just not happening?
How I would tackle it if I was ever so inclined: ![]() * I'd get a sniffer going on my machine - one that recombined multi-packet messages nicely and did an easy to read dump of everything in the packet. There are even a few out there that will do some interesting SSL interpretation for you.* I'd clear all cookies/history from my machine - I'd even think about uninstalling Flash because there are ways of storing persistence there.* Then I'd do a straight "Go to you tube", login/post etc exactly as cleanly as I possibly could. * In the packet logs will be the answer. The reason? There is no magic connection with your machine other than what is in those packets. There can be nothing more mysterious than cookies, hidden post variables and such... the only think they could do to make things worse is to build some form of challenge-response mechanism inJavascriptthat yourmachine must figure out to send back as a hidden/cookie value at one point or another - but if your browser can figure it out, so can you. The bang for buck of figuring all this out: Aye, there's the rub.Clearly this is a simple research project, right? You just want to do this just for the sake of automating an upload mechanism for your son's / daughter's school science project, right? Well then - if you do get around to grabbing those logs and there's something quizzical, I'm thinking that this is just the right place for us puzzlesters to take it apart... /p perkiset
Forgot to mention - any of the tools you listed above will be fine, although my personal favorite is to go right to fopen and do it all by hand because you have the ultimate level of control. Check in the
PHPcode repository for a WebRequest class I posted which gives you pretty much utter control of the packets./p nop_90
quote author=perkiset link=topic=94.msg430#msg430 date=1177455737 Clearly this is a simple research project, right? You just want to do this just for the sake of automating an upload mechanism for your son's / daughter's school science project, right? Well then - if you do get around to grabbing those logs and there's something quizzical, I'm thinking that this is just the right place for us puzzlesters to take it apart... /p ![]() ![]() perkiset
What? Whasso funny? :
![]() vsloathe
cURL's cookiejar misses cookies set by
javascript(sometimes?), and a lot of the "major" websites out there (Myspace, blogger, etc) have begun setting and more importantly deleting cookies viajavascript. You can easily parse them out of the response headers if you know what to look for though. I've seen recently a lot of websites will set a dummy cookie value viajavascriptand delete it right before the actual POST (whether it be on a captcha page or whatever) to catch sloppy scripts. Make sure you look at the cookiejar your browser has vs. the cookiejar cURL (or snoopy, or whatever) has each step of the way.Just my experience though, YMMV. perkiset
Nice catch VS and clever of them.
mrsdf
I know a guy who knows a guy who did it with some browser scripting because it worked faster. Auto-it or something like that.
Browser automation in curl sucks but once you get it to work, it's great because it doesn't require a browser instance and you can run it remotely. Clear the cookies in the browser, open liveheaders, and go to the main page, and start doing stuff, the easiest way to spot js cookies it when the browser sends a request with a cookie that doesn't ap pearbefore as being set by the server replies.When I do something like this I usually create the full account/whatever, I save the liveheaders log, and then start deleting from it anything that can't set a cookie: images, style sheets(take a look at those, after all they're just text). Look carefully when deleting them, you may see something like logo.gif that was received with a set cookie header. This will leave you with just the connections you need to do to to get it working, and perhaps with some JS, hopefully not encrypted, that will have a simple cookie value algo. If you see the browser sent a cookie and you can't figure out where it was set, just start all over again. And again. And again... If FF can do it, curl can do it. Not by itself unfortunately. I also don't allow curl to auto-follow redirects, but that's just me... perkiset
Nice protocol MrS... a nice blueprint for dissecting the comms.
Regarding browser automation with cURL - could you please talk about that a little? I am not entirely certain, but I might have an application that would be handsomely complimented by such a thing... TIA, /p mrsdf
What I meant by browser automation was to make a script smart enough to simulate the browser. The ideal situation would be to input something like : 'go to domain.com; click register button; fill in form; click submit', or something as high level as possible. This means it should handle redirects(header, html meta, refresh, js), it should automatically identify forms, automatically get all that the browser gets(images, scripts, a lot of
ajaxstuff), handle cookies, even js ones and leave you with only a small script to write for each site.What I got working: form detection, posting, some cookie handling, basic redirects, sending http headers in the same order as firefox does. Js proved to be a mess.. encryption, functions, I don't want to parse that. After trying that on a few websites I figured it's almost impossible to handle all the things correctly, even if it worked for some sites, it parsed the forms, generated the posts, followed some redirects. So I tried parsing the liveheaders logs automatically but it's not really the way to go either because you'd have to guess where/how each post got generated, but it gives you a place to start from. Probably combining the log parsing with the above might do the trick. Finally I hacked the source into a gb script and some non-generic site registration tools, because I never considered it got to a point where the curl browser could be used, for every new site it looked like I had to add a lot of stuff/rewrite the whole thing. I also know have a friend who's tried something similar in C because some site kept changing form structure and links, he got it working to some degree. perkiset
Gotcha I understand. I was hoping it was some kind of control over FF or IE or something...
... but sounds like a cool, if daunting, project anyway. /p JasonD
shhh don't tell everyone about this .......
http://developer.spikesource.com/wiki/index. php/Projects:TestGen4Weband then use the bindings to replay (with variables etc) in your favourite language ![]() DangerMouse
Theres also a project called Sahi out there that effectively does browser automation I believe.
DM perkiset
quote author=JasonD link=topic=94.msg2748#msg2748 date=1185981817 shhh don't tell everyone about this ....... Hehe... I love the diagram. Nice JD, gonna have to look at that... /p georgiecasey
posting videos to youtube? jasond posted this link on another thread, http://www.
phpclasses.org/browse/file/19567.htmlif u get past all the ads and crap on phpclasses, the file is there somewhereKarlo
quote author=perkiset link=topic=94.msg2745#msg2745 date=1185976609 Gotcha I understand. I was hoping it was some kind of control over FF or IE or something... ... but sounds like a cool, if daunting, project anyway. /p Regarding FF automation, I have become a great fan of greasemonkey recently. Install a XVNC server on your linuxbox to run firefox.Use Greasemonkey to do the dirt work. It's still slow as hell. But it does the job for scraping those bot-protected thingys if you <>really need> that data... ![]() Note sure about the "clear cookies" thing, this might require a "real" FF extension. K. perkiset
I've been really slow to look at greasemonkey... sounds like it's the cat's meow.
|

Thread Categories

![]() |
![]() |
Best of The Cache Home |
![]() |
![]() |
Search The Cache |
- Ajax
- Apache & mod_rewrite
- BlackHat SEO & Web Stuff
- C/++/#, Pascal etc.
- Database Stuff
- General & Non-Technical Discussion
- General programming, learning to code
- Javascript Discussions & Code
- Linux Related
- Mac, iPhone & OS-X Stuff
- Miscellaneous
- MS Windows Related
- PERL & Python Related
- PHP: Questions & Discussion
- PHP: Techniques, Classes & Examples
- Regular Expressions
- Uncategorized Threads