.... Maybe triggered off of the rss feed or something. To Keep It Simple, lets just say i wanted to scrape a flat url with lamp, and just store all the html in a varchar field.... how would i go about it?
Forget about a forum for just a moment and let's just talk about straight up scraping. You'll use something like cURL, or file get contents or the Web request class here in the PHP code repository to simply grab a page, parse it the way you want to, then stored in a database. Scraping websites is much like scraping travel systems as we did so many years ago.
Are there scrapers that accept cookies, etc as if you were a browser the way the internet explorer activex control used to do?
VS is right, cookies are simply a name value pair that's exchanged between a server and a client. They can be used for any sorts of purposes, not the least of which is to remember a session ID -- much like this forum or any retail site does. If you look at the web requests class you'll see how I handle cookies, both sending them to a server and storing them locally. Cookies really don't have anything to do with scraping unless the server requires that you have a session ID when you request a page.