The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 22, 2019, 05:42:10 PM

Login with username, password and session length


Pages: [1] 2 3 4
  Print  
Author Topic: Fantomaster spiderSpy download and parser  (Read 17533 times)
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« on: April 17, 2008, 06:47:05 PM »

Here is a little routine I use to pull down the spiderSpy table and update my MySQL database. It is designed to be run from a command prompt (I run this via a cron job) but it could be activated via a web call, provided your Apache will let a pretty long process run (it's taken as long as 3 minutes for me to get the whole thing at times) and the Apache daemon has write access to your botbase.download file - although this is simply a backup mechanism and not required.

Note that as it is written, it requires my class.dbconnection.php library which is also available here in the PHP repository.

Enjoy!
/p

Code:
<?php

/*
This little routine pulls the spiderSpy database down from fantomaster
line by line, and if it is an address line, updates a table in a database.
It then removes any spider records that have been removed from their list.

The table that this routine expects can be built with the following SQL:

CREATE TABLE spiders (
  address varchar(16) NOT NULL,
  lupdate datetime NOT NULL,
  `engine` varchar(128) NOT NULL,
  PRIMARY KEY  (address)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

*/

$classPath '/www/sites/lib/classes';
$botFile '/www/resources/botbase.download';
$fantomasterURL 'http://userandpassword@fantomaster.com/dardanelles/registerdb/fabotbasecsv_xxl.cgi';
$dbHost '127.0.0.1';
$dbUser 'theuser';
$dbPass 'thepass';
$dbName 'thedatabase';

file_put_contents($botFile'');
$search = array('"'"\n""\r");
$now date('Y-m-d H:i:s'time());

require(
"$classPath/class.dbconnection.php");
$db = new dbConnection($dbHost$dbUser$dbPass$dbName);

if ((
$handle fopen($fantomasterURL'r')) === FALSE
die ('Cannot open Fantomaster');

while (
$thisLine fgets($handle))
{
if (preg_match('/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"/'$thisLine))
{
file_put_contents($botFile$thisLineFILE_APPEND);
echo '.';

$parts explode(','$thisLine);
$engine str_replace($search''$parts[0]);
$address mysql_escape_string(str_replace($search''$parts[3]));

// Here's the google re-inclusion - do this only
// if you are comfortable cloaking Google...
if (preg_match('/google/i'$engine))
$address str_replace('#'''$address);

if (substr($address01) <> '#')
$db->query("replace into spiders(address, lupdate, engine) values('$address', '$now', '$engine')");

} else echo "x";
}

$db->query("delete from spiders where lupdate<'$now'");

?>

« Last Edit: April 19, 2008, 03:09:32 PM by perkiset » Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
dink
Expert
****
Offline Offline

Posts: 349


View Profile
« Reply #1 on: April 17, 2008, 08:46:16 PM »

Short, sweet, and to the point.  As usual.

Thanks Perk. Applause
Logged

[quote Nutballs]
the universe has a giant fist, and its got enough whoop ass for everyone.
[/quote]
craw
Journeyman
***
Offline Offline

Posts: 65


View Profile
« Reply #2 on: April 18, 2008, 01:03:52 AM »

thanks
Logged
dbrown
Rookie
**
Offline Offline

Posts: 28


View Profile
« Reply #3 on: July 17, 2008, 06:04:54 PM »

This script is will not work on PHP4 simply because of "file_put_content()".

You can simply add this to the top of the code to make it work on php4 and php5

Code:
if(!function_exists('file_put_contents')) {
    function file_put_contents($filename, $data, $file_append = false) {
      $fp = fopen($filename, (!$file_append ? 'w+' : 'a+'));
        if(!$fp) {
          trigger_error('file_put_contents cannot write in file.', E_USER_ERROR);
          return;
        }
      fputs($fp, $data);
      fclose($fp);
    }
  }

Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #4 on: July 17, 2008, 06:10:26 PM »

Nice add D, thanks!
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nchaos
n00b
*
Offline Offline

Posts: 9


View Profile
« Reply #5 on: September 27, 2008, 06:57:25 AM »

hi

my first post here although i registered sometime ago but forgot all my details, ive been lurking here from time to time but this place is way too advanced for me Roll Eyes
i just spent last whole night on this routine trying to make it work for me and happily i made it!now i need to find out how to use it with some php,  it will take me another few days i guess lol

anyway,just wanted to say hello and thanks!
Logged

No links in signatures please
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #6 on: September 27, 2008, 01:26:26 PM »

Welcome to the Cache NChaos and congrats on getting the script to work, that's always a great feeling.

Don't hesitate to post your questions, that's a lot of the reason for this board.

Cheers,
/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
Bompa
Administrator
Lifer
*****
Offline Offline

Posts: 564


Where does this show?


View Profile
« Reply #7 on: September 27, 2008, 11:13:09 PM »

hi

my first post here although i registered sometime ago but forgot all my details, ive been lurking here from time to time but this place is way too advanced for me Roll Eyes

 Ditto

 
Logged

"The most beautiful and profound emotion we can experience is the sensation of the mystical..." - Albert Einstein
nchaos
n00b
*
Offline Offline

Posts: 9


View Profile
« Reply #8 on: September 28, 2008, 10:09:12 AM »

hi perk!

Ok so after another few hours of searching i found a php code on some blog that I think will work with the spiders db I created using your routine. I didnít have time to test it properly but it seems that it works(it worked last night when I tested it against my ip that I put into spiders db)
the code is as fallows:

newbiecloak.php

<?php
// Lets connect to MySQL
mysql_connect('localhost', 'username', 'password');
mysql_select_db('database');
// First we grab the visitors IP
$visitorIP = $_SERVER['REMOTE_ADDR'];
// Then we search the database for the IP
$result = mysql_query("SELECT * FROM spiders WHERE address = '$visitorIP'");
$dbip = mysql_fetch_array($result);
$theip = $dbip['0'];
// We check if there is an instance
if(strlen($theip) > 0) {
echo 'uber SE content';
}else{
echo 'standard page';
}
?>

I know itís a pretty basic but Im happy anyway I made it to work(not the redirection though but only print echo ďuber se contentĒ)  now if im not mistaken I need to modify my htaccess as I want to pass every request to my site through my newbiecloak.php script?is that correct? in this case my htaccess should look like this:

RewriteEngine on
Options +FollowSymlinks
RewriteBase /
RewriteRule  ^(.*)$   newbiecloak.php

I havenít test it yet as im away from home so donít know if it works
Anyway, If this example above does what I think it does, I simply need to put header function into:

if(strlen($theip) > 0) {

echo header('Location: http://www.myspidercontent.com');

}else{

echo header('Location: http://www.myhumancontent.comí);

}
?>

In above example im 301 redirecting humans as well as spiders, correct? What if i want to simply leave spiders on the page they come in? So in case when the script detects spider it doesnít do any redirect but skip the script and let the spiders crawl the page that was requested. Lets say spider hits http://www.myspidercontent.com/spiderfood1.html, request go through newbiecloak.php where the spider is detected, there is NO redirection, spider is accessing spiderfoo1.html as normal. What should I put in this first echo function in this case?How should i modify the script?After some reading I feel it can be done somehow with request_uri in the code and htaccess but im super guessing here.

Ok, sorry for the long post and my silly approach. I must say that I feel kinda dumb asking these questions and looking for the ready code Smooch but any help would be much appreciated, some direction or smth.
If this post hijack this thread or itís a rubbish to forum or you want to move it or whatever, feel free to do so. You are more then welcome! Smiley
 
thank you!
Logged

No links in signatures please
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #9 on: September 28, 2008, 10:46:21 AM »

Heyya NC -

Let's consider a slightly different approach and see if this bakes your noodle. First, let's take all requests for all pages and ship them into a single script, regardless of surfer type. This is done in your .htaccess:


RewriteEngine on
RewriteRule  ^(.*)$  /main.php?uri=$1


Now in the main.php, let's evaluate what kind of surfer they are, and pass the correct page back. I'm also going to make use of the dbConnection class in the php repository here (it's my class) because you'll be using your connection to the database more than just for identifying spiders:

Code:
<?php

require "class.dbconection2.php";
$db = new dbConnection('127.0.0.1''user''password''database');

// This will return a zero or a 1 - true or false, if it is a spider
$isSpider $db->singleAnswer("select count('x') from spiders where address='{$_SERVER['REMOTE_ADDR']}'");

if (
$isSpider)
{
   require 
"spiderpages/$_GET['uri']";
} else {
   require 
"humanpages/$_GET['uri']";
}

?>


Now for this to work, you will need a directory under your <main.php> called humanpages and a directory called spiderpages. In both, you will create a page called index.html. In both, you will put the content that you want to display of that page is requested, whether the surfer is a spider or a human. Note also that it doesn't matter that the required scripts are .html at the end rather than .php, since you're already executing php the interpreter will not care.

Alternately, consider this code:

Code:
<?php

require "class.dbconection2.php";
$db = new dbConnection('127.0.0.1''user''password''database');

// This will return a zero or a 1 - true or false, if it is a spider
$isSpider $db->singleAnswer("select count('x') from spiders where address='{$_SERVER['REMOTE_ADDR']}'");
$pageScript $db->singleAnswer("select pagescript from pages where pagename='{$_GET['uri']}' and spider=$isSpider");
echo eval(
$pageScript);

?>


This very tiny script will decide if the user is a spider, then pull a php script from the database and then execute the code (the eval line). This is another way of storing all of your pages, but in a database rather than in the file system. Of course this script assumes that (the page required) returns a string that is the output of the page. Another way to do this if pages are all completely static would be:

Code:
<?php

require "class.dbconection2.php";
$db = new dbConnection('127.0.0.1''user''password''database');

// This will return a zero or a 1 - true or false, if it is a spider
$isSpider $db->singleAnswer("select count('x') from spiders where address='{$_SERVER['REMOTE_ADDR']}'");
echo 
$db->singleAnswer("select pagescript from pages where pagename='{$_GET['uri']}' and spider=$isSpider");

?>


... where we just push back out whatever is in the database with that name and no evaluation of "code" at all... just a straight page dump.

It gets way more complicated and cool from here, but I think this ought to keep you up for a few nights thinking about how you want to cloak. Note that same-server cloaking is way more (WAY more) effective than redirecting, both for spiders and surfers - especially if you are trying to sell them something, so I think you're on the right track.
« Last Edit: September 28, 2008, 10:49:52 AM by perkiset » Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nchaos
n00b
*
Offline Offline

Posts: 9


View Profile
« Reply #10 on: September 28, 2008, 11:35:14 AM »

wow! Praise

this is definitely more then i asked for!I have read tons of threads on syndk8 where "same server cloaking" and related techniques were described as THE way to cloak nowadays, but considering my skills i thought i would leave it for the pros and will start first with some simple a/b cloaking to see how it works and what are the effects.

BIG thanks for your codes and time.Much appreciated! Praise
have a sunny day
Logged

No links in signatures please
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #11 on: September 28, 2008, 11:38:45 AM »

You're welcome NC - BTW, here is the link for the dbconnection class:

http://www.perkiset.org/forum/php/perks_mysql_connection_class_updated-t1169.0.html

sunny @ you as well (I live in Phoenix, Arizona - we have no lack of sun Wink )

/perk
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nchaos
n00b
*
Offline Offline

Posts: 9


View Profile
« Reply #12 on: September 28, 2008, 12:50:58 PM »

You're welcome NC - BTW, here is the link for the dbconnection class:

http://www.perkiset.org/forum/php/perks_mysql_connection_class_updated-t1169.0.html

sunny @ you as well (I live in Phoenix, Arizona - we have no lack of sun Wink )

/perk

thanks for the link.i was reading this thread yesterday but didn't understand a thing lol i wasn't sure if its  going to work with your routine from here, so i used your first dbconnection class, the one from the very first page of this subforum

you r lucky with the sun!im currently living in ireland and there was no summer this year, same as last year.at least for the last few days it didn't rain.happy times! Grin
Logged

No links in signatures please
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #13 on: September 28, 2008, 01:11:50 PM »

Where abouts in Ireland? You know there just MIGHT be a gathering of the best BHs in the world there in the next few weeks...
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nchaos
n00b
*
Offline Offline

Posts: 9


View Profile
« Reply #14 on: September 28, 2008, 05:34:32 PM »

yep.i know that actually, its in limerick and i know that you are coming there too  Mobster im in dublin on the other side of country,not very far away thou ,considering that island itself isnt very big.have a nice stay and weather when you come here Smiley
Logged

No links in signatures please
Pages: [1] 2 3 4
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!