The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 20, 2019, 11:49:40 PM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: rewrite maps  (Read 3224 times)
SEOidiot
Journeyman
***
Offline Offline

Posts: 84


^ Hes a pint sized Ned Flanders


View Profile
« on: May 16, 2007, 02:57:27 PM »

Perk was helping me out with a particular task where i needed flexibility and power in rewriting data across a large number of urls and we talked about rewrite maps - great info and i hope he can expand here for all to see. It looks to me like what you started to explain had huge possibilities for lots of the things we all do...
Logged

<mango>Yummmmmmmmmmmmm</mango>
<pie>ewwwwwwwwwww</pie>
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 416


View Profile
« Reply #1 on: May 16, 2007, 06:08:55 PM »

Some time ago (about 4 years) I wrote some articles about mod_rewrite.

Below I have attached the part which explains the handling of the rewrite maps.

----------------------------------------------------

Another directive which is very handy for cloaking
purposes are the so-called Rewriting Maps. These are
files consisting of key/value pairs, e.g. in the
simple format of an ordinary text file:

cde2c920.infoseek.com spider
205.226.201.32 spider
cde2c923.infoseek.com spider
205.226.201.35 spider
cde2c981.infoseek.com spider
205.226.201.129 spider
cde2cb23.infoseek.com spider
205.226.203.35 spider

These keys are, as you can see, hostnames or IPs.
In this simplistic example the value is always the
same, namely "spider".

This directive is entered either in the server
section 2 or in the virtual host section 3 in file
"httpd.conf":

RewriteMap botBase txt:/www/yourdomain/spiderspy.txt

The Rewriting Map will then be available across your
server.

The other directives are entered in file ".htaccess":

RewriteCond   ${botBase:%{REMOTE_HOST}} =spider [OR]
RewriteCond   ${botBase:%{REMOTE_ADDR}} =spider
RewriteRule   ^(.*)\.htm$  $1.htm [L]
RewriteRule   ^.*\.htm$  index.html [L]

The conditions will make the system check whether the
required access is generated by a spider. To this
effect a lookup of file "spiderspy.txt" is triggered.

If the key is found, the value "spider" is returned
and the condition is rendered as true.

Next, the first RewriteRule will be executed. This one
determines that the called for ".htm" page will be fed
to the spider. The variable $1 is equal to the part in
parentheses of "^(.*)\.htm$", i.e. the file name will
remain the same.

If the URL is called by a normal human visitor, rule 2
applies: the user will be redirected to page
"index.html".

As the ".htm" pages will only be read by spiders, they
can be optimized accordingly for the search engines.

You may also use a file in dbm format instead of an
ordinary text file. The binary data base format helps
accelerate the lookup which is particularly important
if you are operating from very large spider lists.

This example given above offers a simple cloaking
functionality. All ordinary visitors will always be
redirected to the site's "index.html" page and there
is no access logging beyond the mod_rewrite logs.

However, it does go to show how you can effectively
replace several lines of Perl code with just a few
lines of mod_rewrite.
Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #2 on: May 16, 2007, 09:50:00 PM »

Very nice article Dirk...
I had told SEOI that you are really strong when it comes to this stuff, but I had forgotten reading this so many years ago - thanks for posting it.

Wasn't kidding about Dirk, was I? He's THE JUICE.

Yeah man... just call on me whenever you need the real info. I'm yer man  ROFLMAO

/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
m0nkeymafia
Expert
****
Offline Offline

Posts: 240


Check it!


View Profile
« Reply #3 on: May 17, 2007, 12:46:33 AM »

Top quality stuff, very nice way of doing it
Perhaps we need an apache section too Shocked
Logged

I am Tyler Durden
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!