Instead of eliminating unwanted shit, I only allow alphanurmerics plus a few others like
the underscore, (or whatever).
i also do along the lines of what bomps does.
this is an ASP function that does exactly that without using regex. I actually found this to be faster for really long text. i know this is the PERL board, but the concept is the same and doesnt use any functions that wouldnt be available in any language.
have a string of valid characters.
check eat letter in the dirty string against the valid string.
replace the character if its bad.
so for URLs i run it as stripnonalphanumerics(someURL,"-")
for content i run it as stripnonalphanumerics(someURL," ")
function stripnonalphanumerics(dirtystring,replacewith)
dim text,i
text=""
validstring="1234567890abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
for i = 1 to len(dirtystring)
letter=mid(dirtystring,i,1)
if instr(validstring,letter) then
text=text&letter
else
text=text&replacewith
end if
next
stripnonalphanumerics=text
end function