The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. December 05, 2008, 10:32:35 AM

Login with username, password and session length


Pages: 1 [2]
  Print  
Author Topic: filtering bad characters  (Read 2814 times)
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 373


View Profile
« Reply #15 on: July 10, 2007, 05:43:29 AM »

Bompa,

the empty curly braces {} mean that the string in the preceeding curly braces
shall be replaced by nothing. So the special characters will be deleted.

Dirk
Logged
Bompa
Administrator
Expert
*****
Online Online

Posts: 213


View Profile WWW
« Reply #16 on: January 03, 2008, 05:07:57 AM »

Bompa,

the empty curly braces {} mean that the string in the preceeding curly braces
shall be replaced by nothing. So the special characters will be deleted.

Dirk

ahhh, I finally get it. 

Bompa <--- SLOW

You have an extra curly brace cuz YOU HAVE TO in order to have braces in pairs,
whereas, if we delimit with slashes, we can use jut three.

damn!

Logged

dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 373


View Profile
« Reply #17 on: January 03, 2008, 10:06:59 AM »

Bompa,

here are some more examples:

Code:
$string =~ s/[^\x00-\x7E]//;
$string =~ s|[^\x00-\x7E]||;
$string =~ s~[^\x00-\x7E]~~;

$string =~ s([^\x00-\x7E])();
$string =~ s[[^\x00-\x7E]][];
$string =~ s{[^\x00-\x7E]}{};

Normally only three delimiters are required.

But if you use brackets you need 4 (two pairs).
Logged
Bompa
Administrator
Expert
*****
Online Online

Posts: 213


View Profile WWW
« Reply #18 on: October 23, 2008, 08:40:24 PM »

Using a regex you could skip all weird characters and keep only ASCII 0 - 127:
Code:
$string =~ s{ ( [^\x00-\x7E] ) }{}xmsg;   # ASCII 0 - 127


This thread is old, but may need this bump OR I'm reading the docs wrong, (which is very likely).

ASCII 0-19hex (0-31decimal) are all control characters I think.

So, to allow only printable characters and remove all others, shouldn't the basic regex be...

If the character is not 20h to 7Eh remove it.

Code:
/[^\x20-\x7E]//g;


Bompa
Logged

perkiset
Olde World Hacker
Administrator
Lifer
*****
Online Online

Posts: 5230


:sniffle: Humor was so much easier before.


View Profile
« Reply #19 on: October 24, 2008, 09:05:03 AM »

Although I'm no good with the PERL syntax, you're right about the ASCII value for exclusion Bomps. 0..31 are indeed unprintables and control characters, although it is arguable that you might want to send a chr(7) [beep] to the printer to drive the operator insane. It's been known to happen Wink
Logged

If I can't be Mr. Root then I don't want to play.
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 373


View Profile
« Reply #20 on: October 24, 2008, 01:57:17 PM »

If you skip ASCII 0-19hex (0-31decimal) you would also filter tabs, line feeds and carriage returns.

This may be ok, but it depends on the purpose of the filtering.
Logged
Pages: 1 [2]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!