The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 03, 2010, 05:36:20 AM

Login with username, password and session length


Pages: 1 [2]
  Print  
Author Topic: filtering bad characters  (Read 5305 times)
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 415


View Profile
« Reply #15 on: July 10, 2007, 05:43:29 AM »

Bompa,

the empty curly braces {} mean that the string in the preceeding curly braces
shall be replaced by nothing. So the special characters will be deleted.

Dirk
Logged
Bompa
Administrator
Expert
*****
Online Online

Posts: 394


View Profile WWW
« Reply #16 on: January 03, 2008, 05:07:57 AM »

Bompa,

the empty curly braces {} mean that the string in the preceeding curly braces
shall be replaced by nothing. So the special characters will be deleted.

Dirk

ahhh, I finally get it. 

Bompa <--- SLOW

You have an extra curly brace cuz YOU HAVE TO in order to have braces in pairs,
whereas, if we delimit with slashes, we can use jut three.

damn!

Logged

Do You Know?
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 415


View Profile
« Reply #17 on: January 03, 2008, 10:06:59 AM »

Bompa,

here are some more examples:

Code:
$string =~ s/[^\x00-\x7E]//;
$string =~ s|[^\x00-\x7E]||;
$string =~ s~[^\x00-\x7E]~~;

$string =~ s([^\x00-\x7E])();
$string =~ s[[^\x00-\x7E]][];
$string =~ s{[^\x00-\x7E]}{};

Normally only three delimiters are required.

But if you use brackets you need 4 (two pairs).
Logged
Bompa
Administrator
Expert
*****
Online Online

Posts: 394


View Profile WWW
« Reply #18 on: October 23, 2008, 08:40:24 PM »

Using a regex you could skip all weird characters and keep only ASCII 0 - 127:
Code:
$string =~ s{ ( [^\x00-\x7E] ) }{}xmsg;   # ASCII 0 - 127


This thread is old, but may need this bump OR I'm reading the docs wrong, (which is very likely).

ASCII 0-19hex (0-31decimal) are all control characters I think.

So, to allow only printable characters and remove all others, shouldn't the basic regex be...

If the character is not 20h to 7Eh remove it.

Code:
/[^\x20-\x7E]//g;


Bompa
Logged

Do You Know?
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 8744



View Profile
« Reply #19 on: October 24, 2008, 09:05:03 AM »

Although I'm no good with the PERL syntax, you're right about the ASCII value for exclusion Bomps. 0..31 are indeed unprintables and control characters, although it is arguable that you might want to send a chr(7) [beep] to the printer to drive the operator insane. It's been known to happen Wink
Logged

It's the things you learn after you know it all that really count.
-John Wooden
dirk
Global Moderator
Expert
*****
Offline Offline

Posts: 415


View Profile
« Reply #20 on: October 24, 2008, 01:57:17 PM »

If you skip ASCII 0-19hex (0-31decimal) you would also filter tabs, line feeds and carriage returns.

This may be ok, but it depends on the purpose of the filtering.
Logged
Pages: 1 [2]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!