Okay Im trying to understand this...
[^0-9].+?
In english,
is it: Not zero through 9 any character except newline, zero or one times, one or more times?
is this translation correct?
then why is it getting two characters after the slashes.... what am i missing here...
im trying to just grab everything every http:// up until a #32
Close.
$re = '/(http:\/\/[^0-9].+?)/';
That will match http:// followed by a single character that is not 0-9, followed by
by one or more of anything. The ? is either doing nothing is this case or screwing
things up.
As quantifiers, the + usually means "one or more". The ? usually means 0 or 1 time,
but can also be used to stop the greediness of matching, like in my example below.
The block of text that you're parsing is a tough one cuz it doesn't
have clear cut endings for the match.
What I would try is to match everything from http:// to the first white space.
This perl worked in a quick test:
$string =<<here;
Vermont was the first state to pass more aggressive privacy policies
with respect to credit reports in 1992. Maine and California are typically
early adopters when caring for their state citizen’s privacy.
http://www.privacyrights.org/fs/fs6a-facta.htm http://uspirg.org/uspirg.asp?id2=13649http://en.wikipedia.org/wiki/Fair_and_Accurate_Credit_Transactions_Acthere
while($string =~ /(http:\/\/.*?\s)/msg) {
$url = $1;
print "$url\n";
}
The * means unlimited number of times, but the ? immediately following means
"up to the first", then I have \s for any white space.
Ooops, actually, mine gets all the urls, but I didn't follow why up to a #32
cuz I did not see a #32.
If you're not confused yet, let me know, I can type more nonsense.

Bompa