Thread: Boost regex
thedarkness

Good

regex

  implementations in C++ can be hard to find (a major deficiency).

Boost

regex

  is about as good as they come, here's a sample of how to strip out all href links from a page and no, I don't want to talk about the

regex

  it uses  Applause

Compile with;
g++  -O   gethrefs.cpp   -o gethrefs -lboost_

regex

 

implement with;

./gethrefs *.html

or something like that


#include <fstream>
#include <iostream>
#include <boost/

regex

 .hpp>

using namespace std;

bool

regex

 _callback(const boost::match_results<std::string::const_iterator>& what)
{
   cout << what[1] << endl;
   return true;
}

void load_file(std::string& s, std::istream& is)
{
   s.erase();
   if(is.bad()) return;
   s.reserve(is.rdbuf()->in_avail());
   char c;
   while(is.get(c))
   {
      if(s.capacity() == s.size())
         s.reserve(s.capacity() * 3);
      s.append(1, c);
   }
}

int main(int argc, const char** argv)
{
   std::string text;
   const char* re = "<\s*a\s+[^>]*href\s*=\s*["'](.*?)["'>]";
   boost::

regex

  expression(re);

   for(int i = 1; i < argc; ++i)
   {
      std::ifstream fs(argv<>);
      load_file(text, fs);
      fs.close();
      // construct our iterators:
      boost::s

regex

 _iterator m1(text.begin(), text.end(), expression);
      boost::s

regex

 _iterator m2;
      std::for_each(m1, m2, &

regex

 _callback);
   }
   return 0;
}


And to those of you that were saying OO causes code bloat, does this look "bloated"  Applause (I do know what you mean, but I think OO reduces code in the actual implementation and ads to readability, etc. but I digress..... )

Cheers,
td

P.S. This heavily leverages another example so i don't claim this as "my code" at all

nop_90

Never used boost itself.
But have used boost

python

  www.boost.org/libs/

python

 /
Forgot about it Applause
for wrapping C++ classes to be called from

python

 .
Downside it is slow on the compile
Upside compared to swig Classes are very elegant Applause

lisp/scheme way Applause in this snippet decided to use recursion.

(define (strip-links page-text)
    (let ((m-link (string-search "<\s*a\s+[^>]*href\s*=\s*["'](.*?)["'>(.*)"Applause))
      (if m-link (cons (list-ref m-link 1) (strip-links (list-ref page-text))) '() )))
(define links (string-links (string-intersperse (read-lines "somefile.html"Applause)))

links now contains list of href links Applause

thedarkness

Very elegant Nop. Would it help my cause if I put all of mine on one line?  Applause

Cheers,
td

thedarkness

I figure when I've got some time I'm going to

learn

  some lisp/scheme just to see whether you're jerking my chain  Applause

Cheers,
td

nop_90

http://smartwin.sourceforge

.net

 /
A gui library for windows.
Uses templates, most intresting is u can use for pocketpc devices also.
What was my point.
Yah that guy is like a C++ god Applause. Probably he forgot more about shit then i will ever know.
Yet he uses C++.

Your implementation will run faster then mine.
If u are running it on a pocketpc could be an issue Applause

1/2 of ur progam size is caused by your fileloader.
(price you pay when u have absolute control).
If you wrote the program in

python

 , it would basically just be what is contained in your for-loop.
If you had ur own bag of C++ functions like read_lines load_file etc you could cut down on program size a lot.

Problem with compiled languages is that you have to write the program.
Compile, test repeat.
How many times have you written little C/C++ proggies just to test how a function works.

With

python

 /scheme/lisp you have an interpreter where you can type in and test.
You do not have to compile.
And you can modify stuff using the console at runtime, that way u can test new ideas etc.
With

python

  you could attach a console to a

python

  webserver and at runtime play with the insides Applause.

When i first

learn

  something new i think it is like a magic bullet it can fix all problems.
He has a brain tumor, if only we could attach a

python

  console to his head so i could mess arround with his brains Applause.

Where

python

 /scheme/lisp (and others) excel is acting like "glue".

Functions where u need to interface with low-level stuff, or speed is a problem write in C/C++/assembler.
Then call each of these functions from the high level language.
That way u get the best of both worlds Applause.

Anyway I blab too much Applause
Me i like to

learn

  new things. Probably next year i will

learn

  some other freaky language. I wanna look at

erlang

 .
But i am sure i will still be using C/C++,

python

 ,scheme still Applause
I will just have another tool in my toolbox.

thedarkness

quote author=nop_90 link=topic=70.msg261#msg261 date=1177217840

Me i like to

learn

  new things. Probably next year i will

learn

  some other freaky language. I wanna look at

erlang

 .
But i am sure i will still be using C/C++,

python

 ,scheme still Applause
I will just have another tool in my toolbox.


Well said nop, very cool.

Cheers,
td

perkiset

quote author=nop_90 link=topic=70.msg261#msg261 date=1177217840

Me i like to

learn

  new things. Probably next year i will

learn

  some other freaky language. I wanna look at

erlang

 .
But i am sure i will still be using C/C++,

python

 ,scheme still Applause
I will just have another tool in my toolbox.


I so agree - I have always told PinkHat that I essentially have to let go of everything I know and re

learn

  every 3 years - or I become a dinosaur :waits for TD harrassement: even though I've about ruined my sto

mac

 h from Ibuprofen, I have not ever been left in the dust by you youngsters...

/p

nutballs

so are you saying i should move on from a 12 year old language? lol

perkiset

I dunno man... is it working for you?  Applause

That being said, you have mentioned stretching out into some

PHP

  and such... that'd prolly just expand your toolchest perfectly. That and some more JS/HTML stuff and you're good to go.

Dbyt3r

C/C++ aren't comparable performance wise.

But... Productivity wise.. well it pretty much sucks monkey balls, right?  Applause

For example, your little file loader would probably be something like,
for line in open("file.txt","r"Applause: s+=line




Btw, why the hell do you keep using std:: when you've already defined it as a namespace? :p (C++ noob)

thedarkness

quote author=Dbyt3r link=topic=70.msg602#msg602 date=1177588254

C/C++ aren't comparable performance wise.


Comparable with what, with each other?

quote author=Dbyt3r link=topic=70.msg602#msg602 date=1177588254

But... Productivity wise.. well it pretty much sucks monkey balls, right?  Applause


It does, if you believe it does.

quote author=Dbyt3r link=topic=70.msg602#msg602 date=1177588254

For example, your little file loader would probably be something like,
for line in open("file.txt","r"Applause: s+=line




Are we going to have this argument again? You personally are a *insert language here* zealot?

There's prolly 50 ways to read a file in in each language.... your point?


quote author=Dbyt3r link=topic=70.msg602#msg602 date=1177588254

Btw, why the hell do you keep using std:: when you've already defined it as a namespace? :p (C++ noob)


My bad, this is an example extracted from a work in progress, not finished yet.

Cheers,
td

Dbyt3r

Aren't comparable with anything else, the best if you may.

Ya my read file in one line example was pretty stupid.

I actually have a *thing* for C++'s :: thats also a reason why I don't use

.NET

 's c++ because it uses -> instead :p

thedarkness

OK, I dig, it's all good my brutha.

As you so correctly pointed out in your "language doesn't matter" thread its about you and sometimes it's about the challenge.

Kirk out


Perkiset's Place Home   Politics @ Perkiset's