The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 18, 2019, 12:35:55 PM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: Boost regex  (Read 6645 times)
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« on: April 21, 2007, 06:44:26 PM »

Good regex implementations in C++ can be hard to find (a major deficiency).

Boost regex is about as good as they come, here's a sample of how to strip out all href links from a page and no, I don't want to talk about the regex it uses  Grin

Compile with;
g++  -O   gethrefs.cpp   -o gethrefs -lboost_regex

implement with;

./gethrefs *.html

or something like that

Code:
#include <fstream>
#include <iostream>
#include <boost/regex.hpp>

using namespace std;

bool regex_callback(const boost::match_results<std::string::const_iterator>& what)
{
   cout << what[1] << endl;
   return true;
}

void load_file(std::string& s, std::istream& is)
{
   s.erase();
   if(is.bad()) return;
   s.reserve(is.rdbuf()->in_avail());
   char c;
   while(is.get(c))
   {
      if(s.capacity() == s.size())
         s.reserve(s.capacity() * 3);
      s.append(1, c);
   }
}

int main(int argc, const char** argv)
{
   std::string text;
   const char* re = "<\\s*a\\s+[^>]*href\\s*=\\s*[\"'](.*?)[\"'>]";
   boost::regex expression(re);

   for(int i = 1; i < argc; ++i)
   {
      std::ifstream fs(argv[i]);
      load_file(text, fs);
      fs.close();
      // construct our iterators:
      boost::sregex_iterator m1(text.begin(), text.end(), expression);
      boost::sregex_iterator m2;
      std::for_each(m1, m2, &regex_callback);
   }
   return 0;
}

And to those of you that were saying OO causes code bloat, does this look "bloated"  Wink (I do know what you mean, but I think OO reduces code in the actual implementation and ads to readability, etc. but I digress..... )

Cheers,
td

P.S. This heavily leverages another example so i don't claim this as "my code" at all
« Last Edit: April 21, 2007, 06:58:00 PM by thedarkness » Logged

"I want to be the guy my dog thinks I am."
 - Unknown
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #1 on: April 21, 2007, 07:06:50 PM »

Never used boost itself.
But have used boost python www.boost.org/libs/python/
Forgot about it Smiley
for wrapping C++ classes to be called from python.
Downside it is slow on the compile
Upside compared to swig Classes are very elegant Smiley

lisp/scheme way Smiley in this snippet decided to use recursion.
Code:
(define (strip-links page-text)
    (let ((m-link (string-search "<\\s*a\\s+[^>]*href\\s*=\\s*[\"'](.*?)[\"'>(.*)")))
      (if m-link (cons (list-ref m-link 1) (strip-links (list-ref page-text))) '() )))
(define links (string-links (string-intersperse (read-lines "somefile.html"))))
links now contains list of href links Wink
Logged
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« Reply #2 on: April 21, 2007, 09:17:05 PM »

Very elegant Nop. Would it help my cause if I put all of mine on one line?  Wink

Cheers,
td
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« Reply #3 on: April 21, 2007, 09:19:36 PM »

I figure when I've got some time I'm going to learn some lisp/scheme just to see whether you're jerking my chain  ROFLMAO

Cheers,
td
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #4 on: April 21, 2007, 09:57:20 PM »

http://smartwin.sourceforge.net/
A gui library for windows.
Uses templates, most intresting is u can use for pocketpc devices also.
What was my point.
Yah that guy is like a C++ god Smiley. Probably he forgot more about shit then i will ever know.
Yet he uses C++.

Your implementation will run faster then mine.
If u are running it on a pocketpc could be an issue Smiley

1/2 of ur progam size is caused by your fileloader.
(price you pay when u have absolute control).
If you wrote the program in python, it would basically just be what is contained in your for-loop.
If you had ur own bag of C++ functions like read_lines load_file etc you could cut down on program size a lot.

Problem with compiled languages is that you have to write the program.
Compile, test repeat.
How many times have you written little C/C++ proggies just to test how a function works.

With python/scheme/lisp you have an interpreter where you can type in and test.
You do not have to compile.
And you can modify stuff using the console at runtime, that way u can test new ideas etc.
With python you could attach a console to a python webserver and at runtime play with the insides Smiley.

When i first learn something new i think it is like a magic bullet it can fix all problems.
He has a brain tumor, if only we could attach a python console to his head so i could mess arround with his brains Smiley.

Where python/scheme/lisp (and others) excel is acting like "glue".

Functions where u need to interface with low-level stuff, or speed is a problem write in C/C++/assembler.
Then call each of these functions from the high level language.
That way u get the best of both worlds Smiley.

Anyway I blab too much Smiley
Me i like to learn new things. Probably next year i will learn some other freaky language. I wanna look at erlang.
But i am sure i will still be using C/C++,python,scheme still Smiley
I will just have another tool in my toolbox.
Logged
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« Reply #5 on: April 21, 2007, 10:07:36 PM »

Me i like to learn new things. Probably next year i will learn some other freaky language. I wanna look at erlang.
But i am sure i will still be using C/C++,python,scheme still Smiley
I will just have another tool in my toolbox.

Well said nop, very cool.

Cheers,
td
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #6 on: April 22, 2007, 11:41:24 AM »

Me i like to learn new things. Probably next year i will learn some other freaky language. I wanna look at erlang.
But i am sure i will still be using C/C++,python,scheme still Smiley
I will just have another tool in my toolbox.

I so agree - I have always told PinkHat that I essentially have to let go of everything I know and relearn every 3 years - or I become a dinosaur :waits for TD harrassement: even though I've about ruined my stomach from Ibuprofen, I have not ever been left in the dust by you youngsters...

/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #7 on: April 23, 2007, 08:44:10 AM »

so are you saying i should move on from a 12 year old language? lol
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #8 on: April 23, 2007, 08:51:14 AM »

I dunno man... is it working for you?  Wink

That being said, you have mentioned stretching out into some PHP and such... that'd prolly just expand your toolchest perfectly. That and some more JS/HTML stuff and you're good to go.
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
Dbyt3r
Rookie
**
Offline Offline

Posts: 19


View Profile
« Reply #9 on: April 26, 2007, 04:50:54 AM »

C/C++ aren't comparable performance wise.

But... Productivity wise.. well it pretty much sucks monkey balls, right?  Wink

For example, your little file loader would probably be something like,
Code:
for line in open("file.txt","r"): s+=line



Btw, why the hell do you keep using std:: when you've already defined it as a namespace? :p (C++ noob)
Logged

No links in signatures please
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« Reply #10 on: April 26, 2007, 06:06:54 AM »

C/C++ aren't comparable performance wise.

Comparable with what, with each other?

But... Productivity wise.. well it pretty much sucks monkey balls, right?  Wink

It does, if you believe it does.

For example, your little file loader would probably be something like,
Code:
for line in open("file.txt","r"): s+=line


Are we going to have this argument again? You personally are a *insert language here* zealot?

There's prolly 50 ways to read a file in in each language.... your point?


Btw, why the hell do you keep using std:: when you've already defined it as a namespace? :p (C++ noob)

My bad, this is an example extracted from a work in progress, not finished yet.

Cheers,
td
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
Dbyt3r
Rookie
**
Offline Offline

Posts: 19


View Profile
« Reply #11 on: April 26, 2007, 07:41:21 AM »

Aren't comparable with anything else, the best if you may.

Ya my read file in one line example was pretty stupid.

I actually have a *thing* for C++'s :: thats also a reason why I don't use .NET's c++ because it uses -> instead :p
Logged

No links in signatures please
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« Reply #12 on: April 26, 2007, 02:52:22 PM »

OK, I dig, it's all good my brutha.

As you so correctly pointed out in your "language doesn't matter" thread its about you and sometimes it's about the challenge.

Kirk out
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!