
![]() |
m0nkeymafia
any of u
regexnutters got a few mins to help a monkey out?ive got a script to grab all the classes and id's used on a webpage i then want to pull off any matching css code for each class out of the css files, but i cant figure out a regexto do iti.e. if i have found a class called "topNav" i want to search through the css files [already in a buffer] and find the corresponding definitions i.e. .topNav { padding-top: 5px; } would return just "padding-top: 5px;" obviously with any other lines this way if i rob someones web layout / design and modify it, i can run it through this and discard any usesless classes can anyone gimmie a hand on what to do? im stuck!! cheers perkiset
First - are you doing this in
PHP?(for the rest of this I'll assume so) Second - you'll want to load all referenced CSS files (and their referenced files) just in case the definitions are not in the immediate page. Best way to find it then would be to create one big mess of all the linked files and do the Regexon <that string>.Then, assuming you got a class reference, like 'topNav' and have it in the variable $className, you could do something like this: preg_match('/([#.]$className<>*{[^}]+/', $inputBuff, $matches); ... and then the $matches array would have anything that it found. Important to look for things that assign class by id too, rather than just class name - by this I mean: <style> #topNav { font-family: Arial; } </style> <div id="topNav">Here is the html</div> Good luck! /p m0nkeymafia
cheers perk thats very similar to what i had but wasnt working, the bit at the end is different though.
will let ya know how it goes and yeah its php![]() cheers matey perkiset
Hey MM - I forgot something really important:
You really need to add an "m" as a modifier to the regexie.,preg_match('/([#.]$className<>*{[^}]+/<>m>', $inputBuff, $matches); ... this dicked with me for a long time. PHPwon't let the pattern span multiple lines without it... and if someone defines a class like this:.topNav { font-family: courier; font-size: 12px; } .. then the first regexI posted won't work.Hope that helps, /p m0nkeymafia
excellent perk thanks, just about to try it
![]() m0nkeymafia
sorry to be lame but you omitted the closing )
where should it go? Bompa
quote author=m0nkeymafia link=topic=371.msg2415#msg2415 date=1182934987 sorry to be lame but you omitted the closing ) where should it go? My guess is at the end, but before the semicolon. Bompa m0nkeymafia
ahh amazin that worked - i think lol
cheers for the help guys hopefully have this finished and tweaked by tonight o/ perkiset
Right on... BTW ... what's with the missing paren? There's no open parens in my post
![]() m0nkeymafia
off the end apparently
![]() ok so i now have the regexto pull classes and id's out of a webpage'/(class|id) *= *"([a-zA-z0-9 ]*)"/i' i then explode the results to pull out all classes even if two+ are specified then i use perks code to grab the classes out of the style sheet '/([#.]'.$val.'.*<>*{[^}]+)/m' which all works great only thing left to do is grab non-class selectors out of the style sheet, i.e. body {} and so forth I originally tried something like this: '/([^#.]'.$val.'.*<>*{[^}]+)/m' but didnt particularily work very well, as it seemed to pull all the classes off :/ anyone have any further ideas? cheers perkiset
On quick note, I'm assuming that the same code you used for style definitions could be used, except simpole negate the period or pound sign in front and replace the $val with letters/numbers...
/([<>^>#.]<>[A-Za-z0-9_]+><>*{[^}]+)/m ... I think that'd do it, although you might possible get some stuff on a page that you didn't want... that one might take a bit more testing... m0nkeymafia
Yeah perk i tried that [although i now realise i copied teh wrong syntax in my previous post]
for some reason it pulls up class definitions even though they have a preceeding period! very odd? p.s. i actually get the sytnax u provided previously, when u first posted it i was like WTF !! but now it make sense, i really hope my regexskills keep improving lol [slowly![]() so any ideas why it may not work? even though we say we want text that is NOT directly preceeded by a # or . ?? Bompa
The dot is a literal within square brackets.
![]() (Dont escape it). Bompa thedarkness
Nice pickup Bomps.
Cheers, td m0nkeymafia
nice one bompa.
it still doesnt work though? /([^.#][a-zA-Z0-9_].*<>*{[^}])/m so if we expand it into constituent parts we have Match a string that follows these rules: - Starts NOT with a . or # - Immediately followed by any number of letters or numbers or underscores - Then match any amount of whitespace - Then it needs to find an open parenthesis - It then matches until it hits a closing parenthesis, at which point it stops - * Works over multiple lines I cannot see how this matches ALL classes within my stylesheet? The critical part, the NOT . or # reads fine to me? What am i missing? Cheers Bompa
I would like to do some experimenting, but since I am a css idiot, maybe you
could give me a sample of a page or a url to any page where this should work? later, Bompa Bompa
quote author=m0nkeymafia link=topic=371.msg2464#msg2464 date=1183208136 nice one bompa. it still doesnt work though? /([^.#][a-zA-Z0-9_].*<>*{[^}])/m so if we expand it into constituent parts we have Match a string that follows these rules: - Starts NOT with a . or # - Immediately followed by any number of letters or numbers or underscores - Then match any amount of whitespace - Then it needs to find an open parenthesis - It then matches until it hits a closing parenthesis, at which point it stops - * Works over multiple lines I cannot see how this matches ALL classes within my stylesheet? The critical part, the NOT . or # reads fine to me? What am i missing? Cheers I think you're verbalizing it wrong, but anyways, perk's code /([^#.][A-Za-z0-9_]+<>*{[^}]+)/m your code /([^.#][a-zA-Z0-9_].*<>*{[^}])/m perk has the + sign, you don't, instead you have .* he also has a + near the end. still if i had a sample of text to parse i could experiment Bompa Bompa
Ok, you all must be sleeping like babies.
I dug up a sample of css. This perlworks for me:while( $text =~ /(^[^#.]w+{.*?})/msg ) { print "$1 "; } This says The character immediately after a newline can not be a # nor . Then followed by one or more alphanumeric characters Then followed by a left curly brace Then followed by anything, (including a new line) Stop matching at the first right curly brace While the above worked for me, it's likely that I do not have a complete sample of the text you'll be parsing. Let me know. There seems to be a lot of confusion with the ^ and $ metacharacters, as well as with the /m and /s flags. m0nkeymafia
Hey Bompa
Cheers for goin out your way matey! I think i was sleeping / making up time with my gf lol I managed to get a working version, some limitations, but works fairly well. Not had much chance to test it though, i got it working on my sample set then fished it off lol Too much regexmakes m0nkey a dull boy![]() /^s*([w,:]*)s*{([^}]*)}/m When I get chance I'll have a play with ur code see if it works better Cheers tho dude, you posted like 3 times in a row for me ![]() thedarkness
Yeah Bomps,
You're like a true humanitarian...... ![]() Cheers, td m0nkeymafia
hahahaha
well were not curing world hunger, but a thankya was needed to y'all perkiset
Hey all - sorry about not responding, turns out that even though I was mobile I was just to frigging happy being offline to make much use of my new broadband card to hook up.
Bomps - here is a bit of production CSS to look at (take from a bunch of different places in my code, this wouldn't necessary all be on one page, but gives a nice idea of what it all looks like: <style> <style type="text/css"> .arial { font-family: "Trebuchet MS", Verdana, Arial, Helvetica, sans-serif; line-height: 120%; } .normal { font-weight: normal; } .bold { font-weight: bold; } .wt { color: #FFFFFF; } .pad { padding: 1px 5px 1px 5px; } A.wt { text-decoration: none; } A.wt:HOVER { background-color: #671218; } .bk { color: #220004; } A.bk { text-decoration: none; } A.bk:HOVER { background-color: #c78289; color: #ffffff; } .s9 { font-size: 9px; } .s10 { font-size: 10px; } .s11 { font-size: 11px; } .s12 { font-size: 12px; } .s14 { font-size: 14px; } .s16 { font-size: 16px; } .s18 { font-size: 18px; } .s20 { font-size: 20px; } .s24 { font-size:24px; } .left { text-align: left; } .center { text-align: center; } .right { text-align: right; } body { margin: 0px 0px 0px 0px; background-image: url('/graphics/bgtile.gif'); background-repeat: repeat; } #main { padding: 10px 10px 10px 10px; } .placard { background-color: #ffffff; border-style: solid; border-color: #000000; border-width: 1px 3px 3px 1px; padding: 5px 10px 20px 10px; } #clientArea { padding: 0px 30px 40px 40px; } </style> </style> I am pretty consistent about my spacing, but there is no really strong standard out there for spacing / line breaking - I just write it so that it looks right. These examples are all valid: A.bk:HOVER { background-color: #c78289; color: #ffffff; } A.bk:HOVER { background-color: #c78289; color: #ffffff; } A.bk:HOVER { background-color: #c78289; color: #ffffff; } A.bk:HOVER{background-color:#c78289;color:#ffffff;} I think your example of /(^[^#.]w+{.*?})/msg is really close: I'd add that there *may* be white space between the class name and the first squiggly brace, but that's about it I think. /p m0nkeymafia
yeah perk i think ur right dude
mobile? where are ya? perkiset
Back now - I was at Lake Mead for the last 4 days - just bought a Sprint broadband (EVDO) card - it rocked. I was on the
netand working for both the 4 hour drive up and back... but while I was there I just relaxed. Ahhh.![]() But now back to reality... ![]() /p m0nkeymafia
accounting day? ouch
in britain the company u works for takes care of that ![]() perkiset
![]() I haven't worked for someone else since my teens. Employment sucks, but so does accounting ![]() To each his own I guess... /p |

Thread Categories

![]() |
![]() |
Best of The Cache Home |
![]() |
![]() |
Search The Cache |
- Ajax
- Apache & mod_rewrite
- BlackHat SEO & Web Stuff
- C/++/#, Pascal etc.
- Database Stuff
- General & Non-Technical Discussion
- General programming, learning to code
- Javascript Discussions & Code
- Linux Related
- Mac, iPhone & OS-X Stuff
- Miscellaneous
- MS Windows Related
- PERL & Python Related
- PHP: Questions & Discussion
- PHP: Techniques, Classes & Examples
- Regular Expressions
- Uncategorized Threads