The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. November 21, 2008, 01:01:16 PM

Login with username, password and session length


Pages: [1] 2
  Print  
Author Topic: heeeeeeeeeeeeelp  (Read 1557 times)
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« on: June 26, 2007, 01:17:48 PM »

any of u regex nutters got a few mins to help a monkey out?

ive got a script to grab all the classes and id's used on a webpage
i then want to pull off any matching css code for each class out of the css files, but i cant figure out a regex to do it

i.e. if i have found a class called "topNav"
i want to search through the css files [already in a buffer] and find the corresponding definitions

i.e.

.topNav {  padding-top: 5px; }

would return just "padding-top: 5px;" obviously with any other lines

this way if i rob someones web layout / design and modify it, i can run it through this and discard any usesless classes
can anyone gimmie a hand on what to do? im stuck!!

cheers
Logged

I am Tyler Durden
perkiset
Olde World Hacker
Administrator
Lifer
*****
Online Online

Posts: 5142


:sniffle: Humor was so much easier before.


View Profile
« Reply #1 on: June 26, 2007, 02:44:50 PM »

First - are you doing this in PHP?

(for the rest of this I'll assume so)

Second - you'll want to load all referenced CSS files (and their referenced files) just in case the definitions are not in the immediate page. Best way to find it then would be to create one big mess of all the linked files and do the Regex on <that string>.

Then, assuming you got a class reference, like 'topNav' and have it in the variable $className, you could do something like this:

preg_match('/([#\.]$className[\s]*{[^}]+/', $inputBuff, $matches);

... and then the $matches array would have anything that it found. Important to look for things that assign class by id too, rather than just class name - by this I mean:

<style>
#topNav { font-family: Arial; }
</style>
<div id="topNav">Here is the html</div>

Good luck!
/p
Logged

If I can't be Mr. Root then I don't want to play.
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #2 on: June 27, 2007, 12:49:09 AM »

cheers perk thats very similar to what i had but wasnt working, the bit at the end is different though.
will let ya know how it goes

and yeah its php Cheesy
cheers matey
Logged

I am Tyler Durden
perkiset
Olde World Hacker
Administrator
Lifer
*****
Online Online

Posts: 5142


:sniffle: Humor was so much easier before.


View Profile
« Reply #3 on: June 27, 2007, 01:25:03 AM »

Hey MM - I forgot something really important:

You really need to add an "m" as a modifier to the regex ie.,

preg_match('/([#\.]$className[\s]*{[^}]+/m', $inputBuff, $matches);

... this dicked with me for a long time. PHP won't let the pattern span multiple lines without it... and if someone defines a class like this:

.topNav {
   font-family: courier;
   font-size: 12px;
}

.. then the first regex I posted won't work.

Hope that helps,
/p
Logged

If I can't be Mr. Root then I don't want to play.
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #4 on: June 27, 2007, 01:27:10 AM »

excellent perk thanks, just about to try it Cheesy
Logged

I am Tyler Durden
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #5 on: June 27, 2007, 02:03:07 AM »

sorry to be lame but you omitted the closing )
where should it go?
Logged

I am Tyler Durden
Bompa
Administrator
Expert
*****
Offline Offline

Posts: 208


View Profile WWW
« Reply #6 on: June 27, 2007, 02:14:00 AM »

sorry to be lame but you omitted the closing )
where should it go?

My guess is at the end, but before the semicolon.


Bompa
Logged

m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #7 on: June 27, 2007, 02:21:11 AM »

ahh amazin that worked - i think lol
cheers for the help guys
hopefully have this finished and tweaked by tonight \o/
Logged

I am Tyler Durden
perkiset
Olde World Hacker
Administrator
Lifer
*****
Online Online

Posts: 5142


:sniffle: Humor was so much easier before.


View Profile
« Reply #8 on: June 27, 2007, 09:04:22 AM »

Right on... BTW ... what's with the missing paren? There's no open parens in my post  Huh?
Logged

If I can't be Mr. Root then I don't want to play.
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #9 on: June 28, 2007, 09:17:30 AM »

off the end apparently Tongue

ok so i now have the regex to pull classes and id's out of a webpage
'/(class|id) *= *\"([a-zA-z0-9 ]*)\"/i'

i then explode the results to pull out all classes even if two+ are specified

then i use perks code to grab the classes out of the style sheet
'/([#\.]'.$val.'.*[\s]*{[^}]+)/m'

which all works great

only thing left to do is grab non-class selectors out of the style sheet, i.e. body {} and so forth
I originally tried something like this:

'/([^#\.]'.$val.'.*[\s]*{[^}]+)/m'

but didnt particularily work very well, as it seemed to pull all the classes off :/
anyone have any further ideas?

cheers
Logged

I am Tyler Durden
perkiset
Olde World Hacker
Administrator
Lifer
*****
Online Online

Posts: 5142


:sniffle: Humor was so much easier before.


View Profile
« Reply #10 on: June 28, 2007, 10:34:43 AM »

On quick note, I'm assuming that the same code you used for style definitions could be used, except simpole negate the period or pound sign in front and replace the $val with letters/numbers...

/([^#\.][A-Za-z0-9_]+[\s]*{[^}]+)/m

... I think that'd do it, although you might possible get some stuff on a page that you didn't want... that one might take a bit more testing...
Logged

If I can't be Mr. Root then I don't want to play.
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #11 on: June 28, 2007, 03:21:50 PM »

Yeah perk i tried that [although i now realise i copied teh wrong syntax in my previous post]

for some reason it pulls up class definitions even though they have a preceeding period! very odd?
p.s. i actually get the sytnax u provided previously, when u first posted it i was like WTF !! but now it make sense, i really hope my regex skills keep improving lol [slowly Wink]

so any ideas why it may not work? even though we say we want text that is NOT directly preceeded by a # or . ??
Logged

I am Tyler Durden
Bompa
Administrator
Expert
*****
Offline Offline

Posts: 208


View Profile WWW
« Reply #12 on: June 28, 2007, 11:39:28 PM »

The dot is a literal within square brackets.   Nerd

(Dont escape it).




Bompa

Logged

thedarkness
Global Moderator
Lifer
*****
Offline Offline

Posts: 580



View Profile
« Reply #13 on: June 29, 2007, 05:39:25 AM »

Nice pickup Bomps.

Cheers,
td
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
m0nkeymafia
Expert
****
Offline Offline

Posts: 236


Check it!


View Profile
« Reply #14 on: June 30, 2007, 05:55:36 AM »

nice one bompa.
it still doesnt work though?

/([^.#][a-zA-Z0-9_].*[\s]*{[^}])/m

so if we expand it into constituent parts we have

Match a string that follows these rules:
- Starts NOT with a . or #
- Immediately followed by any number of letters or numbers or underscores
- Then match any amount of whitespace
- Then it needs to find an open parenthesis
- It then matches until it hits a closing parenthesis, at which point it stops
-
* Works over multiple lines

I cannot see how this matches ALL classes within my stylesheet?
The critical part, the NOT . or # reads fine to me? What am i missing?

Cheers
Logged

I am Tyler Durden
Pages: [1] 2
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!