
![]() |
London
SEO
Hi,
I'm working on a new scraping script, as simple as detect if the site that I'm querying has a SITEMAP or NOT so I was playing with some domxpath to get the html and play with it, but I think that am I limited to the attributes? ...some more code $dom = new DOMDocument(); @$dom->loadHTML($myhtml); $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//a"); for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); $url = $href->getAttribute('href'); //$alt = $href->getAttribute('alt'); //$target = $href->getAttribute('target'); echo "<br />Hey take a look: $url"; .... close Any ideas how can I get domxpath to let me know if a site has a string (...anything...SITEMAP...anything) Thank you J. perkiset
Why use the DOM to get that info?
Just preg_match for the string you're looking for, then load it into the DOM if you find it has what you want. |

Thread Categories

![]() |
![]() |
Best of The Cache Home |
![]() |
![]() |
Search The Cache |
- Ajax
- Apache & mod_rewrite
- BlackHat SEO & Web Stuff
- C/++/#, Pascal etc.
- Database Stuff
- General & Non-Technical Discussion
- General programming, learning to code
- Javascript Discussions & Code
- Linux Related
- Mac, iPhone & OS-X Stuff
- Miscellaneous
- MS Windows Related
- PERL & Python Related
- PHP: Questions & Discussion
- PHP: Techniques, Classes & Examples
- Regular Expressions
- Uncategorized Threads