The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 16, 2019, 09:40:06 PM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: Objects? Where / what are the objects?  (Read 5704 times)
DangerMouse
Expert
****
Offline Offline

Posts: 244



View Profile
« on: November 06, 2007, 01:26:46 PM »

Hello there,

I'm in the middle of creating my own spider system, and have taken great inspiration from Perks spider class. I'm planning on adding a few things to it, simple seo analysis type things ... Yahoo backlinks, pagerank that sort of stuff.

I'm trying to model these things in terms of objects to correctly define my classes. I can understand how a "webpage" is an object, it has simple properties like title, meta tags amongst plenty of other things. I guess my question is would something like "pagerank" and the associated methods to acquire it be an object in its own right or part of maybe an "offsite factors" class?

More generally I guess I'm asking to what degree is it sensible to break things down into objects, how low level should you go?

Its a shame that all applications can't be as simple as the "dog" thats "brown" and capable of "barking". Tips appreciated.

Cheers,

DM

Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #1 on: November 06, 2007, 01:48:05 PM »

Hey DM -

That's as broad a question as, "How should we build a building. I'm thinking it should have windows, and walls are good as well."

Sorry, not making fun, just pointing out that in OO thinking there is very little notion of "Right and Wrong." One of the beauties of OO thinking and architecture is that it can help you model your programmatic architecture around the way you perceive life/things around you.

For example: if I were to think about scrapers and an object hierarichy, it'd probably be something like this:
perksBaseClass
   requestBase
      httpRequest
         serpBase
            googleSerps
            yahooSerps
            askSerps
         scrapeBase
            blogScraper
      socketRequest
         secureSocketRequest
            merchantProcessingRequest
               linkPointRequest

I've added WAY more than you are asking so that you can see the way that I'd structure it - bear in mind that this is quick notion and not well thought out. Another unfortunate truism in OO programming is that the 3rd time you write it is when you'll get it right...

Part of what you'll see in my hierarchies is levels where I can MIGHT need to add stuff. Take, for example, the perkBaseClass - I like having an ability to add something to the entire family if I need. Also, the names of the objects tend to outline what they will add to the family.

Although some might argue that this adds a lot of code to the compile process, you can see where I employ APC here. Here's the juice: if the code for a class is already compiled and ready to become and object, then you can have massively intricate and deep structures that work the way you think, and pay virtually no penalty for them. The VERY FIRST request to the system will, but after that, all the complexity of your class trees are ready to go - and they'll have a huge amount of capability available to them at essentially no processor cost.

Just a quick
/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
DangerMouse
Expert
****
Offline Offline

Posts: 244



View Profile
« Reply #2 on: November 06, 2007, 02:13:17 PM »

That actually helps quite a bit thanks Perk, and you're right the question was a little broad lol!

To a degree I've probably been getting caught up in the semantics of it all. While its a good idea to code to standards, part of the beauty of PHP is its flexibility, maybe I should think more about how I'll want to use and extend my code (and objects) in future rather than breaking things down for the sake of it.

What started this off was reading some blog comment about class architecture being used as a different format for proceedural code, rather than a providing a true OOP implementation, got me thinking that to a degree this is what I've been doing with my little projects so far (Yahoo! Answers API wrapper for example). I'm trying to start thinking of classes as object blueprints rather than library components, this is straightforward in theory but seems to become a grey area in practice!

The idea of building objects on top of a Web Request object appeals, currently i've been initiating an instance of my web request class within the class definition for various scraper type objects, but like you say, it would be more efficient to just extend the webrequest as its essentially the base element of whats going on.

Hmm interesting stuff, thanks for the advice.

DM
Logged
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #3 on: November 06, 2007, 02:38:05 PM »

lol at 3rd time u get right Smiley true.

there are lots of books on howto do shit, ussually crappy books on API etc
very few books on how to design shit

http://en.wikipedia.org/wiki/Design_Patterns
read that book quoted there.

Logged
DangerMouse
Expert
****
Offline Offline

Posts: 244



View Profile
« Reply #4 on: November 06, 2007, 03:28:42 PM »

there are lots of books on howto do shit, ussually crappy books on API etc
very few books on how to design shit

Yeah I'm starting to notice that Nop! Don't get me wrong I've got some decent info from books and tutorials, but tend to be able to find most of what I need with a simple google search. However theres a distinct lack of beginner information on how to structure code - its all very well knowing  how to create a class, how to extend one and how to use magic functions but theres very little on where the best place to use them is.
Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #5 on: November 06, 2007, 03:36:34 PM »

The idea of building objects on top of a Web Request object appeals, currently i've been initiating an instance of my web request class within the class definition for various scraper type objects, but like you say, it would be more efficient to just extend the webrequest as its essentially the base element of whats going on.
I do that exactly. As you may or may not know, when you have a class hierarchy, the code for each "level" of the tree is compiled only once and referenced from then on. So if you have 10 classes built on a huge base class, you'll only have one instance of the baseclass code in memory and 10 little wrappers for the additions to the class. It's very cost-effective.

/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #6 on: November 06, 2007, 04:02:53 PM »

99% of the problem with good code is proper design structure.
Most people suck at that.

That is why you read the gang of 4 book (or a similar book).
It is like a book of blue prints / design patterns.

Not to say perks way is wrong (there really is no right or wrong, just as long as it works)
But you can see from his layout he comes from the "old skool" where for a class to be polymorphic is had to inherit off a common ancestor with same methods. (this is the way object pascal and C++ work, it has to be this way since they are compiled).
 
Not sure how php works, but with languages like perl,ruby,python it does not have to be this way. And you can harness the true power of the language.

So i would make a webclient class.

TWebclient
  • get
  • post

Then I make a seperate class for each search engine.
But inside that class would be like the webclient
and each class would have on function called search
so it would look like this

class TSearchGoogle {
         TWebclient webclient;
         TSearchGoogle {
                  webclient = new TWebclient();
                  }
         search(query) { return results}
}
to make other class for other SE just make exactly the same.

now since you are using scripting language you can do cool shit like this

function search_engines(engine_classes,query) {
        for engine_class in engine_classes {
                   engine = new engine_class()
                   result = engine.search(query)
                   }
}

I would then use like this
search_engines([TSearchGoogle,TSearchYahoo,TSearchMsn])

So i have advantage of perks system, if i need to change my webclient code, it will propagate down.
But on other hand i need a lot less classes
Logged
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #7 on: November 06, 2007, 05:07:12 PM »

@ OldSkool - it's a fair cop  Wink

I am confused, however, if we look at simply the last layer of the HTTP arm of my hierarchy, would this not be very similar? I was outlining that you'd have an HTTP request class that you would add to it the understanding of serps, and then add to that the understanding of Google, or Yahoo or such. So you have one body of code that understands HTTP, then perhaps the next layer is simply abstract functions like getTop10() or an array for the actual serps - no real implementation code, just the structures that the next layer will implement, specific to the idioms of each engine that you scrape. I dunno, just shitkickin.

As an interesting point of research, it is good to look at the Borland Delphi/C++ VCL hierarchy, in which *every single* class in the library ultimately descends from TObject. It is an excellent way to see some REALLY good design of a huge library, although it's a fantastic amount of overkill in about 99.9% of situations. But it's fair to say that my design methodology was hugely influenced by that work.
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
DangerMouse
Expert
****
Offline Offline

Posts: 244



View Profile
« Reply #8 on: November 07, 2007, 04:15:11 AM »

This is a really interesting discussion, highlighting that there seems to be no right or wrong approach.

I've just been taking a look at this: http://framework.zend.com/manual/en/zend.service.flickr.html- they seem to provide results as iterations of an object itself - I've never thought of constructing classes like this (being a noob) and was just wondering what your views on this approach were?

DM
« Last Edit: November 07, 2007, 06:15:37 AM by DangerMouse » Logged
stefan
n00b
*
Offline Offline

Posts: 8


View Profile
« Reply #9 on: November 07, 2007, 05:39:00 AM »

Yes interesting discussion! Wanna say something as well. Even though I don't do PHP but rahter C#/C++ etc...
 
Agree on no rights or wrongs - my basic idea is that noone can tell *me* what's best practice about anything - and that goes with object modelling as well. I'm looking for best practice for ME and that
might be totally wrong for others. I am lucky enough to be alone in my shop and therefore have no boss telling me what's wrong or not.

My best tip is to yes do read up, yes listen to others, but code/code/code and you'll find
a model that works for YOU. And that's the most important.

My second idea when it comes to object modelling, and in the end mapping against physical storage (a database in 99% of the cases for me) is

a) DONT START WITH THE MODEL - start with writing your client. I.e think how the you want the code using the objects to look like

o = new googleSerps();
o.Run();  (or however perk was thinking it should be used - sorry if I am completely wrong...)

or maybe you fancy the notion of nop_90

search_engines([TSearchGoogle,TSearchYahoo,TSearchMsn])

Cause the purpose of the object model is to be USED - not to be the perfect replica of the "real world" objects behind it. Cause you will need to do changes either way,
your requirements will change - or the WORLD (i.e the objects, or even the meaning of the objects) will change since the world around us does indeed change. 

b) I will mess this serp example up even more and say we have a database as backend. I typically create a class with EXACT mappings to the database fields (or sp result fields) - using a codegenerator.
Then I made a new class which inherits from that. "Business layer"  where I put functions and properties. That's my base I start with at least.





    
Logged

No links in signatures please
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #10 on: November 07, 2007, 05:47:35 PM »

@ OldSkool - it's a fair cop  Wink
Old skool not bad skool Smiley

I am confused, however, if we look at simply the last layer of the HTTP arm of my hierarchy, would this not be very similar? I was outlining that you'd have an HTTP request class that you would add to it the understanding of serps, and then add to that the understanding of Google, or Yahoo or such. So you have one body of code that understands HTTP, then perhaps the next layer is simply abstract functions like getTop10() or an array for the actual serps - no real implementation code, just the structures that the next layer will implement, specific to the idioms of each engine that you scrape. I dunno, just shitkickin.
the end result will be the same Smiley. It is what happens behind the scenes.
End result is that we have 3 classes, all with 1 function called search.
My previous example kinda sucked i hide the webclient object inside the class it really does not matter whether you do it perks way or mine.

Code:
class SearchGoogle :
        def search(query):

class SearchYahoo :
        def search(query):

class SearchMsn :
        def search(query):
behind the scenes in perks case (the delphi case) they all inherit off serpBase which has to have a virtual function called search.
in my case they are all 3 independant classes, they only thing they have in common is one member function called search.

In my case you can do intresting things like this.
@dangermouse yes i do stuff like that all the time, that is power of dynamic language.
Code:
seConstructors = [SearchGoogle,SearchYahoo,SearchMsn];
query = "blackhat seo";
results = [];
for seConstructor in seConstructors :
    se = seConstructor()
    results.append(se.search(query))   <---- notice here regardless of what ever the object is, i am able to call the member function search
The advantage of my method is it eliminates 1 class. Smaller code means quicker coding time etc, also easier to debug etc.
Disadvantage of dynamic languages is that type errors will not be caught.

As an interesting point of research, it is good to look at the Borland Delphi/C++ VCL hierarchy, in which *every single* class in the library ultimately descends from TObject. It is an excellent way to see some REALLY good design of a huge library, although it's a fantastic amount of overkill in about 99.9% of situations. But it's fair to say that my design methodology was hugely influenced by that work.
Always a good idea to look at how others solved the problem.
And i suspect that the design of python was strongly influenced by Delphi.
He basically took ideas from lisp and then combined them with delphi.
python was the first mainstream language to do this. (perl objects are a whole different ball of wax, basically you can take any type (hash,array,etc) and "bless" it into a class)
All objects in python inherit off a common object (this occurs automatically)
Inside that python parent object are functions allow you to query/manipulate the classes/objects at runtime (a lisp idea).

This allows you to do extra cool shit like http://pyro.sourceforge.net/
You can make a class that constucts another class that you specify. While the class is being constructed (since you can query it at runtime etc), you can stick "hooks" in there to intercept the calls. Basically that is what pyro does.

http://psyco.sourceforge.net/ is even cooler.
python compiles to a virtual machine. psyco then gets the VM code at runtime first time function is executed. it then checks to see if it can be compiled into assembler. it possible it will do so. hence massive speedup in code.

With python the biggest thing they stole from lisp is the idea of a console which you can attach to a running process.
(there are implementations that allow you to attach a ssl python console to like a webserver)
you can then poke and prode arround in the insides while the server is running

I suspect that php allows similar things to be done.
(a casual glance at the php manual shows this http://www.php.net/manual/en/language.oop5.reflection.php)
But these things are not being used or very little

example of python's xmlrpc
http://docs.python.org/lib/xmlrpc-client-example.html
because the code use interspection (not sure if that correct term, makes itself at run time).
xmlrpc is a snap to use, basically you set the server, and then you can call xmlrpc functions just like you call regular ones.
magically at runtime when function is called using rpc it will magically convert into the correct python types or vice versa. (perl similar story).
It is not really magic Smiley. Basically when the function is called it inspects the arguements, and then converts them into the proper xml representation.
When the call is done it does the exact reverse.

This is not problem with php, it is how people use it.
Part of the problem with php is that it's major selling was basically C++ using a scripting language.
Yah it sounds good when u sell the idea to the guys upstairs,
But the consequence is that you lose the advantages/safety of delphi where you have compiler checking for proper types etc.
And you do not gain any of the advantages of the language.


Logged
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!