The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 22, 2019, 06:36:03 PM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: new database columns or relational method?  (Read 4332 times)
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« on: August 23, 2008, 04:06:00 PM »

In my 'see if i want to do it' phase of creating a blog/cms/forum platform, I ran into a challenge that I am curious of opinions on.

First of, i realized that a blog post with comments is just like a forum post with followup posts. And a 'static' page is a blog post without comments.

So because of that I realized that you could do all three in 1 database, using 1 package to manage it, and the template would dictate the look and feel, not the data.

If you assume that this system would allow an arbitrary number of fields for an entry, for example, title, body, image, metas, where found, etc. I could store those 2 ways. either each in a column of 1 table, or all in a reference table with a key/value pair as a referential system.

In alot of respects, internal management of the data would be alot easier everything was just in 1 table, and as a result no need for multi-row parsing of a referential setup

for example, in the single table:
PostID   title   body   image
1          aaa   bbb     ccc

or the referential table
PostID    key   value
1           title   aaa
1           body  bbb
1           image ccc

The single table method would require columns to be added or removed from the actual table structure, at the whim of the administrator if they wanted to add a video column for example.
The referential table would require no structure changes, just new entries of that video column.

The advantage of single table is, 1 row = 1 set of data, but at the cost of a potentially huge row.
the advantage of referential is no structure change, but at the cost of multiple rows being dealt with.

My concern with the single row method would be allowing the administrative interface to affect the structure, which to me seems awfully dangerous. On the other hand, management of the data is far easier. Want to delete a column and all the data associated with it, boom gone. Referentially, you gotta do cleanup.

Does that make sense?
I am curious of thoughts on this.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
dink
Expert
****
Offline Offline

Posts: 349


View Profile
« Reply #1 on: August 24, 2008, 05:30:21 PM »

I think the single table method would be best.

An admin module with the 'add col' , 'del col' , etc would make it easy enough.

I don't know of any reason that a huge row would cause problems either.  I've made excel tables with 45-50 cols in a row.  Didn't seem to make any difference in performance.

 
Logged

[quote Nutballs]
the universe has a giant fist, and its got enough whoop ass for everyone.
[/quote]
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #2 on: August 24, 2008, 05:43:22 PM »

i was also considering serializing the data, and storing in 1 big column, but that would make cleanup/management difficult.

it really does seem like 1 table it best, but it goes against everything I know... lol
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
vsloathe
vim ftw!
Global Moderator
Lifer
*****
Offline Offline

Posts: 1669



View Profile
« Reply #3 on: August 25, 2008, 03:24:52 AM »

You should normalize it to the nines, at which point you have a ton of flexibility and you will be able to code whatever you want into your CMS without touching the database again.

Well, except through code. Unless you're like me and write a database abstraction layer though, e.g.

Code:
<?php
/*****************************************
An insert*********************************
*****************************************/
$database->insert('myTable',$arrayOfMyData);
/*****************************************
A select**********************************
*****************************************/
$database->requireRelationship('postTag','doohickies');
$postsAboutDoohickies $database->getContent();
/***etc. etc. etc.***/
?>

Logged

hai
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #4 on: August 25, 2008, 07:21:39 AM »

see thats the rub.

normalized is my normal M.O.
however, there have been times where I have had to do some really retarded things in DB because it just made things work better.

an example is having about 100 boolean columns, and in order to search for true/false, or a specific column, I have a lookup table of the column friendly names which tells the column code name. This is to prevent injections because of the front end scenario. I hope to god noone ever looks at it other than me, because frankly they would think I was a loony, unless of course they realized the reason.

In the blog case, my concern is speed. Normalizing will require a hell of a lot more DB queries. But like you said, the DB is then done, etched in stone, no more worries.

Bah. I will head down the road i know, which is normalized. Im sure it will be the wrong one... lol
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #5 on: August 25, 2008, 08:01:15 AM »

Hey Nuts - sorry it took a while to get here.

I have a similar problem with my contacts, contact_addresses, contact_fops etc etc tables... what I really wanted was a single huge wide row, but that just didn't make sense. For example, contact_addresses makes it so that one contact can have any number of possible addresses - but a field in the main contact table, billing_id for example, points to a single address in the contact_address table that says, "This is the current one."

My solution was to use views. I create a view called contactview which is maleable and mimics whatever structure I want, regardless of the underlying tables (oh - forgot to mention - this db layout is slightly different for each client, based on needs [I can go into why in another space] - but I do not want to have to change my underlying logic for maintaining lists of people, eblasts, newsmedia etc etc)

Views are prepared SQL statements and run really fast, so you get the single table speed, but you're not recreating or complicating things when you change your layout. This also benefits from the older DB notion, that if you include variable size blobs (like posts) in a table you increase the lookup time, so you want your primary indexed table to be all fixed length - if your main table is fixed length, then joins to a blob table, yet it is perceived as a view, you'll get the benefit of all listed above.

Sorry, going fast on the way to take the kids to school.. more in a bit if I've been confusing.
/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #6 on: August 25, 2008, 08:21:13 AM »

That helps. forest through the trees and all that, apparently...

duh. views. I could also do SPs of course, which actually might make more sense in this case. Though there is still that issue of hosting companies now allowing SPs...

I am going to post what I think a basic table structure of this would be. And I am curious what the opinions are.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #7 on: August 25, 2008, 09:07:15 AM »

I've done just a little talking with some folks that have downloaded phpMyIDE - it seems that getting access to your own SPs is considerably easier than triggers, since SPs apply to <your database> but triggers are global privileges.

I actually made use of a post you made a long while ago, and have minimized my trigger usage to only a couple extreme situations, choosing callable SPs from php instead to make my logic more easily followed. Great advice and it's working really well.
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #8 on: August 29, 2008, 12:30:49 PM »

glad it made sense and works for you. like I said, triggers are awesome, but can be scary. I don't like scary. I like to know where everything is coming from.

As for my little adventure into CMS/Blog/forum land.
I have been thinking quite a bit about it. On the back end, i realized a simple ruleset could apply.
Every entity, be it a post, a comment, a user, etc, could be defined in 3 tables, and allow for interaction between the two.
this would also allow for unlimited flexibility, and significant compactness of the core set of features. Of course this will bloat out to absurdity with the more stuff that gets bolted on, but the core is all i care about.

There are sections, which would allow a single installation to run multiple sites, blogs, forums, etc.
There are users, duh.
There are posts, which are associated with section and a user
There are comments, which are followups to a post.

like i said in the OP I realized that all posts online follow the same format, no matter what the type of system they are for.
A static website has a posts, with no comments.
a blog has posts with or without comments.
a forum has posts generally with followup comments.

since they all follow that criteria, it became a duh.

The only table I can see adding is a Usersections lookup so users can be restricted.
UsersectionID
fk_UserID
fk_SectionID

let the feedback rip.

Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #9 on: August 29, 2008, 01:42:24 PM »

i was also considering serializing the data, and storing in 1 big column, but that would make cleanup/management difficult.

it really does seem like 1 table it best, but it goes against everything I know... lol
That is the whole idea behind zope (which i posted about in another place).
And it solves all those problems, but it is dependant on python.

For a blog like system probably overkill.
Logged
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #10 on: August 29, 2008, 02:02:01 PM »

i had looked at zope, regardless of the python issue.
it sounds similar to what I am thinking, and actually i am planning to install it and check it out because i might steal some concepts Smiley
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
nop_90
Global Moderator
Lifer
*****
Offline Offline

Posts: 2203


View Profile
« Reply #11 on: August 29, 2008, 02:48:01 PM »

There have been other solution attempts in other languages.
Biggest reason for failure in my opinion is because of lack of resources. You might be able to come up with a good solution, but because of lack of time/funding u can never really complete it fine tune it. ZODB has backing of zope so it is also not going anywhere.
(if u just want to do for fun cool Smiley)

Biggest issue for indexing u need to come up with some sort of BTree storage device, pickling the arrays,dicts will not cut it in large structure.
Also cleanup/management. ZODB has utilities that handle all this stuff. But u have to "clean" the database regularly otherwise it will bloat up. (a simple cron job, but still it is an issue).

Also before ZODB became seperate project of zope, it had already been powering zope for 5 years.
Python seems to be quite big in scientific community. So they have invested quite a bit of time using other project based on zodb.

Another idea is http://en.wikipedia.org/wiki/Sedna_(database) and xml database (i have never actually used project).
It appears to have all the advantage of an odbms, and it has binding for other languages.
It saids it has a PHP binding, but i suspect it probably old, and u probably will have to poke arround in the source to make it go.
(probably still lot easier then making ur own odbms from scratch).
Biggest problem with these type of project is lack of documentation.
Lots of great academic based projects out there, but they all seem to have zero documentation.
Seems to be the attitude unless u poke arround in the source code, u are not a real man and should not use them Smiley.
Logged
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #12 on: August 30, 2008, 08:52:41 AM »

bah, i ran into a problem already. lol

templates. So now I added a template table, so each section can have many templates.
but that now made me realize another issue.

how to deal with a home page.
for any URL that is translatable to a page in the database, no problem. this seems like it will work fine.
For a URL that is not known, I can take of that too, probably just need to htaccess that to pull the correct template.
But the root page, index, home, whatever you want to call it. Thats a bit sticky it seems.
since often you will have a home page that is a different template than the rest of the site, and is only used once, I need to somehow determine what that template is. I guess putting a bool of IsIndex or something in the database would work, and if the URL has no params, or directory, then its the index, so display that template.
But then that creates a problem for sub-sites or sub-sections. I guess I need to store the URL for the root of a section.

So my structure works for the actual blog entries, but for 'default pages' its a not going to work.

mostly talking out loud in case your wondering...
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #13 on: August 30, 2008, 11:15:37 AM »

I haven't weighed in yet, because you're looking for feedback and as yet, I don't see anything wrong with where you are going.

At templates, home page, URLs etc:
I have this problem all day long because every single site I have is WAY dynamic. Even though some are simply the construct of my header and a simple HTML "client area" file, they are still dynamic. Some, like with my cinema client, almost all pages are specialized code, pulling schedules and movie data from a database - yet the url is quite simply, "/movies/h/Hancock.html"

My solution has been a translation table that instructs me how to behave for each page. In some cases, this table is literally a table in a database, in some cases it's a chunk of code and in some it's both. But consider this - if I look at the url above, I can convert that into any instruction that I want to behind the scenes, and then behave accordingly:

/movies/h/hancock.html -> ./simplePage.php?file=/h/hancock.html&header=simple&template=movieDetail.php&cached=false&menuID=-1

Now that might look like a bunch of meaningless work. However, here's what's important: the various parts of my dynamic framework have no idea about each other - they are very object oriented. Client area stuff has no idea how it's being display - the menu system has no connection to the header, the header has no idea what's in it - the caching system has no connection to anything except itself. By converting this URL into instructions, THEN processing through the dynamic construction of the page, each system can call the code that it needs to in it's own space. In this case, the menu ID is -1, meaning that there is nothing "highlighted" - but if I'd passed an ID, then the menu-generation code would have known to highlight a certain component of the menu - again, without knowledge or intervention by the header or content.

Note, for example, that I'm telling my header what kind of outer framework I want for this page - the client area display module will be movieDetail.php - this is a real requirement - I'll be doing something akin to this: require($_GET['template']) - which takes advantage of APC nicely as well.

(No, JasonD, I am not actually doing THAT and no, you will not be able to break any of my sites through this mechanism. As a side note, GET parameters are all stripped off a URL before I even begin processing a page - since I NEVER have get parameters on a client site, I strip them all off, then fill the GET array with what I do want myself. I just use the $_GET because it's automatically global and ... for other reasons outside the scope of this post).

I tried the .htaccess and mod_rewrite method for quite a while, but the gyrations that I needed to go through were just too much and still not powerful enough. My translation mechanism today looks more like this:
  • Is there a cached record that is EXACTLY what the inbound URL is? If yes, return that buffer and exit.
  • Does (this site) have a custom translation function? If yes, call it and see if the return value is TRUE (handled) or FALSE (not handled)
  • If the URL has not yet been handled, look to the database and see if there is a translation record for the inbound URL.
  • If still not handled, does the URL fit my simple mapping?
  • If not handled, register this as a hack attempt and throw the request away.
  • If handled, load the $_GET array with (whatever the URL looks like NOW)
  • Call the processor referenced as the NEW URI (in the above case, simplePage.php, which is a simple file_get_contents sort of page).
  • Echo the page buffer created by <the processing entity>
  • If $_GET['cached'] is true, then post the page buffer into the APC cache named <the original URL>

Note that I have dozens and dozens of different page construction modules, the activity of which is probably obvious from some of these names:
  • simplePage.php
  • simpleInclude.php
  • simpleForm.php
  • plaquePage.php
  • galleryPage.php
  • itemDetail.php
  • movieDetail.php
  • movieTrailer.php
  • scheduleGen.php

... and the list goes on. My framework is so very dynamic, that my "main.php" is untouched by me when I need a new site - I do a symbolic link to it in a new directory, modify my configSite.php file and off I go. Then in a /source/ directory under the <main for this site> I'll put symbolic links to the pages I need ... simplePage.php simpleInclude.php or create real new files for specific applications. This gives me the best of both worlds - utterly customizeable but making absolutely as much possible use of previous art as possible.

I'll bet you can see where I'm going for your CMS system, but ping back if my coffeed up self is making no sense Wink
« Last Edit: August 30, 2008, 11:19:31 AM by perkiset » Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #14 on: August 30, 2008, 01:17:19 PM »

interesting. It's 'similar' to what I have been thinking for the cms, but a bit more complex than I was thinking. mostly because of your needs.

I am thinking 1 code handler page, basically index.php. Everything gets routed through that.

but you did just make me realize that I m overthinking it. LOL

every "post" would have a URL translated from the title was my assumption. And that still would be the case, but, I would allow an override, so... You could clear out the URL, and the page would be the default page. or have a checkbox that does it for you, to make it dumb-person-friendly.

So if a URL exists in the database, display the content in the associated template.
If a URL doesnt exist, do whatever is set for the 404. I assume I will need to add a special section for error pages and handling.
If a URL is blank, and there is a blank one in the database, bam! thats the index.

technically this will work down through subdirectories as well i think.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!