The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 23, 2019, 01:52:23 AM

Login with username, password and session length


Pages: [1]
  Print  
Author Topic: Why stored procedures?  (Read 3858 times)
dink
Expert
****
Offline Offline

Posts: 349


View Profile
« on: May 25, 2008, 06:37:23 PM »

I get to show off my newbness now.

I have a couple of ecommerce sites that make extensive use of MySql for data storage.  The sites use the php method(s) I learned a bunch of years ago.

I have a standard config file which gives instructions as to database name, host, user, and password.  The config file is included on each page that requires data from storage.  Various pages use standard select statements to make the data available then it is displayed for the user.  Pretty standard stuff.  From what I'm beginning to see, it's also a little old fashioned.

So, why should I use stored procedures?  What benefits are available?  Will it save time?  Bandwidth?  Speedier?

Logged

[quote Nutballs]
the universe has a giant fist, and its got enough whoop ass for everyone.
[/quote]
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #1 on: May 25, 2008, 06:51:44 PM »

there are a few reasons.

the first reason is mostly for reasons of separation of your code structures. Many believe that manipulation of data should be handled by the database itself, and as such, SPs are needed.

Another reason, related to the separation, is that it can make your application code, database agnostic. Although the SPs for msSQL and mySQL are different, the input and output from those can be exactly the same. therefore, you application could give a crap about which database is being used (for the most part).

SPs are faster than sending the query from PHP/ASP/etc. This is because the SP is compiled/cached, and so are the queries associated with it, or at least they are supposed to be.

SPs are more secure (in theory), since you pass in parameters and the parameter gets type checked. At least in msSQL, it escapes quotes, and wraps with quote automatically for you reducing a major vector for injection attacks.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
dink
Expert
****
Offline Offline

Posts: 349


View Profile
« Reply #2 on: May 25, 2008, 09:57:54 PM »

there are a few reasons.

the first reason is mostly for reasons of separation of your code structures. Many believe that manipulation of data should be handled by the database itself, and as such, SPs are needed.

Are you saying that the select and results of same can be kept in the db?

Quote
Another reason, related to the separation, is that it can make your application code, database agnostic. Although the SPs for msSQL and mySQL are different, the input and output from those can be exactly the same. therefore, you application could give a crap about which database is being used (for the most part).

Interesting, but prolly doesn't apply to my case.

Quote
SPs are faster than sending the query from PHP/ASP/etc. This is because the SP is compiled/cached, and so are the queries associated with it, or at least they are supposed to be.

Now we're getting somewhere.  So, the php interpreter doesn't need to read and react each time the same query is served?  With 20k or so pages that becomes a factor to reckon with.

Quote
SPs are more secure (in theory), since you pass in parameters and the parameter gets type checked. At least in msSQL, it escapes quotes, and wraps with quote automatically for you reducing a major vector for injection attacks.

Security.  Aha.  That's good to know.

Looks like I have some studying to do.  Is my example in the first post a good candidate for a change to stored procedures?

What are some other areas where SP's shine?
Logged

[quote Nutballs]
the universe has a giant fist, and its got enough whoop ass for everyone.
[/quote]
thedarkness
Lifer
*****
Offline Offline

Posts: 585



View Profile
« Reply #3 on: May 26, 2008, 12:17:12 AM »

Quote
SPs are faster than sending the query from PHP/ASP/etc. This is because the SP is compiled/cached, and so are the queries associated with it, or at least they are supposed to be.

Now we're getting somewhere.  So, the php interpreter doesn't need to read and react each time the same query is served?  With 20k or so pages that becomes a factor to reckon with.

Nope, the query is cached, not the result set. If you want to cache results you'll need to do that yourself prolly using APC, and update them only when there is a write to the db.... but that's another story....

An example;

Let's say you have a new record to add to your database. let's also say that record has a vendor component, a category component, a relationship between the category and the vendor (1 to 1, belongs to), and the actual product info. Now when you insert this record you want to attend to the following steps;

1: Check if the vendor already exists, if they do do nothing otherwise add them
2: Check if the category already exists, if it does do nothing otherwise add it
3: Check that the vendor/category relationship is in place otherwise add it
4: Add the final product info. to the product table

Imagine if you will doing this in php (this is prolly the only way you can imagine it but we're about to change that). Imagine writing each query, interpreting the result, issuing the following query if required, etc.

Now here is a real life example of how it would be handled as an SP;

Code:
  # Do vendor
  SELECT COUNT(vendorID), vendorID INTO _vendor_exists, _vendor_id
    FROM vendor WHERE vendorName = i_vendorName GROUP BY vendorID LIMIT 1;
  IF _vendor_exists = 0 OR _vendor_exists IS NULL THEN
    INSERT INTO vendor (vendorName, vendorURL) VALUES (i_vendorName,i_vendorURL);
    SELECT last_insert_id() INTO _vendor_id; 
    INSERT INTO afNetwork_vendor_assign (afNetworkID, vendorID) VALUES (i_afNetID, _vendor_id);   
  END IF;

  # Do category
  SELECT COUNT(categoryID), categoryID into _category_exists, _category_id
    FROM category WHERE categoryName = i_categoryName GROUP BY categoryID LIMIT 1;
  IF _category_exists = 0 OR _category_exists IS NULL THEN
    INSERT INTO category (categoryName) VALUES (i_categoryName);
    SELECT last_insert_id() INTO _category_id;
    INSERT INTO vendor_category_assign (vendorID, categoryID) VALUES (_vendor_id, _category_id);
    SET _vca_done = 1;
  END IF;

  # Do vendor/category assign
  IF _vca_done = 0 THEN
    SELECT count(vendorID) into _vc_assign_exists
      FROM vendor_category_assign WHERE vendorID = _vendor_id AND categoryID = _category_id;
    IF _vc_assign_exists = 0 OR _vc_assign_exists IS NULL THEN
      INSERT INTO vendor_category_assign (vendorID, categoryID) VALUES (_vendor_id, _category_id);
    END IF;
  END IF;

I can't show you all the code I'm afraid but that will give you an idea. Now this is all handled as a single cached query and offers considerable speed enhancements on a big database (prolly a small one too).

So instead of making half a dozen queries from PHP you just issue a "CALL addproduct( 'vendcom', 'widgets', 'green', 'doohickey' );" and hand off the work to the DB. I have seen the benefits of this, I am a believer, but like so much in IT the problem has to be the right fit for the solution so YMMV.

Presumably you are using persistent connections so this will not reduce the number of connections to the DB but it will cut down on data being transferred back and forth between your app and the DB so should increase speed that way as well. Now if you want raw speed look at compiled procedures or UDFs written in C, etc. but that again is another story......

Cheers,
td

[edit] More info? Start here: http://dev.mysql.com/tech-resources/articles/mysql-storedprocedures.html [/edit]
Logged

"I want to be the guy my dog thinks I am."
 - Unknown
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #4 on: May 26, 2008, 08:10:01 AM »

Quote
Quote
there are a few reasons.

the first reason is mostly for reasons of separation of your code structures. Many believe that manipulation of data should be handled by the database itself, and as such, SPs are needed.

Are you saying that the select and results of same can be kept in the db?
nope. All i meant is that the actual manipulation of the data, such as looping through records, inserts, updates, edits, should be handled by the database technology itself. Part of the reason is so that you know that all your database tinkering is happening with code that is native to the database. Otherwise, if PHP for example, was to change the way the mysql_query works, you might get funky result, or even fatal, if you are not keeping on top of functionality changes in the PHP processor. Now that is unlikely to happen, but it has with probably every language out there. This is the reason why you actually do care about your next quote.

Quote
Quote
Another reason, related to the separation, is that it can make your application code, database agnostic. Although the SPs for msSQL and mySQL are different, the input and output from those can be exactly the same. therefore, you application could give a crap about which database is being used (for the most part).

Interesting, but prolly doesn't apply to my case.
Good practice though. Your projects today may just be practice for your projects tomorrow. realize too that it also makes it code agnostic. since the same db could be talked to by any application that knows the SPs.

Quote
Quote
SPs are faster than sending the query from PHP/ASP/etc. This is because the SP is compiled/cached, and so are the queries associated with it, or at least they are supposed to be.

Now we're getting somewhere.  So, the php interpreter doesn't need to read and react each time the same query is served?  With 20k or so pages that becomes a factor to reckon with.
what TD said, but I think you understood what I meant. The results are not cached without some kind of caching engine, which i think exists, or you can create. but the actual procedure, and the execution path are cached/compiled. So mySQL doesnt need to parse the query each time it runs. When you do it from PHP, mysql has to "decode" what crazy shit you wrote, EVERY TIME (there is a caveat here, but not my realm). The results of the SP are still retrieved realtime, or at least the same exact way that it would be from PHP.

Quote
Quote
SPs are more secure (in theory), since you pass in parameters and the parameter gets type checked. At least in msSQL, it escapes quotes, and wraps with quote automatically for you reducing a major vector for injection attacks.

Security.  Aha.  That's good to know.

Looks like I have some studying to do.  Is my example in the first post a good candidate for a change to stored procedures?

What are some other areas where SP's shine?

From a security point, SPs shelter the application from knowing anything about the internals of the database. like an API for a webservice in a lot of respects. You have no idea whats actually going on behind that API, but you dont care because you get what you need. Same with SPs, which also means, you can allow other applications to access your database via the SPs, and they dont need a username, or password, or anything. just the parameters of the SPs.

basically, an SP is probably always good to use. however, if you dont need to squeeze cycles, and you dont care about being Database agnostic, then probably not worth it to convert something. However, if you are currently building it, why not do it in SPs.

One more advantage is in your PHP. most SP calls are going to be much simpler to follow than the jumble of shit you most likely have to build crazy SQL strings. The call is like a magic box, you send in some parameters, and you get back your result. Just like how you currently do it, but in less PHP lines.

The downside to SPs is that your code is now fragmented. You have code to manage in 2 places. This is part of the reason I have not used SPs in alot of my client work. They take it over, and they only know ASP, or SQL scares them, or or or or.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
dink
Expert
****
Offline Offline

Posts: 349


View Profile
« Reply #5 on: May 26, 2008, 10:12:25 AM »

Wow.!

@td.  The example fits my situation perfectly.  Thanks for that.  Much easier to understand when I can relate.

Thanks for the linky too.  I'll go read there as well.

@nb. 
Quote
...in your PHP. most SP calls are going to be much simpler to follow than the jumble of shit you most likely have to build crazy SQL strings....

I want to know how in the hell you can get your monitor to look thru my monitor and see my php code.   Don't make me...

The code fragmentation part doesn't worry me, at least not now.  I already have code fragmentation to some degree. 

The basic concept is beginning to seep in.  I think.

Doing a few searches on my (least) fav engine brings up a lot of results mentioning SP in conjunction with mySql 5.  Lots of praise for the inclusion, yada yada, etc.  Can mySql 4 be hacked into working with stored procedures?  Conversly, can a set of SP's be created to interface with mySql 4?  Or, are SP's agnostic?

My current setup has a modest db of about 275k rows.  It includes all of the  items you would expect in a retailish database.  The feed for this db is from several different entities.  Each week I download a feed from suppliers x, y, and z.  I massage the feeds from y and z so that they conform to the schema set by the original supplier (x).

After the normalization of the feeds, I run a script against the csv to get the data into the db.  The script needs to run n times to grab all the data and do the required insert, update, etc, stuff.

Sounds like SP's might be a good fit to use for the datafeed import script.  What about that?

**side thought.  Stored procedures sound sorta like OOP for sql.
Logged

[quote Nutballs]
the universe has a giant fist, and its got enough whoop ass for everyone.
[/quote]
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #6 on: May 26, 2008, 10:24:46 AM »

Since the mechanics of SPs is already under discussion, I'd like to tackle three completely different avenues.

First, Triggers: A trigger is an SP that runs based on an event in the database. I can say, essentially, "run this code when a row in the transactions table is updated" - which can be very handy, because let's say I like to keep a tally of the amount that a particular person has spent with me - well then by simply adding or subtracting a transaction that value is automatically updated. Not only is this faster than PHP, but the outside-the-db-logic (ie., PHP, PERL or what have you) cannot avoid this code - the code MUST and WILL run.

Which leads to the second part of the story, referential integrity. MySQL does not have "declared" referential integrity, because I cannot explicitly tie this table to another one with a foreign key. In other words, if table A is dependent on a foreign key in table B, then the row containing that key in table B must never be deleted if something is "tied to it" in table A - do you follow? because if the row in table B is deleted, table A now has orphans and the database is corrupt.

So referential integrity is a method of using triggers to ensure that your database stays the way you expect it. So for example, you would now place a Before Delete trigger on table B - and in that trigger, look to see if there are any records in table A associated with it, and if so, halt and fail the delete.

Lastly, let's look at "code location appropriateness." Things have come a long way from applications running in a terminal from the mainframe... we went to apps running completely locally to the newest distributed model, Ajax - where the appropriate code is running in the appropriate location ie., server connection stuff is, cleverly, at the server, and (if you're doing things right) client-specific stuff is running at the client. The whole mess contributes to our notion of an application, even though it is split into two pieces.

Well SPs, Funcs and Triggers can further the appropriate fragmentation of an application. Much like it makes no sense to have a server checking to see if the "shape" of a telephone number in a web form is right, there's a strong argument that the web server application having to perform all of the logic to make sure that the DB is straightened out is equally wrong. Old time DBers (like the DBDog, whom I am trying to get back here) see the world from the DB code all locked in and protected out - rather than logic pointing in. He sees the world from a declared and programmatic referential integrity perspective on his databases, and critical logic that builds database answers is part of the database. Web programmers simply ask him (the DB) for answers, they don't push a lot of logic against the database.

The downside to this is that if you're a one man show like me, you've now got code in 3 locations for a single app - and if you're not a Type A Loves To Code And Learn kind of guy, this can be problematic. On the other hand, if the appropriate logic is moved to the appropriate places, (client, server, database) then it's easy to understand because you'd "never place THAT kind of code anywhere except the database" and such.

I think this is going to be a great extension to the Cache discussions. Thanks TD and NB for jumping in so quick and hard - and Dinker, keep 'em coming!

/p
Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
Pages: [1]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!