|
nutballs
|
 |
« on: January 20, 2009, 02:07:00 PM » |
|
can anyone think of an efficient way to combine two tables, keeping only the unique values?
for example, I have 2 tables which each have a sentence column. 1 sentence per row. this table, let call it tableA, might have duplicates within itself. tableB however, has unique sentences, which only occur once in the table. I want to add the sentences from tableA to tableB, which do not already exist in tableB.
right now I am doing it in a loop. first select tableA.sentence from tableB. if not exist, then insert tableA.sentence into tableB
this is slow, and stupid. I know there is a better way, but for some reason, this has just become completely cloudy to me and I am not seeing it.
sentence column is TEXT btw, because I need to to text queries on it.
|
|
|
|
|
Logged
|
I could eat a bowl of Alphabet Soup and shit a better argument than that.
|
|
|
|
perkiset
|
 |
« Reply #1 on: January 20, 2009, 05:07:17 PM » |
|
Make the sentence a primary key, then just insert away and let it deny the insertion of a dupe.
|
|
|
|
|
Logged
|
It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
|
|
|
|
vsloathe
|
 |
« Reply #2 on: January 20, 2009, 07:04:05 PM » |
|
select into join something or other
but Perk's right, that's way simpler.
|
|
|
|
|
Logged
|
hai
|
|
|
|
nutballs
|
 |
« Reply #3 on: January 20, 2009, 09:09:07 PM » |
|
but the sentences are FULLTEXT. sorry, i meant fulltext, not text before. though I guess I could store the sentence twice. once in a unique key column and once in a fulltext column.
|
|
|
|
|
Logged
|
I could eat a bowl of Alphabet Soup and shit a better argument than that.
|
|
|
|
perkiset
|
 |
« Reply #4 on: January 20, 2009, 09:32:45 PM » |
|
FullText indicies work on regular text fields as well NBs. I'm gonna guess that you've got a TEXT field with a FULLTEXT index ... are your sentences under 255 chars? If so, you've got it with a simple primary key - that won't mess with your fulltext index. If it's bigger than that (need a regular TEXT field) and you can't primary key it then perhaps you write a little stored procedure that checks the existence of <exactly> that "new" sentence, and if it doesn't exist then inserts it. Something as simple as: PROCEDURE test(newText TEXT) BEGIN
declare theCount integer;
select count('x') into theCount from myTable where textfield = newText; if theCount = 0 then insert into myTable(textfield) values(newText); end if; END
... then you could do the same - just insert away and not worry.
|
|
|
|
|
Logged
|
It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
|
|
|
|
nutballs
|
 |
« Reply #5 on: January 20, 2009, 09:57:47 PM » |
|
yea I am just gonna force 255 max length, so I can do PK unique in a varchar.
i am already doing what you just suggested, though from teh PHP side.
|
|
|
|
|
Logged
|
I could eat a bowl of Alphabet Soup and shit a better argument than that.
|
|
|
|
perkiset
|
 |
« Reply #6 on: January 20, 2009, 10:03:58 PM » |
|
If speed is an issue, a stored proc will be way faster... but if it's not, then it's way less complicated to do it your way.
For example, you could make a string like this: call insertSentence('My dog has fleas'); call insertSentence('My other dog has none.'); call insertSentence('My dog has fleas'); call insertSentence('And that is all I have to say about that.');
... then use the multi-request function from the mysqli functions to do it as a single call to MySQL and it would just *rock*.
But since you're going to PK it, then you're about as quick as you need to be.
|
|
|
|
« Last Edit: January 20, 2009, 10:07:23 PM by perkiset »
|
Logged
|
It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
|
|
|
|
nutballs
|
 |
« Reply #7 on: January 21, 2009, 08:42:33 AM » |
|
i was being truly retarded this time...
$sql = "insert IGNORE into sentences (sentence) select SentenceRaw from sentencesraw order by SentenceRawID LIMIT 500;"; mysql_query($sql,$db); $sql = "delete from sentencesraw order by SentenceRawID LIMIT 500;"; mysql_query($sql,$db);
PK unique on sentences.sentence which is now varchar255 and fulltext.
duh...
thanks for the help guys. much faster now.
|
|
|
|
|
Logged
|
I could eat a bowl of Alphabet Soup and shit a better argument than that.
|
|
|
|