The Cache: Technology Expert's Forum
 
*
Welcome, Guest. Please login or register. September 16, 2019, 11:14:17 PM

Login with username, password and session length


Pages: 1 [2]
  Print  
Author Topic: How to break a ton of sentences into one single line with line feed.  (Read 7219 times)
tommytx
Expert
****
Offline Offline

Posts: 123


View Profile WWW
« Reply #15 on: April 10, 2009, 08:35:58 AM »

Thanks for all the help from everyone.. and yes, Vsloathe, the last one did retain the punctuation mark.
Makes it more better....
Wonder if the more complex formula will cause a noticable slowdown when using large files..
I may try it on a large text file and measure the processs time and if a noticable difference stick adding a period after each one as 99% will be period anyway... So what is a few missing question marks among us scrapers. he he..
Thanks again for all the help.
Logged
nutballs
Administrator
Lifer
*****
Offline Offline

Posts: 5627


Back in my day we had 9 planets


View Profile
« Reply #16 on: April 10, 2009, 08:14:13 PM »

Wonder if the more complex formula will cause a noticable slowdown when using large files..

yes it will.

speed is a factor of file size and regex complexity.
realize though that regex complexity might not always be what you think.

but in the same vein, its also probably the most efficient choice you have usually.
Logged

I could eat a bowl of Alphabet Soup and shit a better argument than that.
vsloathe
vim ftw!
Global Moderator
Lifer
*****
Offline Offline

Posts: 1669



View Profile
« Reply #17 on: April 11, 2009, 07:54:11 AM »

Any zero-width tuple representations in regex are pretty hefty. Also negative and positive lookarounds and lookbehinds make it CRAWL if you're doing enough of them.
Logged

hai
perkiset
Olde World Hacker
Administrator
Lifer
*****
Offline Offline

Posts: 10096



View Profile
« Reply #18 on: April 12, 2009, 10:12:33 AM »

At that point, it would be faster to convert all sentence breaking instances (". ", "! " "?") into a single special character and then explode on it, thus:

$newBuff = str_replace(array('. ', '? ', '! '), '###', $inputBuff);
$array = explode('###', $newBuff);

Logged

It is now believed, that after having lived in one compound with 3 wives and never leaving the house for 5 years, Bin Laden called the U.S. Navy Seals himself.
vsloathe
vim ftw!
Global Moderator
Lifer
*****
Offline Offline

Posts: 1669



View Profile
« Reply #19 on: April 12, 2009, 10:52:08 AM »

But the regex I gave him should be fast enough. It doesn't use any advanced regex syntax at all, it's extremely basic.
Logged

hai
Bompa
Administrator
Lifer
*****
Offline Offline

Posts: 564


Where does this show?


View Profile
« Reply #20 on: April 12, 2009, 05:27:14 PM »

You geeks love to discuss processing speed.   Roll Eyes

I am pretty sure he was talking only of the additional code to capture the punctuation.

Quote
Wonder if the more complex formula will cause a noticable slowdown when using large files

It's not going to be noticable unless the file is FRIGGIN HUGE.

imo
Logged

"The most beautiful and profound emotion we can experience is the sensation of the mystical..." - Albert Einstein
Pages: 1 [2]
  Print  
 
Jump to:  

Perkiset's Place Home   Best of The Cache   phpMyIDE: MySQL Stored Procedures, Functions & Triggers
Politics @ Perkiset's   Pinkhat's Perspective   
cache
mart
coder
programmers
ajax
php
javascript
Powered by MySQL Powered by PHP Powered by SMF 1.1.2 | SMF © 2006-2007, Simple Machines LLC
Seo4Smf v0.2 © Webmaster's Talks


Valid XHTML 1.0! Valid CSS!