
searchForPattern' (and therefore searchForPattern) will read the entire file, indeed. To check only the first 'n' chunks in a file you'd have to 1) receive a lazy bytestring, check first n*chuckSize bytes. 2) receive the handle, loop over n chunks. El jue, 14-04-2011 a las 14:50 +0200, Daniel Fischer escribió:
On Thursday 14 April 2011 14:30:59, Felipe Almeida Lessa wrote:
On Thu, Apr 14, 2011 at 8:50 AM, Daniel Fischer
wrote: $ cabal install stringsearch -- faster searches for a pattern than provided by bytestring
import qualified Data.ByteString.Char8 as BC import qualified Data.ByteString.Lazy as L import qualified Data.ByteString.Lazy.Search as L
blockSize = 512
main = do stuff <- L.readFile "blocks" case L.indices (BC.pack "PART") stuff of [] -> putStrLn "Not Found." (i:_) -> putStrLn $ "Found at " ++ show (i `quot` blockSize) ++ "." putStrLn "Done."
This solves a different problem. The original looked for PART only on the beginning of each block,
Well, I would have expected that, but
searchForPattern' index handle pat = do
eof <- hIsEOF handle if eof then return Nothing
else do
bytes <- B.hGet handle chunkSize case BC.breakSubstring pat bytes of
(x, y) | BC.null y -> searchForPattern' (index + 1) handle pat
| otherwise -> return (Just index)
searches the entire block and checks whether the second component is empty to see whether there's any match at all. So I thought that'd be the desired behaviour.
If one wants to only look at the start of the blocks, isPrefixOf would be a much more efficient way than breakSubstring. It would probably be still more efficient to read larger chunks than 512 bytes and step through them with 512-byte steps to check for a "PART" there (unless the "PART" appears in the first few blocks).
while your solution will find everything that has PART in it. Because this is I/O bound, probably the real time taken won't be very different, but the program will surely use more CPU time and report more false positives.
Cheers,
_______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners