Takusen sqlite3 insert is very slow

Hi Cafe, I have another problem, please look at code: storeInDb = withSession (connect "test.db") (do execDDL (sql "create table x (y int)") forM_ ([1..10000] :: [Int]) (\x -> do execDML (cmdbind ("insert into x (y) values (?);") [bindP x]) return ())) This code runs 16 seconds which is very slow for this simple task. RTS output is below. After profiling this program I found that 85% of its time it spends in 'Database.Sqlite.SqliteFunctions.stmtFetch'. Currently I don't know how to make it faster, maybe anyone had this problem later? HDBC inserts very fast, so this is not sqlite error. RTS output: 156,195,264 bytes allocated in the heap 591,280 bytes copied during GC 3,952 bytes maximum residency (1 sample(s)) 10,968 bytes maximum slop 1 MB total memory in use (0 MB lost due to fragmentation) Generation 0: 295 collections, 0 parallel, 0.00s, 0.01s elapsed Generation 1: 1 collections, 0 parallel, 0.00s, 0.02s elapsed INIT time 0.00s ( 0.03s elapsed) MUT time 0.87s ( 16.09s elapsed) GC time 0.00s ( 0.02s elapsed) RP time 0.00s ( 0.00s elapsed) PROF time 0.00s ( 0.00s elapsed) EXIT time 0.00s ( 0.00s elapsed) Total time 0.88s ( 16.14s elapsed) %GC time 0.5% (0.1% elapsed) Alloc rate 179,111,917 bytes per MUT second Productivity 99.5% of total user, 5.4% of total elapsed Regards, Vasyl

On Sat, Mar 20, 2010 at 3:32 AM, Vasyl Pasternak
Hi Cafe,
I have another problem, please look at code:
storeInDb = withSession (connect "test.db") (do execDDL (sql "create table x (y int)") forM_ ([1..10000] :: [Int]) (\x -> do execDML (cmdbind ("insert into x (y) values (?);") [bindP x]) return ()))
This code runs 16 seconds which is very slow for this simple task. RTS output is below. After profiling this program I found that 85% of its time it spends in 'Database.Sqlite.SqliteFunctions.stmtFetch'. Currently I don't know how to make it faster, maybe anyone had this problem later?
HDBC inserts very fast, so this is not sqlite error.
Can you show the HDBC version? Maybe they make different assumptions about transactions or fetching the number of affected rows? If I'm reading the source of takusen correctly it's using a different transaction for each insert and stmtFetch is getting called to return the number of rows inserted. Which should be 1 every time and for your purposes, ignorable. You should be able to change to execDDL, but I seriously doubt that will have any impact on performance. It looks like the only difference between execDDL and execDML is that execDDL has ">> return ()" at the end of it. You might try running your inserts inside withTransaction. The default behavior of sqlite is to use a separate transaction for each statement. Perhaps this is adding overhead that shows up during stmtFetch. How long does your HDBC version take? Is it a factor of 10? Factor of 2? Jason

Hi Jason,
Your recommendations worked for me. When I enclosed updating into
single transaction, the code executed in less than 0.5 seconds, which
is as fast as HDBC version. I didn't go deeper, hoping, that
everything will be OK from now.
Thank you,
Vasyl
2010/3/20 Jason Dagit
On Sat, Mar 20, 2010 at 3:32 AM, Vasyl Pasternak
wrote: Hi Cafe,
I have another problem, please look at code:
storeInDb = withSession (connect "test.db") (do execDDL (sql "create table x (y int)") forM_ ([1..10000] :: [Int]) (\x -> do execDML (cmdbind ("insert into x (y) values (?);") [bindP x]) return ()))
This code runs 16 seconds which is very slow for this simple task. RTS output is below. After profiling this program I found that 85% of its time it spends in 'Database.Sqlite.SqliteFunctions.stmtFetch'. Currently I don't know how to make it faster, maybe anyone had this problem later?
HDBC inserts very fast, so this is not sqlite error.
Can you show the HDBC version? Maybe they make different assumptions about transactions or fetching the number of affected rows? If I'm reading the source of takusen correctly it's using a different transaction for each insert and stmtFetch is getting called to return the number of rows inserted. Which should be 1 every time and for your purposes, ignorable. You should be able to change to execDDL, but I seriously doubt that will have any impact on performance. It looks like the only difference between execDDL and execDML is that execDDL has ">> return ()" at the end of it. You might try running your inserts inside withTransaction. The default behavior of sqlite is to use a separate transaction for each statement. Perhaps this is adding overhead that shows up during stmtFetch. How long does your HDBC version take? Is it a factor of 10? Factor of 2? Jason

This FAQ on the sqlite website seems relevant:
http://www.sqlite.org/faq.html#q19
David
On Wed, Mar 24, 2010 at 5:34 PM, Vasyl Pasternak
Hi Jason,
Your recommendations worked for me. When I enclosed updating into single transaction, the code executed in less than 0.5 seconds, which is as fast as HDBC version. I didn't go deeper, hoping, that everything will be OK from now.
Thank you, Vasyl
2010/3/20 Jason Dagit
: On Sat, Mar 20, 2010 at 3:32 AM, Vasyl Pasternak <
wrote:
Hi Cafe,
I have another problem, please look at code:
storeInDb = withSession (connect "test.db") (do execDDL (sql "create table x (y int)") forM_ ([1..10000] :: [Int]) (\x -> do execDML (cmdbind ("insert into x (y) values (?);") [bindP x]) return ()))
This code runs 16 seconds which is very slow for this simple task. RTS output is below. After profiling this program I found that 85% of its time it spends in 'Database.Sqlite.SqliteFunctions.stmtFetch'. Currently I don't know how to make it faster, maybe anyone had this problem later?
HDBC inserts very fast, so this is not sqlite error.
Can you show the HDBC version? Maybe they make different assumptions about transactions or fetching the number of affected rows? If I'm reading the source of takusen correctly it's using a different transaction for each insert and stmtFetch is getting called to return the number of rows inserted. Which should be 1 every time and for your purposes, ignorable. You should be able to change to execDDL, but I seriously doubt that will have any impact on performance. It looks like
vasyl.pasternak@gmail.com> the
only difference between execDDL and execDML is that execDDL has ">> return ()" at the end of it. You might try running your inserts inside withTransaction. The default behavior of sqlite is to use a separate transaction for each statement. Perhaps this is adding overhead that shows up during stmtFetch. How long does your HDBC version take? Is it a factor of 10? Factor of 2? Jason
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
-- David Sankel Sankel Software www.sankelsoftware.com

Using PostgreSQL on my computer, your code executes in 3.5 seconds with GHCi, while compiled it executes in 16.2 seconds! Clearly something is wrong, although I don't yet know enough about Takusen enough to be able to say what. I tried hoisting the preparation of the statement out of the loop, however, Takusen throws an error after the first insertion. However, the insertion was apparently successful, as a single row shows up in the table afterwards. Here's sample code and a sample shell session: Best, Leon
import Database.PostgreSQL.Enumerator import Database.Enumerator import Control.Monad(forM_)
main = do withSession (connect [ CAhost "localhost" , CAport "5432" , CAuser "takusen" , CApassword "takusen" , CAdbname "test" ]) $ do execDDL (sql "create table x (y text)") let query = sql "insert into x (y) values (?)" stmt = prepareQuery "testinsert" query [bindType (0::Int)] forM_ ([1..10000] :: [Int]) $ \x -> do withPreparedStatement stmt $ \pstmt -> do withBoundStatement pstmt [bindP x] $ \bstmt -> do execDML bstmt return ()
$ psql test psql (8.4.2) Type "help" for help. test=# select * from x; ERROR: relation "x" does not exist LINE 1: select * from x; ^ test=# \q $ ghci Tak.lhs GHCi, version 6.12.1: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. [1 of 1] Compiling Main ( Tak.lhs, interpreted ) Ok, modules loaded: Main. (0.09 secs, 40984568 bytes) *Main> main Loading package mtl-1.1.0.2 ... linking ... done. Loading package old-locale-1.0.0.2 ... linking ... done. Loading package old-time-1.0.0.3 ... linking ... done. Loading package time-1.1.4 ... linking ... done. Loading package Takusen-0.8.5 ... linking ... done. *** Exception: DBError ("","") 1 "" *Main> :q Leaving GHCi. $ psql test psql (8.4.2) Type "help" for help. test=# select * from x; y --- 1 (1 row)
participants (4)
-
David Sankel
-
Jason Dagit
-
Leon Smith
-
Vasyl Pasternak