Project postmortem II /Haskell vs. Erlang/

Simon, Please see this post for an extended reply: http://wagerlabs.com/articles/2006/01/01/haskell-vs-erlang-reloaded Thanks, Joel On Dec 29, 2005, at 8:22 AM, Simon Peyton-Jones wrote:
| Using Haskell for this networking app forced me to focus on all the | issues _but_ the business logic. Type constraints, binary IO and | serialization, minimizing memory use and fighting laziness, timers, | tweaking concurrency and setting up message channels, you name it.
That's a disappointing result. Mostly I think Haskell lets you precisely focus on the logic of your program, because lots else is taken care of behind the scenes. You found precisely the reverse.
It'd be interesting to understand which of these issues are - language issues - library issues - compiler/run-time issues
My (ill-informed) hypothesis is that better libraries would have solved much of your problems. A good example is a fast, generic serialisation library.
If you felt able (sometime) to distil your experience under headings like the above, only more concretely and precisely, I think it might help to motivate Haskellers to start solving them.

On Sun, Jan 01, 2006 at 11:12:31PM +0000, Joel Reymont wrote:
Simon,
Please see this post for an extended reply:
http://wagerlabs.com/articles/2006/01/01/haskell-vs-erlang-reloaded
Looking at this code, I wonder if there are better ways to express what you really want using static typing. To wit, with records, you give an example data Pot = Pot { pProfit :: !Word64, pAmounts :: ![Word64] -- Word16/ } deriving (Show, Typeable) mkPot :: Pot mkPot = Pot { pProfit = 333, pAmounts = [] } and complain about "having to explain to the customer how xyFoo is really different from zFoo when they really mean the same thing". I wonder: if they really are the same thing, is there a way to get the data types to faithfully reflect that? Can you post a few more snippets of your data structures? Peace, Dylan

Dylan Thurston
http://wagerlabs.com/articles/2006/01/01/haskell-vs-erlang-reloaded
| Compare Erlang | | -record(pot, { | profit = 0, | amounts = [] | }).
[...] complain about "having to explain to the customer how xyFoo is really different from zFoo when they really mean the same thing".
Isn't the obvious solution to declare a class here? I.e. class HasProfits h where profits :: h -> Word64 data Pot = Pot { pProfits :: !Word64, pAmounts = ![Word64] } instance HasProfits Pot where profits = pProfits And since you like to count LOC, why not use a more compact representation? mkPot = Pot 333 [] If it resides close to the data definition, it's easy to keep the two in sync. -k -- If I haven't seen further, it is by standing in the footprints of giants

Yes, that _is_ obvious but then puts the burden on the programmer to define the getters. It also misses the setters issue entirely. Each field can definitely be made into a class and records can be composed dynamically, HList-style. I think HList is _the_ facility for doing this. How do you create a "record" that has a field dictionary for updates and retrievals and stores the order of the fields for serialization? I think HList does provide for the order of records since it keeps them in a list of sorts. Can serialization of these HList-style records serialization be efficient in this case? Would updating fields be efficient? I tried composing my records using HList and that made GHC run out of memory and bomb out. Simon Peyton-Jones fixed the problem promptly but since I was not familiar with the profiler at the time I never got to measuring the pickling efficiency. I figured if it's that tough on the compiler then maybe I should pick an easier path. Thanks, Joel On Jan 4, 2006, at 8:41 AM, Ketil Malde wrote:
Isn't the obvious solution to declare a class here? I.e.
class HasProfits h where profits :: h -> Word64
data Pot = Pot { pProfits :: !Word64, pAmounts = ![Word64] }
instance HasProfits Pot where profits = pProfits

Sure. Type classes, as Ketil Malde has suggested. On Jan 4, 2006, at 2:09 AM, Dylan Thurston wrote:
Looking at this code, I wonder if there are better ways to express what you really want using static typing. To wit, with records, you give an example
data Pot = Pot { pProfit :: !Word64, pAmounts :: ![Word64] -- Word16/ } deriving (Show, Typeable)

Hello Joel, use enclosed unstuff.hs ghc -O2 --make unstuff.hs -o unstuff -lz ./unstuff trace.dat +RTS -s -A10m analysis: 1) are you really don't use -O2 switch? :) 2) your origianl program spend 2/3 of it's time in GC. so i used -A10m to reduce GC times 3) i also placed lock around `unstuff` call to decrease GC times 4) small delay between starting threads allow each thread to start smoother of course, try these changes in your real networking code -- Best regards, Bulat mailto:bulatz@HotPOP.com

Bulat, On Jan 4, 2006, at 7:57 PM, Bulat Ziganshin wrote:
3) i also placed lock around `unstuff` call to decrease GC times
This sort of invalidates the test. We have already proven that it works much better when you do this but it just pushes the delays upstream. I will profile your version in the next few days to compare it to my final results. Thanks, Joel -- http://wagerlabs.com/

Hello Joel, Wednesday, January 04, 2006, 11:31:48 PM, you wrote:
3) i also placed lock around `unstuff` call to decrease GC times
JR> This sort of invalidates the test. We have already proven that it JR> works much better when you do this but it just pushes the delays JR> upstream. as i say, try this on real program -- Best regards, Bulat mailto:bulatz@HotPOP.com

Hello Bulat, Thursday, January 05, 2006, 3:14:12 AM, you wrote:
3) i also placed lock around `unstuff` call to decrease GC times
JR>> This sort of invalidates the test. We have already proven that it JR>> works much better when you do this but it just pushes the delays JR>> upstream. on my 1ghz duron, unpickling speed (for sequential code) is about 2 mb/s. with 50 kb packets, it can run 40 packets/s and 120 packets in a 3 sec. my changes to program ensure minimal overhead of threading, so i can guarantee 120 working threads for this program. your processor is slightly faster, it will run 150-200 threads. to futher increase speed, you need either 1) use faster processor or many processors 2) speed up unpickling 3) learn timeout strategy of server and write program according to it i also recommend you to try FD from my Binary package instead of Handles because using 1000 Handles may involve a large memory/cpu pressure -- Best regards, Bulat mailto:bulatz@HotPOP.com

Could you give us a bit more detail on this? How does using handles involve large memory/CPU pressure? On Jan 5, 2006, at 10:01 AM, Bulat Ziganshin wrote:
i also recommend you to try FD from my Binary package instead of Handles because using 1000 Handles may involve a large memory/cpu pressure

Hello Joel, Thursday, January 05, 2006, 2:01:43 PM, you wrote: JR> Could you give us a bit more detail on this? forget about this :) i forget that direct i/o on sockets will block entire app. GHC organizez its own complex non-blocking machinery JR> How does using handles involve large memory/CPU pressure? just look at GHC.Handle, .IO, .IOBase modules JR> On Jan 5, 2006, at 10:01 AM, Bulat Ziganshin wrote:
i also recommend you to try FD from my Binary package instead of Handles because using 1000 Handles may involve a large memory/cpu pressure
JR> -- JR> http://wagerlabs.com/ -- Best regards, Bulat mailto:bulatz@HotPOP.com
participants (4)
-
Bulat Ziganshin
-
Dylan Thurston
-
Joel Reymont
-
Ketil Malde