
2 Mar
2009
2 Mar
'09
5:34 p.m.
Hello Manlio, Monday, March 2, 2009, 8:16:10 PM, you wrote:
By the way: I have written the first version of the program to parse Netflix training data set in D. I also used ncpu * 1.5 threads, to parse files concurrently.
However execution was *really* slow, due to garbage collection. I have also tried to disable garbage collection, and to manually run a garbage cycle from time to time (every 200 file parsed), but the performance were the same.
may be it will be better to use somewhat like MapReduce and split your job into 100-file parts which are processed by ncpu concurrently executed scripts? -- Best regards, Bulat mailto:Bulat.Ziganshin@gmail.com