Thanks. I actually prototyped in Perl the SA method intuitively (though I do not know its name). But it is way slow. So I want to improve the speed by means of both algorithm and language.
Hong
Hong Yang schrieb:I would say it depends on the distribution of values in the table. If there are rows with a lot of values and rows with few values, then I would first sort the rows after the number of cells with values. If you look at all the columns and the number of values for each row is unique then it would be perfectly solved. With a list of preferred columns and also a uniform distribution the problem might be hard (NP-complete?), but these hard problems can often be approximated, i.e with simulated annealing, which in short is switching two rows repeatedly as long as the result improves.
The question is more about algorithm than Haskell. But I am going to code in
Haskell which I am still learning.
Suppose I have a large table, with hundreds of columns and thousands of
rows. But not every cell has a value (of String, or Int, or Double type).
I want to shuffle the rows to maximize the number of columns whose first 100
rows have at least one number, given a list of preferred column names since
there is no guarantee that every number column will have at least one number
in its first 100 rows after shuffling.
Can someone provide a good algorithm for this problem? (I do not have any
background in algorithms.) You can assume I already know which columns are
of Int or Double type.
This is not a homework.
Thanks,
Hong
------------------------------------------------------------------------
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe