
Hong Yang schrieb:
The question is more about algorithm than Haskell. But I am going to code in Haskell which I am still learning.
Suppose I have a large table, with hundreds of columns and thousands of rows. But not every cell has a value (of String, or Int, or Double type).
I want to shuffle the rows to maximize the number of columns whose first 100 rows have at least one number, given a list of preferred column names since there is no guarantee that every number column will have at least one number in its first 100 rows after shuffling.
Can someone provide a good algorithm for this problem? (I do not have any background in algorithms.) You can assume I already know which columns are of Int or Double type.
I would say it depends on the distribution of values in the table. If there are rows with a lot of values and rows with few values, then I would first sort the rows after the number of cells with values. If you look at all the columns and the number of values for each row is unique then it would be perfectly solved. With a list of preferred columns and also a uniform distribution the problem might be hard (NP-complete?), but these hard problems can often be approximated, i.e with simulated annealing, which in short is switching two rows repeatedly as long as the result improves.
This is not a homework. Thanks,
Hong
------------------------------------------------------------------------
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe