
Just thought I would toss the list a problem I need to solve and see what I can learn from the solutions here. My problem is that I need to take a list of Double and map them into "percentile" form (not sure if I have that terminology correct) - so that the lowest number is mapped to 0, the highest number is mapped to 1, the median is mapped to 0.5, and in general given a fixed delta D and an interval (x, x +D) in the percentile domain, the same number of elements get mapped into that interval for any x. Except there aren't an infinity of values and there may be ties, so it gets a little fuzzier how to define it. so I need: toPercentile :: [Double] -> [Double] I shall now work on it and report back. Dennis

Here is my implementation, which sorts the list of values. If there are N values, the first gets a percentile of 0/N, the second gets 1/N, and so on, so that the last value gets a percentile of (N-1)/N. It then generates the percentiles list from this mapping. import Data.List (sort) import Data.Maybe (fromJust) toPercentile xs = let pairs = zip (sort xs) . map (/(fromIntegral $ length xs)) . map fromInteger $ [0..] in map (fromJust . (flip lookup) pairs) xs Nick On Monday, October 01, 2012 03:11:40 PM Dennis Raddle wrote:
Just thought I would toss the list a problem I need to solve and see what I can learn from the solutions here.
My problem is that I need to take a list of Double and map them into "percentile" form (not sure if I have that terminology correct) - so that the lowest number is mapped to 0, the highest number is mapped to 1, the median is mapped to 0.5, and in general given a fixed delta D and an interval (x, x +D) in the percentile domain, the same number of elements get mapped into that interval for any x. Except there aren't an infinity of values and there may be ties, so it gets a little fuzzier how to define it.
so I need:
toPercentile :: [Double] -> [Double]
I shall now work on it and report back.
Dennis

I'm not sure if you considered duplicate elements. I came up with this
(which handles duplicates by averaging the percentile position of the first
index of the element in the sorted list and the last index of the element.
How one handles duplicates may be arbitrary or dependent on the deeper
problem). Also for my problem I wanted the maximum percentile to be 1
rather than (N-1)/N but that may just depend on your definition.
import qualified Data.List as L
toPercentile inp = map doValue inp
where
sorted = L.sort inp
len = length inp
perc idx = fromIntegral idx / (fromIntegral len - 1)
doValue x = case L.elemIndices x sorted of
[i] -> perc i
is -> ((perc (head is)) + (perc (last is)))/2
On Mon, Oct 1, 2012 at 4:28 PM, Nick Vanderweit
Here is my implementation, which sorts the list of values. If there are N values, the first gets a percentile of 0/N, the second gets 1/N, and so on, so that the last value gets a percentile of (N-1)/N. It then generates the percentiles list from this mapping.
import Data.List (sort) import Data.Maybe (fromJust) toPercentile xs = let pairs = zip (sort xs) . map (/(fromIntegral $ length xs)) . map fromInteger $ [0..] in map (fromJust . (flip lookup) pairs) xs
Nick
participants (2)
-
Dennis Raddle
-
Nick Vanderweit