
On Thu, Jun 05, 2003 at 08:09:02AM -0500, Stecher, Jack wrote:
I have an exceedingly simple problem to address, and am wondering if there are relatively straightforward ways to improve the efficiency of my solution.
Was there actually a problem with the efficiency of your first code?
The task is simply to look at a lengthy list of stock keeping units (SKUs -- what retailers call individual items), stores, dates that a promotion started, dates the promotion ended, and something like sales amount; we want to pull out the records where promotions overlap. I will have dates in yyyymmdd format, so there's probably no harm in treating them as Ints.
(Unless this is really a one-shot deal, I suspect using Ints for dates is a bad decision...)
My suggestion went something like this (I'm not at my desk so I don't have exactly what I typed):
I have a different algorithm, which should be nearly optimal, but I find it harder to describe than to show the code (which is untested):
import List(sortBy, insertBy)
data PromotionRec = PR {sku :: String, store :: String, startDate :: Int, endDate :: Int, amount::Float}
compareStart, compareEnd :: PromotionRec -> PromotionRec -> Ordering compareStart x y = compare (startDate x) (startDate y) compareEnd x y = compare (endDate x) (endDate y)
overlap :: [PromoRec] -> [[PromoRec]] overlap l = filter (lambda l. length l > 1) (overlap' [] (sortBy compareStart l))
overlap' _ [] = [] overlap' active (x:xs) = let {active' = dropWhile (lambda y. endDate y < startDate x) active} in (x:active') : overlap' (insertBy compareEnd x active') xs
The key is that, by keeping a list of the currently active promotions in order sorted by the ending date, we only need to discared an initial portion of the list. You could get a moderately more efficient implementation by keeping the active list as a heap rather than a list. Peace, Dylan