
On Saturday 18 September 2010 19:44:38, Jake McArthur wrote:
On 09/18/2010 02:51 AM, Christopher Tauss wrote:
I am trying to write a function that takes a list and returns the last n elements.
This may just be for the sake of learning, in which case this is fine, but usually, needing to do this would be a sign that you are using lists improperly (since this is a O(n) time operation).
By which he meant O(length list), not the number of elements you're asking for.
Let's call the function n_lastn and, given a list [1,2,3,4,5], I would like n_lastn 3 = [3,4,5]
n_lastn n = reverse . take n . reverse
Which is the most elegant definition, but it's an O(length list) space operation (as are all others proposed so far). That will be a problem for long lists (consider n_lastn 10 [1 .. 10^8]). You can implement it as an O(n) [the number of elements you want] *space* operation, provided that the list is generated lazily and not referred to by anything else, but the code is decidedly less nice: The first idea would be to keep a window of n elements and move it to the end of the list: n_lastn n xs = case splitAt n xs of (ys,[]) -> ys -- the list contains at most n elements, yay (ys,zs) -> loop ys zs where loop window [] = window loop window (v:vs) = loop (tail window ++ [v]) vs The space behaviour is now fine (if compiled with optimisations), but unfortunately, the time complexity has become O(n*length list) because the (++)-calls are left-nested: Suppose n = 4, loop (1:2:3:4:[]) [5 .. end] ~> loop ((2:3:4:[]) ++ [5]) [6 .. end] ~> loop (2:((3:4:[]) ++ [5])) [6 .. end] ~> loop (((3:4:[]) ++ [5]) ++ [6]) [7 .. end] The strictness analyser has been smart enough to transform the call to tail into a strict pattern match on window, as if we'd written loop (_:twindow) (v:vs) = loop (twindow ++ [v]) vs so the first few tails go fast, but later, we have to go through more layers to get at the head of the window to discard it ~> loop ((3:((4:[]) ++ [5])) ++ [6]) [7 .. end] ~> loop (3:(((4:[]) ++ [5]) ++ [6])) [7 .. end] -- finally! ~> loop ((((4:[]) ++ [5]) ++ [6]) ++ [7]) [8 .. end] -- bubble the 4 through four layers of parentheses: ~> loop(((4:([] ++ [5])) ++ [6]) ++ [7]) [8 .. end] ~> loop ((4:(([] ++ [5]) ++ [6])) ++ [7]) [8 .. end] ~> loop (4:((([] ++ [5]) ++ [6]) ++ [7])) [8 .. end] ~> loop (((([] ++ [5]) ++ [6]) ++ [7]) ++ [8]) [9 .. end] -- phew -- form now on, it's uniform, on each step we have to match an expression (((([] ++ [a]) ++ [b]) ++ [c]) ++ [d]) against (_:rest) 1. check whether ((([] ++ [a]) ++ [b]) ++ [c]) is empty, for that, 2. check whether (([] ++ [a]) ++ [b]) is empty, for that, 3. check whether ([] ++ [a]) is empty, for that, 4. check whether [] is empty, it is, hence [] ++ [a] = [a], 5. check whether [a] is empty, it's not, it's (a:[]), hence 6. (...) ++ [b] = a:([] ++ [b]), so 2's not empty, and 7. (...) ++ [c] = a:(([] ++ [b]) ++ [c]), so 1's not empty and 8. (...) ++ [d] = a:((([] ++ [b]) ++ [c]) ++ [d]) 9. at last, a can be dropped and we get to loop (((([] ++ [b]) ++ [c]) ++ [d]) ++ [e]) remainingList Blech! Appending to the end of a list is bad if it leads to left-nested parentheses (it's okay if the parentheses are right-nested). So we might be tempted to keep the window reversed and cons each element to the front, dropping the last. No good, removing an element from the end is O(length window) too. One possibility to fix it is to use a 'list-like' type with O(1) appending at the end and dropping from the front. Data.Sequence is such a type, import qualified Data.Sequence as Seq import Data.Foldable (toList) import Data.Sequence (ViewL(..), (|>)) n_lastn' :: Int -> [a] -> [a] n_lastn' k _ | k <= 0 = [] n_lastn' k xs = case splitAt k xs of (ys,[]) -> ys (ys,zs) -> go (Seq.fromList ys) zs where go acc [] = toList acc go acc (v:vs) = case Seq.viewl acc of _ :< keep -> go (keep |> v) vs _ -> error "teh impossible jus hapnd" fixes space and time behaviour. But the constant factors for Sequence are larger than those for lists, so we can beat it with lists: n_lastn :: Int -> [a] -> [a] n_lastn k _ | k <= 0 = [] n_lastn k xs = case splitAt k xs of (ys,[]) -> ys (ys,zs) -> go k (reverse ys) zs where m = k-1 go _ acc [] = reverse $ take k acc go 0 acc (v:vs) = case splitAt m acc of (keep,release) -> release `seq` go k (v:keep) vs go h acc (v:vs) = go (h-1) (v:acc) vs Every k steps, we perform the O(k) operation of removing the last (k+1) elements from a (2k)-element list, making the removal from the end an amortized O(1) operation. You can trade some space for speed and clip the window in larger intervals, say every (3*k) or every (10*k) steps.