
Suppose I have you the source code to some arbitrary function that takes a list and returns another list. It is possible to determine whether the function always examins the entire input list? Or would that be equivilent to solving the Halting Problem? (Last time I checked, the Halting Problem is unsolvable.)

On Fri, 2007-05-11 at 12:10 +0100, Andrew Coppin wrote:
Suppose I have you the source code to some arbitrary function that takes a list and returns another list.
It is possible to determine whether the function always examins the entire input list?
Presumably you mean in the case that you examine the entire output list, or do you mean when you only examine the output enough to see if it's a [] or (:) (i.e. to WHNF) ? Duncan

It is possible to determine whether the function always examins the entire input list?
Presumably you mean in the case that you examine the entire output list, or do you mean when you only examine the output enough to see if it's a [] or (:) (i.e. to WHNF) ?
f1 [] = [] f1 (x:xs) = if x < 0 then [x] else f1 xs This function does not always examine the entire input list. f2 [] = 0 f2 (x:xs) = x + f2 xs This function *does* examine the entire list. Always. There are many possible variations - length examines the whole list, but not the elements *in* the list. null does less than that. And so on. I'm sure there are many possible combinations. What I'm wondering is if it's possible to algorithmically decide which class of functions an arbitrary source fragment belongs to, or whether this is computationally impossible.

I think it might be impossible. For the function:
sumFirst n = sum . take n
then for any finite list, we can choose an n large enough that it will
examine the whole list, even though the function is able to stop. This means
the only way to check is to call it with an infinite list and see if it
terminates, which might be impossible. However, maybe it is possible if this
is a special case of the halting problem?
On 11/05/07, Andrew Coppin
It is possible to determine whether the function always examins the entire input list?
Presumably you mean in the case that you examine the entire output list, or do you mean when you only examine the output enough to see if it's a [] or (:) (i.e. to WHNF) ?
f1 [] = [] f1 (x:xs) = if x < 0 then [x] else f1 xs
This function does not always examine the entire input list.
f2 [] = 0 f2 (x:xs) = x + f2 xs
This function *does* examine the entire list. Always.
There are many possible variations - length examines the whole list, but not the elements *in* the list. null does less than that. And so on. I'm sure there are many possible combinations. What I'm wondering is if it's possible to algorithmically decide which class of functions an arbitrary source fragment belongs to, or whether this is computationally impossible.
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe

Andrew Coppin wrote:
There are many possible variations - length examines the whole list, but not the elements *in* the list. null does less than that. And so on. I'm sure there are many possible combinations. What I'm wondering is if it's possible to algorithmically decide which class of functions an arbitrary source fragment belongs to, or whether this is computationally impossible.
It's computationally impossible, in general. I can write something silly like f (x:xs) = x : (UniversalTuringMachine(x) `seq` xs) and it reduces to the halting problem. However, it might be tractable for a large class of sensible programs. Someone around here might be aware of some relevant research? Jules

Andrew, Jules wrote:
It's computationally impossible, in general. I can write something silly like
f (x:xs) = x : (UniversalTuringMachine(x) `seq` xs)
and it reduces to the halting problem.
However, it might be tractable for a large class of sensible programs. Someone around here might be aware of some relevant research?
This is rather typical in the field of program analysis. Getting the analysis precise is impossible and reduces to solving the halting problem. So, the next best thing is to get an approximate answer. An import design guideline to such an analysis is "to err on the safe side". That means you'll end up either - with an analysis that tells you that a given function *is guaranteed to* inspect all elements of its argument list or else that the function *may* inspect less elements; or - with an analysis that tells you that a given function *is guaranteed to* not inspect all elements of its argument list or else that the function *may* inspect all elements. Of course, it depends on your needs which of these modalities suits you best. I am not aware of any work on exactly this type of analysis, but you may want to have a look at strictness analysis first; I think there's quite some overlap with what you are trying to achieve. Cheers, Stefan

Stefan Holdermans wrote:
This is rather typical in the field of program analysis. Getting the analysis precise is impossible and reduces to solving the halting problem. So, the next best thing is to get an approximate answer. An import design guideline to such an analysis is "to err on the safe side". That means you'll end up either
- with an analysis that tells you that a given function *is guaranteed to* inspect all elements of its argument list or else that the function *may* inspect less elements; or
- with an analysis that tells you that a given function *is guaranteed to* not inspect all elements of its argument list or else that the function *may* inspect all elements.
Of course, it depends on your needs which of these modalities suits you best.
Right. So what you're saying is that for most program properties, you can partition the set of all possible problems into the set for which X is true, the set for which X is false, and a final set for programs where we can't actually determine the truth of X. Is that about right? I find it interesting that one quite hard looking property - "this program is well-typed" - can always be deduced. Perhaps that's because the language is specifically designed to constrain the set of possible programs in such a way that this is possible? Or perhaps it's just that the set of programs which can't be proved well-typed are simply rejected as if they were provel ill-typed? *shrugs* Can somebody explain to me exactly what the halting problem says? As I understand it, it doesn't say "you can't tell if this program halts", it says "you cannot write an algorithm which can tell whether *every* program halts or not". (Similar to the way that Galios dude didn't say "you can't solve high-order polynomials", he said "you can't solve all possible Nth order polynomials *with one formula* [involving only a specific set of mathematical operators]". As in, you *can* solve high-order polynomials - just not with a single formula for all of them. Or not just with the specified operators.)
I am not aware of any work on exactly this type of analysis, but you may want to have a look at strictness analysis first; I think there's quite some overlap with what you are trying to achieve.
I was actually thinking about having a chain of functions, and wondering whether you can remove the intermediate lists, and/or "optimise" them into some other data structure. But that would depend on how much of the list is consumed - which, as I suspected, is tricky to determine.

Andrew,
Right. So what you're saying is that for most program properties, you can partition the set of all possible problems into the set for which X is true, the set for which X is false, and a final set for programs where we can't actually determine the truth of X. Is that about right?
Yes, you can. (Although, as far as I know, one typically partitions into (a) the set for which is true (respectively false) and (b) the union of the set for which X is false (true) and the set for which we cannot determine X. But, of course, being able to do so implies that you can apply the partitioning that you suggest.)
I find it interesting that one quite hard looking property - "this program is well-typed" - can always be deduced. Perhaps that's because the language is specifically designed to constrain the set of possible programs in such a way that this is possible? Or perhaps it's just that the set of programs which can't be proved well-typed are simply rejected as if they were provel ill-typed? *shrugs*
That's exactly it. A prototypical example is the Haskell program (if True then 2 else 'x') + 3 Of course, this is a type-safe program in the sense that it will never actually result in adding a character and a number; but still, for a large class of statically typed languages, such a program will be rejected. Cheers, Stefan

Andrew Coppin wrote:
Right. So what you're saying is that for most program properties, you can partition the set of all possible problems into the set for which X is true, the set for which X is false, and a final set for programs where we can't actually determine the truth of X. Is that about right?
If "X is true/false" is replaced by "X can be determined to be true/false", then: yes, roughly. I say "roughly" because "the set of programs where we can't actually determine the truth of X" depends on the program analysis algorithm, i.e. is not uniquely determined by X. For instance, you can approximate the halting problem with: Procedure A(n): 1) run program for n steps 2) if it halts, return "Halts, I'm sure" 3) otherwise, return "Don't know" Each algorithm A(n) partitions programs into "halting" and "???". Each A(n) is better than A(n-1), i.e. more halting programs are recognised. Also, for each halting program P, there exists some n such that A(n) recognises P. So, "the set of programs where we can't actually determine termination" has no meaning unless you specify the actual algorithm. However, for any algorithm A you choose, you will indeed partition programs into the three classes "Yes/No/???" you mentioned. Also, a better algorithm than A surely exists (i.e. with a smaller "???" class), so you can keep on improving your program analysis techniques forever. (a.k.a. Eternal Employment Theorem ;-))
I find it interesting that one quite hard looking property - "this program is well-typed" - can always be deduced.
This is not true, in general. The "naive" polymorphic type system for the lambda calculus is undecidable. For some tricky cases, you might want to look up "polymorphic recursion". More practically, in some cases you need type signatures in your Haskell programs, since inference can not provide them (in general). Here's a simple one (involving rank-N): (\(x :: forall a . a->a) -> x x) (\x -> x) 3 Intuition: ... => (id id) 3 => id 3 => 3 . Also, using only rank-1: polyf :: Int -> a -> Int polyf x y = if x==0 then 0 else if x==1 then polyf (x-1) (\z->z) else polyf (x-2) 3 Here passing both 3 and (\z->z) as y confuses the type inference.
Perhaps that's because the language is specifically designed to constrain the set of possible programs in such a way that this is possible?
Well, I would say the "naive" polymorphic type system is restricted so to achieve decidability without rejecting too many good programs. (See e.g. Hindley-Milner) (And when designing a type system, decidability is a main concern) Examples as those above are arguably contrived, and rarely show up in actual code. And when it happens, adding a type signature is not too hard.
Or perhaps it's just that the set of programs which can't be proved well-typed are simply rejected as if they were provel ill-typed? *shrugs*
Yes, this is the case, as for the programs above. Also, one might argue that (if True then 42 else "a") is well-typed, and this is hard to check statically.
Can somebody explain to me exactly what the halting problem says? As I understand it, it doesn't say "you can't tell if this program halts", it says "you cannot write an algorithm which can tell whether *every* program halts or not".
Your formulation of "the halting problem is undecidable" seems rather precise to me. Regards, Zun.

On 5/14/07, Roberto Zunino
Also, using only rank-1:
polyf :: Int -> a -> Int polyf x y = if x==0 then 0 else if x==1 then polyf (x-1) (\z->z) else polyf (x-2) 3
Here passing both 3 and (\z->z) as y confuses the type inference.
Actually, I tried this in ghci... Should this work? polyf.hs: polyf x y = if x==0 then 0 else if x==1 then polyf (x-1) (\z->z) else polyf (x-2) 3 NOTE: no type signature Prelude> :l polyf [1 of 1] Compiling Main ( polyf.hs, interpreted ) Ok, modules loaded: Main. *Main> :t polyf polyf :: forall a t1 t. (Num (t1 -> t1), Num a, Num t) => a -> (t1 -> t1) -> t The inference assigns y the type (t1 -> t1) even though it is assigned the value 3? Regards, Chris

Roberto Zunino:
Here passing both 3 and (\z->z) as y confuses the type inference.
Christopher L Conway:
polyf :: forall a t1 t. (Num (t1 -> t1), Num a, Num t) => a -> (t1 -> t1) -> t
The inference assigns y the type (t1 -> t1) even though it is assigned the value 3?
Almost. It assigns y the type (Num (t1 -> t1) => t1 -> t1). If you're wondering where the Num context came from, note that the literal "3" is read as "fromInteger 3", which has type (Num a => a). Unifying (t1 -> t1) with (Num a => a) gives (Num (t1 -> t1) => t1 -> t1). So the type inference is not really confused at all. It just gives a not-very-useful type. To demonstrate the perfectly logical behaviour of the typechecker, we just need a little more absurd code:
{-# OPTIONS_GHC -fglasgow-exts #-}
-- ^ Extensions are needed for relaxed instances only, -- no rank-n polymorphism here.
instance Show (t -> t) where show _ = "<function>"
instance Eq (t -> t) where _ == _ = False
instance Num (t -> t) where _ + _ = id _ - _ = id _ * _ = id abs _ = id signum _ = id fromInteger _ = id
Et, voila: *PolyF> :t polyf polyf :: (Num a, Num t) => a -> (t1 -> t1) -> t *PolyF> polyf 3 3 0

Matthew Brecknell wrote:
Roberto Zunino:
Here passing both 3 and (\z->z) as y confuses the type inference.
So the type inference is not really confused at all. It just gives a not-very-useful type.
Yes, you are right, I didn't want to involve type classes and assumed 3::Int. A better example would be: polyf :: Int -> a -> Int polyf x y = if x==0 then 0 else if x==1 then polyf (x-1) (\z->z) else polyf (x-2) () Here, passing both () and (\z->z) _does_ make inference fail. Zun.

Christopher L Conway wrote:
On 5/14/07, Roberto Zunino
wrote: Also, using only rank-1:
polyf :: Int -> a -> Int polyf x y = if x==0 then 0 else if x==1 then polyf (x-1) (\z->z) else polyf (x-2) 3
Here passing both 3 and (\z->z) as y confuses the type inference.
Actually, I tried this in ghci... Should this work?
polyf.hs: polyf x y = if x==0 then 0 else if x==1 then polyf (x-1) (\z->z) else polyf (x-2) 3
NOTE: no type signature
I forgot to mention a *use* of that as in test = polyf 1 2 Then: No instance for (Num (t -> t)) (see also below) Also, we can avoid type classes and _still_ have inference problems: using (3 :: Int) in polyf yields: Couldn't match expected type `Int' against inferred type `t -> t' which is actually the type mismatch error I had in mind. So, even in Haskell - type classes, we need signatures in some cases.
Prelude> :l polyf [1 of 1] Compiling Main ( polyf.hs, interpreted ) Ok, modules loaded: Main. *Main> :t polyf polyf :: forall a t1 t. (Num (t1 -> t1), Num a, Num t) => a -> (t1 -> t1) -> t
The inference assigns y the type (t1 -> t1) even though it is assigned the value 3?
Yes, because Haskell natural literals are overloaded: 3 :: forall a. Num a => a So 3 :: (t1 -> t1) is "fine" provided you supply a Num instance for that. Alas, only trivial instances exists (0=1=2=..=id). Zun.

Can somebody explain to me exactly what the halting problem says? As I understand it, it doesn't say "you can't tell if this program halts", it says "you cannot write an algorithm which can tell whether *every* program halts or not". (Similar to the way that Galios dude didn't say "you can't solve high-order polynomials", he said "you can't solve all possible Nth order polynomials *with one formula* [involving only a specific set of mathematical operators]". As in, you *can* solve high-order polynomials - just not with a single formula for all of them. Or not just with the specified operators.)
I'm not an expert, but as I understand it, you can construct a Turing machine to run a number of other Turing machines in parallel. In the case of the halting problem, you would then be able to construct a machine that simulates as many halting problem solutions as you want in parallel and return the decision of the first simulated machine that finishes. This parallel machine (still a Turing machine) will also be unable to decide the halting problem. Thus, no combination of algorithms is able to decide the halting problem.
participants (9)
-
Andrew Coppin
-
Christopher L Conway
-
Duncan Coutts
-
Josiah Manson
-
Jules Bean
-
Matthew Brecknell
-
Roberto Zunino
-
Rodrigo Queiro
-
Stefan Holdermans