why is something different within a function when it comes out?

older
deducing type of multi-parameter...

prad

14 Jul 2010 14 Jul '10

5:33 a.m.

i have this: main = do c <- readFile "test.txt" let tleP = "<title>\n(.*)\n</title>" let tle = c=~tleP::[[String]] putStrLn $ tle!!0!!1 let g = xtract tleP c putStrLn $ show g xtract p c = do let r = c=~p::[[String]] return r!!0!!0!!1 for the first putStrLn i get: League of Humane Voters Home Page for the second i get "League of Humane Voters Home Page" but only after i have done r !!0 !!0 !!1 whereas i only needed tle !!0 !!1 it seems to me that both tle and r serve the same purpose though they come out as different types r being a string and tle being i don't know what because i get a "Not in scope: 'tle' so even though both r and tle are set as [[String]], they don't seem to be the same creature. -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's

Show replies by date

Michael Mossey

14 Jul 14 Jul

8:10 a.m.

I'm fairly new, so I can't fully explain the behavior you got, but I know at least one thing you did wrong: prad wrote:

...

i have this:

main = do

c <- readFile "test.txt" let tleP = "<title>\n(.*)\n</title>" let tle = c=~tleP::[[String]] putStrLn $ tle!!0!!1

let g = xtract tleP c putStrLn $ show g

xtract p c = do let r = c=~p::[[String]] return r!!0!!0!!1

You probably mean to write

...

xtract p c = r !! 0 !! 1 where r = c=~p::[[String]]

You used do-notation when you meant to write a simple function. The use of "return" in do-notation is one point of confusion with beginners. It does not operate like "return" in an imperative language. I'm not sure what your 'xtract' did, but the compiler probably accepted the do-notation because it specified a list monad of some sort. I recommend you pay close attention to introductory examples, noticing in particular when they are unlike imperative languages. Find some tutorials on monads and the do-notation. Best, Mike

Chaddaï Fouché

8:54 a.m.

On Wed, Jul 14, 2010 at 7:33 AM, prad wrote:

...

xtract p c = do let r = c=~p::[[String]] return r!!0!!0!!1

return is just a function in Haskell, and given that function application has priority over any operator, this mean your last line is :

...

(return r) !! 0 !! 0 !! 1

...

return x = [x] so in this case your last line put r in a singleton list : [r] !! 0 !! 0 !! 1

return type is "(Monad m) => a -> m a" where m is a type constructor variable (a type constructor is a parametrized type, think array parametrized on the element type in most language, or template in C++) constrained to be an instance of the Monad typeclass. Monad is an important typeclass in Haskell but here it's enough to look at (!!) type "[a] -> Int -> a" to see that (return r) should be of a list type, list is a monad so this code typecheck, but return is perfectly redundant : For the list monad : then (!! 0) extract r

...

r !! 0 !! 1 And you get what you wanted in the first place...

So you could write just "r !! 0 !! 1" instead of "return r!!0!!!0!!1" for the same result. In fact xtract could be written :

...

xtract p c = (c =~ p :: [[String]]) !! 0 !! 1

do-notation (which is just syntactic sugar to write monadic code easily) and return (which is just a function, not a keyword) are only useful when you're working with monads, if you're not you shouldn't use them. -- Jedaï

prad

8:41 p.m.

New subject: why is something different within a function when it comes out?

...

In fact xtract could be written :

...
xtract p c = (c =~ p :: [[String]]) !! 0 !! 1

On Wed, 14 Jul 2010 10:54:22 +0200 Chaddaï Fouché wrote: this is exactly what i was trying to do but couldn't figure out how to write it because i couldn't get my head around the return. it seems so simple when someone else does it. :D thx for your explanations chaddai and you too michael. this (as well as some crude attempts i've used earlier) work nicely on text enclosed in single lines, but find nothing for multiple lines. in python, there is a dotall or re.S flag so that things can be searched for over \n in the text. i can't figure out how to do this here. i looked at http://hackage.haskell.org/packages/archive/regex-tdfa/1.1.3/doc/html/Text-R... but am having difficulty figuring things out from the documentation - and there doesn't seem to be any multiline feature. surely there is some way to do this! also, python had a re.sub so you can replace things using regex searchs. how would you go about doing that in haskell? -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's

Daniel Fischer

9:12 p.m.

New subject: why is something different within a function when it comes out?

On Wednesday 14 July 2010 22:41:00, prad wrote:

...

also, python had a re.sub so you can replace things using regex searchs. how would you go about doing that in haskell?

There's subRegex in Text.Regex in the regex-compat package (don't know if it's also provided in other packages).

Michael Mossey

15 Jul 15 Jul

12:28 a.m.

New subject: why is something different within a function when it comes out?

prad wrote:

...

i looked at http://hackage.haskell.org/packages/archive/regex-tdfa/1.1.3/doc/html/Text-R... but am having difficulty figuring things out from the documentation - and there doesn't seem to be any multiline feature. surely there is some way to do this!

also, python had a re.sub so you can replace things using regex searchs. how would you go about doing that in haskell?

For a parsing job, you might consider Parsec rather than regular expressions. Parsec is present by default with the Haskell Platform. I have not found any single comprehensive documentation on the latest version of Parsec... I had to ask people for help a lot. Mike

prad

1:19 a.m.

On Wed, 14 Jul 2010 17:28:18 -0700 Michael Mossey wrote:

...

For a parsing job, you might consider Parsec rather than regular expressions. ok thx for this, because it's bound to be useful down the road. i'm just trying to grab lines from something like this the stuff between the <text> ... </text>:

<text> The Mission of the League of Humane Voters® (LOHV) is to create, unite, and strengthen local political action committees, which work to enact animal-friendly legislation and elect candidates for public office who will use their votes and influence for animal protection. As election time comes around, council candidates in Independence, Ohio have taken the typical tactic of refusing to even answer questions regarding their position on various matters. Their 'silence speaks volumes'. </text> i'm using what chaddai showed me: xtract p c = (c =~ p :: [[String]]) !! 0 !! 1 which works fine for single lines, but produces nothing for multiple lines - same with some of the other ways i tried it with single lines good, nothing for multiple. python requires setting the re.S flag which i always found strange since \n i thought is a char as well. -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's

Chaddaï Fouché

7:39 a.m.

New subject: why is something different within a function when it comes out?

On Thu, Jul 15, 2010 at 3:19 AM, prad wrote:

...

which works fine for single lines, but produces nothing for multiple lines - same with some of the other ways i tried it with single lines good, nothing for multiple. python requires setting the re.S flag which i always found strange since \n i thought is a char as well.

The problem is classic in regex world : by default "." match any character except \n, I would suggest "<title>\n([^<]*)\n</title>" which is probably a bit more robust anyway. Though you must be aware that parsing html (or any markup language) properly with regexp is just impossible in general and you can only get crude and fragile approximations. There are proper html parsing libraries on hackage if your needs become too complex for simple regexp to handle. -- Jedaï

prad

7:12 p.m.

New subject: why is something different within a function when it comes out?

...

Though you must be aware that parsing html (or any markup language) properly with regexp is just impossible in general and you can only get crude and fragile approximations. i never realized that since my needs have been pretty simple and

...

There are proper html parsing libraries on hackage if your needs become too complex for simple regexp to handle. i found tagsoup a while ago and i'm trying to learn how to use it better. it's a bit of an overkill for what i'm doing here which

On Thu, 15 Jul 2010 09:39:39 +0200 Chaddaï Fouché wrote: python's re has done things nicely. probably can be done even without regex - something i'm thinking about. thx. -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's

Chaddaï Fouché

7:44 a.m.

New subject: why is something different within a function when it comes out?

On Thu, Jul 15, 2010 at 3:19 AM, prad wrote:

...

python requires setting the re.S flag

Note that Haskell also allows you to do things like that though you must compile the regexp with the proper flags, you can't use a simple string as a regexp anymore (mkRegexpWithOpts False False "regexp" compile with "single-line" semantics (ie. "." will match \n)). -- Jedaï

5479

Age (days ago)

5480

Last active (days ago)

List overview

Download

9 comments

4 participants

participants (4)

Chaddaï Fouché
Daniel Fischer
Michael Mossey
prad