why is something different within a function when it comes out?

i have this: main = do c <- readFile "test.txt" let tleP = "<title>\n(.*)\n</title>" let tle = c=~tleP::[[String]] putStrLn $ tle!!0!!1 let g = xtract tleP c putStrLn $ show g xtract p c = do let r = c=~p::[[String]] return r!!0!!0!!1 for the first putStrLn i get: League of Humane Voters Home Page for the second i get "League of Humane Voters Home Page" but only after i have done r !!0 !!0 !!1 whereas i only needed tle !!0 !!1 it seems to me that both tle and r serve the same purpose though they come out as different types r being a string and tle being i don't know what because i get a "Not in scope: 'tle' so even though both r and tle are set as [[String]], they don't seem to be the same creature. -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's

I'm fairly new, so I can't fully explain the behavior you got, but I know at least one thing you did wrong: prad wrote:
i have this:
main = do
c <- readFile "test.txt" let tleP = "<title>\n(.*)\n</title>" let tle = c=~tleP::[[String]] putStrLn $ tle!!0!!1
let g = xtract tleP c putStrLn $ show g
xtract p c = do let r = c=~p::[[String]] return r!!0!!0!!1
You probably mean to write
xtract p c = r !! 0 !! 1 where r = c=~p::[[String]]
You used do-notation when you meant to write a simple function. The use of "return" in do-notation is one point of confusion with beginners. It does not operate like "return" in an imperative language. I'm not sure what your 'xtract' did, but the compiler probably accepted the do-notation because it specified a list monad of some sort. I recommend you pay close attention to introductory examples, noticing in particular when they are unlike imperative languages. Find some tutorials on monads and the do-notation. Best, Mike

On Wed, Jul 14, 2010 at 7:33 AM, prad
xtract p c = do let r = c=~p::[[String]] return r!!0!!0!!1
return is just a function in Haskell, and given that function application has priority over any operator, this mean your last line is :
(return r) !! 0 !! 0 !! 1
return x = [x] so in this case your last line put r in a singleton list : [r] !! 0 !! 0 !! 1
return type is "(Monad m) => a -> m a" where m is a type constructor variable (a type constructor is a parametrized type, think array parametrized on the element type in most language, or template in C++) constrained to be an instance of the Monad typeclass. Monad is an important typeclass in Haskell but here it's enough to look at (!!) type "[a] -> Int -> a" to see that (return r) should be of a list type, list is a monad so this code typecheck, but return is perfectly redundant : For the list monad : then (!! 0) extract r
r !! 0 !! 1 And you get what you wanted in the first place...
So you could write just "r !! 0 !! 1" instead of "return r!!0!!!0!!1" for the same result. In fact xtract could be written :
xtract p c = (c =~ p :: [[String]]) !! 0 !! 1
do-notation (which is just syntactic sugar to write monadic code easily) and return (which is just a function, not a keyword) are only useful when you're working with monads, if you're not you shouldn't use them. -- Jedaï

In fact xtract could be written :
xtract p c = (c =~ p :: [[String]]) !! 0 !! 1
On Wed, 14 Jul 2010 10:54:22 +0200
Chaddaï Fouché

On Wednesday 14 July 2010 22:41:00, prad wrote:
also, python had a re.sub so you can replace things using regex searchs. how would you go about doing that in haskell?
There's subRegex in Text.Regex in the regex-compat package (don't know if it's also provided in other packages).

prad wrote:
i looked at http://hackage.haskell.org/packages/archive/regex-tdfa/1.1.3/doc/html/Text-R... but am having difficulty figuring things out from the documentation - and there doesn't seem to be any multiline feature. surely there is some way to do this!
also, python had a re.sub so you can replace things using regex searchs. how would you go about doing that in haskell?
For a parsing job, you might consider Parsec rather than regular expressions. Parsec is present by default with the Haskell Platform. I have not found any single comprehensive documentation on the latest version of Parsec... I had to ask people for help a lot. Mike

On Wed, 14 Jul 2010 17:28:18 -0700
Michael Mossey
For a parsing job, you might consider Parsec rather than regular expressions. ok thx for this, because it's bound to be useful down the road. i'm just trying to grab lines from something like this the stuff between the <text> ... </text>:
<text> The Mission of the League of Humane Voters® (LOHV) is to create, unite, and strengthen local political action committees, which work to enact animal-friendly legislation and elect candidates for public office who will use their votes and influence for animal protection. As election time comes around, council candidates in Independence, Ohio have taken the typical tactic of refusing to even answer questions regarding their position on various matters. Their 'silence speaks volumes'. </text> i'm using what chaddai showed me: xtract p c = (c =~ p :: [[String]]) !! 0 !! 1 which works fine for single lines, but produces nothing for multiple lines - same with some of the other ways i tried it with single lines good, nothing for multiple. python requires setting the re.S flag which i always found strange since \n i thought is a char as well. -- In friendship, prad ... with you on your journey Towards Freedom http://www.towardsfreedom.com (website) Information, Inspiration, Imagination - truly a site for soaring I's

On Thu, Jul 15, 2010 at 3:19 AM, prad
which works fine for single lines, but produces nothing for multiple lines - same with some of the other ways i tried it with single lines good, nothing for multiple. python requires setting the re.S flag which i always found strange since \n i thought is a char as well.
The problem is classic in regex world : by default "." match any character except \n, I would suggest "<title>\n([^<]*)\n</title>" which is probably a bit more robust anyway. Though you must be aware that parsing html (or any markup language) properly with regexp is just impossible in general and you can only get crude and fragile approximations. There are proper html parsing libraries on hackage if your needs become too complex for simple regexp to handle. -- Jedaï

Though you must be aware that parsing html (or any markup language) properly with regexp is just impossible in general and you can only get crude and fragile approximations. i never realized that since my needs have been pretty simple and
There are proper html parsing libraries on hackage if your needs become too complex for simple regexp to handle. i found tagsoup a while ago and i'm trying to learn how to use it better. it's a bit of an overkill for what i'm doing here which
On Thu, 15 Jul 2010 09:39:39 +0200
Chaddaï Fouché

On Thu, Jul 15, 2010 at 3:19 AM, prad
python requires setting the re.S flag
Note that Haskell also allows you to do things like that though you must compile the regexp with the proper flags, you can't use a simple string as a regexp anymore (mkRegexpWithOpts False False "regexp" compile with "single-line" semantics (ie. "." will match \n)). -- Jedaï
participants (4)
-
Chaddaï Fouché
-
Daniel Fischer
-
Michael Mossey
-
prad