
I had thought about doing something like that, except about 80% of the time
the words in the file won't have a delimitor such as in
"theblueskythegreenearth" but I still need to be able to count "the" or the
ordered sequence of [ t, h, e ] within such a character string (or [ b, l,
u, e ] or whatever)
Again, thank you for your time & thanks in advance.
On Fri, Jan 22, 2021, 4:03 AM
Send Beginners mailing list submissions to beginners@haskell.org
To subscribe or unsubscribe via the World Wide Web, visit http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners or, via email, send a message with subject or body 'help' to beginners-request@haskell.org
You can reach the person managing the list at beginners-owner@haskell.org
When replying, please edit your Subject line so it is more specific than "Re: Contents of Beginners digest..."
Today's Topics:
1. Count Words from File (A. Mc.) 2. Re: Count Words from File (Francesco Ariis)
----------------------------------------------------------------------
Message: 1 Date: Thu, 21 Jan 2021 09:49:03 -0800 From: "A. Mc." <47dragonfyre@gmail.com> To: beginners@haskell.org Subject: [Haskell-beginners] Count Words from File Message-ID:
Content-Type: text/plain; charset="utf-8" Hello,
I apologize in advance if this is crossposting. My IRC client did not appear to be working properly.
I am new to Haskell and I need to find a way to count specific words in a file. File could contain spaces between words, no spacing, uppercase, lowercase, etc so I've standardized it to once the file is taken in, convert to lowercase and remove the spacing. I've also read the postings about using ByteString instead of [Char] so I am trying to use that. But, as it still seems to either view all elements as fused or each letter as individual, I'm not entirely sure how to tackle this. The input after transforming would be something like "theblueskyisveryblue" for uniformity and would need to count "the" and "blue". Feels like I should be able to do a map and foldr(?) but I'm not sure how to get Haskell to recognize 'the' for example and not count all the t's, h's, e's etc in the file, nor am I entirely sure how to properly compose a map-fold for character arrays like this.
Thanks in advance and thank you for your time.
participants (1)
-
A. Mc.