
Hello GHCers, As you may have noticed, the GHC Trac instance is currently being hit with a significant amount of spam traffic recently. I believe I've cleared up most of the garbage and the filter configuration has been tightened so that future spam attacks will be easier to quell. Of course, this tightening means that there is a greater chance of false positives. If you do find that your ticket comments or Wiki edits are being rejected by the spam filter, please notify either Austin or me so that we can train the filter on your example. Cheers, - Ben

Hi Ben, Could we not have a captcha instead of a reject, to avoid false positives? That would require no training. Since I assume most Trac spammers are extremely unsophisticated, a simple hardcoded question like "What programming language is GC all about?" may be sufficient. On 15/04/16 17:28, Ben Gamari wrote:
Hello GHCers,
As you may have noticed, the GHC Trac instance is currently being hit with a significant amount of spam traffic recently. I believe I've cleared up most of the garbage and the filter configuration has been tightened so that future spam attacks will be easier to quell.
Of course, this tightening means that there is a greater chance of false positives. If you do find that your ticket comments or Wiki edits are being rejected by the spam filter, please notify either Austin or me so that we can train the filter on your example.
Cheers,
- Ben
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs

Niklas Hambüchen
Hi Ben,
Could we not have a captcha instead of a reject, to avoid false positives? That would require no training.
Since I assume most Trac spammers are extremely unsophisticated, a simple hardcoded question like "What programming language is GC all about?" may be sufficient.
The CAPTCHAs being broken are the reason why this incident occurred. I have added some more CAPTCHAs to try to dilute the pool of answers that they already know, but they still seem to solve them easily enough regardless. I can only imagine they have some sentient beings sitting at computers solving CAPTCHAs. I don't really feel like we can make the CAPTCHAs themselves any more difficult without excluding real new users, which I really want to avoid. Regardless, my goal here is to error on the side of less filtering, not more, even if this does mean more manual maintenance. To this end, I've configured the filters such that the probability of legitimate activity being suppressed should be negligible, * I've been careful to only train the Bayes filter on obvious spam; I have tested it against various snippets from the wiki and mailing list and have yet to see it score anything legitimate with a spam likelihood > 5%. * Even if the Bayes filter does deem your content to be spammy enough to warrant further attention, you will merely be asked to solve a CAPTCHA. Posts will not be outright rejected unless it is quite clear that they are spam. I am optimistic that the filtering will have negligible effect on legitimate traffic. As a smoke test I managed to create a new account, open a new ticket, and start a new Wiki page without even needing to solve a CAPTCHA. Cheers, - Ben

Question: are we talking captcha or REcaptcha?
My understanding is that REcaptcha is better than old school captcha
Have we evaluated it as an option?
http://www.google.com/recaptcha/intro/index.html
On Saturday, April 16, 2016, Ben Gamari
Niklas Hambüchen
javascript:;> writes: Hi Ben,
Could we not have a captcha instead of a reject, to avoid false positives? That would require no training.
Since I assume most Trac spammers are extremely unsophisticated, a simple hardcoded question like "What programming language is GC all about?" may be sufficient.
The CAPTCHAs being broken are the reason why this incident occurred. I have added some more CAPTCHAs to try to dilute the pool of answers that they already know, but they still seem to solve them easily enough regardless. I can only imagine they have some sentient beings sitting at computers solving CAPTCHAs.
I don't really feel like we can make the CAPTCHAs themselves any more difficult without excluding real new users, which I really want to avoid.
Regardless, my goal here is to error on the side of less filtering, not more, even if this does mean more manual maintenance. To this end, I've configured the filters such that the probability of legitimate activity being suppressed should be negligible,
* I've been careful to only train the Bayes filter on obvious spam; I have tested it against various snippets from the wiki and mailing list and have yet to see it score anything legitimate with a spam likelihood > 5%.
* Even if the Bayes filter does deem your content to be spammy enough to warrant further attention, you will merely be asked to solve a CAPTCHA. Posts will not be outright rejected unless it is quite clear that they are spam.
I am optimistic that the filtering will have negligible effect on legitimate traffic. As a smoke test I managed to create a new account, open a new ticket, and start a new Wiki page without even needing to solve a CAPTCHA.
Cheers,
- Ben

Carter Schonwald
Question: are we talking captcha or REcaptcha? My understanding is that REcaptcha is better than old school captcha
Have we evaluated it as an option? http://www.google.com/recaptcha/intro/index.html
We are using a GHC-specific CAPTCHA generator written by thomie. We could switch to reCAPTCHA if there is a compelling reason but at the moment things seem to work reasonably well so I'm not inclined to touch it. Cheers, - Ben

Gotcha.
On Sunday, April 17, 2016, Ben Gamari
Carter Schonwald
javascript:;> writes: Question: are we talking captcha or REcaptcha? My understanding is that REcaptcha is better than old school captcha
Have we evaluated it as an option? http://www.google.com/recaptcha/intro/index.html
We are using a GHC-specific CAPTCHA generator written by thomie.
We could switch to reCAPTCHA if there is a compelling reason but at the moment things seem to work reasonably well so I'm not inclined to touch it.
Cheers,
- Ben
participants (3)
-
Ben Gamari
-
Carter Schonwald
-
Niklas Hambüchen