CAPTCHA For LinkSpam

OurWork CAPTCHA For LinkSpam (Hassan Javeed)


Project: SpamMitigation

What (summary)

Require a captcha when adding an external link (or image link) to a page for anonymous users or when creating a new account.

Why this is important

For four days in a row we've been getting hit by 1000-4000 spambot edits per day. This is taking a person-week to deal with each time it happens. We could use the people on creative "high yield" work rather than the drudgery of undoing spam if this type of edit were prevented.

By making this also apply to external image links, this encourages a) creating an account instead to upload images directly; b) images that are internal can be resized, while external image links don't resize as well; c) external images are more susceptible to changes (servers going down, malicious image changes, etc).

Additionally, from a broader standpoint, we should be encouraging internal links (images and wiki links) over external ones anyway and the captcha system might help point us in this direction.

DoneDone

  • Anonymous edits that add external links (including images) require a captcha (may be disabled by flag)
  • Require captcha on account creation (may be disabled by flag)
  • Tweak request: Provide a flag that allows us to turn CAPTCHAs on and off for ALL anonymous edits to combat random spam attacks
  • Tweak request: Remove the requirement that logged in users have to complete the CAPTCHA
    • Logged in only have to complete captcha for add spam, not for external links

Links

Steps to get to DoneDone

  • Get the site specific encryption keys(public and private) for AboutUs.org
  • Add the extension directory
  • Modify LocalSettings.php to include the extension
  • Check for conflicts with software modifications
  • Review the permissions settings for the extension
  • Test the Captcha with Sysop users, logged in users and non-logged in users

Testing

http://msn.com

Config Info

ConfirmEdit introduces a ['skipcaptcha'] for wgGroupPermissions. This is useful for groups that shouldn't see captchas ever. (Bots and Sysops)

Defaults from ConfirmEdit.php:

$wgGroupPermissions['*'            ]['skipcaptcha'] = false;
$wgGroupPermissions['user'         ]['skipcaptcha'] = false;
$wgGroupPermissions['autoconfirmed']['skipcaptcha'] = false;
$wgGroupPermissions['bot'          ]['skipcaptcha'] = true; // registered bots
$wgGroupPermissions['sysop'        ]['skipcaptcha'] = true;

There are four triggers that can generate a captcha and allow for different situations.

  • $wgCaptchaTriggers['edit'] = true; -- Would check on every edit
  • $wgCaptchaTriggers['create'] = true; -- Check on page creation.
  • $wgCaptchaTriggers['addurl'] = true; -- Check on edits that add URLs
  • $wgCaptchaTriggers['createaccount'] = true; -- Check on account creation.
  • $wgCaptchaTriggers['badlogin'] = true; -- Check after a failed log-in attempt.

Default triggers from ConfirmEdit.php:

$wgCaptchaTriggers['edit']          = false; 
$wgCaptchaTriggers['create']        = false; 
$wgCaptchaTriggers['addurl']        = true; 
$wgCaptchaTriggers['createaccount'] = true;
$wgCaptchaTriggers['badlogin']      = true;

LocalSettings.php

//Used for enabling and disabling captcha
$enableCaptcha                    = true;  // enable/disable entire captcha framework (default is false)
$captchaReplacesSpamFilter        = true;  // uses a captcha wherever the spam filter would have been triggered
$captchaCreateAccount             = true;  // uses a captcha when a new account is created
$captchaBadLogin                  = true;  // uses a captcha when user does a bad login 
$captchaFreeTime                  = 0;     // After a successful captcha, won't have another captcha for at least this many seconds
$captchaSysopEdits                = false; // Every edit for every user
$captchaSysopAddsExternalLink     = false; // Every addition of an external link for every user
$captchaSysopAddsSpam             = true;  // Every new page title (including move to) or added word that trips the spam filter for every user
$captchaLoggedInEdits             = false; // Every edit for every NON-SYSOP user has a captcha
$captchaLoggedInAddsExternalLink  = false; // Every addition of an external link for every NON-SYSOP user
$captchaLoggedInAddsSpam          = false; // Every new page title (including move to) or added word that trips the spam filter for every NON-SYSOP user
$captchaAnonymousEdits            = true; // Every edit for every anonymous user has a captcha
$captchaAnonymousAddsExternalLink = false; // Every addition of an external link for every anonymous user
$captchaAnonymousAddsSpam         = false; // Every new page title (including move to) or added word that trips the spam filter for 

Image Test

logo_npr_125.gif ambien

Captcha Issues

Tried to save the original bot scrape version of this page: 1stok.com. Captcha came up and didn't work no matter how many times I tried. I was able to save the page as a blank page. Isabel 17:27, 21 September 2007 (PDT)

Pages are still getting spammed by IP: PurchasePhentermine.info, UK-Wedding-Directory.com, Qinfeng-agri.com. Does that mean a problem with the captcha, or does it mean it's a human being adding these links? TedErnst | talk 09:40, 22 September 2007 (PDT)
Captcha comes up anytime I try to add the PossibleSpamSite, which is a bit annoying. Nathan (talk) 18:47, 24 September 2007 (PDT)
Unfortunately, my guess is that's going to be the situation, since that template has an external link in it. Maybe the Dev guys can cook up some sort of exception, but then the problem is that it becomes a way for the ne'er-do-wells to exploit. No one said the captcha wasn't going to have some pains, sorry it's affecting the constructive (and building) stuff you're doing. -- TakKendrick
Aha. I forgot about the obvious external link in the template. Makes sense now. Nathan (talk) 19:01, 24 September 2007 (PDT)
Is it possible to have a whitelist? TedErnst | talk 14:11, 26 September 2007 (PDT)


  1. I really don't like the way it says, "don't spam, read books". Can it be changed somehow?
  2. Can't we have its settings changed so that it doesn't appear to a logged in person or at least it does after some time, like half an hour? We seem to be having fewer edits since CAPTCHA. Is there a way of measuring it? Asad
  3. Also, Mark suggested that text describing why you the user are being asked to enter a captcha, especially as not all types of edits require captchas. Kasey

Discussion

Interesting. Only have to type one of the words. TedErnst ... Are you sure? I tried it and when I entered one word I had to actually fill in two captchas. In order to get my http://awesome-but-spammy-site-message.com to save. But then, the next time it was actually one one.

Not including Sysops in the CAPTCHA trap would be very useful. Can we make this happen? Kasey and MarkDilley




Retrieved from "http://aboutus.com/index.php?title=CAPTCHA_For_LinkSpam&oldid=20657429"