Difference between revisions of "Rewrite PageCreationBot"

Line 19: Line 19:
 
* Hooked in to all the old points Bot was
 
* Hooked in to all the old points Bot was
 
* Checks robots.txt before spidering the website.
 
* Checks robots.txt before spidering the website.
 +
 +
== Bot insertion points into Mediawiki ==
 +
* /wiki/skins/common/generatePage.js (and some other javascript that we should remove)
 +
* /wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php
  
 
[[Category:DevelopmentTeamTask]]
 
[[Category:DevelopmentTeamTask]]
 
</noinclude>
 
</noinclude>

Revision as of 23:43, 23 August 2007

DevelopmentTeam

What (summary)

  • New page-building bot
  • Still relies on Java/Tomcat to do crawling (for now)
  • Carefully tested

Why this is important

  • We need to have control over the pages that our created on our site.
  • The old bot was known to pollute the database; we need control over all the access points that could screw up our data.
  • Gaining mastery over the code so that we can add new features easily.


DoneDone

  • Creates news pages based on a template
  • Monitoring and logging have been added (tests whether or not the bot succeeds)
  • Hooked in to all the old points Bot was
  • Checks robots.txt before spidering the website.

Bot insertion points into Mediawiki

  • /wiki/skins/common/generatePage.js (and some other javascript that we should remove)
  • /wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php

Retrieved from "http://aboutus.com/index.php?title=Rewrite_PageCreationBot&oldid=9151274"