Rewrite PageCreationBot
Revision as of 00:21, 26 October 2007 by VartanSimonian (talk | contribs) (→Discussion)
What (summary)
- New page-building bot
- Still relies on Java/Tomcat to do crawling (for now)
- Carefully tested
Why this is important
- We need to have control over the pages that our created on our site.
- The old bot was known to pollute the database; we need control over all the access points that could screw up our data.
- Gaining mastery over the code so that we can add new features easily.
DoneDone
- Creates news pages based on a template
- Monitoring and logging have been added (tests whether or not the bot succeeds)
- Output to a log file. Either on each squal box (with aggregation) or an NFS volume. Have emailed Ethan and Michael about this.
- Hooked in to all the old points Bot was
- Not exactly the same points, but the same end-user functionality.
- Projects:BotTest problems fixed
Bot insertion points into Mediawiki
-
/wiki/skins/common/generatePage.js (and some other javascript that we should remove) -
/wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace) -
/wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.) - /wiki/extensions/AboutUsBuildDomain/AboutUsBuildDomain.php should be the best place to keep it.
Schema
- New schema location http://images.aboutus.org/images/b/be/Aboutusbot_new.zip. Its an sql file and not a compressed one.
Issues
- Umar and Ghufran are onto this project for now. We are still not sure what we have to do in this project. For now, exploring the database tables and getting a feel of the code in pagecreationbot.rb seems like a good idea. But we need more details on the description, scope, and time for the project.
Discussion
- I heard rumor of a possible change in format for new pages. Is this true? Where is the discussion about the new format possibilities happening? TedErnst | talk 13:50, 25 October 2007 (PDT)
- I think that the bot is still using <graphic> tag instead of the tag <email> with the new name. Please correct me if I'm wrong. :) Vartan 17:21, 25 October 2007 (PDT)