Difference between revisions of "Rewrite PageCreationBot"

(flagged as noextra)
 

(21 intermediate revisions by 9 users not shown)

Line 1: Line 1:
<noinclude><big>[[OurWork]] < [[DevelopmentTeam]] < [[DevelopmentTeamPriorities|Priorities]] < </noinclude>('''3''') [[Rewrite PageCreationBot]] ('''[[Who?|?]]''') {{JustTinyEditIcon|Rewrite PageCreationBot}}<noinclude></big>
+
<noinclude><big>[[OurWork]] < [[DevelopmentTeam]] < [[DevelopmentTeamPriorities|Priorities]] < </noinclude>('''2''') [[Rewrite PageCreationBot]] ('''[[Mohammad Ghufran|Ghufran]]''', '''[[Umar Sheikh]]''') {{JustTinyEditIcon|Rewrite PageCreationBot}}<noinclude></big>
 
__NOTOC__
 
__NOTOC__
 
== What (summary) ==
 
== What (summary) ==
Line 6: Line 6:
 
* Still relies on Java/Tomcat to do crawling (for now)
 
* Still relies on Java/Tomcat to do crawling (for now)
 
* Carefully tested
 
* Carefully tested
 +
 +
== Current Status ==
 +
* <s>Creates new pages based on a template</s>
 +
* <s>Monitoring and Logging has been added</s>
 +
* <s>Test cases added</s>
 +
* We have created a sample page which is a rough sketch of how a page looks like after being created by the bot. [[PageCreationBot_Sample | Here...]]
 +
* The current version of the PageCreationBot is not using the thumbnail extracted from Alexa. It is currently using the thumbnail tag being used in the Domain_Page template.
 +
** This can be changed by using the get_thumbnail function that is already in place.
  
 
== Why this is important ==
 
== Why this is important ==
 
+
* We need to have control over the pages that are created on our site.
* We need to have control over the pages that our created on our site.
 
 
* The old bot was known to pollute the database; we need control over all the access points that could screw up our data.
 
* The old bot was known to pollute the database; we need control over all the access points that could screw up our data.
 
* Gaining mastery over the code so that we can add new features easily.
 
* Gaining mastery over the code so that we can add new features easily.
 
  
 
== [[DoneDone]] ==
 
== [[DoneDone]] ==
 
* Creates news pages based on a template
 
* Creates news pages based on a template
 
* Monitoring and logging have been added (tests whether or not the bot succeeds)
 
* Monitoring and logging have been added (tests whether or not the bot succeeds)
 +
** Output to a log file.  Either on each squal box (with aggregation) or an NFS volume.  Have emailed Ethan and Michael about this.
 
* Hooked in to all the old points Bot was
 
* Hooked in to all the old points Bot was
 +
** Not exactly the same points, but the same end-user functionality.
 
* [[Projects:BotTest]] problems fixed
 
* [[Projects:BotTest]] problems fixed
  
 
== Bot insertion points into Mediawiki ==
 
== Bot insertion points into Mediawiki ==
* /wiki/skins/common/generatePage.js (and some other javascript that we should remove)
+
* <strike>/wiki/skins/common/generatePage.js (and some other javascript that we should remove)</strike>
* /wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace)
+
* <strike>/wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace)</strike>
* /wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.)
+
* <strike>/wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.)</strike>
 +
* /wiki/extensions/AboutUsBuildDomain/AboutUsBuildDomain.php should be the best place to keep it.
  
[[Category:DevelopmentTask]]
+
== Schema ==
 +
* New schema location http://images.aboutus.org/images/b/be/Aboutusbot_new.zip. Its an sql file and not a compressed one.
 +
==Discussion==
 +
* I heard rumor of a possible change in format for new pages.  Is this true?  Where is the discussion about the new format possibilities happening?  [[User:TedErnst|TedErnst]] | <small>[[User talk:TedErnst|talk]]</small> 13:50, 25 October 2007 (PDT)
 +
* I think that the bot is still using <nowiki><graphic></nowiki> tag instead of the tag <nowiki><email></nowiki> with the new name. Please correct me if I'm wrong. :) {{IconSig|Vartan|17:21, 25 October 2007 (PDT)}}
 +
[[Category:OpenTask]]
 +
[[Category:DevelopmentTeam]]
 
</noinclude>
 
</noinclude>

Latest revision as of 11:31, 19 December 2013

OurWork Edit-chalk-10bo12.png

What (summary)

  • New page-building bot
  • Still relies on Java/Tomcat to do crawling (for now)
  • Carefully tested

Current Status

  • Creates new pages based on a template
  • Monitoring and Logging has been added
  • Test cases added
  • We have created a sample page which is a rough sketch of how a page looks like after being created by the bot. Here...
  • The current version of the PageCreationBot is not using the thumbnail extracted from Alexa. It is currently using the thumbnail tag being used in the Domain_Page template.
    • This can be changed by using the get_thumbnail function that is already in place.

Why this is important

  • We need to have control over the pages that are created on our site.
  • The old bot was known to pollute the database; we need control over all the access points that could screw up our data.
  • Gaining mastery over the code so that we can add new features easily.

DoneDone

  • Creates news pages based on a template
  • Monitoring and logging have been added (tests whether or not the bot succeeds)
    • Output to a log file. Either on each squal box (with aggregation) or an NFS volume. Have emailed Ethan and Michael about this.
  • Hooked in to all the old points Bot was
    • Not exactly the same points, but the same end-user functionality.
  • Projects:BotTest problems fixed

Bot insertion points into Mediawiki

  • /wiki/skins/common/generatePage.js (and some other javascript that we should remove)
  • /wiki/extensions/AboutUsDomainRedirect/SpecialRedirectToDomain.php (deprecate and point to CaseSpace)
  • /wiki/extensions/CaseSpace/CaseSpace.php (Ultimately, here is where the magic will happen.)
  • /wiki/extensions/AboutUsBuildDomain/AboutUsBuildDomain.php should be the best place to keep it.

Schema

Discussion

  • I heard rumor of a possible change in format for new pages. Is this true? Where is the discussion about the new format possibilities happening? TedErnst | talk 13:50, 25 October 2007 (PDT)
  • I think that the bot is still using <graphic> tag instead of the tag <email> with the new name. Please correct me if I'm wrong. :) Vartan 17:21, 25 October 2007 (PDT)

Retrieved from "http://aboutus.com/index.php?title=Rewrite_PageCreationBot&oldid=40118904"