[maemo-community] [maemo2midgard] Wiki migration
From: Dave Neary bolsh at gnome.orgDate: Thu May 15 21:46:50 EEST 2008
- Previous message: [maemo2midgard] Minutes of maemo web monthly meeting, 13/5/2008
- Next message: [maemo2midgard] Wiki migration
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Hi all, I've spent the day working on a strategy for migrating the wiki from midgard/markdown to MediaWiki, and with some help from Niels, I've got a proposal (not quite complete yet, but not far). First off, I think we should split the steps of wiki clean-up and migration. Ideally, the migration should be mostly or completely automated, and the cleap-up should be either pre-processing or post-processing. The wiki clean-up will continue to live in https://maemo.org/community/wiki/wikireorg/ - I have a small team of volunteers who are prepared to help me with the wiki clean-up, and that's a decent place for us to start already. So, without further ado, here's the migration plan for midgard to mediawiki: Needs: ====== • Need a nice way to generate a list of pages • Need a nice way to convert midgard wiki text (Markdown) to mediawiki • Should probably do wiki clean-up work before migration Converting a page from Markdown to MediaWiki: ============================================= 1. Install pandoc http://johnmacfarlane.net/pandoc/ (apt-get install pandoc on Ubuntu 8.04) 2. Install HTML WikiConverter with MediaWiki dialect http://search.cpan.org/dist/HTML-WikiConverter/lib/HTML/WikiConverter.pm • apt-get install libhtml-wikiconverter-perl • perl -MCPAN -e 'install HTML::WikiConverter::MediaWiki' 3. Download the text of the page (for the purposes of the test: https://maemo.org/community/wiki/howdoibecomeroot/) • Page->Edit • copy wikitext • paste into a text file locally (HowDoIBecomeRoot.txt) • NEW! Thanks Niels wget http://maemo.org/community/wiki/source/HowDoIBecomeRoot/ -O HowDoIBecomeRoot.txt 4. Convert to HTML with pandoc • pandoc -f markdown -t html -o HowDoIBecomeRoot.html HowDoIBecomeRoot.txt Note: The markdown2html step isn't 100% reliable. Lines starting with " #!/bin/bash" get turned into h1s. 5. Convert from HTML to MediaWiki with WikiConverter: • html2wiki --dialect MediaWiki --encoding iso-8859-1 \ --base-uri http://wiki.maemo.org/ \ --wiki-uri http://wiki.maemo.org/ \ HowDoIBecomeRoot.html > HowDoIBecomeRoot.wiki 6. Create the page in MediaWiki 7. Upload the wiki text to the Mediawiki page This takes about 15 minutes *per page* because of all of the hassel of reading the source in midgard and creating the page in mediawiki. Mass export from midgard: ========================= 1.Get list of all wiki pages • Use the admin interface to query all wiki objects • Copy & paste filenames to a file 2. while read wikipage; do wget http://maemo.org/community/wiki/source/${wikipage}/ -O ${wikipage}.txt pandoc -f markdown -t html -o ${wikipage}.html ${wikipage}.txt html2wiki --dialect MediaWiki --encoding iso-8859-1 \ --base-uri http://wiki.maemo.org/ \ --wiki-uri http://wiki.maemo.org/ \ ${wikipage}.html > ${wikipage}.wiki done Mass import into wikipedia: =========================== See http://meta.wikimedia.org/wiki/Help:Export and http://meta.wikimedia.org/wiki/Help:Import We need to generate an XML file like this from all the pages: http://meta.wikimedia.org/wiki/Help:Export#Example - this is the one big remaining TODO. See http://meta.wikimedia.org/wiki/Help:Import for importing the XML file. There's an option to turn on in MediaWiki. Comments? Suggestions? Improvements? Cheers, Dave. -- Dave Neary GNOME Foundation member bolsh at gnome.org
- Previous message: [maemo2midgard] Minutes of maemo web monthly meeting, 13/5/2008
- Next message: [maemo2midgard] Wiki migration
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]