Author Topic: Locally accessible, readable HTML web archive of the wiki / New wiki restoration  (Read 375 times)

0 Members and 1 Guest are viewing this topic.

Offline bonion (OP)

  • Able Ordinary Rate
  • b
  • Posts: 3
  • Thanked: 6 times
I managed to get a significant portion of the old wiki into a more readable format than the XML dump. As described on Discord, there are some caveats:
  • Everything is in HTML, so you do not need a local webserver and database to access it without reading the XML as raw text.
  • Every page is the most recent snapshot web archive had before the site went down, however not every page is available
  • Hyperlinks, search bars, and anything regarding dynamic pages do not work, images were removed as ~90% of them were corrupted and not viewable
  • Template, Talk, and Edit pages I manually removed, but I can get them if needed for the new wiki.
  • Since the search bar does not work, you can't look up pages using that. The filename of every page is instead the result of searching them (The more savvy will see they are HTML pages generated via GET requests). To search for a page, use your file explorer search bar.

The new wiki is also currently uneditable because it spits internal server errors when trying to create an account.

EDIT:
Important note:
Windows file explorer will always also search inside the contents of the file, which is something you might not want it to do. To search for pages containing a particular word in their name, ignoring contents of pages, use name:[word_you're_looking_for] in the search bar (this keyword is localized, so if your system language is something other than English, the keyword will also be different). Wildcard characters are also allowed.

EDIT 2:
The new wiki at https://aurorawiki2.pentarch.org is now functional!
« Last Edit: July 09, 2025, 05:57:04 AM by bonion »
 
The following users thanked this post: skoormit, lumporr, Ghostly, PastorOfMuppets

Offline bonion (OP)

  • Able Ordinary Rate
  • b
  • Posts: 3
  • Thanked: 6 times
WIKI code from XML dump
« Reply #1 on: July 09, 2025, 05:56:47 AM »
I took the XML dump and converted it all to raw mediawiki text. This means that entire page contents can be simply copy pasted instead of slowly transcribing the contents, which will make the restoration of the wiki considerably easier and faster.
It also contains information about Templates used, though it appears the Template themselves are not in there. We're going to have to make them back up from scratch.
« Last Edit: July 09, 2025, 09:44:37 AM by bonion »
 
The following users thanked this post: lumporr

Offline bonion (OP)

  • Able Ordinary Rate
  • b
  • Posts: 3
  • Thanked: 6 times
I realized that while users may not need user talk pages, contribution histories, and so on, wiki editors probably will, and also, images are not corrupt, it's just the pointers inside the pages to them are invalid, but the images are actually saved and openable.

Attached is the unfiltered, FULL, considerably larger dump of the web archive, images and all. The sole modification is that all pages made with GET requests that previously had a bad extension (.php%3ftitle%3dAncient_races, for example) have had .html appended to their name, so they are openable.
Not useful for those who only wish to visit the wiki.
 
The following users thanked this post: lumporr