For approximately 10 years, reviews.llvm.org functioned as the codeview site for the LLVM project, utilizing a Phabricator instance. Thiswebsite hosted numerous invaluable code review discussions. However,following LLVM's transitionto GitHub pull requests, there arises a necessity for a read-onlyarchive of the existing Phabricator instance.
The intent is to eliminate a SQL engine. Phabicator operates on a complexdatabase scheme. To minimize time investment, the most feasibleapproach seems to involve downloading the static HTML pages andemploying a lightweight scraping process.
Raphaël Gomès developed phab-archiveto serve a read-only archive for Mercurial's Phabricator instance. I have modifiedthe code to suit reviews.llvm.org.
At this juncture, the only requirement is someone with domain accessto redirect reviews.llvm.org to the archive website. Then we can obtain aHTTPS certificate.
Data
The file hierarchy is quite straightforward.archive/unprocessed/diffs contains raw HTML pages whiletemplates/diffs contains scraped HTML pages alongside patchfiles.