09-17-2018, 05:39 PM
The Zybez Forums ( http://forums.zybez.net ) should now be closed to new posts. There's no telling how long the forums will be kept online beyond this. For the immediate foreseeable future, it seems they will stay online (as a read-only archive). But beyond a year, I'd say it looks bleak. For that eventuality, I've gone ahead and done something... but first let me tell you some facts:
1. Archive.org only saves a few pages of the website over years. We can manually request it to archive individual pages but saving the whole site will take forever and Archive.org probably has limits in place for people attempting this sort of thing. It's safe to say that Archive.org will stay online forever. So, copies (of certain pages) of Zybez Forums will always be online. These copies will probably outlast any other archives of Zybez Forums in the long run.
2. I was not given the Zybez Forums database. Had I been given it, I could have put it online and the site would remain open and online.
3. So, the only other thing I could think of to ensure that we have the most complete possible archive of the site is to scrape it.
Scraping means that a crawler (like a bot with a web-browser) is gonna try to browse Zybez Forums, clicking links randomly, and saving the pages one by one along with whatever pictures or javascript files or whatever it can save. Here's the problem: Zybez Forums have over a decade's worth of posts. Forget the posts, just the memberlist is so long that scraping each of the user profile pages will take several forevers.
Nevertheless, I decided to give it a try anyways.
As of me posting this, the scraper has already scraped over 2 GB worth of content but I suspect that's not even a small fraction of the whole thing. It's going slow as not to overload any servers/networks - but it's going steady. I'll keep it running so the archive will get more and more complete.
Fun fact: the scraper is saving EVERYTHING, including remotely hosted pictures. That means those screenshots or signatures or whatever in the posts are also being archived.
If you're interested to see it, here it is:
https://zybez.runescapecommunity.com/forums.zybez.net/index.html
1. Archive.org only saves a few pages of the website over years. We can manually request it to archive individual pages but saving the whole site will take forever and Archive.org probably has limits in place for people attempting this sort of thing. It's safe to say that Archive.org will stay online forever. So, copies (of certain pages) of Zybez Forums will always be online. These copies will probably outlast any other archives of Zybez Forums in the long run.
2. I was not given the Zybez Forums database. Had I been given it, I could have put it online and the site would remain open and online.
3. So, the only other thing I could think of to ensure that we have the most complete possible archive of the site is to scrape it.
Scraping means that a crawler (like a bot with a web-browser) is gonna try to browse Zybez Forums, clicking links randomly, and saving the pages one by one along with whatever pictures or javascript files or whatever it can save. Here's the problem: Zybez Forums have over a decade's worth of posts. Forget the posts, just the memberlist is so long that scraping each of the user profile pages will take several forevers.
Nevertheless, I decided to give it a try anyways.
As of me posting this, the scraper has already scraped over 2 GB worth of content but I suspect that's not even a small fraction of the whole thing. It's going slow as not to overload any servers/networks - but it's going steady. I'll keep it running so the archive will get more and more complete.
Fun fact: the scraper is saving EVERYTHING, including remotely hosted pictures. That means those screenshots or signatures or whatever in the posts are also being archived.
If you're interested to see it, here it is:
https://zybez.runescapecommunity.com/forums.zybez.net/index.html