All Classes Functions Variables Pages

Introduction

Wadseeker is a library for searching for and automatically downloading a list of mods for Doom (1993). Wadseeker requires a small subset of the Qt Toolkit, namely the QtCore and QtNetwork modules. In addition, to interact with Wadseeker, you will need to use Qt to connect to the signals.

Usage

The steps required for use are:

  1. Initialize an instance of Wadseeker.
  2. Connect the Qt signals to your slots.
  3. Configure some values such as the sites to search in and target directory.
  4. Create a list of wads to search for and pass it into Wadseeker::startSeek().

Observe the following example on how to do these things.

Wadseeker wadseeker;
// ... connect the signals ...
wadseeker.setPrimarySitesToDefault(); // Use the default list of sites
wadseeker.setTargetDirectory("./");
QStringList wads;
wads << "doom2.wad"; // This will be ignored because of isForbiddenWad()
wads << "somemod.wad";
wads << "somemod.pk3";
wadseeker.startSeek(wads);

Wadseeker runs asynchronously. Wadseeker::startSeek() launches the seek session and returns true when the seek starts. Then Wadseeker operates within the Qt's event loop: it navigates the Web, parses WWW sites looking for links, downloads the files and archives, installs or unpacks them. When it's done, successful or not, it emits the Wadseeker::allDone() signal.

Any Wadseeker options must be set before Wadseeker::startSeek() is called. Once the seek session starts, any changes to the options are ignored for the currently running session.

Is it only for WADs?

The documentation often refers to "WADs", and the name of the library is "Wadseeker", but the library is not just limited to WAD files. It can download anything that constitutes a Doom mod. So, wherever you see "WAD", in reality it also means "PK3" or even "DEH". It will also install unpacked "ZIP" or "7z" archives if they are listed by their exact name and extension.

How does it work?

Wadseeker belongs to a type of tools known as Web spiders (or Web crawlers). It's specialized towards searching (scraping) websites, looking for links to Doom WADs.

When Wadseeker is in this process of scraping the Web, it is called to be in a "seek session".

Before a "seek session" can start, Wadseeker must first be told which WADs to look for – it's not designed to just go and download everything it encounters. Having the list of WADs to get, Wadseeker starts the search. For this purpose, it has an embedded (hardcoded) list of WAD hosting websites it will first go to if not told otherwise. This is where a seek session starts usually.

If the link it navigates to goes to a website, it scrapes this website looking for links that may lead to the seeked WADs. When it finds a link it suspects to lead to the WAD download, it navigates to it and checks if this link is another website or a binary file.

If the link goes to another website, Wadseeker continues the scraping on that website. It will also follow HTTP redirects. However, there's a limit for such link chains. Wadseeker doesn't let itself be led on a goose chase by prankster Webmasters ;).

If the link goes to a binary file, Wadseeker attempts to install it in the specified target directory. Downloaded files whose filenames match the seeked WADs exactly are saved in the directory as-is. Archives are unpacked and scanned for the seeked WADs. Downloads that do not match the seeked WADs are discarded.

Wadseeker declares that the "seek session" is completed when it either finds and installs all WADs that were requested (a success), or when it runs out of links to go to and websites to scrape (a failure). It can also be aborted prematurely on demand.

Aside from scraping websites, Wadseeker can also contact the API of the /idgames Archive and get the WADs from there.

It can also happen that Wadseeker's seek session is configured with a custom link going directly to the desired WAD. In such case, the Web scraping is bypassed. Wadseeker may still do it, but its results may be discarded, as Wadseeker will immediately go for the direct download.

Limitations

  • Wadseeker can scrape the links only from static HTML sites. It's not a Web browser, it doesn't run Javascript, so it can't handle dynamically generated websites.
  • If Wadseeker fails to find the WAD it doesn't mean that it's not available online. The Web crawling is limited to only the specified list of URLs. If it runs out of them, it doesn't use the search engines to look for more.
  • If a WAD is available on a website, but it's located in an archive with a slightly different name, Wadseeker will miss it. Imagine a case where you wish to obtain a file called Mod-Guns.wad, and this file belongs to a multi-file mod named "Mod". The author of "Mod" has put all this mod's files in an archive named "Mod.7z". Wadseeker has no knowledge of the structure of this particular mod (or any mods, really) and will not know to look for Mod-Guns.wad in Mod.7z. This limitation can be circumvented by looking for both Mod.wad (provided that there is Mod.wad in Mod.7z) and Mod-Guns.wad in a single seek session. Wadseeker will download Mod.7z looking for Mod.wad and, by chance, find Mod-Guns.wad there also.