This page describes the Background and Aims and Methodology. A consideration of how to quantify writing in Numbers is given, followed by an explanation of Terminology and FAQ.

1. Background and Aims

I know what you are thinking: Why? Why attempt to categorise and rank writing about video games?

Even before the WWW, the internet was a great source of hints to help those trying to complete video games. The internet made that material much easier to find. It also encouraged authors to develop simple bullet-point walkthroughs or lists of cheats into complete game guides – comparable to (often better than) commercially published books. Video games themselves matured, with ever greater depth of game-play. The internet itself meant that authors could draw on the observations, strategies and tips of many other players. Gradually the volume, quality, and length of video game FAQs increased.

Oh; don’t be worried that the term “FAQ” is applied to just about anything from a short list of hints to an encyclopaedic account of a game. It’s an accident of history. The term is used here because it still defines the style of presentation and method of distribution of information.

Some video-game related writing evolved with the internet. Authors started to publish their work on private web-space, rather than posting it in public space such as usenet. Rather than just provide a single text, an entire website would be spawned. This approach remains particularly popular for certain types of games: Computer (rather than console or portable) games tend towards bespoke websites, I suspect partly because of the tendency to offer downloadable modifications and patches. Games played primarily by adults tend towards bespoke websites, partly because this group are more likely to be able to pay for web-server space.

However, in many cases something odd happened. The old text-only standards were not only retained, but adopted by a whole new generation of authors. These text-only guides are “FAQs”. FAQs are characterised as being written solely in simple ASCII, with fixed line width and all information in a single file. These files are conventionally syndicated across the internet.

I think there are a few reasons why the FAQ format has thrived: It can be easily syndicated, greatly reducing the chance of information being lost completely. It is a very information-focused format. Within certain communities of gamers, the FAQ is still regarded as the best source of information, which in turn creates competition between authors to write “better” FAQs. Perhaps most important is that authors’ work will be hosted for free on some of the most prominent gaming websites – the lack of cost may help explain a tendency towards younger authors writing FAQs rather than websites. That tendency will skew the demographic slightly – for example, FAQs will account for a higher proportion of all internet writing about GameBoy Pokemon games, than they will for PC strategy games.

I’ve written FAQs for some obscure genres. I found it was virtually impossible to find out who my contemporaries were – who else was writing similar sorts of FAQs. A few FAQ authors are well-rounded in their writing, but most favour certain genres or platforms. Whether this ultimately matters is a moot point. It is certainly true that different genres engender different styles of writing: A great roleplay game guide writer may transpire to be a lousy example for someone writing about a sports game.

My simple analysis of which authors had written the most for games in specific genres rapidly developed in response to requests: Who’s written the most about for “old-skool” consoles? Does anyone write for arcade titles anymore? Who’s dedicated half their life to writing about Simpsons’ games? Is the DS more popular than the PSP? And just how many authors write in “Dutch”? In part, I’m aiming to provide answers to as many of these questions as possible. In doing so, the results are hopefully interesting to browse through; if you are generally interested in these sorts of things to start with.

I suppose I’m secretly also aiming to break the slightly elitist approach to “prolific authors“. I don’t wish to belittle the achievement of people there – I’m well aware just how much work is needed. But I’d like to offer some alterative lists on which individuals may appear, some of which don’t have such high entry requirements. Having said all that, my intention is not primarily to develop a form of competition between authors.

2. Methodology

All data is extracted from GameFAQs. This is the undisputed king of video game FAQ archives. While it is not the only site that can trace its history back to the dawn of the internet, it is the only such site that actively encourages submissions across all genres and platforms. It is unusual to find an FAQ that is not listed on GameFAQs.

I should clarify that none of the information here is in any way endorsed by GameFAQs (or current sponsor CNET) – it makes use of data extracted from reading public internet pages on their servers. (Also see Do you have permission to do this? below.)

Data has been parsed from the original HTML pages, and written into a database. The database is then used to generate these pages. Initially all author profile pages were parsed and checked for FAQs. Subsequent updates are based on “what’s new” pages only. This avoids the need to continually re-parse very many empty or unchanged author profile pages. Game data is parsed for any game that at least one author has written about. Note that this game data is not subsequently updated. Updates are only done if new games are added. Author names are re-parsed with each update. So while there is some scope for inaccuracies to be created by unusual or unexpected changes, the vast majority of the data will be unaffected. As such, the data analysed here should be a fair representation of the overall situation.

Images are included within the original parsing, but not other content such as reviews or cheats.

Games are identified by platform and game number, which should be more reliable than name. At the start of 2005, many games were re-numbered. Except that some of the author profiles continued to show the old number. The result is that precisely the same game is often identified by two separate numbers, depending on which author’s profile one reads. This isn’t a problem for most of the analysis, since data such as genre and platform is the same for each twin. It is only a problem for game-specific analysis. Currently this uses a combination of platform and game name. However there are occasional cases where the same name/platform combination is used for genuinely different games, and these are now grouped together.

3. Numbers

Quantifying FAQ authors’ writing is anything but straightforward. Conventionally the size of files has been used as a proxy for volume. Since simple text files essentially use one byte per character, this is a reasonable measure of the number of words written, although heavy use of decorative ASCII art and redundant formatting can skew the totals. An alternative method is simply to count the number of files. However, since FAQs can be split into separate files, this measure is fraught with difficulties.

Neither approach quantifies the quality or usefulness of the work. The file-count can be improved slightly by only considering completed guides (guides with a full spot – see What’s the difference between an “FAQ” and a “Guide”? and What are spots? below). However this penalises those that prefer to write in-depth guides, or write in foreign languages. And a completed guide can still read like a badly translated household appliance manual: There is no measure of quality.

Rankings therefore primarily use file size as a measure. File size is measured in KB. A count of files may also be displayed, however these generally aren’t ranked. Alternative measures are occasionally used – these are explained in the relevant section.

Ultimately ranking should be used not as a form of competition where adding some more elaborate ASCII art headers could make second place into first, but as a guide to which authors tend to write about certain topics.

4. Terminology and FAQ

Why are the KB totals wrong?

The file size totals are based on a literal reading of the published size of each file, factored for known co-authored FAQs (each author is attributed half the KB), and ignoring duplicate files (same filename and platform). Unfortunately there are a few quirks that continue to make the precise numbers slightly inaccurate. One is a simple rounding error – everything here is rounded to the nearest whole KB, while the original files are in bytes. Another less common error seems to be caused by anthologies and expansion packs: Where the author has re-written ostensibly the same content as a new file, this is counted only once. However, since I can’t tell which files are duplicated in this way, there is no way to factor the results.

What’s the difference between an “FAQ” and a “Guide”?

An FAQ is any text (.txt) file. A guide is normally an English-language walkthrough for the game – although often such walkthroughs contains much more game information. Non-linear games may still have guides – these will cover things such as common strategies instead. Guides are intended to provide enough information to complete the game, rather than information on only one aspect of the game. Guides do not include non-English language texts, or “in-depth” texts. In-depth texts commonly cover one aspect of a game in detail – for example, character move lists or an explanation of one activity within a game.

What are spots?

GameFAQs uses colored spots to indicate how complete a guide is. These are used for guides only, not all FAQs. A full spot indicates a guide is reasonably complete. Half spots and empty spots indicate relatively incomplete guides.

What’s an “image”?

Image files are those ending in extensions .gif, .jpg, .jpeg, or .png. Image files are discounted from most of the analysis, however do features in certain parts. Images that happen to be in other formats are completely ignored. For example, .pdf files or .zip archives are ignored.

Why so many genres? And why are some wrong?

GameFAQs uses two genre-sets. The “master genre” is a single category. This groups games into one of eight “master genres” (Action, Adventure, Driving, Puzzle, Role-Playing, Simulation, Sports, or Strategy). A secondary genre-set consists of a “genre” followed by up to three sub-genres. For example, a shooter might be: Action > Shooter > First-Person > Sci-Fi. Most games are not categorised in such detail, with many defaulting to something like: Action > General. There will always be grey areas as to what constitutes a specific genre, and some of GameFAQs’ categorization is deeply suspect. However, GameFAQs’ definition of genres has been used throughout, warts and all. Note that because game data is only parsed once, if genres have subsequently been changed they may continue to be out of date. Without re-parsing several thousand game pages with each update, there is no way to know.

What’s the difference between a “game” and a “title”?

A game is a platform-specific implementation of a title. So one title may have been released as a PS2 game and an Xbox game – one title, two games. FAQs rarely cover titles because different platform versions tend to have minor variations. Also see How are multi-game FAQs handled? below.

How are co-authored FAQs handled?

FAQs written by two authors are counted only once for the purposes of overall analysis. Within author-specific analysis and rankings, co-authored FAQs are counted once for each author, however only half the filesize (KB) is attributed to each author.

How are multi-game FAQs handled?

FAQs that cover multiple games may be counted more than once, or may not: The count is of unique files. Unique means having the same filename, and within the same platform. If the file itself is duplicated (in whole or part), then the count is duplicated. If precisely the same file is simply cross-referenced across several games, the file is only counted once. Multi-game FAQs are often attributed to a somewhat inappropriate game. For example, an FAQ that covers both the original and the expansion will sometimes be attributed to the expansion instead of the original. This is simply a quirk of only attributing one game to each file – the first game name listed on the author’s profile will be used.

What about anonymous authors?

Anonymous FAQs are discounted from author-specific analysis and rankings, however they are included elsewhere.

The niche listings are wrong!

Probably. The niche listings look for certain words within the game name or filename. This process isn’t flawless because naming conventions aren’t consistent. Take the precise numbers with a pinch of salt. Hopefully they will still give sufficient indication of the top authors in their class.

When are you going to update? How current is the data?

I do not intend to update these pages. They are all now based on 22 January 2006. The decision to not continue updates was due to layout changes at GameFAQs, which would require a lengthy parser re-write, which I don’t have the inclination to do.

Why don’t you do this for reviews, codes, etc?

The simple answer is that I’m not very interested in those. So while it would be technically possible, it would require too much effort to code and maintain.

Hey, that’s wrong! What should I do?

Please read the text above to ensure it isn’t a known limitation, or isn’t simply caused by old data, then email me with details. Since I am no longer updating these pages, I can’t guarantee to fix problems.

Do you have permission to do this?

No (-: . These pages are not part of GameFAQs or the CNET empire. They are provided as a service to current and future FAQ authors, and maintained in response to positive feedback from them. They are based solely on the results of analysing data which is freely available by reading public internet pages. If you want data removed, and can prove ownership of it, please email me with details.

