AO3 News

Post Header

Published:
2021-03-21 17:35:00 -0400
Tags:

From time to time, we get contacted by students, scholars, and people interested in fandom stats who would like to access information about the fanworks in the AO3 database, such as frequently used tags or growth of a fandom over time. While we're unable to respond to individual requests, today we're pleased to provide a one-time release of data for all of our users.

The data comes in two CSV files.

The first includes information about works:

  1. creation date
  2. language
  3. word count
  4. restricted or not
  5. complete or not
  6. associated tag IDs

The second provides the key to the tag IDs:

  1. tag ID
  2. tag type (e.g. Warning, Fandom, Relationship)
  3. tag name (unless the tag has fewer than 5 uses)
  4. canonical or not
  5. an approximate number of uses
  6. merger ID (i.e. the tag's canonical version, if it has one)

🏷️ Download both CSV files as a zip (417 MB)

We hope to one day be able to provide regular, automatic dumps of this data, but for now, our focus is on other projects. In the meantime, there are a number of tools available to scrape publicly available data, or you're welcome to build your own. (If you're planning to scrape the Archive, we do ask that you include a delay between requests to reduce load on our servers, and avoid scraping on weekends, which are our busiest time. We'd also appreciate it if you could set your scraper's user agent string to include the word "bot.")

If you use this data in one of your projects, we'd love to hear about it! Drop us a line here in the comments or tag us on social media to show us what you've done.