Chapter 1: Introduction
I downloaded data about all works tagged "Old Guard (Movie 2020)"/"Old Guard (2020 Movie)" as of 17th August 2020 (metadata only, i.e. title, number of hits, tags, etc. not the contents of each work).
I then had some fun calculating statistics, making graphs etc. You can find the details in the following chapters, but here are a few highlights.
- After an initial increase during July as people discovered the movie and the fandom grew, the number of works posted per day to AO3 seems to have stabilised at a little under 50 works per day on average.
- 56% of works are in the "M/M" category. The next most frequently used categories are "Gen" (14%), "M/M, Gen" (6%) and "F/F" (5%)
- The most frequently used rating is "Teen And Up Audiences" (34% of works), followed by "General Audiences" (25%), "Mature" (19%), "Explicit" (14%) and "Not Rated" (7%).
- Excluding character and relationship tags, the most frequently used tags are: Fluff (12% of works), Angst (11%), Hurt/Comfort (11%), Canon-Typical Violence (10%), Established Relationship (9%), Post-Canon (9%), Found Family (8%), Pre-Canon (8%), Team as Family (7%), Temporary Character Death (7%), Immortality (5%), Character Study (5%), Angst with a Happy Ending (5%), Emotional Hurt/Comfort (4%), Canon Compliant (4%), Immortal Husbands (4%), Romance (4%), Anal Sex (4%), Alternate Universe - Canon Divergence (3%), Porn with Feelings (3%), Canon Queer Relationship (3%), Slice of Life (3%), Fluff and Angst (3%), Canon Queer Character of Color (3%), Post-Movie (3%), Enemies to Lovers (3%), Plot What Plot/Porn Without Plot (3%)
- The "Major Character Death" and "Violence" warnings are very common. The other AO3 warnings are almost never used.
- Joe almost never appears without Nicky and vice versa. They appear in almost exactly the same number of works.
- 99% of works are in English.
- 52% of works have less than 2,000 words. 98% of works have less than 20,000 words. One work has 153,684 words!
- Mature and Explicit works tend to be longer than General Audiences works.
- In total, works tagged "The Old Guard" have received almost 3 million hits and 500,000 kudos so far.
- Besides English, the most popular languages for work titles are Italian, French and Latin. Other languages used include Arabic and Genoese.
A few notes
Statistical analysis of fanworks for a given fandom can be difficult because the works tend to be scattered over lots of different sites: AO3, fandom-specific archives, fanfiction.net, livejournal, dreamwidth, deviantart, tumblr, fans' personal sites, Yahoo Groups mailing lists, geocities sites, and probably many others I don't even know about. But The Old Guard is a very recent fandom and seems to be quite heavily concentrated on AO3, for the fic at least (I think? I'm not sure...), so that makes the analysis a lot easier. Nevertheless, bear in mind that everything I say here takes into account works posted to AO3 only, and works posted to other sites might have totally different characteristics.
In the "Notes" section at the end of this work, I've given more details about the data included, the analysis performed, the tools used, etc.
Note for screenreaders: I have not included any description of the images in the alt text attribute, because I have tried to convey within the main text all the same information as is conveyed in the graphics.
This was a lot of fun! Please let me know if something isn't clear, if you spot any mistakes, or if there's something else you're curious about which I didn't include here.
Chapter 2: Growth of the fandom over time
Number of works posted to AO3 each day
The release date of The Old Guard on Netflix was 10 July 2020. After a steady increase in the number of works posted to AO3 per day between 11 and 27 July or so, that number seems to have stabilised at a little under 50 works per day on average, with a lot of fluctuations and lows of approx 30 and highs of approx 60 works per day.
The date I'm using is that shown by AO3 in the top right-hand corner of the display box for each work, i.e. the "publication date". Usually this is automatically set by AO3 as the date of posting of the most recent chapter, but creators can also choose this date themselves when they post, setting it in the past (backdating). They can also manually edit the date of a previously posted work. I'm not sure what timezone AO3 is using, but it seems to be UTC, a.k.a. GMT.
Note: in the graphic above, the number of works for 17 August is low because I downloaded this data on the 17th in the middle of the day. The number of works for 16 August is very high, but I expect this number will be lower if I check again in a few days. I notice that no matter which day I search for works, the number of works for the day before is always particularly high, and this number has then decreased if I come back and do the same search a few days later. I guess this is because of multi-chapter works which are updated every couple of days.
I wondered if I might be able to see some patterns in the data, such as more works being posted on Saturdays and Sundays, but that does not seem to be the case.
I would love to see how reader activity is increasing, but I don't have any way to access that data. I guess it has followed a similar growth pattern.
Chapter 3: Creators' usage of different categories, ratings and warnings
Number of works tagged with each rating
(Ratings are mutually exclusive, so each work can have only one rating.)
The most used rating is "Teen And Up Audiences" (34% of works), followed by "General Audiences" (25%), "Mature" (19%), "Explicit" (14%) and "Not Rated" (7%).
Number of works tagged with each category
(Categories are not mutually exclusive, so each work can be tagged with multiple categories)
The most used category or combination of categories is "M/M" (56% of works), followed by "Gen" (14%), "Gen, M/M" (6%), "F/F" (5%), "F/F, M/M" (4%), and then many, many other combinations of categories at less than 4% each.
Number of works tagged with each warning
(Warnings are not mutually exclusive, so each work can be tagged with multiple warnings)
The most used warning or combination of warnings is "No Archive Warnings Apply" (59% of works), followed by "Choose Not To Use Archive Warnings" (22%), "Graphic Depictions Of Violence" (11%), "Graphic Depictions Of Violence, Major Character Death" (3%), "Major Character Death" (2%) and then various other warnings or combinations of warnings at less than 1% each.
Number of works tagged with each language
(Languages are mutually exclusive, so each work can have only one language)
Note: The empty squares in the graphic are Russian. I could not get the Cyrillic characters to display properly :(
99% of works are tagged as being in English. The other languages used as of 17th August are Spanish, Russian, Italian, French.)
Number of works tagged with each fandom.
All works were tagged with "The Old Guard (2020 Movie)" or "The Old Guard (Movie 2020)" because that's the search term I used when downloading data. 15% of works were also tagged with "The Old Guard (Comics)".
I wondered if any popular crossover fandom might emerge, but it hasn't (not yet, anyway). The other fandoms most frequently occurring were "IT (Movies - Muschietti)", "IT - Stephen King", "Marvel Cinematic Universe", "Leverage", "Highlander: The Series", "The Avengers (Marvel Movies)" (none of them tagged in more than 0.5% of Old Guard works).
Number of works falling into each wordcount bucket
(For example, the label "(0, 2000]" means all works that have between 0 and 1999 words)
52% of works have less than 2000 words. 98% of works have less than 20000 words. Excluding multi-fandom works, the longest work has 153,684 words!
Word counts by rating and category
The plots below are called boxplots and they show the median and interquartile range of each value for each category, as well as any outliers. Basically this means that the coloured boxes show the central or "most typical" 50% of all works. The higher up the coloured box is, the higher the typical word count for that category.
"Mature" works tend to be the longest, typically having around 1800 to 6900 words, while "General Audiences" works tend to be the shortest, typically having around 600 to 2100 words.
Chapter 4: Creators' usage of different characters, pairings and other freeform tags
(See the end of the chapter for notes.)
Number of works tagged with each character
(I included all tags that I identified as characters within the top 200 most popular tags)
The most frequently used character tags are: "Nicky | Nicolo di Genova" (tagged in 83% of works), "Joe | Yusuf Al-Kaysani" (83%), "Andy | Andromache of Scythia" (54%), "Nile Freeman" (46%), "Booker | Sebastien le Livre" (34%), "Quynh | Noriko" (18%), "James Copley" (6%), "Lykon (The Old Guard)" (1%)
The number of works tagged with Joe or Nicky is not actually exactly the same. There do exist some works where one appears without the other! *g*
Number of works tagged with each pairing or relationship tag
(I included all tags that I identified as pairings or relationships within the top 200 most popular tags)
The most frequently used pairing tags are: "Joe | Yusuf Al-Kaysani/Nicky | Nicolo di Genova" ( tagged in 75% of works), "Immortal Husbands Joe | Yusuf Al-Kaysani/Nicky | Nicolo di Genova" (12%), "Andy | Andromache of Scythia/Quynh | Noriko" (7%), "Andy | Andromache of Scythia & Quynh | Noriko" (3%), "Andy | Andromache of Scythia/Nile Freeman" (3%), "Andy | Andromache & Booker | Sebastien le Livre & Joe | Yusuf al-Kaysani & Nicky | Nicolo di Genova" (3%), and many others at less than 3% of works each.
Number of works tagged with each freeform tag
In this graphic I included all tags that are used in at least 2% of works, excluding those character and pairing tags that already appeared in the two graphics above.
The most frequently used tags are: Fluff (12% of works), Angst (11%), Hurt/Comfort (11%), Canon-Typical Violence (10%), Established Relationship (9%), Post-Canon (9%), Found Family (8%), Pre-Canon (8%), Team as Family (7%), Temporary Character Death (7%), Immortality (5%), Character Study (5%), Angst with a Happy Ending (5%), Emotional Hurt/Comfort (4%), Canon Compliant (4%), Immortal Husbands (4%), Romance (4%), Anal Sex (4%), Alternate Universe - Canon Divergence (3%), Porn with Feelings (3%), Canon Queer Relationship (3%), Slice of Life (3%), Fluff and Angst (3%), Canon Queer Character of Color (3%), Post-Movie (3%), Enemies to Lovers (3%), Plot What Plot/Porn Without Plot (3%), Light Angst (2%), Kissing (2%), First Kiss (2%), Explicit Sexual Content (2%), Domestic Fluff (2%), Love (2%), Humor (2%), Introspection (2%), Getting Together (2%), Pre-Relationship (2%), Friendship (2%), Slow Burn (2%)
Here is the same information as a tag cloud:
I would love to compare this list with some other fandoms. I notice some tags that seem frequent in other fandoms I am familiar with, such as "Character Death Fix" or "Alternate Universe", are relatively rare here.
You can see here a discussion of the most frequently used tags across AO3 as a whole, for comparison.
Which tags and ratings commonly occur together?
The graphic below is called a heatmap. The colour of each square shows how many works exist for a given combination of tags. The darker the colour, the more works have been tagged with that combination. For example, if you look at the first row in the plot below, you see that the "Fluff" tag is mostly used with the "General Audiences" and to a lesser extent with the "Teen And Up Audiences" rating. In the second row, you see that the "Angst" tag is mostly used with the "Teen And Up Audiences" and to a lesser extent with the "Mature" rating.
Some of these are pretty obvious *g*. Others were a surprise to me, at least. Here are a few of the highlights:
- "Canon-typical violence" is rarely used with works rated "General Audiences", which makes sense because the definition of the "General Audiences" rating usually includes "does not contain violence". That tag is also rarely used with works rated "Explicit". I guess that in "Explicit" works, characters are too busy with other activities to indulge in canon-typical violence *g*
- Similarly, "Angst", "Hurt/Comfort" and "Temporary Character Death" are most often used with the "middle" ratings, i.e. "Teen And Up Audiences" and "Mature", and less often with the lowest or highest ratings.
- Tags such as "Porn with Feelings", "Anal Sex", "Plot What Plot/Porn Without Plot" and "Explicit Sexual Content" are almost only ever used in works with an "Explicit" rating. That makes complete sense, of course. Interestingly, "Canon Queer Relationship" and "Canon Queer Character of Color" are also mostly used with the "Explicit" rating.
Note: I also have some stats on readers' interaction with works (i.e. hits and kudos) but didn't finish writing it up yet.
Chapter 5: Reader interaction with different categories and ratings
On AO3 we can see readers' interaction with works through things like number of hits, kudos, comments and bookmarks.
Total number of hits on all Old Guard works as of 17th August: 2,921,030
Total number of kudos on all Old Guard works as of 17th August: 441,202
Working with these metrics (hits, kudos, etc.) is a bit trickier than just counting the number of works in each category, as in the previous chapters I posted. Let's say we want to answer a question like "How many hits do "General Audiences" works typically receive?" There are about 400 The Old Guard works rated "General Audiences", and each of those 400 works has a different number of hits. If we just take the average number of hits, that can be very misleading, because the value will be dominated by a small number of works that have an atypical number of hits. So instead, I use a type of plot called boxplots.
The charts below, called boxplots, show the median and interquartile range of each value for each category, as well as any outliers. Basically this means that the coloured boxes show the central or most typical 50% of all works. For example, in the first plot below, works rated "Explicit" typically receive between 95 and 310 hits per day, while works rated "General Audiences" typically receive between 40 and 120 hits per day. The exact numbers are not meaningful, because we're only looking at a small number of works. To understand the charts, we just need to compare the positioning of the coloured boxes on the vertical axis: higher up means more hits or more kudos.
Works rated "Explicit" generally receive more hits per day than works rated "Teen And Up Audiences" or "General Audiences". However, they maybe receive a bit less kudos per hit (but we would need to look at a lot more works to say for sure).
Works in the "Gen" and "F/F" categories definitely receive a lot less hits per day than works in the "M/M" or "M/M, Gen" categories. However, the number of kudos per hit is pretty similar across all categories (maybe a little higher for "Gen" than for "M/M", but again, we'd need more works to say for sure).
Note: For these plots, I only included categories that appeared on at least 1% of all works, and I zoomed in a little on the y-axis so that we could see the details better. This means that some outlier works with very high number hits per day don't appear in the plots. (If I'd included them, you wouldn't be able to see any of the other works.) Note also that "hits per day" can be misleading for multi-chapter works, because it's the total number of hits received over all time, divided by the number of days since the most recent chapter was posted.
Chapter 6: Other fun stuff
What languages are used for work titles?
Since this is a movie where all the main characters are multilingual (though Nile seems to speak less Pashto in the movie than in the comics) and there's a fair bit of non-English dialogue, I thought there was a good chance that titles in Italian, Arabic, etc were popular with work creators. Below are all the languages used for the titles of works tagged as being written in English, as of August 17th.
Data in the graphic above: titles in Italian (2% of works), French (1%), Latin (1%), Arabic (<1%), Unknown (<1%), Genoese (<1%), German (<1%), Greek (<1%), Vietnamese (<1%)
Excluded from the graphic are all works with titles in English, and all works where it doesn't make sense to assign a language, e.g. people's names, placenames, dates, numbers, etc.
Notes: I tried to classify the titles automatically, but language detection models don't work very well for short snippets of text, so I ended up mostly doing it by hand. (So any mistakes are my own fault!) For ambiguous titles, I mostly made the obvious assumptions, e.g. I assume "Sempre" is in Italian and not, for example, Catalan or Portuguese. However, I was still left with a few words where it's not really possible to know which of several languages the author had in mind, and those are classified as "Unknown". (They were 'kurgan', 'spectrum', 'atlas', 'mien', 'aperol spritz', and 'crescendo'.) For titles that had a mixture of two languages, I counted them as the least common language. These were mostly English/Latin, but one was French/Vietnamese.
Changes over time in types of works posted
I wonder if there will be a shift over time from writing "backstory" or "how the characters first met" fics to writing various types of AUs. It's probably much too early to tell, and also it's very difficult to detect automatically, especially because there isn't really one dominant tag for "backstory" or "characters getting to know each other".
I tried to take a look at the evolution of all tags containing the phrase "alternate universe" (so this includes both canon divergence, and also all other types of AUs: coffee shop, college, etc.). I also looked at all tags containing the word "crusades", as a pretty unsatisfactory proxy for works about Joe and Nicky's first meetings. (I couldn't think of any suitable word to use as proxy for any of the other characters' backstories, and even for Joe and Nicky the word "crusades" is not great, but I don't have anything better.)
It looks a bit like the proportion of works tagged "Alternate Universe" is slowly increasing over time... maybe? It's too early to tell.
Chapter 7: Notes
I scraped metadata from AO3 using a homemade script, parsed it with the BeautifulSoup python package and made the plots in matplotlib and seaborn. I included all works returned by AO3 when I click on the tag "The Old Guard (Movie 2020)", i.e.: 'https://archiveofourown.org/tags/The%20Old%20Guard%20(Movie%202020)/works' on 17 August 2020. This tag has been wrangled, so it also includes works tagged "The Old Guard (2020 Movie)". I didn't apply any other filters.
I was not logged in to AO3, so this analysis does not include any works that are visible only to logged-in users.