2012-12-15 15:36:14 -0500

The following is a post created by the Tag Wrangling Committee to address some ongoing questions and discussions involving tag wrangling on the Archive of Our Own.

The question has been raised in various places of how sustainable the Archive of Our Own’s tag wrangling system is, and whether it will continue to be viable as AO3 continues to grow and the number of fandoms and tags increases. The AO3 wrangling committee would like to address some of the concerns we’ve heard, from AO3 users as well as wranglers (including the staff).

In all honesty, it’s a fair question, and one without a clear or simple answer. The AO3 tag wrangling system is a special beastie, and because of its uniqueness, it is difficult to judge questions of long-term sustainability, since there is no real precedent to look to. But we have high hopes for it, which so far have been met or exceeded by our amazing team of wrangling volunteers.

To better understand our position, it may help to understand what makes the wrangling system special, and why it was implemented this way in the first place.

Why do AO3 tags work like they do?

The AO3 tag wrangling system was specifically designed as a compromise between the two standard tagging/organization models for online archives: a regulated taxonomy, versus a 'folksonomy'.

A regulated taxonomy – such as what's currently used on – allows creators to tag their work with a limited number of pre-determined options (such as genre or characters). This system is very good for keeping things ordered and preventing misspellings and otherwise inconsistent labeling. However, it also requires constant maintenance to add new tags as new fandoms arise, and greatly restricts what users can label or sort by. The latter condition can be especially problematic if data is not kept up-to-date. (For instance, on many fandoms have no character lists, and other fandoms don't include all characters, especially those recently introduced.)

A "folksonomy" - the tagging system used on most social bookmarking sites and Tumblr - allows users to tag their content with any tag of their choosing, and users can see all works using any given tag. This system has the advantage of flexibility and currentness - its tags are always up-to-date with user preferences - but can make browsing difficult. (For example: on Tumblr, if you want to see most posts about kid!Loki, you also have to look up "kid loki" and "bb!Loki" and will still miss the posts tagged "bbloki.)

When designing the tag system on AO3, both of these systems were considered. But both have significant drawbacks in meeting the demands of both creators and browsers of a growing multi-fandom archive.

Options & drawbacks

User tagging could be limited to only approved tags. This then puts the burden on the users to specifically request new tags to be added; it also requires wranglers to work quickly to make tags available as needed. For active fandoms like Homestuck that see on order of five new relationships a day, these requests could quickly become overwhelming. To keep up with such demand, we would need a ridiculous number of volunteers, and/or a way to prioritize requests, limiting new tag creation to the most popular fandoms/most requested tags. Assuming users could post works without tags, many people wouldn't bother tagging their works at all if the tag they wanted wasn't available and they didn't have time to submit it. Works would also be left without tags if a user did submit the request, but failed to go back to add it to their old works when the tag was finally entered in the system.

To get around this last issue, we could regulate the tags – a user could enter any tag they like, but it must be approved before appearing on AO3. In that case, wranglers become the inadvertent gatekeepers of fandom, deciding what tags are or are not shown to users. Is "Feels" worthy of being displayed? What about "Wingfic"? Maybe we don't want to allow "Incest" or "BDSM" - we're not that kind of archive (obviously we totally are, but you get the idea!) And there would still be a period of time when the tags wouldn't be visible or useful, so an enormous team of volunteers would still be required to overview the tags in a timely fashion.

Another option is to let users enter whatever they like and display all those tags, but moderate them by telling people how we want them to tag, and removing all the tags that don't fit, or requiring users to change them. Again, the burden on the moderators would be considerable, having to monitor the over half-million works on the AO3. It would also be difficult to justify regulating tags when the spelling, grammar, and format of posted works are not likewise moderated (and to do so would require modifying AO3's Terms of Service).

Otherwise we could take the opposite tack and not organize tags at all: allow users to enter any tags they like, display and filter by all these tags, and let people who want to read John Watson/Sherlock Holmes search for "John/Sherlock" and "sh/jw" and "Johnlock" and any other permutations they can think of. But this method becomes frustrating for browsing users who don't know or don't remember all the permutations. It's also a burden on creators who want their work to be found by as many people as possible, but have the same issue of not knowing or remembering the many variant names for the same concept. (It's worth noting that this is not an unviable system - Tumblr, Pinboard, Pixiv, and many other sites use similar systems; and AO3 could switch over to it with relatively little tweaking, if necessary.)

Or we could let users enter whatever tags they like, and display all those tags however the creator or bookmarker wants to display them. Then, behind the scenes, volunteers can organize and link tags together so the most commonly used and useful-for-browsing concepts are more readily available to the largest number of people – both creators and audience – with the smallest amount of required effort. This is how the AO3 tag wrangling system works.

But is this system sustainable?

It's impossible to be sure, but after observing wrangling on the beta archive over the last four years, the tag wrangling committee believes that yes, the AO3 tag wrangling system is sustainable in the long-term. To begin with, our volunteer pool is currently as large as it’s ever been (at close to 160 wranglers), and keeping more than level. When recruiting is open, we average more people volunteering than retiring, and get a surge with most donation drives as well. The AO3's expansion this year does mean there are more tags than ever, but it also means there are more fans willing to offer their time to keep those tags in order. And the fandoms with the most activity are also those with the most fans, so it's more likely for us to be able to find wranglers for them.

Additionally, archive growth doesn't correspond directly to an increase in tag wrangling work. The vast majority of new works posted on AO3 fall into two categories: very small fandoms – under 20 works – that require occasional wrangling rather than ongoing maintenance; or very large fandoms, which often are the best-wrangled, because we have lots of wrangling volunteers familiar with them! Looking at, half the available fandoms there are under the 20-work threshold; and on the Archive, while there are currently close to 5000 fandoms without an assigned wrangler, fewer than 300 of these have more than 20 works.

Even large fandoms may not produce many new tags. A popular fandom with a small core cast of characters may get 100 new works posted a day, but only one new relationship tag, because all the other works used existing tags. Fandoms from 'closed' canons (canceled shows, etc.) tend not to get many new tags because they aren’t introducing new characters. And many fandoms share tags – see the X-men metatag, which has 13 different sub-fandoms, but a number of the characters and relationships among these overlap and only need to be wrangled once for all the fandoms.

What if wrangling isn't viable in the long-term?

It is undeniable that as AO3 grows, wrangling becomes an increasingly greater task. We don’t believe it’s insurmountable, however. Nor do we believe that there is any real danger of the tag system collapsing entirely.

AO3 tag wrangling is designed to assist and facilitate users in labeling and finding works, but for the most part it is not crucial for these purposes. Many aspects of AO3 tags are still functional without any wrangling at all. An unwrangled AO3 tag acts like a Tumblr or Pinboard tag, showing all works and bookmarks using that tag. AO3 search brings up results both for wrangled tags and the text of unwrangled tags, and unwrangled tags can likewise be used in the new filters.

In other words, if all wranglers quit and all wrangling on AO3 stopped this instant, existing tags would continue to work as they do now, preserving the work wranglers had done up until this point; and all new tags on AO3 would still be as useful as tags on Tumblr or LiveJournal or any other service with flat tags. The filters of older but growing fandoms would be sparse, new fandoms would lack filters and only appear in the "Uncategorized" section, and a user would have to look for "Fullmetal Alchemist", "Full Metal Alchemist", and "Hagaren" separately to find all works; but the basic functionality of calling up all works with a tag would remain.

Obviously an end to all wrangling is the worst-case scenario and not one we expect to pass. The greater concern is that the wrangling committee and volunteers will keep working, but the bulk of the work will become too great for us to keep up with. The current wrangling system is definitely not perfect, and one of the wrangling committee’s primary goals is to look for ways to improve it and make it more sustainable.

So what does the future of AO3 tags look like?

The wrangling committee is working to improve the tag and wrangling experience both on the front-end (for users) and the back-end (for wranglers). On both sides, the two aspects of tags we're most concerned with at the moment are internationality and additional tags.

Currently, AO3 wrangling primarily deals with English-language/Roman alphabet tags. To be a more useful archive for fans around the world, we are developing better methods of sorting and linking tags across languages. We want to display tags of all languages in the appropriate filters and the auto-complete, while preserving the links between tags with the same meanings. We also need to develop better guidelines for non-English-language tags.

Our second focus is on the issue of Additional Tags (or "Freeforms", as wranglers know them). Presently we are seeing several hundred new additional tags on works and bookmarks added to AO3 daily.

It's important to note that these tags do not interfere with the wrangling of non-freeform tags. AO3 is designed to handle tags of different categories such that wranglers can view fandom, character, and relationship tags separately from freeforms; and the former get priority. Wranglers can also sort tags by number of uses, to easily see which freeforms are popular enough to warrant making them canonical. The majority of new freeforms are not made canonical and never will be; they are single-use, notes-style tags that only require being checked off a list by a single wrangler. This process is not as streamlined as it could be, and one of our top priorities for the back-end is features to simplify it.

On the front-end, we're looking into ways for users to limit the display of freeforms, such as by making the view of single-use freeforms optional. At this point we have no plans to limit what tags users are allowed to put on their works, beyond what is mandated by the AO3 Terms of Service; but we want to give users better ways to view the particular tags they're interested in. (If you are looking for ways to limit them now, you may find the skins linked in this post helpful.)

Users & wranglers unite!

As well as improving the efficiency of the wrangling interface to make it easier for wranglers to do our job, we believe that a major way to keep wrangling sustainable is to employ the help of all users to keep tags in line. To that end, we’re seeking to open up aspects of the wrangler interface to regular users. We've already made wrangling connections visible to all users on AO3, and publicly posted our wrangling guidelines to explain what tags we make canonical. We also would like to find better ways for users to contact us – any message sent to Support concerning tags or wrangling is already forwarded to us, and we respond to messages on our Wrangler Twitter as well, but we hope to have more direct lines of communication. This might include allowing users to leave notes on individual tags, or other methods to call attention to specific problems.

Now that bookmarks are filterable, it's possible for users to filter for tags other than those the creators put on their works, allowing users to label and categorize works even if the creators don't opt to. We’re also considering giving all users limited wrangling capabilities, such as sorting tags into fandoms, making synonyms to existing canonical tags, or suggesting new canonical tags following the guidelines for wranglers to approve. Such features would require moderation from wranglers, but would take some of the burden off us (as well as potentially encouraging more users to volunteer for wrangling.)

So when will this happen?

Most of these improvements require new features to be coded. This requires the attention of the AD&T committee’s diligent coding and testing volunteers, and must be prioritized against the hundreds of other features and bug-fixes also in demand. It is also contingent on having available coders and testers - the wrangling code is some of the more complex on AO3, so relatively few coders have the skills and experience to make significant changes to it. So it may be some time before changes appear on the beta archive; but new tag features are under development now.

In the meantime, the wrangling committee relies on all its awesome wrangling volunteers to keep up with the tag load! Thus far they have been more than up to the task, and we are confident that with improvements, the wrangling system will remain functional for both wranglers and users as the AO3 continues to expand in the years to come.