AO3 News

Systems spotlight: January 2013 - our server setup

Published: 2013-02-10 09:02:18 -0500

January 2013 - our server setup

Over the course of 2012, the demands on all our sites grew. In particular, the number of users accessing the Archive of Our Own each month dramatically expanded from 963,818 in January 2012 to 2,970,103 by December 2012. This demanded a significant expansion of our hardware: we bought 3 new servers and were lucky enough to have another 2 servers donated.

As of January 2013, the OTW owns 11 servers and one switch to communicate between them, and pays for space on 3 virtual servers (some of these will be decommissioned in the coming months).

Our physical servers are at colocation facilities. Our current hosting costs for our physical machines amount to $1,640 a month, and we pay another $370 a month for virtual hosting.

Planning for expansion

Systems have spent a lot of time thinking about how to manage the current demand and plan for continuning expansion going forward. We have put in processes which will allow us to add servers to the organization with efficiency so that additional growth can be done with far less systems administration work that has been required in the past.

One of the issues we have had in the past is that the systems have been maintained individually. When the number of servers is small, this can be maintained but it becomes unwieldy as more servers are obtained. We have been trying to balance looking after the servers with committing to the work needed to automate the installation and configuration of the systems needed to provide the Archive and Fanlore and the other org sites.

Over November and December 2012 the group spent around around 25 hours a week automating the installation (with fai, the fully automated install system) and configuration (with cfengine3, ). Once this work is complete we should be able to provision new servers both physical and virtual quickly and consistently.

Machine specifications

As of January 2013, the machine specifications and jobs were as follows:

Machine name Specification Purpose.
otw-admin (was OTW1) ProLiant DL360 G5, E5420@2.50GHz with 16GBytes of Ram and 140GB of RAID 10 disc. OTW Tools Administration host for the OTW. Hosts cf-engine3 (see below) servers, dhcp and tftp. Redis slave for the Archive, Mysql server for internal databases, local Debian Linux repository, xtrabackup manager (Mysql backup).
otw-gen01 (was OTW2) ProLiant DL360 G5, E5420@2.50GHz with 16GBytes of Ram and 140GB of RAID 10 disc. OTW Projects New installation currently under testing. Will host Fanlore, Transformative Works and Cultures (the OTW journal) and Symposium (TWC’s blog). Uses Squid, Apache, memcached, MySQL.
OTW3 Supermicro X8DTU, 24 gig of RAM, dual E5620 4 cores @ 2.40GHz, 4*143GB SAS discs Archive of Our Own Runs nginx and Squid to provide the front end of the Archive. Runs the following sets of unicorns: 5 unicorns for web spiders, 5 unicorns for comments, kudos and adding content,16 unicorns for retrieving works or their comments and 30 general purpose unicorns.
OTW4 Supermicro X8DTU, 24 gig of RAM, dual E5620 4 cores @ 2.40GHz, 4*143GB SAS discs Archive of Our Own Runs the following sets of unicorns: 18 unicorns for comments, kudos and adding content,10 unicorns for retrieving works or their comments and 50 general purpose unicorns. Resque workers (these run jobs that do not need to be done immediately, such as sending email).
OTW5 Supermicro X8DTU, 48 gig of RAM, dual X5650 6 cores @ 2.67GHz, dual intel X25 80GB disc Archive of Our Own Mysql primary (database server for storing works in the Archive). Memcached (used to speed up the archive). Redis (used to store data that needs to be stored quickly, such as page hits, etc).
Qnap QNAP TS-809U, 8*2GB discs. Archive of Our Own Storage device. Used for backups, work downloads, and shared binaries. Hosts fai ( fully automated install system, http://fai-project.org )
switch 16 port netgear dumb switch. Networking switch. Used for communications between internal servers.
tao Virtual machine with 1GB of RAM OTW Tools Service machine: primary email support, Mailman (mailing lists), DNS hosting, other support services and tasks
zen Virtual machine with 1GB of RAM OTW Projects Currently used to host Fanlore, Transformative Works and Cultures (the OTW journal) and Symposium (TWC’s blog). Soon to be decommissioned
buddha Decommissioned. All sites (transformativeworks.org, elections site and opendoors) have been moved to a third party Drupal vendor.
stage Virtual machine with 1.5GB of RAM Test server. Once the new versions of Fanlore and our other sites go live, the test sites will go to the new physical stage hosted in a separate colocation facility.
stage (2nd site) HP ProLiant DL385 G2 2*Dual-Core AMD Opteron(tm) Processor 2218, 32GB of RAM, 1TB of raid 6 disc over 8 discs. Test server. Used to host the Archive software before it goes live. Mysql secondary for all of the org’s mysql servers. Secondary for redis for the archive.
dev (2nd site) HP ProLiant DL385 G2 2*Dual-Core AMD Opteron(tm) Processor 2218, 32GB of RAM, 1TB of raid 6 disc over 8 discs. Development server. Used to provide a Unix environment for Archive developers, so people don’t need to set up the code on their own machines to code for us.
spine Virtual machine with 2 GB of RAM Service machine. Used to host offsite backups and other services.
ao3-db01 New! Supermicro SuperServer SYS-6027R-TRF 8*cores @ 2.6GHz, with 256 GB of RAM, RAID controller with battery backed up cache, 2* 2TB SATA discs Intel 910 SSD 800GB PCIE The new database server. We have spent a considerable amount of money here both on RAM and the intel 910 PCIe storage. This system should be able to support the Archive for a reasonable amount of time. If we need more performance we will have to buy additional machines and shift memcached and redis to other machines, later buying a replacement machine with even more memory. We are willing to spend so much money on RAM as our developer resources (the number of hours people can devote to coding the Archive) are even more tightly constrained than our financial resources.
ao3-app01 New! CSE-815TQ-R700WB server, E5-2670 8*cores @ 2.6GHz, with 256 GB of RAM, RAID controller with battery backed up cache, 2* 2TB SATA discs These machines will just run the Archive application. If there is significant spare RAM then we will run memcached instances and add them to the memcached cluster.
ao3-app02 New! CSE-815TQ-R700WB server, E5-2670 8*cores @ 2.6GHz, with 256 GB of RAM, RAID controller with battery backed up cache, 2* 2TB SATA discs Same as ao3-app01

Definitions:

Virtual machine: a server that looks like an actual computer but is actually software built on top of a larger, higher performance server. Virtual machines are ideal for web servers and other basic workhorse systems.

Storage device: A system that is mostly disk space and networking. Imagine a gigantic external hard disk times a billion.

Service machine: A system that runs mostly behind the scenes programs that Joe User never sees, but OTW staff may need.

KVM: Keyboard/Video/Mouse: Servers generally do not come with these, but are just big boxes full of disks, memory, CPU, and lots and lots of fans. To talk to a server directly, while in front of it, you generally need a KVM.

DNS: Dynamic Name Service. What tells other computers (like yours) where to find sites like archiveofourown.org.

Colocation: Remote hosting site where servers are kept. Hosting costs include power, cooling, and someone to physically work with the machine when needed.

 

The new servers have a total of 48 cores of compute, 768GB of RAM, and 800GB of very fast storage. We are currently providing the Archive on 28 cores of compute and 192GB of RAM.

Archive of Our Own - server setup

As of January 2013, buying new servers had allowed us to significantly restructure the systems architecture for the Archive of Our Own. The technically minded can see the basics of our setup below (this is a simplified version of the current configuration with systems such as the mail server and the secondary mysql servers removed:

Diagram of Archive of Our Own server setup in January 2013

Comment

Systems spotlight: January 2012 - our server setup

Published: 2013-02-10 08:54:31 -0500

January 2012 - our server setup

At the beginning of 2012, the OTW owned 6 servers and paid for space on 6 virtual machines for the rest of our services. The Archive of Our Own was completely hosted on servers we owned (5 of the 6). This is important because owning the servers makes it easier for us to protect fanworks. We also had one switch for the Archive servers (this communicates between the different machines).

All our physical servers were at a colocation host - we pay them for space, electricity and bandwidth, and physical maintenance when required. The hosting costs at the start of the year were around $800 per month. The charges for the virtual servers were $420 per month.

Machine specifications

As of January 2012, the machine specifications and jobs were as follows:

Machine name Specification Purpose
otw1 ProLiant DL360 G5, E5420@2.50GHz with 16GBytes of Ram and 140GB of RAID 10 disc. Archive of Our Own Mysql secondary (database server for the Archive), Sphinx (used for free text searching), web stats
otw2 ProLiant DL360 G5, E5420@2.50GHz with 16GBytes of Ram and 140GB of RAID 10 disc. Archive of Our Own Memcached (used to speed up the Archive). Resque workers (run jobs that do not need to be done immediately, such as sending email). Redis (used to store data that needs to be stored quickly, such as page hits etc.).
otw3 Supermicro X8DTU, 24 gig of RAM, dual E5620 @ 2.40GHz, 4*143GB SAS discs Archive of Our Own Nginx (web services) and the application which provides the Archive.
otw4 Supermicro X8DTU, 24 gig of RAM, dual E5620 @ 2.40GHz, 4*143GB SAS discs Archive of Our Own Same as otw3.
otw5 Supermicro X8DTU, 48 gig of RAM, dual X5650 @ 2.67GHz, dual intel X25 80GB disc Archive of Our Own Mysql primary (database server for storing the works in the Archive)
Qnap QNAP TS-809U, 8*2GB discs. Archive of Our Own and other projects Storage device used for backups, work downloads, and shared binaries.
switch 16 port netgear dumb switch. Archive of Our Own Networking switch used for communication between internal servers
Tao Virtual machine with 1GB of RAM OTW tools Service machine: primary email support, Mailman (mailing lists), DNS hosting, and other support services and tasks.
Zen Virtual machine with 1GB of RAM OTW projects Web server: Hosts Fanlore, Transformative Works and Cultures (the OTW journal) and Symposium (TWC’s blog).
Buddha Virtual machine with 1GB of RAM OTW projects Web server: Hosts transformativeworks.org, the OTW Elections site and Open Doors.
Stage Virtual machine with 1.5GB of RAM OTW Projects Test Webserver: Used to test all websites’ code (including the AO3) before they go live.
Dev Virtual machine with 2 GB of RAM Archive of Our Own - internal Development server: Used to provide a Unix environment for Archive developers, so people don’t need to set up the code on their own machines to code for us.
Spine Virtual machine with 2 GB of RAM Service machine: Used to host offsite backups and other services.

Definitions:

Virtual machine: a server that looks like an actual computer but is actually software built on top of a larger, higher performance server. Virtual machines are ideal for web servers and other basic workhorse systems.

Storage device: A system that is mostly disk space and networking. Imagine a gigantic external hard disk times a billion.

Service machine: A system that runs mostly behind the scenes programs that Joe User never sees, but OTW staff may need.

KVM: Keyboard/Video/Mouse: Servers generally do not come with these, but are just big boxes full of disks, memory, CPU, and lots and lots of fans. To talk to a server directly, while in front of it, you generally need a KVM.

DNS: Dynamic Name Service. What tells other computers (like yours) where to find sites like archiveofourown.org.

Colocation: Remote hosting site where servers are kept. Hosting costs include power, cooling, and someone to physically work with the machine when needed.

Archive of Our Own - server setup

As you can see from the above, the Archive of Our Own uses the most servers and therefore has a more complicated server setup. For the curious (and technically minded) here’s how they were organised at the start of 2012:

Diagram of OTW server setup in January 2012

Comment

Highlights from Open Doors Chat

Published: 2013-02-08 13:03:40 -0500

As we reported early last month, due to delays in setting up the automated import for 852 Prospect, we are working to support authors who are interested in manually importing their stories into the Archive of Our Own.

A public chat, hosted by the Open Doors and Support committees, was held on Campfire (the online chat platform the OTW uses) on February 2. You can now read the highlights. The second chat will be on February 10 at 01:00UTC. (Click the link to see when the chat is being held in your timezone). You can access OTW’s public chatroom using this guest link.

If you have questions and are unable to make it to the chat or have additional questions after, you can always contact Open Doors for further information.

Comment

Site security (constant vigilance!)

Published: 2013-02-07 17:25:55 -0500

While developing the Archive of Our Own, site security is one of our top priorities. In the last couple of weeks, we've been reviewing our 'emergency plan', and wanted to give users a bit more information about how we work to protect the site. In particular, we wanted to make users aware that in the event of a security concern, we may opt to shut the site down in order to protect user data.

Background

Last week we were alerted to a critical security issue in Ruby on Rails, the framework the Archive is built on. We (and the rest of the Rails community) had to work quickly to patch this hole: we did an emergency deploy to upgrade Rails and fix the issue.

As the recent security breach at Twitter demonstrated, all web frameworks are vulnerable to security breaches. As technology develops, new security weaknesses are discovered and exploited. This was a major factor in the Rails security issue we just patched, and it means that once a problem is identified, it's important to act fast.

Our security plans

If the potential for a security breach is identified on the site, and we cannot fix it immediately we will perform an emergency shutdown until we are able to address the problem. In some cases, completely shutting down the site is the only way to guarantee that site security can be maintained and user data is protected.

We have also taken steps for 'damage limitation' in the event that the site is compromised. We perform regular offsite backups of site data. These are kept isolated from the main servers and application (where any security breach could take place).

In order to ensure the site remains as secure as possible, we also adhere to the following:

  • Developers are subscribed to the Rails mailing list and stay abreast of security announcements
  • We regularly update Rails and the software we use on our servers, so that we don't fall behind the main development cycle and potentially fall afoul of old security problems
  • All new code is reviewed before being merged into our codebase, to help prevent us introducing security holes ourselves
  • All our servers are behind firewalls
  • All password data is encrypted

What you can do

The main purpose of this post is to let you know that security is a priority, and to give you a heads up that we may take the site down in an emergency situation. Because security problems tend to be discovered in batches, we anticipate that there is an increased risk of us needing to do this over the next month. In this case, we'll keep users informed on our AO3_Status Twitter, the OTW website and our other news outlets.

Overall site security is our responsibility and there is no immediate cause for concern. However, we recommend that you always use a unique username / password combination on each site you use. Using the same login details across many sites increases the chance that a security breach in one will give hackers access to your details on other sites (which may have more sensitive data).

We'd like to thank all the users who contacted us about the latest Rails issue. If you ever have questions or concerns, do contact Support.

Comment

Tiny Release Notes for Release 0.9.5 Redux

Published: 2013-02-04 13:10:09 -0500

After deploying version 0.9.5 of the Archive last weekend, we (along with the entire Ruby on Rails community) were alerted to a critical security issue that had to be fixed immediately. We had just upgraded to Rails 3.0.19 and were working on fixing an unexpected bug this upgrade had caused: work information in subscription emails had lost its line breaks and arrived in one hard-to-read blob.

We deployed the security patch, together with the updated work information code, last Monday, and are now working on the next regularly scheduled release. Many thanks to Elz, Jenn Calaelen, Lady Oscar, Sarken and Scott for their contributions to this code update! Some information about the current security concerns regarding Ruby on Rails, and the measures we take to protect our servers and users, will be posted later.

As always, you can find currently known issues (and some workarounds) on our Known Issues page, and you can always contact Support in case you run into problems or have any questions.

Release Details

Features

  • Added a Tumblr button to the "Share" box available for all works: it will create a new Link post with work title, URL, and work information already filled in - you just have to add tags and push the button!

Bug Fixes

  • Upgraded Rails
  • Fixed the "Share" text to include HTML for line breaks, making it display correctly in email notifications as well as any blogging platform that accepts HTML-formatted text
  • Also added Additional Tags to the work information block; they had been missing previously

Comment

852 Prospect - Manual Import Support Chat Reminder

Published: 2013-02-01 13:02:57 -0500

As we reported early last month, due to delays in setting up the automated import for 852 Prospect, we are working to support authors who are interested in manually importing their stories into the Archive of Our Own.

There will be two public chats, hosted by the Open Doors and Support committees, on Campfire (the online chat platform the OTW uses). The first will be on February 2 at 22:00UTC. The second will be on February 10 at 01:00UTC. (Click the links to see when the chat is being held in your timezone). You can access OTW’s public chatroom using this guest link.

If you have questions and are unable to make it to the chat or have additional questions after, you can always contact Open Doors for further information.

Comment

Fandom Tags: Now with More Articles!

Published: 2013-01-27 13:52:21 -0500

Good news for users browsing fandoms on the AO3 -- alphabetizing titles by articles such as "the" or "das" or "los" is now a thing of the past!

With this latest AO3 release, the Fandom names on the media pages now will sort alphabetically regardless of articles. Previously, the code that generated pages like the Theater Fandoms page sorted by the first letter of the canonical fandom tag name. Because we wanted the tags to be sorted alphabetically, we had to remove articles from the names of the fandom, unless the fandom name was only two words or otherwise was confusing without the article. Needless to say, we've been seeking a solution to this for some time, but required something internationally compatible that wouldn't strain our servers.

This deploy gives wranglers the ability to set a "sort name" on canonical fandom tags that is separate from the "display name". So we can now have fandom names such as "The Crucible - Miller" display the article, but be sorted under "C".

The deploy also ran an automated process on our existing fandom tags that should have automatically changed the sort name for tags starting with: a, an, the, la, les, un, une, des, die, das, il, el, las, los, der, and den. In some cases, this auto-corrected some fandom names incorrectly ("Die Hard (1998)" sorting under "H", for example).

This still leaves a large number of tags that need to be manually adjusted, as they had an article removed to allow proper sorting under the old system. The Tag Wranglers are working through the fandom tags, restoring articles where the fandom name should have one, and fixing any incorrect changes. It will not be an instant process, given there are over 11,000 canonical fandom tags on the Archive, so we ask for your patience if it takes us a while to fix your particular fandom.

In the meantime, if you have questions you can ask here or send a question to our Support team, who'll pass it on to the Wranglers. The Tag Wrangling Committee also has a Twitter account at ao3_wranglers for all sorts of tag-related discussion.

Comment

Release Notes for Release 0.9.5

Published: 2013-01-26 09:43:05 -0500

Welcome to Release 0.9.5! Ariana, Elz, Enigel, Lal, Sarken, and Scott S contributed code to this release, which was tested by our awesome testing team: Ariana, Elz, Emilie K, Estirose, Jenn Calaelen, Kylie, Lady Oscar, Mark B, Sam J., Sarken, and Tai.

We're starting into the new year with a small collection of fixes and improvements, with a bigger release slated for the February/March deploy. As always, if you run into any problems or have any questions, please contact Support. If you want to know if a feature you'd like to propose has already been suggested, or has been approved by our coders for a future update, visit our Feature Requests board (see the Internal Tools FAQ for more information).

Highlights!

Ignoring articles when sorting fandoms

On each media subpage, such as for Movies or Video Games, fandom tags were listed alphabetically, leading to somewhat irregular results when looking for fandoms starting with "The" or other articles. We have now changed the sorting code to ignore articles (a, an, the, la, le, les, un, une, des, der, die, das, den, il, el, las, los), while also giving the tag wranglers an option to manually override the sorting for a given tag in case of clashes. (For example, the German article "die" would lead to "Die Hard" being sorted under H, which is undesirable.)

No more OpenID

We've finally gone ahead and removed all support for OpenID accounts - a system we could never fully support, as the infrastructure behind it isn't without its own problems and our invitation system meant you couldn't just go ahead and use your OpenID login to create an account to begin with. We might consider different ways of accessing the site in the future, but as little more than a password replacement OpenID has outlived its use.

Activity log for admins

In this and the next several code updates, we'd like to focus on tools and enhancements for the people "behind the scenes" - members of the Abuse team, Open Doors, Support, Tag Wranglers, and so on. We're starting with a more convenient overview of recent admin activity, collecting all changes made to works by Abuse personnel, such as tag changes or deletions of works that were found in violation of the TOS.

Most of these enhancements will be invisible to the casual user, but we're hoping to make our volunteers' lives a little easier and enable a smoother experience for everyone.

Known Issues

See our Known Issues page for current issues. This list is updated with each release, so please make sure to give it a glance before contacting Support - it might just offer you a temporary solution to your problem right away.

Release Details

Features

  • Removed OpenID support
  • Added an activity log for Abuse admins
  • Made fandoms on media subpages ignore "the" and other articles when sorting alphabetically

Bug fixes and backend enhancements

  • The list of fandoms on a user's homepage was potentially breaking anonymity if the user had posted only anonymous works for a fandom, making it guessable which work in a collection belonged to them; this has been fixed to not display the fandom for anon works either in the list or the filters
  • Clicking "Edit Tags" from a work saved as a draft would take you to a form where your only option to save the tags would post the draft; this has been fixed
  • When marking a work for later, a success message would let you know it had been added to your history; it now helpfully links to your actual "Marked for Later" page
  • Accessing the "new comment" page attached to a restricted work would allow guests to leave comments on said work (without actually being able to see the work itself), this has been fixed to allow only logged-in users to comment
  • Fixed a problem with the caching on some collection fandom pages, where the works listing wasn't always updating properly
  • The notification emails for collection owners wouldn't be sent when someone added a work to a collection and also made it part of a series at the same time; this has been fixed
  • In preparation for the 852 Prospect archive import, we made some helpful changes to the page authors can access to claim their imported works
  • The page to change your username was quietly loading all usernames currently registered on the Archive, presumably in an attempt to make sure your choice hadn't been taken yet; this didn't actually work and was also a huge drain on the servers, so the code was changed
  • Changing your email would work even when the address given in the confirmation field didn't match your desired address; this has now been fixed
  • The Report Abuse form was behaving erratically; it now correctly sends a copy of the report if you enter an email address, and flashes an error if you request a copy but don't enter an address
  • A bug was preventing Abuse personnel from editing work tags and warnings on works that had been found to violate the guidelines for warnings; they can now follow through on procedure as laid out in the TOS
  • The invitation email was inviting you to join an "Organization of Transformative Works" project; the "of" has been silently replaced with the vastly more correct "for" now (oops)
  • On a user's Related Works page, their own translations were coded as an invalid mixture of tables and lists; this has now been fixed
  • Upgraded the version of Ruby on Rails our code runs on to make it easier to incorporate security updates and to pave the way for bigger upgrades in the next few months
  • We run a mirror version of the site that we use for testing, and it's now running in staging mode rather than production, which lets us customize and track things a little more easily

Comment


Pages Navigation