Comment on Upcoming changes to kudos

  1. Can you add a more detailed explanation about why you had to remove existing kudos instead of implementing this change, going forward, for any new ones?

    Kudos are a huge element of the site, so this is obviously something that's worrying for both readers and writers. These kudos, whether as a result of a bug or not, have meant a lot to writers for a long time. And I know of multiple cases where readers have intentionally tried to leave multiple kudos in order to show their appreciation for a work they'd enjoyed and read more than once, or had enjoyed over a long period of time. (I know this was an exploitation of a bug, and I haven't tried to do it myself, but it's also a symptom of the inability to leave 1 kudos per chapter, for example...so leaving multiple kudos on a work upon its completion was a bit of a cheat-workaround.)

    I recognize that this is something that needs to be in place for future kudos, but was there no way to just leave our kudos counts intact and make the database adjustment to all future kudos? It's going to be rather heartwrenching for people to see their kudos suddenly drop across their works.

    (Note: I'm not looking for any sort of debate on this, so no replies, please; I understand why it's being done now. I'd just like more official information about this process and why it's being handled this way. Any change to kudos is a pretty big deal that really could've done with a more informative post from the start.)

    Last Edited Wed 11 Mar 2020 08:48PM EDT

    Comment Actions
    1. That's the way databases work... it's either EVERYTHING gets checked for the thing or nothing gets checked for the thing. Ya can't have your cake and eat it too.

      Comment Actions
      1. lol I'm all for this new policy but stop spreading lies. You talk like you know about databases, so what's the function of "WHERE" in database programming? They simply can remove the multiple kudos that are added after the policy is implemented? Even the most basic program like MySQL can do too. They just need literally one additional line of coding.

        Stop spreading lies, you really sound stupid.

        Comment Actions
        1. That would imply that they're also storing the date on which the kudos was registered. Which, maybe they do, but it sounds like a huge waste of space to store an additional date field for like the millions of kudos people are leaving.
          In which case, you can totally implement a DELETE * FROM kudos_table WHERE kudos_date > '2020-03-16' kind of line, but then you'd have to run that view/query every time the database is updated or at certain intervals in order to ensure that the newer duplicate kudos get removed while the old ones are kept. Which...doesn't seem really feasible either. Are they going to run this query every week or something?

          Correct me if I'm wrong, but it would be much easier to just add a CHECK constraint to the WHOLE database that removes duplicate kudos for all the kudos ever left...but in order to do that, you have to first remove entries within the database that violate the Check constraint in order to add it to the table creation statement or else SQL doesn't let you update the table. So it makes sense that they're removing ALL the duplicate kudos ever, rather than being really selective with the date range.

          --My two cents, coming from a CS database course student

          Comment Actions
    2. And re: my request for more information at the beginning of rollouts/announcements like this, I'd also like to point out the extremely worrying tweet: "We're running out of room in the kudos table, and without this change, it will no longer be possible to leave kudos."

      Although context was shortly added, that's...really something that should be more carefully considered before you tweet it. It made it sound VERY much like there was going to be an ongoing issue with kudos.

      I'm also wondering about this part:

      This means that if you have received multiple kudos from the same user or guest on one of your works, your kudos total will go down when we deploy this change.

      How do you know it's the same guest? Are you going off IP? By the timing of the kudos? I guess my concern here is that this change is going to end up stripping away more kudos than it actually should...which maybe you've accounted for very well behind the scenes, but the information provided thus far here and on twitter has done little to ease my concerns.

      I appreciate all your hard work, of course, and fully support AO3 (have for years now!). Would just like more details for this type of thing in the future so it doesn't set off mini panics.

      Last Edited Wed 11 Mar 2020 09:10PM EDT

      Comment Actions
      1. Steven Gerrard and Xabi Alonso

        I don't know anything about how stuff work, but I do know that I can't leave multiple kudos as a guest on the same fic from the same device? So there must be a way to know which kudos were left by the same guest

        Comment Actions
      2. IP address is stored if user ID is not. It will almost certainly be by IP address.

        A fear of stripping away more kudos than necessary would require some incompetence on the coding side. Really the purge amounts to check work, if in the kudos for this work, a user ID appears >1, delete the newest kudos record. Same for guest kudos, just IP instead of user ID. The IP essentially functions the same as a user ID, as you can't kudos more than once on a work from an IP address anyway.

        Comment Actions
      3. I'm not one of the volunteer coders, so I'm not super familiar with the processes here, but they do have a public Jira for project management and codebase work happens on GitHub.

        The specific ticket for organizing these changes seems to be this one: https://otwarchive.atlassian.net/browse/AO3-5597

        This is the current schema for the kudos table: https://github.com/otwcode/otwarchive/blob/ecf3f58a8b6511523fc51eea2ba9d9c059e49432/db/schema.rb#L609
        - Looks like it's a many-to-many relationship table between pseud_id (pseudonym of a user?) and commentable_id (things like posted works?)

        My understanding that these changes were motivated by a number of existing bugs:
        - If a user leaves a kudos and then changes their username/default pseud, their name in the kudos list is still the old username
        - The kudos table was starting to run out of space (primary key currently seems to be an signed integer, which has an max value 2,147,483,647 in MySQL; for comparison, there were ~647,000,000 kudos on the site as of Feb. 14, 2020)
        - It is possible to leave duplicate kudos (if you look at the kudo model code from before the recent changes, Rails did already validate for uniqueness on pseud_id and/or ip_address; but the uniqueness is not validated at the database level, so presumably it was possible for duplicates to sneak through somehow)

        So the fix plan seems to be (according to the Jira epic):
        1. Database migration: Add user_id column to kudos table
        2. Change the code to save user_id when a new kudos is created by a logged in user
        3. Task to fill in the user_ids for existing kudos that were left by logged in users
        4. Update Rails to validate uniqueness on user_id instead of pseud_id
        5. Database migration: (1) Change primary key type from INT to BIGINT (this updates the max table size to 2^63 - 1), (2) add unique index for commentable_id + commentable_type + user_id, (3) add unique index for commentable_id + commentable_type + ip_address
        6. Task to update kudos count on indexed works (something something caching I think)
        7. Change kudos displays to use users instead of pseuds
        8. Drop the pseud_id from the kudos table

        Right now I think they're right around step 5, just before their changes might start having visible effects on the users.

        And n.b. that in that upcoming database migration, they'll be changing the primary key type from INT to BIGINT. This is how they're resolving the issue with space in the kudos table. INT is signed 32-bit, BIGINT is signed 64-bit. In other words, the max kudos table size before was 2,147,483,647 entries. The new max kudos table size will be 9,223,372,036,854,775,807 entries. i.e. after this migration, hard disk capacity will be a concern long before the max PK value does.

        Comment Actions
        1. Hm. That's interesting and helpful, thanks. It actually makes me wonder now, though, if the kudos will drop even more than expected, because while I have not exploited the multi-click kudos bug, I *have* left multiple kudos as a guest (before I had an AO3) account, then as my logged-in account on re-reading particular fics years later. I've also heard that it was possible to leave multiple kudos if you did so with enough time in between them. I suppose there's no use worrying about it, though, since it's something they have to go ahead with to ensure future kudos will continue working.

          Comment Actions
        2. Ladybug with a rainbow Pride background

          thank you for the explanation!

          Comment Actions
        3. "I dwell in darkness, Madam, and darkness is where I belong."

          Thank you for breaking this down, especially the INT to BIGINT detail; that should alleviate concern about running out of room for hits, or saying that we should just remove kudos since we're going to run out of room for them anyway.

          Comment Actions
    3. Not a coding volunteer, but basically the changes being made to the database structure can't happen if there is duplicate entries. An oversight in the past (probably the assumption that the app code wouldn't allow multiple kudos to be created) meant the SQL server wasn't checking if kudos were unique. Now the SQL server will check it. The SQL server also won't be happy about the existing duplicates. They're RecordNotUnique as well. So, they gotta go to perform the migration. The database server doesn't make exceptions, it does what the schema tells it, so that's why there can't be an exception.

      I agree some more technical details would be nice, but really people are worried too much about this. The amount that kudos are going to shift is likely NOT going to be noticed by most people. There will just be a small set of authors who have been kudos glitched by readers that will notice a drop. I really don't think the bug was easy enough to trigger for it to make a substantial dent.

      Comment Actions