Start a new topic

Tag Aliases

I've observed a number of tags that are essentially the same thing, but are either spelt differently or use synonyms. It would be great if there was a system such as exists on sites like Danbooru, where tags can be aliased to each other, so that it doesn't matter under which thing you tag your post or which you search, the posts will appear on both.

An example of this is: "ArtistsOnCohost" and "artists on cohost".

10 people like this idea

Bumping this because it's one of the worst things about running blogs on tumblr and I was sad to see it be a problem here too; this is coupled with the whole "canonical capitalization" thing where it seems whoever first used a tag gets to decide which letters are uppercase. We're got a worst-of-both-worlds situation going on where we get aliasing but only so that our grammar gets mangled by some past user's unaware actions, and in no other ways.

Obviously, though, booru mods work long and hard to manage their tag aliasing system and they have a much more structured tagging system and limited scope compared to what we have here. So, it might be a good idea to implement this in a semi-automated way:

  1. I know there are easy methods of turning OP's example tags into each other with simple text parsing, so in cases like that, the tags should simply all point to the same thing. (It could even be handled client-side to reduce server load, if you need it to.) That alone would help a lot.
  2. To go a step further for cases where there are lots of tags to describe similar things but have slightly different wordings (e.g. "xenoblade" vs "xenoblade chronicles" or something like that,) showing a rough estimate of how many posts are tagged with the same tag as you're about to type in would help you know which tag is used by the majority of people.
  3. Finally, on the /rc/tagged/<tag> page, a list of frequently-associated tags (i.e., tags that often appear on posts alongside the currently searched tag) would help people discover related tags that aren't something they'd think to start typing.

I think all these would be deliverable without having to implement booru-style tag aliasing, be it crowdsourced or human-moderated or both.

2 people like this

AO3, a website devoted to fan fiction, has a well-defined concept and implementation of *tag wrangling* (see AO3 tag FAQ & wrangling guidelines).
On AO3, authors tag their posts, and tag wranglers (later) manually establish relations and equivalency between tags for purposes of search & indexing as well other things like generating meaningful tag cloud diagrams.
Tag wranglers aren't the general userbase but rather are volunteers who make a commitment to wrangling tags following guidelines and within a particular scope.

One import aspect of tag wrangling that I think cohost should definitely follow is that *the original author's tags on their post are not modified in any way*.
This ensures that the author's original intention (or mistake) is preserved regardless of how tag wranglers later decide to interpret tags.

I would also point out that *tag disambiguation* is just as important as tag aliasing, and any solution for one should consider how to solve the other.

Wikipedia is an excellent example of the importance of disambiguation.
For example Dysnomia is a medical condition, a music album, a mythological figure, a genus, and a celestial body, but Anomic aphasia is a synonym for only Dysnomia the medical condition. If I were posting about a topic I was describing as dysnomia, I might pick the tag `#dysnomia` myself, but the community of posters talking about the mythological figure Dysnomia might want some way to refer to their Dysnomia without ambiguity, such as `#Dysnomia (deity)` with a clear difference in both capitalization and a reasonably concise suffix to disambiguate by.

I am not familiar with how AO3 implements tag indexing, but I think I recall it being mentioned as being open source, so there may be some opportunities to reuse rather than reinvent here. AO3 has a very mature system for discovering fan fiction by combinations of tags, and the tag wrangling volunteers ensure that fandom tag interpretations are joined sensibly so that typos, alias, and disambiguations all "just work" for users.

would anything get broken if spaces didn't count? I've seen a ton of posts that list the same tag multiple times, some with spaces and some without, just in case (#furry art, #FurryArt). I've also seen a *lot* of people put an extra # at the start by accident that should probably get automatically removed.

Out of the bigger changes, listing related tags with some indication of popularity would be my pick, but it's potentially difficult to implement.

Login or Signup to post a comment