Tags, Categories, and Knowledge Management

I’m inclined to view “tags” and “categories” as just two facets of the same thing: knowledge hooks we apply to content to help us find it.  My gut tells me that these things tend to be viewed as two different beasts because of the way they’ve historically been implemented.

While I’d love to get the concept of tags and categories right the first time, and do it in a way that’s cohesive and elegant, in the end, compatibility with other approaches is probably the right solution.  So we may just need both tags and categories regardless of personal beliefs on my part.

Of course, one way to kludge both into a common solution – that isn’t terribly kludgy – is to have a predefined “category” called “tags” (or “Keywords”) that contain all the “tags” and which are wired through API to the correct fields in LiveWriter.  They could be presented in the DNN module as a separate field that works just like “categories” only without hierarchy separators.  It would still reside in the same data structures and produce the same lists and “related entries” capabilities.

I have a lot of experience with knowledge management solutions (I’ve been doing KM since 1993), and think that “best of both worlds” is really the best approach, because it offers the flexibility of tags with the structure of categories – in other words, it’s a lot more like the way the mind manages these knowledge structures.  My experience is that “structured categories” always start out as “flexible keywords”.   At some point, there is sufficient comprehension to establish the structure that was previously invisible, and then your “keywords” become “categories”.

Let’s take a cue from Amazon.  They sell items in categories.  And items can have multiple categories.  They also assign author (artist) and other common attributes.  However, they only offer one “category” structure – a “genre” list.

They also offer tagging.  If you look at what sort of tags get created, it becomes plain that there are three sorts of tags:

1. Spurious tags – tags that duplicate existing attributes and shouldn’t be there in the first place, e.g. “Monty Python” tag associated with the movie “Monty Python and the Holy Grail” which has an artist of “Monty Python”, or tags which just don’t add practical value, like “Movies that Include the word ‘swallow'”.

2. Tags that ought to be part of a data structure that just doesn’t exist – e.g. “Graham Chapman” tag on “Holy Grail”, which really ought to be part of a structure called “Actors” but isn’t provided by Amazon.  Or “British Comedy” which really should have been a subset of the “Comedy” genre, but Amazon didn’t provide this option.  Or “Arthurian Legends” – a tag that could easily have been a category in the “Subject Matter” hierarchy.  Or “Party” – a tag that could well belong in a “Mood” category.

3. Tags that don’t seem like they should be part of a data structure, YET, because not enough material has been tagged on this dimension to understand the dimension.  For example, your hypothetical post that you wanted to tag “suggestions”, could easily have been a node in the “Article Type” category, along with “Ratings”, “Reviews”, “Comparisons”, and “Recipes”

Back in the olden days of doing KM in Lotus Notes, there was often great confusion about the difference between “Categories” and “Keywords” and the reality is that they’re both the same thing, with differing amounts of structure.

IMHO the problem that “tags” have shown up to address is the same one that “keywords” showed up to address – the problem of only offering one category dimension.  People get stuck in the paradigm and can’t get out.  Consider this article covering the subject – it’s clear that the author views “tags” and “categories” not as KM abstractions in and of themselves, but as artifacts of the particular implementation in WordPress.  WordPress has a particular implementation of Categories that lends itself to a limited use – e.g. you can’t have too many, because the sidebar list will be too long.  Well, that’s an artifact of putting the whole thing on a sidebar all at once, and not a consideration of the KM implications at all.

Consider that most books just have an “index”, which is a “subject category” structure.  But some reference books have indices for many different dimensions.  Likewise, most blogs just give you the capability to build the single “subject category” structure.  But if we built just one iota of flexibility into the “category view” module, then you could (for example) have one category hierarchy that is short and highly topical and shown on the sidebar (like WordPress) as well as other hierarchies that are deep and multi-layered, but viewed through a larger display on a different page, and other views like the “Related Entries” views that mine the hierarchies and just return the most relevant entries.

Not only does this offer a lot of UI flexibility for sites with a lot of structured content, but also, given the way the topics, keywords, and URLs are all associated, it’s a real SEO boon.  You can build a very strong semantic map into the linked content.

Having said all that, I will return to the position that I suppose we’ll have to support both tags and categories, but perhaps we can do it in a more powerful, elegant way than just duplicating a bunch of middle-of-the-road functionality.