home
on exploration, introspection and creation

Archive for August, 2010

The theory of classifying things

Tuesday, August 3rd, 2010

One of the most valuable abilities a person can boast, in my view, is the ability to classify. It’s very closely related to the ability to think in terms of layers of abstraction, since categories are just abstractions on top of the objects being classified.

Most people are really bad at any kind of categorization — they either simply don’t do it (just look at people’s desktops on their computers) or come up with very poor categorizations and as a result find it difficult to locate finds in a large set or synthesize properties of sets efficiently (which are the two operations that good classifications make trivial; and the two operations that are actually fairly commonly needed. There is also a lot of money to be made on good categorization systems, for example, in systems that allow customers to search for products to purchase).

A friend of mine W.D. pointed out that taxonomies are dangerous. I will agree with him: to create a classification system for the sake of it is not only wasteful, but also risks inaccurate generalizations being made. But a good classification, supported with the goals of that classification, is invaluable.

Some principles that should guide a good taxonomy are:

  • Unique representation — everything should have a single, deterministic place in the hierarchy
  • Meaningful dimensions — ideally you should be able to express each dimension (or category) in as few words as possible. Arbitrary divisions don’t make it easy to find things and make for a weak hierarchy, even if they allow you to bifurcate your set of objects right down the middle
  • Reasonably sized dimensions — in a perfect classification, each added property halves the number of items in it. This will, of course, never be true but there are good ways to split the set into by-and-large equivalently-sized sets. This balances the categorization (it won’t take a large number of dimensions to describe an object – for a perfect classification, you only need 12 bits of information to classify four thousand objects, which with a good category system, may mean three dimensions that each take one of sixteen values
  • Separable dimensions — ideally each dimension should be fully disjoint from all other — if shouldn’t matter if you apply a condition first or last. Unfortunately, most times, the further dimensions vary depending on the values of the prior dimensions. For a good example, visit Amazon.com and see how the filters change based on what category of items you select. If the dimensions are separable, you can more efficiently find things by picking the relevant dimension first

Concept Naming

Monday, August 2nd, 2010

Crediting some fairly natural (and intuitively understood) concepts with people’s names seems to be a distinctly American idea. I’ve never used the term Venn diagram (although I have drawn two overlapping circles to illustrate a point countless times) before I came to the U.S. Similarly, I was shocked to hear that the idea to illustrate the degree of fulfillment with partially-filled circles is actually being credited to a dude who first seemed to have popularized (pointed out the obvious?) this particular visualization method.

How we Add Value

Monday, August 2nd, 2010

I’m becoming increasingly more convinced that where one adds the most value in a workplace is not a set of skills one possesses, but the ability to make reasonable decisions faced with imperfect information, which is really just three things:

  • An ability to see the possible outcomes (ability to visualize; one something you may call creativity)
  • An ability to enumerate the value and the possible risks of each
  • An ability to evaluate these trade-offs

In other words, everything there is to a responsibility is the ability to make decisions and each decision is simply an output of an evaluation function of all the pros and cons of all the possibilities.