home
on exploration, introspection and creation

Archive for the ‘paramathematics’ Category

Turn signals: how yours never blink with the same frequency as the guy’s in front of you

Friday, October 9th, 2009

If you’re a driver, you no doubt spend a nontrivial amount of time waiting at an intersection, another car in front of you, you both wanting to turn left. You probably noticed that the turn signal of the car in front of you doesn’t blink with quite the same frequency as one in your car.

In fact, I have a strong suspicion that no two cars have the same frequency–at least that’s what it seems to me since I can never find turn signals to be in phase.

And so here I am, waiting for the light to turn green, with two blinkers flashing with different frequencies. I don’t like to do nothing, so I often figure out what this difference in frequencies is. It’s not as hard as it may seem–and it involves no measuring devices! It’s a pretty cool trick that takes advantage of the fact that while it’s hard to measure or compare quantities (such as speed, frequency), it’s relatively easy to detect synchronicity. First, figure out which blinker is faster. Then wait for both blinkers to be momentarily synchronized (i.e. for both to flash at the same time). Count how many times the faster blinker flashed before both are synchronized again (make sure you don’t “skip” a cycle). If the faster one blinked n times, and you captured the cycle correctly, the slower one blinked n-1 times so the faster one is n/(n-1) times faster. I like to go a step further and memorize what fractions of the form n/(n-1) come out to be as percentages to impress people with some percentage estimates while sitting in the car, with no calculator. For example, if the slow one blinks 9 times and the fast one blinks 10 times, the fast one is 11% faster.

This trick works for car blinkers in part because most cars’ blinkers flash with similar but not the same frequency (if the ratio of frequencies is fairly large, you will find it hard to not skip steps–that is, the blinkers will not synchronize fast enough). I also like to go to the gym, get on the treadmill and figure out how fast the person next to me is running by observing the synchronicity of the markings on the treadmill (treadmills have their brand names displayed on the belts)–the trick works because most people run at similar speeds (between 6.5 and 8.5 mph) and, moreover, because people tend to run at quantized speeds, I am often able to figure out the speed precisely.

What is a cause?

Wednesday, October 7th, 2009

As I re-watched fragments of Benjamin Button recently (a movie that I’m not particularly fond of, I’m afraid), I recalled a scene in which a particular incident was shown to be preceded by a series of events that took place that ultimately led to it. If at least one of these events had not happened, the narrator argued, the incident could have been avoided. This got me thinking about causes and what it really means to call something a “cause” of some event (more narrowly, how one assigns “blame” to things, or gives credit to things).

As we can easily predict, there is often no single cause of an event. WARNING: spoilers abound! Even though the driver, who struck Daisy, seems like he was the cause, if many other things hadn’t happened, Daisy would have been fine; shouldn’t those other things also be considered causes? How far back do we go? How do we attribute causality–it seems to us, viscerally, that some factors were more instrumental to the incident than others, but how do we quantify that? (For example, the invention of the automobile was among the causes but was it as significant as the fact that the driver didn’t pay attention to what was in front of him?).

This is a difficult question and so let’s start small. The universe is massive, with so much happening in such short periods of time that it’s easy to get confused. Let’s simplify the question as much as we can (it’s still going to be pretty complicated so bear with me).

I’m going to assume that the universe consists of a series of discrete decisions that lead to some event. The decisions are binary (and let’s assume that the choices are 0 and 1–whatever these numbers may stand for. For example, 0 may mean “forget the jacket” and 1 may mean “take the jacket”) and could stand for anything that has two distinct outcomes, so I’m using a broad definition of the word “decision” here. We will try to come up with some framework that allows us to talk about these decisions–for example, determine which decisions contributed to the event more than others. First, some caveats:

  • We will not deal with intent, that is, we will not be interested in what a particular decision intended to achieve and instead look at what it ended up achieving. This means we’ll be looking at cause in a way which may be different from the way the law looks at cause.
  • We will deal with decisions which are not biased in any way, i.e. behind each decision there is an agent who is either picking 0 or 1, and is not subject to chance. Hence, a “decision” to rain in Sahara is not a valid one since it’s not particularly likely to rain in Sahara.
  • We can model decisions which are biased, or decisions which are not binary, as a sequence of binary decisions, so we can proceed without loss of generality (I never thought I’d use this phrase again…)
  • We’ll assume that all decisions are discrete, that we can determine the consequence of every decision and that we have captured them all

Moreover, in order to talk about causality at all, we have to define equivalence of decisions and events properly. We’ll be dealing with parallel universes (i.e. universes in which a particular decision had been made differently) and, strictly speaking, events across universes are never the same (because the universe is slightly different if a decision is made differently, even if it seems that a particular event was not at all affected). We will identify an equivalence class of events (and equivalence classes of decisions) — buckets, where we put many similar events and treat them as the same, as the event.

For example, consider the event “Daisy breaks her leg as a result of a car accident”. If, in a parallel universe, some decisions are made differently, and Daisy breaks her other leg, the event is now different, but we still care about it (for all intents and purposes, she’s broken a leg). So we’ll put the two events in the same equivalence class. Similarly, the driver’s decision to not pay attention should be treated as the same decision regardless of the decision to forget the jacket, so we need equivalence classes of decisions.

Let’s work with an example. Consider three decisions d1, d2, d3, that led to some event E. We can represent this as a tree of decisions:

The Partial Decision Tree

The Partial Decision Tree

The graph represents the current universe — the three decisions were made, and the event E happened. Now let’s introduce parallel universes — let’s determine what would have happened if either of the decisions had been made differently. Suppose that if d3 was 0, E could still have happened if two other decisions d4 and d5 were 1. If d2 was 0, there is no way for E to happen. If d1 was 0, E could happen if d3 was 1 and a new decision d6 was 1.

This is, of course, arbitrary — in the real world it may be difficult if not impossible to reason about what would have happened. But there again, I never made any claims about the practicality of this framework…

Note that in our example above decision d3 is reused: it appears in the parallel universe as well as in the actual one. This is fine — a decision may appear in many universes (especially if decisions are unlikely to influence one another, for example, because they happen far away from each other). This is also why we need equivalence classes of decisions (so that we can talk about similar decisions rather than treating every decision as a unique one). Those “reused” decisions will be tricky to analyze: on one hand, they are the same decision, so we could talk about it in the abstract, regardless of what our universe looks like (i.e. regardless of what decisions have actually been made); on the other hand, by the time a decision needs to be made, other branches of the tree (that may include the same decision!) can be pruned. We’ll solve this problem shortly.

We can complete the decision tree:

A Full Decision Tree

A Full Decision Tree

Now we can look at all combinations of different decisions and see if E occurred or not (x means “any value”):


1 2 3 4 5 6   E?  # distinct combinations
0 x 0 x x x   0   16
0 x 1 x x 0   0   8
0 x 1 x x 1   1   8
1 0 x x x x   0   16
1 1 0 0 x x   0   4
1 1 0 1 0 x   0   2
1 1 0 1 1 x   1   2
1 1 1 x x x   1   8


For example, if d1 was 0, d3 was 1 and d6 was 1, the event happened regardless of the values of the other three decisions (and there are 8 such combinations).

Now we can compare the decisions and see which one caused E the most. For each decision, determine for how many combinations E happened, and for how many it didn’t (if a decision didn’t matter for the outcome, for example d2 in the case when d1 is 0, we can exclude the scenario from our calculations). The difference between these two numbers is the extent to which that decision caused E. For example, if d1=0, E is caused in 8 out of 32 combinations. If d1=1, E is caused in 10 out of 32 combinations. So regardless of d1, E is caused in at least 8 out of 32 combinations (so that much wasn’t caused by d1). However, the remainder — 2 out of 32 combinations — were caused directly by d1 so the causality of d1 is 2/32 = 1/16.

Similarly we can compute the causality for the other decisions:

  • d2: if equal to 0, causes E in 0 combinations; if equal to 1, causes E in 10 out of 16 combinations. So its causality is 5/8
  • d4: causes 0 if equal to 0; 2 out of 4 if equal to 1. Its causality is 1/2
  • d5: causes 0 if equal to 0; and all (2 out of 2) if equal to 1. Its causality is 1
  • d6: causes 0 if equal to 0; and all if equal to 1. Its causality is 1

Let’s now look at the tricky case — d3. It appears in two branches of the tree. The answer to how much it caused E will depend on how much the agent making the decision knows about the decisions made up until now (specifically, d1 and possibly d2). In other words, if the agent knows which branch of the tree he is in, he’s causing E to a different degree than if he had no such information.

  • If the agent has no information, we need to look at all combinations. If d3 is equal to 0, E is caused in 2 combinations out of 24; if it’s equal to 1, in 16 out of 24. Its causality is 7/12
  • If the agent has perfect information, we need to consider which branch of the tree he’s in.
    • If he’s in the right branch of the tree: if the decision is equal to 1, E is always caused. Otherwise, E is caused in 1 out of 4 combinations. The causality is 3/4.
    • If he’s in the left branch of the tree: if the decision is equal to 1, E is caused in 1 out of 2 combinations. Otherwise, it’s never caused. The causality is 1/2.

    Hence d5 and d6 cause E the most (which makes sense: since they are at the bottom of the graph, they have full control over whether E happens or not). d1 causes E the least. However, it’s not always true that the higher up the tree a decision is, the less it contributes to the event: d2 contributes more than d3 (with no information).

Crowdsourcing Art, Part Two

Thursday, September 24th, 2009

Previously I introduced the idea of a crowdsourced graphic that I would have my friends generate. The experiment is still happening, and I encourage everyone who got a token to participate (so far very few people actually submitted their pixels). If you didn’t get a token, but are subscribing to/occasionally reading/just stumbled on this blog, let me know and I’ll send you one. Similarly, if you used up your tokens, let me know. I don’t want the lack of pixels (or the fear to lose them) to be the reason why people don’t contribute.

The work so far is fun to look at; undisputedly predictable was the penis that found its way on the canvas a few hours after the game started, but there have been efforts to turn it into a happy face (I must admit, I was tempted to use up some of my tokens but in the end I decided this piece of art is a no judgment art and anything goes).

An interesting pattern was that people used the background image and the fact that the canvas was translucent to trace elements of the image onto the canvas. I did not expect that, but I guess that’s an as good place to start as any.

Finally, people are not taking advantage of the collaboration element of the game. I’ll let you in on a secret: the “bonus” you get for merging tokens is actually pretty substantial. In fact, the number of pixels you can set is equal to the “size” of the token (the first three digits) raised to the power 1.25. This may not seem much (it’s close to 1, after all), but consider merging two tokens, each of “size” 10. Each individual token allows you to set 101.25 = 18 pixels, so in total both tokens let you set 36 pixels. If you merge the tokens, you will get one token of size 20 and 201.25 = 42 (you’ve just gained 6 pixels). The differences are even larger for larger tokens (or more tokens). Two tokens of size 20 give you 84 pixels when separate and 101 when merged! You get the idea.

Finally in the spirit of full collaboration I’m making the source code available — I’m not opening up the interface (which contains secret information that allows me to generate and keep track of tokens and make sure people don’t cheat) but most of the implementation is in the helper file below.

Source:
crowdsource-helper.rb

The knowledge of the collective

Thursday, September 24th, 2009

Consider the strange cause-and-effect loop involving science fiction. An example I like to bring up is the multi-touch screen as featured in the movie Minority Report, the one where you can scroll through the information by performing sweeping motions with your hands. I was somewhat surprised how much attention this fictional gadget got. Everyone talked about the “Minority Report screen” (particularly as technology was developed that made such a screen possible); when Microsoft came up with its Surface product, it looked surprisingly like those screens in the movie. I think it’s safe to say that science fiction inspired a number of scientists, engineers, product managers to come up with a product that mimicked the fictional entity.

In fact, it’s not the first time this has happened, even if often it’s difficult to trace the elements of our culture that have previously been introduced in a work of fiction. Today’s robots are largely based on renderings by the early science fiction illustrators and movie directors. Cars dropped their box-like shape. In a way, one can say, science fiction predicted what was going to happen. But of course science fiction itself, just like fiction in general, has a power to motivate, inspire, and influence, so one can also say that it influenced what was going to happen (we made cars look what they look because the public was used to seeing images of “futuristic” cars already and why not introduce a product the public is already familiar with. I believe that things such as the Star Trek franchise have been engrained in our culture so much that space ships may very well be similar in structure to USS Enterprise; the doors may slide sideways; and we may be greeted by a warm yet machine-like female voice of an omnipresent “computer” (incidentally, voiced by Gene Roddenberry’s wife, if I remember correctly).

If that’s the case, we may say that we really came up with those inventions of the future already: when we depicted them in the science fiction movies. They haven’t materialized yet but it’s just a matter of time. The future, indeed, is now.

I’ll take one more leap here. These days, as ideas presented in works of fiction (particularly visual works such as movies and TV shows) get tested and engineered more and more, the public becomes to have a controlling stake in what is presented. A small “control” set of viewers, for example, is often asked to rate pilots of TV shows (and decide–not quite as dramatically as the Emperor with his thumb in the ancient Roman Gladiator games, but in a similar fashion–which TV shows see the light of day and which never do) or help decide on which one of the two endings a film should have. I’m going to say that over time, the public will decide more and more on the details of many works of fiction.

In a way, then, the public may end up collectively deciding what the future will look like. Reducing further, since the public opinion as such is timeless, even though the individual outcomes may vary over time, it seems that today, the public has all the information to know what an arbitrary object, time period or an outcome in the future will look like.

How does that work? Assume that we want to know the status of a future outcome A. This outcome will be influenced by a series of events, and (and this is a big assumption) assume that we can trace this influence to some outcome B in a particular work of fiction, say, a TV show. The producers will want to gauge public opinion on B to maximize the chance of success of the show. Based on how the public reacts to different stimuli, B will be chosen. The value of B which causes A to materialize is therefore a result of a particular class of reactions of the public (it’s not necessarily the case that everyone has to have the same reaction; it’s also possible that many different sets of reactions influence the value of B). We can repeat this experiment many times, each time constructing outcomes based on the set of reactions, and having these outcomes influence some future outcomes.

The public, therefore, today has all the knowledge of the world at any time in the future. This knowledge pertains to a large number of outcomes, so we can compress it to some linear combination of the public’s reactions (we can reconstruct any outcome A by applying this combination to an outcome B and “playing the movie” to see how A would be influenced) the same way we compress a curve that passes through n specific points with a linear combination of monomials. This linear combination consists of a large number of coefficients (just like the curve is actually a polynomial with some coefficients) with which we can express the total future of the world. Hence, these coefficients are a de facto encoding of the world as we know it (now and into the future). Different sets of coefficients lead to different outcomes so they describe different worlds.

The worlds are therefore countably infinite.

A large group trying to go somewhere

Friday, September 18th, 2009

Very frequently I’ve been in a fairly large group of people (5 or more) and we were all trying to go somewhere, say, a movie theater. I noticed that the amount of time it took us to actually get going increased pretty rapidly with the size of the group. This fact by itself should be no surprise to anybody; I have a feeling, though, that the amount of time is superlinear with the number of people, and perhaps it’s even superpolynomial. Let’s see if we can derive this relationship. We’ll make some simplifying assumptions but the gist of the problem should be captured.

Suppose you have n people in a group. Every person is mostly ready to leave, with the exception of a small number of tasks the person has to do (or can do, given enough time) — you know, the “If we’re not going to leave in the next five minutes, I’m just going to quickly go to the restroom” sort. Assume that each person experiences some distribution of such “events” which derail the effort of leaving. The duration of such events is also a random variable. The group can’t leave if at least one of its members is currently occupied with an event. Let’s say that the probability at any given time that a person is not occupied with an event is p (p, therefore, is the measure of “readiness”; of course, if p=0, the group will never leave; if p=1, the group will leave at time t=0).

Assume it takes n people time t to leave. Now add a new person to the group. The group will leave at the expected time t only if the new person happens to be free at that time (with probability p). If not, the group will have to wait; assuming the events are independent, this will take another time t (at the end of which the new person may or may not be free). The expected time is therefore


tp + 2t(1-p)*p + 3t(1-p)(1-p)*p + 4t(1-p)(1-p)*p + … = tp(1 + 2(1-p) + 3(1-p)2) + …)

Let


S=1+2a+3a2+4a3+…

Then

S = (1+a+a2+a3+…) + (a + 2a2 + 3a3+…) = 1/(1-a) + aS

Hence

S = 1/(1-a)2

So the above becomes

tp*1/(1-(1-p))2 = tp/p2 = t/p

Suppose p=0.5. The expected time is 2t: an additional person doubles the amount of time it takes for the group to leave. The amount of time is therefore not only superlinear in the number of people, it’s actually exponential!

What is and what is not meta

Tuesday, July 14th, 2009

There was a period in my life that was clearly marked by my exuberant fascination with everything meta. Now I treat it as a remarkable but not indispensable concept. I remembered it after a post in Coding Horror with some (really unjustified) hate for all things meta.

The concept is too simple to make an entire blog themed after it, I think, so the blog died but I’d like to cite one post I submitted there that explains my view on various ideas in the space and disambiguates them from meta proper:

From metayada (specifically, this post):

meta

One can think of meta as a function that takes an object and returns an object that describes the input object using the technique characeristic of the object itself. MP3 metadata, for example, is data about the mp3 file, but not the audio itself (for example, artist information, album cover picture, etc.). It’s meta because, just like the mp3 file is described by data (sets of bits), the metadata is described by the same means.

In real-life meta is shorthand for a situation in which instead of the object, the focus is on the description of the object that shares the same characteristics as the object itself. For example, suppose we are documenting the goals of our projects. One project’s goal could be to understand the relationships between customers’ preferences and their purchasing power. Another project’s goal could be to determine all players in the currency exchange market. If we step back, we might want to determine what’s the goal of us documenting all those goals. This is a meta goal. It doesn’t refer to the object itself (documenting the goals) but it describes it, and uses the same technique (determining the goal).

It is crucial to understand that sharing the characteristic of the underlying object is a necessary condition. In the example above, for example, we could do many things with the given collection of goals to describe it — we could determine their number, for example, or figure out that they all take just one sentence. But it’s the fact that we’re determining the goal of writing up those goals that makes the situation meta.

Not everything can be metafied, but I claim that once it’s been metafied, it can then be metafied ad infinitum.

self-reference

This is most often confused with meta. If something refers to an instance of itself (either a different instance or the same), then it’s self-referential. For example, if you’re playing an RPG game and your character instantiates an RPG game, then you’re using self-reference. Self-reference is often used to provide paradoxes: for example, take the following statement: “There are two misteaks in this sentence.” There is only one mistake in this sentence, but the fact that the sentence failed to provide an accurate count of the number of mistakes is a mistake in itself — so there are two mistakes in the sentence. Which means that the sentence only contains one mistake. But then… (ad inf.) The sentence is self-referential. Self-referential statements can cause infinite loops (but iterative, not recursive infinite loops), like in the example above.

recursion

Recursion is a special case of self-reference when the object invokes a new instance of itself rather than refer to itself. The sentence in the above case is not recursive because it does not invoke a new instance. Similarly, a piece of code that does recursion is itself not recursive: it’s, however, still self-referential. An example of a recursive statement is “n factorial is n times larger than n-1 factorial”. This is recursive: to determine what 5 factorial is, you have to instantiate this sentence multiple times. In the mistakes example above, you don’t need to instantiate multiple copies of the sentence.

reflection

Reflection is the ability to inspect one’s own definition. Reflection is a special case of meta.

reflexivity

An operation is reflexive if it operates on itself. This is not really self-referential (is the statement “my dog is cleaning himself” self-referential?).

self-relativity

It seems to be that self-relativity is the ability to relate to oneself. A simplest example of self-relativity is the word “I”. If something is self-relative, it is capable of producing self-referential statements.

deep

A lot of things in the world are just deep. They’re not meta or self-referential. We like talking about them because they seem complex. That doesn’t make them any less sophisticated or interesting, but a distinction must be made. An example of something that’s not really meta or self-referential is the barber paradox:

“There’s a barber in the town that shaves everyone who doesn’t shave himself. Who shaves the barber?”. This is paradoxical because if the barber shaves himself, then by the first statement, the barber cannot does not shave himself. If the barber doesn’t shave himself, then by the first statement he shaves himself.

Actually, this is not really a paradox — there is a flaw in how the first statement is constructed. Let S(i) be a function that assigns to every person i a person that shaves them. Then S(i)=b ^ S(i)=i encodes the first statement. It is meant to be a tautology (since the statement describes something about this hypothetical world that’s supposed to be true for everyone in the town). But if we substitute b for i, we get S(b)=b ^ S(b)=b which is false. Hence the first statement cannot be a description of the world. In other words, it is impossible for the barber to shave everyone who doesn’t shave himself. It’s a little like saying, “The square root of every integer is an integer. What is the square root of 2?” and claiming it’s a paradox. In actuality, the first statement is simply false.

In any case, this is not really self-referential; it’s just deep. The fact that the barber shaves himself isn’t self-referential, but “shaving” becomes reflexive when it’s applied to the barber.

I wish humans had 12 fingers

Sunday, June 7th, 2009

We count in base 10 because we have 10 fingers. 10 is not a particularly good number to use every day: it divides only into 2 and 5, so many everyday operations (such as splitting something into a small number of equal parts) are just painful with a decimal numbering system. Of the remaining divisors — 3, 4, 6, 7, 8 and 9 — all but two yield nasty infinite decimal expansions (because 3, 6, 7 and 9 factor into primes that don’t appear in the prime factorization of 10), 3 being the most painful one since it appears in nature so much.

I wish we had 12 fingers. 12 is a number that is far superior to the number 10. It divides evenly into 2, 3, 4 and 6 parts. Dividing into 5, 7, 10 and 11 still produces infinite expansions, but who needs to divide into 5 in nature anyway.

K.R., a friend of mine, noticed that if the additional two fingers were opposing, like our thumbs, our hands would be infinitely more agile. Just think about the sorts of manipulation (and hence the sorts of tools) we would be capable of with two thumbs at each hand!