<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>blog.elevenseconds &#187; how i&#8217;d explain this to my mom</title>
	<atom:link href="http://blog.elevenseconds.com/category/how-id-explain-this-to-my-mom/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.elevenseconds.com</link>
	<description>on exploration, introspection and creation</description>
	<lastBuildDate>Sat, 21 Jan 2012 20:36:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>The greatest moment in my lifetime of interactions with computers</title>
		<link>http://blog.elevenseconds.com/the-greatest-moment-in-my-computer-interactions/</link>
		<comments>http://blog.elevenseconds.com/the-greatest-moment-in-my-computer-interactions/#comments</comments>
		<pubDate>Wed, 25 Nov 2009 04:03:43 +0000</pubDate>
		<dc:creator>me</dc:creator>
				<category><![CDATA[discovery]]></category>
		<category><![CDATA[how i'd explain this to my mom]]></category>
		<category><![CDATA[reductions]]></category>
		<category><![CDATA[retrospective]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://blog.elevenseconds.com/?p=916</guid>
		<description><![CDATA[For those who prefer brevity to beauty, here it is: the day I discovered DOS/4GW. For everyone else, read on. Thanks to my dad&#8217;s wonderful prescience, I grew up with computers&#8211;and by grew up, I mean grew up. I think the first computer in the house was a Commodore (although for some reason I imputed [...]]]></description>
			<content:encoded><![CDATA[<p>For those who prefer brevity to beauty, here it is: the day I discovered DOS/4GW.  For everyone else, read on.</p>
<p>Thanks to my dad&#8217;s wonderful prescience, I grew up with computers&#8211;and by grew up, I mean grew up.  I think the first computer in the house was a <a href="http://en.wikipedia.org/wiki/Commodore_64">Commodore</a> (although for some reason I imputed a memory of my dad owning an <a href="http://en.wikipedia.org/wiki/Amstrad_CPC">Amstrad-Schneider</a>).  I was too young to remember much other than sitting on my dad&#8217;s lap and staring at the computer screen.  Back in the day it was impossible (and I believe also illegal) to own &#8220;imperialist&#8221; equipment (I know one needed the governor&#8217;s permission to get a car) and so I am deeply impressed by my dad&#8217;s ability to do magic.</p>
<p>When I was six or seven (again, I don&#8217;t remember exactly), I got my very own <a href="http://en.wikipedia.org/wiki/Zx_spectrum#ZX_Spectrum.2B">ZX Spectrum +</a> with a black-and-white monitor.  I remember playing with it for hours (the fact that those years spent staring at a CRT screen haven&#8217;t made me blind confirms my theory that if you grow up with something you get used to it and it doesn&#8217;t harm you; ironically many of my friends who got their game consoles or computers when they were teenagers are nearly legally blind now).</p>
<p>Pretty quickly I started writing programs for it.  Looking back, this was crazy&#8211;the user manual was in English and so at age seven I knew about eighty words&#8211;keywords used in programs I&#8217;d write&#8211;in English very well but no grammar or vocabulary that kids my age for whom English was their native language knew (a fascinating way to learn the language).</p>
<p>I don&#8217;t want to digress too much from the theme of this post; one day I&#8217;ll continue the rant.  The important thing was that at the time the computer had 48k of memory available (an equivalent to the amount of information contained in the text of the Constitution of the United States of America; to put things in perspective, computers come with 2GB now, which is about 40 thousand times more).  I didn&#8217;t feel I needed much, though, because the capabilities of the computer were limited and my brain was pretty small, too.  Over time, however, I did start bumping into these limits more and more.  I could define my own <a href="http://www.worldofspectrum.org/ZXBasicManual/chap16diag1.gif">characters</a> and sprites (for which I had to learn to operate in <a href="http://en.wikipedia.org/wiki/Binary_numeral_system">binary</a>; knowing binary before I knew how to divide is a funny thing, now that I think about it) but I could define at most 256 of them.</p>
<p>Fast forward six years, to my first PC (getting close to that ominous <em>greatest moment</em>).  Eternally curious how to <em>make</em> games rather than <em>play</em> them I&#8217;d continue programming.  My brain, now more fully developed, could process more information and so I expected my programs to.  The PC was also capable of much more (I think I had 16MB of memory at this point?) yet due to the hardware limitations I could use at most 640kB of it.  You may think that a little more than 650 thousand bytes should suffice but that&#8217;s how much information is contained in a single uncompressed <a href="http://pictopia.com/pub2/images/photo-tulips.png">photo</a>!  I felt very constrained&#8211;my computer could play sound, display graphics, and perform calculations much faster than anything I&#8217;ve seen before.  Yet I couldn&#8217;t take advantage of any of it.  There seemed to be some kind of a rule that stipulated those omnipresent limits&#8211;party poopers.  There seemed no way around it.</p>
<p>Then, one day, I discovered this utility called DOS/4GW.  It wasn&#8217;t a discovery as much as a result of an investigation: I&#8217;d see more and more advanced games pop up that <em>surely</em> couldn&#8217;t have been subject to the 640k limit (they combined music with graphics and seemed to store a lot of information about the virtual worlds they were depicting&#8230; my intuition told me that must have been more than the measly 640k).  All these games would launch a small application before they themselves started, and that application would simply pop &#8220;DOS/4GW&#8221; on the screen and disappear.  So I started digging.</p>
<p>I found out (at that point I didn&#8217;t have any access to the Internet&#8211;searching for information was so incredibly painful back then) that this little utility allowed the game programmers to bypass the 640k limit, effectively taking advantage of all the memory that the computer had available.  And to my shock it turned out that <em>I</em> could also take advantage of this application when making my programs.</p>
<p>This lifting of the limit was, in retrospect, the single most eventful day in my entire life of interaction with computers.  For one, the limit was <em>immediately</em> increased from 640k to 16M (a 25-fold increase!).  But, more importantly, it was a <em>soft</em> limit: I could simply buy more memory and have more available to my programs.  I felt empowered<sup>*</sup>.  I went crazy&#8211;there were finally no limits to my creativity.</p>
<hr />
* What I didn&#8217;t know was that DOS/4GW didn&#8217;t abolish the limit altogether; it simply upped it to 4GB.  But given that most of us don&#8217;t have that much memory even today, that fact wouldn&#8217;t have registered with me as something fundamental then.  Now it&#8217;s part of the sad reality about modern computing&#8211;it&#8217;s all painfully finite.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.elevenseconds.com/the-greatest-moment-in-my-computer-interactions/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>What is an &#8220;amp;&#8221; [sic] doing on this taxicab&#8217;s newsfeed?</title>
		<link>http://blog.elevenseconds.com/what-is-an-amp-sic-doing-on-this-taxicabs-newsfeed/</link>
		<comments>http://blog.elevenseconds.com/what-is-an-amp-sic-doing-on-this-taxicabs-newsfeed/#comments</comments>
		<pubDate>Wed, 30 Sep 2009 01:39:38 +0000</pubDate>
		<dc:creator>me</dc:creator>
				<category><![CDATA[The Daily Badness]]></category>
		<category><![CDATA[how i'd explain this to my mom]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://blog.elevenseconds.com/?p=436</guid>
		<description><![CDATA[Look very closely at the newsfeed at the bottom of screens installed at the back of some New York City&#8217;s taxicabs. If the newsfeed includes an ampersand, instead of it you will see a mysterious amp; (so, for example, &#8220;Crate &#038; Barrel reports quarterly loss&#8221; turns into &#8220;Crate amp; Barrel reports quarterly loss&#8221;). The first [...]]]></description>
			<content:encoded><![CDATA[<p>Look very closely at the newsfeed at the bottom of screens installed at the back of some New York City&#8217;s taxicabs.  If the newsfeed includes an ampersand, instead of it you will see a mysterious <tt>amp;</tt> (so, for example, &#8220;Crate &#038; Barrel reports quarterly loss&#8221; turns into &#8220;Crate amp; Barrel reports quarterly loss&#8221;).  The first few times this didn&#8217;t even reach my threshold of detail &#8212; I&#8217;m so used to seeing bugs that are a result of common programming mistakes that this error made a lot of sense to me (which wasn&#8217;t at all an excuse &#8212; in fact, I&#8217;m surprised that this defect was allowed to exist for so long!).  But then I stepped back and realized that this bug that I&#8217;m taking for granted is a fascinating case of how prevalent various technologies are, and how likely one technology is to build on another.  If I were to explain the cause of this error to my mom, I thought to myself, I would find it very difficult (I feared that there is so much context that I just assume is broadly known that the exercise would turn into telling the history of computing).</p>
<p>I like to do difficult things so I decided to try (explaining this to my mom):</p>
<p>1. This is a <em>bug</em> &#8212; which is just a phrase for an error that was caused by a mistake in the programming of the device</p>
<p>2. It happens whenever the newsfeed needs to display an ampersand &#8212; instead, that ampersand gets replaced with the word <tt>amp</tt> followed immediately by a semicolon</p>
<p>3. The device needs to somehow know what it is supposed to display.  I don&#8217;t know for sure the way it is done but programmers like to reuse parts of code, and standards, and conventions that have been widely accepted, so I have an idea for how this works</p>
<p>4. The device is connected to some kind of a component whose task it is to receive information from the outside (after all, the newsfeed is regularly updated with fresh news) and pass it on to the device to it can be ultimately displayed on screen</p>
<p>5. It doesn&#8217;t matter what that component is &#8212; it may be a radio receiver which receives all the information periodically from a radio transmitter somewhere in the City (just like the RDS on the radio which allows you to see what band is currently playing), or a 3G receiver which receives the information from a cell tower (just like your cell phone is able to receive email), or maybe the cab driver is plugging the device in to some kind of box when he parks the cab at the end of his shift, and the data is transferred then</p>
<p>6. In any case, the data is transmitted to the device.  There are many ways to transmit the data &#8212; remember that the transmission is digital which means that there has to be some code for each character of the newsfeed.  This is similar to Morse code.  A convention that is used a lot is to use a code called ASCII which assigns every character a combination of eight bits &#8212; each bit is either a zero or a one.  I&#8217;m guessing this data is transmitted this way.  I&#8217;m guessing this because I haven&#8217;t seen anything more complicated in the newsfeed than regular text &#8212; if I saw little icons, or Greek characters, I would have to go for a more complicated code (ASCII lets you encode at most 256 different characters)</p>
<p>7. But the newsfeed is not the only thing that gets transmitted.  Weather information gets transmitted (and displayed as a number &#8212; the temperature &#8212; and an icon &#8212; rain, snow, etc.); those annoying commercials must also be transmitted the same way.  The latter is a lot more information than the newsfeed, but the fundamental way of transmission is the same</p>
<p>8. Because there is more than one different piece of information to transmit, there has to be some way to organize this information.  ASCII has no built-in way to do this because all it does it encode individual characters.  For example, if the device received just a &#8220;stream of consciousness&#8221; information like this:</p>
<blockquote><p>Crate &#038; Barrel reports quarterly loss.  President Obama arrived in Denmark.  Monday 67 degrees sunny.  Tuesday 70 degrees heavy rain.</p></blockquote>
<p>it would be difficult for the device to decipher what belongs to the newsfeed, and what belongs to the weather forecast.  Within each group, there are structures as well (the newsfeed has individual ticker items; each day of forecast contains the day of the week, the temperature, and the weather conditions).  So programmers use another convention called XML which allows them to organize information in a way that computer programs can read.  XML allows you to surround text with special words enclosed in brackets which are interpreted specially.  There are a few rules that stuck (for example, to use angular brackets, and to use a slash for the word at the end).  So, for example, the above transmission would look like this:</p>
<blockquote><p>
&lt;news><br />
&nbsp;&nbsp;&lt;item>Crate &#038; Barrel reports quarterly loss.&lt;/item><br />
&nbsp;&nbsp;&lt;item>President Obama arrived in Denmark.&lt;/item><br />
&lt;/news><br />
&lt;weather><br />
&nbsp;&nbsp;&lt;item><br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;day>Monday&lt;/day><br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;temperature>67&lt;/temperature><br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;condition>sunny&lt;/condition><br />
&nbsp;&nbsp;&lt;/item><br />
&nbsp;&nbsp;&lt;item><br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;day>Tuesday&lt;/day><br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;temperature>70&lt;/temperature><br />
&nbsp;&nbsp;&nbsp;&nbsp;&lt;condition>heavy rain&lt;/condition><br />
&nbsp;&nbsp;&lt;/item><br />
&lt;/weather>
</p></blockquote>
<p>You will see that all information is there, it&#8217;s just highly structured.  A computer program can then ask for things like, &#8220;give me every item in the <tt>weather</tt> block, and for every item, piece together what&#8217;s in the <tt>day</tt> block, what&#8217;s in the <tt>temperature</tt> block (adding the word &#8220;degrees&#8221; to the end), and what&#8217;s in the <tt>condition</tt> block.&#8221;</p>
<p>9. XML has some limitations.  For example, an opening angle bracket cannot be used anywhere in the text because the program will think that it&#8217;s the beginning of a special block and probably not allow this transmission:</p>
<blockquote><p>
&lt;item>To write that 2+2&lt;5 will confuse the hell out of this program&lt;/item>
</p></blockquote>
<p>By &#8220;not allow&#8221; I mean, it&#8217;s possible that the part of the program doing the transmitting is expecting correct XML (that is, every opening angle bracket has a corresponding closing angle bracket, and so on).  Perhaps the part of the program doing the receiving is expecting correct XML.  Very likely both do (because programmers reuse code &#8212; somebody else wrote code for interpreting XML and they probably wrote it in a way that prevents common mistakes from happening).</p>
<p>10. To get around this problem, if you want to display an opening angle bracket, you have to use a special code instead.  This code (again, by convention) is <tt>&amp;lt;</tt> (which stands for &#8220;less than&#8221;): the ampersand denotes the beginning of a special code, and the semicolon ends it (you need both; otherwise the word &#8220;altitude&#8221; would be rendered as &#8220;a&lt;itude&#8221;).  Similarly, <tt>&amp;gt;</tt> is the closing angle bracket (&#8220;greater than&#8221;).  So the above really needs to be written as</p>
<blockquote><p>
&lt;item>To write that 2+2&amp;lt;5 will confuse the hell out of this program&lt;/item>
</p></blockquote>
<p>11. We&#8217;re almost there.  This is a cool way to solve one problem, but unfortunately it introduces another one: you can&#8217;t display an ampersand! (because the program will think that you mean a special character).  The way programmers solved this problem is to create a special code for ampersand itself &#8212; &amp;amp;.  So the news item that gets transferred to the device probably looks like this:</p>
<blockquote><p>
&lt;news><br />
&nbsp;&nbsp;&lt;item>Crate &amp;amp; Barrel reports quarterly loss.&lt;/item><br />
&lt;/news>
</p></blockquote>
<p>12. I&#8217;m pretty confident that so far I&#8217;ve been fairly right &#8212; &amp;amp; is seen pretty much whenever XML is involved.  One can come up with many theories at this point.  Here is one.</p>
<p>13. I already mentioned that usually, programmers would include a frequently-used piece of code that does the proper decoding (i.e. turns <tt>&amp;amp;</tt> into an ampersand).  It&#8217;s possible that in this case, they didn&#8217;t use that common code, and instead wrote their own (thinking it&#8217;s easy to write something simple like this).  That code may simply be ignoring all special codes and just passing whatever it encountered on.  Then, down the road, another piece of code would pick up whatever it received (at this point it&#8217;s no longer aware that the text came from XML), and, as a safety measure, simply stripped any characters that it didn&#8217;t expect.  This includes ampersands.</p>
<p>14. Why is this a safety measure?  Programs are usually written in a defensive way, that is, they make very few assumptions about what they are given.  Instead, they err on the safe side and double, triple check everything.  One of the checks commonly performed is called <em>sanitization</em>, and it&#8217;s a process of turning possibly erroneous input into a correct one by stripping bad data or data that could be malicious (if interpreted literally).  For example, suppose that the same newsfeed has a command that defines how fast the newsfeed is moving on screen (since the XML data is structured, we can just add a special block for this):</p>
<blockquote><p>
&lt;news><br />
&nbsp;&nbsp;&lt;item>Crate &#038; Barrel reports quarterly loss.&lt;/item><br />
&nbsp;&nbsp;&lt;speed>16&lt;/speed><br />
&lt;/news>
</p></blockquote>
<p>So if the program ever encounters a <tt>speed</tt> block, it knows to set the speed of the ticker to the given number.  Now suppose that I can submit my own news items and they will be displayed.  Suppose that this gets integrated with all the other news items by replacing the word CUSTOM below with the user-provided item:</p>
<blockquote><p>
&lt;news><br />
&nbsp;&nbsp;&lt;item>This is the regular news item.&lt;/item><br />
&nbsp;&nbsp;&lt;item>CUSTOM&lt;/item><br />
&lt;/news>
</p></blockquote>
<p>This is all fine for simple news items (for example, if I provide &#8220;Hello world&#8221;, the word &#8220;CUSTOM&#8221; will be replaced with the phrase &#8220;Hello world&#8221; and everything is good).  But if I knew about the <tt>speed</tt> command, I could submit an item that looks like this (read carefully!):</p>
<blockquote><p>
My news item.&amp;lt;/item&amp;gt;&amp;lt;speed&amp;gt;99999&amp;lt;/speed&amp;gt;&amp;lt;item&amp;gt;
</p></blockquote>
<p>All the <tt>&amp;lt;</tt> and <tt>&amp;gt;</tt> will be interpreted as opening and closing brackets and so here is what the transmission will look like:</p>
<blockquote><p>
&lt;news><br />
&nbsp;&nbsp;&lt;item>This is the regular news item.&lt;/item><br />
&nbsp;&nbsp;&lt;item>My news item.&lt;/item>&lt;speed>99999&lt;/speed>&lt;item>&lt;/item><br />
&lt;/news>
</p></blockquote>
<p>I just added a <tt>speed</tt> command to the feed even though I wasn&#8217;t allowed to!  If this is not caught, I could crash the program by, for example, passing in a really large value for speed (or a negative value, or zero, or some nonsense).  If the program didn&#8217;t strip certain characters (such as angle brackets), it could be vulnerable to attacks like this (this, by the way, is called an <em>injection attack</em> because I&#8217;m injecting special code into the data that I&#8217;m allowed to provide).  Ampersands can also be interpreted as special characters (because <tt>&amp;lt;</tt> is translated into an angular bracket which I can use to form a news item that changes the speed!) so they are stripped.</p>
<p>15. Hence, <tt>&amp;amp;</tt> becomes <tt>amp;</tt>.  Now the newsfeed contains no special characters and hence we see <tt>amp;</tt> on the screen.</p>
<p>Hopefully now my mom can see where all those years of computer science education went&#8230; and at the same time she learned about injection attacks.  Pretty good for one post.</p>
]]></content:encoded>
			<wfw:commentRss>http://blog.elevenseconds.com/what-is-an-amp-sic-doing-on-this-taxicabs-newsfeed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

