apnorton | blog2018-12-07T07:35:14.354Zhttp://www.apnorton.com/blog/Andrew NortonHexoMeeting Don Knuthhttp://www.apnorton.com/blog/2018/12/07/Meeting-Don-Knuth/2018-12-07T07:35:14.000Z2018-12-07T07:35:14.354Z<p>Everyone has heroes, but not everyone gets to meet them. This past Tuesday, I was fortunate to meet one of mine — Don Knuth. If you are in computer science and don’t know who Don Knuth is, I highly recommend you take a break from this post and <a href="https://amturing.acm.org/award_winners/knuth_1013846.cfm" target="_blank" rel="external">do a bit of reading about him</a>. He is most well-known for authoring his (still in progress) <em>Art of Computer Programming</em>, a compendium of all kinds of information on algorithms, but his contributions to computer science and software development reach far beyond that. He invented TeX (precursor to LaTeX), wrote <em>Concrete Mathematics</em> (a great mathematical foundations of computer science book), and received a Turing award.</p>
<img src="/blog/2018/12/07/Meeting-Don-Knuth/knuth.jpg" alt="Me with Don Knuth" title="Me with Don Knuth">
<a id="more"></a>
<p>I’ve been enamoured with Knuth’s work ever since I received <em>Concrete Mathematics</em> as a birthday present in my teens. His level of detail and precision in both his proofs and programming are second to none, in my eyes. Thus, when I saw that he gives a yearly public lecture at Stanford, I knew I had to find a way to attend. So, this week I flew to California to go hear his twenty-fourth annual Christmas lecture, which was on new applications of Dancing Links that he has found (and will be covered in <em>AoCP Volume 4B</em>, yet to be published). After the lecture, he signed a book of his and let me take a picture with him. When I told him that his work greatly inspired me, he encouraged me to attempt to find errors and report them, participating in his <a href="https://en.wikipedia.org/wiki/Knuth_reward_check" target="_blank" rel="external">reward check program</a>.</p>
<p>Some people would think it crazy to fly across the country to hear one person speak for an hour and a half, but it honestly was worth it. I got to meet a towering giant in the field of CS who has inspired me in many ways. </p>
<img src="/blog/2018/12/07/Meeting-Don-Knuth/knuth_autograph.jpg" alt="A newly-signed copy of Art of Computer Programming" title="A newly-signed copy of Art of Computer Programming">
<p>Everyone has heroes, but not everyone gets to meet them. This past Tuesday, I was fortunate to meet one of mine — Don Knuth. If you are in computer science and don’t know who Don Knuth is, I highly recommend you take a break from this post and <a href="https://amturing.acm.org/award_winners/knuth_1013846.cfm">do a bit of reading about him</a>. He is most well-known for authoring his (still in progress) <em>Art of Computer Programming</em>, a compendium of all kinds of information on algorithms, but his contributions to computer science and software development reach far beyond that. He invented TeX (precursor to LaTeX), wrote <em>Concrete Mathematics</em> (a great mathematical foundations of computer science book), and received a Turing award.</p>
<img src="/blog/2018/12/07/Meeting-Don-Knuth/knuth.jpg" alt="Me with Don Knuth" title="Me with Don Knuth">
Raspberry Pi Default Groupshttp://www.apnorton.com/blog/2017/11/25/Raspberry-Pi-Default-Groups/2017-11-25T18:54:28.000Z2017-11-25T19:09:37.770Z<p>In setting up my Raspberry Pi for a home fileshare, I noticed the <code>pi</code> user is a part of several default groups. These are:</p>
<figure class="highlight stylus"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pi adm dialout cdrom sudo <span class="selector-tag">audio</span> <span class="selector-tag">video</span> plugdev games users <span class="selector-tag">input</span> netdev gpio i2c spi</span><br></pre></td></tr></table></figure>
<p>(I’m using the 2017-09-07 image of Raspbian Stretch Lite.)</p>
<p>This looked like a lot of groups to me! To make sure my new user only has the minimum permissions needed, let’s look at the what each group is and why it’s there.</p>
<a id="more"></a>
<h2 id="Group-Descriptions"><a href="#Group-Descriptions" class="headerlink" title="Group Descriptions"></a>Group Descriptions</h2><table>
<thead>
<tr>
<th style="text-align:left">Name</th>
<th style="text-align:left">Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left">pi</td>
<td style="text-align:left">User-specific group. A group is automatically created for each new user; you can ignore this.</td>
</tr>
<tr>
<td style="text-align:left">adm</td>
<td style="text-align:left">Allows access to log files in <code>/var/log</code> and using <code>xconsole</code></td>
</tr>
<tr>
<td style="text-align:left">dialout</td>
<td style="text-align:left">Allows access to serial ports/modem reconfiguration, etc.</td>
</tr>
<tr>
<td style="text-align:left">cdrom</td>
<td style="text-align:left">Uncreatively, this group enables access to optical drives.</td>
</tr>
<tr>
<td style="text-align:left">sudo</td>
<td style="text-align:left">Enables <code>sudo</code> access for the user.</td>
</tr>
<tr>
<td style="text-align:left">audio</td>
<td style="text-align:left">Allows access to audio devices like microphones and soundcards</td>
</tr>
<tr>
<td style="text-align:left">video</td>
<td style="text-align:left">Allows graphics card/webcam access.</td>
</tr>
<tr>
<td style="text-align:left">plugdev</td>
<td style="text-align:left">Enables access to external storage devices</td>
</tr>
<tr>
<td style="text-align:left">games</td>
<td style="text-align:left">I’m unsure of this. No files belong to this group by default, and I cannot find references to it online.</td>
</tr>
<tr>
<td style="text-align:left">users</td>
<td style="text-align:left">Appears to be a Pi-specific group enabling access to <code>/opt/vc/src/hello_pi/</code> directory and contained files.</td>
</tr>
<tr>
<td style="text-align:left">input</td>
<td style="text-align:left">Appears to give access to the <code>/dev/input/mice</code> folder and nothing else.</td>
</tr>
<tr>
<td style="text-align:left">netdev</td>
<td style="text-align:left">Enables access to network interfaces</td>
</tr>
<tr>
<td style="text-align:left">gpio</td>
<td style="text-align:left">Pi-specific group for GPIO pin access.</td>
</tr>
<tr>
<td style="text-align:left">i2c</td>
<td style="text-align:left">Similar to the above, but for I2C access. Generated after installing <code>i2c-tools</code>.</td>
</tr>
<tr>
<td style="text-align:left">spi</td>
<td style="text-align:left">Similar to the above, but for the SPI bus.</td>
</tr>
</tbody>
</table>
<p>So, based on my application (and future use of the Pi), I’m not adding the <code>cdrom</code>, <code>games</code>, and <code>users</code> groups to my new user.</p>
<h2 id="Helpful-Resources"><a href="#Helpful-Resources" class="headerlink" title="Helpful Resources"></a>Helpful Resources</h2><p>The above descriptions were sourced based on the following:</p>
<ul>
<li><a href="https://wiki.debian.org/SystemGroups" target="_blank" rel="external">SystemGroups - Debian Wiki</a></li>
<li><a href="https://wiki.ubuntu.com/Security/Privileges" target="_blank" rel="external">Privileges - Ubuntu Wiki</a></li>
<li>Molloy, Derek. <em>Exploring Raspberry Pi: Interfacing to the Real World with Embedded Linux.</em> <a href="https://books.google.com/books?id=ro0gCwAAQBAJ&pg=PA270&lpg=PA270&source=bl&ots=0T50hVUvy5&sig=6n_Hi0U2rCyu7pvx5LUqXJfDhbE&hl=en&sa=X&ved=0ahUKEwjG6qGorNrXAhUNRN8KHcndBDQQ6AEISTAE#v=onepage&q&f=false" target="_blank" rel="external">Page 270</a>. </li>
<li><a href="https://www.raspberrypi.org/forums/viewtopic.php?p=158107#p158107" target="_blank" rel="external">I2C group discussion on Raspberry Pi forums</a> </li>
</ul>
<p>In setting up my Raspberry Pi for a home fileshare, I noticed the <code>pi</code> user is a part of several default groups. These are:</p>
<figure class="highlight stylus"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">pi adm dialout cdrom sudo <span class="selector-tag">audio</span> <span class="selector-tag">video</span> plugdev games users <span class="selector-tag">input</span> netdev gpio i2c spi</span><br></pre></td></tr></table></figure>
<p>(I’m using the 2017-09-07 image of Raspbian Stretch Lite.)</p>
<p>This looked like a lot of groups to me! To make sure my new user only has the minimum permissions needed, let’s look at the what each group is and why it’s there.</p>
How to Request a Regradehttp://www.apnorton.com/blog/2017/07/04/How-to-Request-a-Regrade/2017-07-04T21:02:25.000Z2017-07-04T21:02:25.821Z<p>One of the highlights of my time at UVA was working as a teaching assistant for the computer science department. In this capacity, I proctored labs and exams, held office hours, created exam questions, and even graded homework and exams. Due, in part, to the large class sizes of our introductory courses and the necessity of multiple graders for each assignment, most professors have a “regrade” policy – if the grader has made an error in grading a student’s work, there is a formal process for the student to request a second look at his or her work.</p>
<p>For CS 2150 (the course I TA’d), I was one of two or three TAs who processed most – if not all – of the regrades for exams in the past two semesters. Although grades are ideally determined solely by the answer’s merit, there are a few simple ways you can make your grader’s life easier. (And that’s always a good thing, right?)</p>
<a id="more"></a>
<h2 id="Be-Brief"><a href="#Be-Brief" class="headerlink" title="Be Brief"></a>Be Brief</h2><p>Graders for every course, even technical ones, read a <em>lot</em> of student-generated text. In a class of around 300 students, we may receive 60 to 100 regrade requests per exam. (This is not to say have a 1/6 to 1/3 error rate in grading, but rather that we receive a lot of frivolous regrades – more on that later.) If every student writes four or five paragraphs explaining why they got <code>101101</code> instead of <code>101100</code> as the base-two representation of some number, it becomes hard to separate important points from unimportant ones. Thus, my first recommendation is to <em>be brief</em>.</p>
<p>Of course, there is a difference between “brief” and “short.” Receiving “regrade plz” is just as bad, if not worse, than recieving two pages on how you forgot a trivial detail of a problem. My Science, Technology, and Society professor put it well: when you hear <em>brief</em>, think <em>economical</em>. Maximize the “signal to noise” ratio of your regrade request – bullet points are your friend!</p>
<h2 id="Be-Informed"><a href="#Be-Informed" class="headerlink" title="Be Informed"></a>Be Informed</h2><p>Many courses have a <em>grading rubric</em> that describe how many points each part of an assignment is worth. Exhibiting familiarity with this document by referencing relevant parts is great when writing a regrade request; this helps you be brief and shows the TA that you have some understanding of how many points your answer should be worth. It’s surprising the number of students, puzzled over their denied regrade, I’ve spoken with in office hours only to find that they never read the grading guidelines. Knowing what each part of the question is worth is invaluable in discussing regrades.</p>
<h2 id="Don’t-Complain"><a href="#Don’t-Complain" class="headerlink" title="Don’t Complain"></a>Don’t Complain</h2><p>I realize that we are humans and make mistakes. I also realize that it’s very stressful and can feel insulting when one of your assignments has been graded improperly – believe me, I’ve been there! However, venting this frustration in your regrade request does not help the grader sympathize with your cause. Insulting the person who originally graded you, acting “entitled” to points (regardless of whether or not you actually are), and other forms of complaining do not endear yourself to your grader. Acting humble (but not overly so, of course) goes a long way in getting the grader to subconsciously want you to get a higher grade.</p>
<h2 id="Don’t-Be-Wrong"><a href="#Don’t-Be-Wrong" class="headerlink" title="Don’t Be Wrong"></a>Don’t Be Wrong</h2><p>No, I’m not going to give some cheesy advice about not missing the question in the first place. What I mean by “don’t be wrong” is really “don’t be wrong a <em>second</em> time.” Double and triple-check your work before submitting a regrade – nothing is more embarassing than writing a long explanation on why your answer is correct, only to be disproved with a short response from the TA! Consult your textbook and discuss with your friends (or possibly the TAs) before submitting a regrade to reduce the likelihood it gets rejected.</p>
<p>While getting an “wrong” regrade can be mildly frustrating to a TA, we don’t really mind; after all, everyone makes mistakes and things like that occasionally slip through. However, one of the few ways to actually make a grader upset is to <em>knowingly</em> submit wrong work and act as if it’s correct. Please don’t do this. (I won’t write more; since this takes conscious effort to perform, I’ll assume those who are doing it can stop without much instruction.) Also in this category is “grade grubbing;” if your work is wrong and you’ve gotten what the rubric says you should get, don’t submit a regrade request asking for more points than your answer merits!</p>
<h2 id="Summary-Please-Respect-our-Time"><a href="#Summary-Please-Respect-our-Time" class="headerlink" title="Summary: Please Respect our Time"></a>Summary: Please Respect our Time</h2><p>This is a lot of words to basically say: make your work quick to grade. We (your graders) are busy. We’re either students with coursework and/or research to do, or we’re professors who have lectures to prep, office hours to hold, and meetings to attend. The fact that you’re requesting a regrade means that we already saw your paper once. Considered together, these factors mean that the less time we have to spend handling regrade requests, the better off everyone is. </p>
<p>Yes, as graders it is our job to grade your work correctly. Yes, if we messed up your grade, we do want to fix it! But, if you can make it so it’s really fast and easy to see why you need points back, it makes our life so much easier.</p>
<p>One of the highlights of my time at UVA was working as a teaching assistant for the computer science department. In this capacity, I proctored labs and exams, held office hours, created exam questions, and even graded homework and exams. Due, in part, to the large class sizes of our introductory courses and the necessity of multiple graders for each assignment, most professors have a “regrade” policy – if the grader has made an error in grading a student’s work, there is a formal process for the student to request a second look at his or her work.</p>
<p>For CS 2150 (the course I TA’d), I was one of two or three TAs who processed most – if not all – of the regrades for exams in the past two semesters. Although grades are ideally determined solely by the answer’s merit, there are a few simple ways you can make your grader’s life easier. (And that’s always a good thing, right?)</p>
A Brief Exploration of a Möbius Transformationhttp://www.apnorton.com/blog/2017/04/18/A-Brief-Exploration-of-a-Mobius-Transformation/2017-04-19T00:05:42.000Z2017-04-19T00:15:50.918Z<p>As part of a recent homework set in my complex analysis course, I was tasked with a problem that required a slight generalization on part of <a href="http://mathworld.wolfram.com/SchwarzsLemma.html" target="_blank" rel="external">Schwarz’s Lemma</a>:</p>
<blockquote>
<p><strong>Lemma (Schwarz):</strong> Let $f$ be analytic on the unit disk with $|f(z)| \leq 1$ for all $z$ on the disk and $f(0) = 0$. Then $|f(z)| < |z|$ and $f’(0)\leq 1$.<br>If either $|f(z)|=|z|$ for some $z\neq0$ or if $|f’(0)|=1$, then $f$ is a rotation, i.e., $f(z)=az$ for some complex constant $a$ with $|a|=1$. </p>
</blockquote>
<p>The homework assignment asked us (within the context of a larger problem) to consider the case when $f(\zeta) = 0$ for some $\zeta \neq 0$ on the interior of the unit disk. The secret to this problem was to find some analytic function $\varphi$ that maps the unit disk to itself, but <em>swaps</em> $0$ and $\zeta$. Then, we may consider $\varphi^{-1}\circ f\circ \varphi$ and apply Schwarz’s Lemma.</p>
<a id="more"></a>
<h2 id="Properties-of-the-transformation"><a href="#Properties-of-the-transformation" class="headerlink" title="Properties of the transformation"></a>Properties of the transformation</h2><p>The appropriate map, which is a particular Möbius transformation, is given by the following:</p>
<p>$$\varphi_\zeta(z) = \frac{\zeta - z}{1-\overline{\zeta}z}$$</p>
<p>Now, if $|z| = 1$, then $|\varphi_\zeta(z)| = |\overline{z} \varphi_\zeta(z)| = \left|\frac{\overline{z}\zeta-1}{1-\overline{\zeta}z}\right| = 1$. Therefore, this map takes the boundary of the unit disk to itself.</p>
<p>Further, this $\varphi_\zeta$ is analytic within the unit disk, as its only singularity occurs when $|z| > 1$ (since this occurs when $z = \frac{1}{\overline{\zeta}}$ and $\left|\overline{\zeta}\right| < 1$). And, finally, since $\varphi_\zeta$ is non-constant, the <a href="http://mathworld.wolfram.com/MaximumModulusPrinciple.html" target="_blank" rel="external">maximum modulus principle</a> tells us that $|\varphi_\zeta(z)| < 1$ when $|z| < 1$. </p>
<p>Therefore, $\varphi_\zeta$ maps the unit disk onto itself, where $\varphi_\zeta(\zeta) = 0$ and $\varphi_\zeta(0) = \zeta$.</p>
<p>Another useful feature of this map is that it is an involution. That is, $\varphi_\zeta^{-1} = \varphi_\zeta$. An application of Schwarz’s Lemma shows this immediately: since $\varphi\circ\varphi$ fixes <em>two</em> points in the unit disk (one of them zero) and satisfies the modulus bound, we can conclude that $\varphi\circ\varphi$ is the identity. Therefore, $\varphi$ is its own inverse.</p>
<h2 id="Impact-of-this-map-on-the-unit-disk"><a href="#Impact-of-this-map-on-the-unit-disk" class="headerlink" title="Impact of this map on the unit disk"></a>Impact of this map on the unit disk</h2><p>I was curious what this mapping does to the values on the unit disk. We’ve clearly swapped $\zeta$ and $0$, but the map must maintain analyticity on the unit disk, so it must do more than just that. I wanted to know how this distortion affects the rest of the values on the disk. So, I wrote a quick Python program to generate a couple of plots:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> numpy <span class="keyword">as</span> np</span><br><span class="line"><span class="keyword">import</span> matplotlib.pyplot <span class="keyword">as</span> plt</span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> math <span class="keyword">import</span> pi</span><br><span class="line"><span class="keyword">from</span> cmath <span class="keyword">import</span> phase <span class="keyword">as</span> arg</span><br><span class="line"></span><br><span class="line"><span class="comment"># Create a 2D grid on which to evaluate the function</span></span><br><span class="line">xs = np.linspace(<span class="number">-1</span>, <span class="number">1</span>, num = <span class="number">700</span>)</span><br><span class="line">ys = np.linspace(<span class="number">-1</span>, <span class="number">1</span>, num = <span class="number">700</span>)</span><br><span class="line">X, Y = np.meshgrid(xs, ys)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Round it off to be only the unit circle</span></span><br><span class="line">r = np.sqrt(X**<span class="number">2</span> + Y**<span class="number">2</span>)</span><br><span class="line">X = np.ma.masked_where(r > <span class="number">1</span>, X)</span><br><span class="line">Y = np.ma.masked_where(r > <span class="number">1</span>, Y)</span><br><span class="line"></span><br><span class="line"><span class="comment"># The new "zeta" value</span></span><br><span class="line">zeta = <span class="number">0.2</span> + <span class="number">0.38j</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># The involution, phi</span></span><br><span class="line"><span class="meta">@np.vectorize</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">phi</span><span class="params">(z)</span>:</span></span><br><span class="line"> <span class="keyword">return</span> (zeta - z) / (<span class="number">1</span>-zeta.conjugate()*z)</span><br><span class="line"></span><br><span class="line">vabs = np.vectorize(abs)</span><br><span class="line">varg = np.vectorize(arg)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Determine the argument and modulus of points on the unit circle</span></span><br><span class="line">Z = X+Y*<span class="number">1.0j</span></span><br><span class="line"></span><br><span class="line">F1 = vabs(phi(Z))</span><br><span class="line">F2 = vabs(Z)</span><br><span class="line">F3 = varg(phi(Z))</span><br><span class="line">F4 = varg(Z)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Plot them all!</span></span><br><span class="line">F = [F1, F2, F3, F4]</span><br><span class="line">fig, axes = plt.subplots(<span class="number">2</span>, <span class="number">2</span>)</span><br><span class="line">titles = [</span><br><span class="line"> <span class="string">'|$\\varphi_\\zeta(z)$|'</span>, </span><br><span class="line"> <span class="string">'|$z$|'</span>,</span><br><span class="line"> <span class="string">'Arg$(\\varphi_\\zeta(z))$'</span>,</span><br><span class="line"> <span class="string">'Arg$(z)$'</span>,</span><br><span class="line">]</span><br><span class="line"></span><br><span class="line">t = np.linspace(<span class="number">0</span>, <span class="number">2</span>*pi, <span class="number">100</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i, ax <span class="keyword">in</span> enumerate(np.reshape(axes, (<span class="number">-1</span>,))):</span><br><span class="line"> <span class="comment"># draw the heatmap</span></span><br><span class="line"> ax.pcolormesh(X, Y, F[i])</span><br><span class="line"> </span><br><span class="line"> <span class="comment"># draw bounding circle</span></span><br><span class="line"> ax.plot(np.cos(t), np.sin(t), linewidth=<span class="number">2</span>, color=<span class="string">'black'</span>)</span><br><span class="line"> </span><br><span class="line"> <span class="comment"># adjust the axis labels</span></span><br><span class="line"> ax.set_aspect(<span class="string">'equal'</span>)</span><br><span class="line"> ax.set_xlim(<span class="number">-1.1</span>, <span class="number">1.1</span>)</span><br><span class="line"> ax.set_ylim(<span class="number">-1.1</span>, <span class="number">1.1</span>)</span><br><span class="line"> ax.set_title(titles[i])</span><br><span class="line"> </span><br><span class="line">plt.tight_layout() <span class="comment"># helps with spacing</span></span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
<p>Below, we’ve plotted the magnitude and argument (angle) of $z$ and $\varphi_\zeta(z)$ side-by-side. We can now see that, in terms of magnitude, it’s just as if the map “shifted” over the origin, squeezing and pulling the surrounding values to maintain analyticity. However, it also <em>twisted and reflected</em> the values of $z$ on each circle around the origin. (This can be seen through the curve of the $\mathrm{Arg}(\varphi_\zeta(z))$ plot.)</p>
<img src="/blog/2017/04/18/A-Brief-Exploration-of-a-Mobius-Transformation/plots.png" alt="Resulting plots" title="Resulting plots">
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>This doesn’t really serve much purpose in and of itself, but it helped build my intuition of what is happening when I apply the function $\varphi$ and developed my abilities in <code>numpy</code> and <code>matplotlib</code> usage. The Schwarz Lemma is an interesting topic in Complex Analysis, and I based some of my initial work on a 2010 paper by Dr. Harold P. Boas, entitled <a href="https://arxiv.org/abs/1001.0559" target="_blank" rel="external"><em>Julius and Julia: Mastering the art of the Schwarz lemma</em></a>. Of particular note is “Section 3: Change of Base Point,” where he develops and discusses the map $\varphi$.</p>
<p>As part of a recent homework set in my complex analysis course, I was tasked with a problem that required a slight generalization on part of <a href="http://mathworld.wolfram.com/SchwarzsLemma.html">Schwarz’s Lemma</a>:</p>
<blockquote>
<p><strong>Lemma (Schwarz):</strong> Let $f$ be analytic on the unit disk with $|f(z)| \leq 1$ for all $z$ on the disk and $f(0) = 0$. Then $|f(z)| < |z|$ and $f’(0)\leq 1$.<br>If either $|f(z)|=|z|$ for some $z\neq0$ or if $|f’(0)|=1$, then $f$ is a rotation, i.e., $f(z)=az$ for some complex constant $a$ with $|a|=1$. </p>
</blockquote>
<p>The homework assignment asked us (within the context of a larger problem) to consider the case when $f(\zeta) = 0$ for some $\zeta \neq 0$ on the interior of the unit disk. The secret to this problem was to find some analytic function $\varphi$ that maps the unit disk to itself, but <em>swaps</em> $0$ and $\zeta$. Then, we may consider $\varphi^{-1}\circ f\circ \varphi$ and apply Schwarz’s Lemma.</p>
How I wrote a GroupMe Chatbot in 24 hourshttp://www.apnorton.com/blog/2017/02/28/How-I-wrote-a-Groupme-Chatbot-in-24-hours/2017-03-01T04:05:39.000Z2017-03-01T04:10:22.519Z<p>For the past couple years, I have worked as a teaching assistant for UVa’s CS 2150 (Program and Data Representation) course. We recently started a <a href="https://groupme.com/" target="_blank" rel="external">GroupMe</a> chat for the course staff, and I thought it would be fun to create a chatbot to help remind all the TAs to submit timesheets, keep track of when people are holding office hours, and remember when/where TA meetings are being held. Setting up a basic chatbot is a lot simpler than it sounds and is really fun–I wrote my bot from scratch using Python in just one day.</p>
<img src="/blog/2017/02/28/How-I-wrote-a-Groupme-Chatbot-in-24-hours/screenshot.png" alt="The 2150 chatbot" title="The 2150 chatbot">
<h2 id="GroupMe-Bot-Overview"><a href="#GroupMe-Bot-Overview" class="headerlink" title="GroupMe Bot Overview"></a>GroupMe Bot Overview</h2><p>GroupMe has a very <a href="https://dev.groupme.com/tutorials/bots" target="_blank" rel="external">brief tutorial</a> explaining how their API may be used for bots. The easiest way to create a bot is through their <a href="https://dev.groupme.com/bots/new" target="_blank" rel="external">web form</a>, which prompts you for the bot’s name, callback URL (technically optional, but you want it), avatar URL (optional), and the name of the group where the bot will live. Once you’ve done this, you will be given a unique bot ID token. Anyone with this token can pretend to be your bot, so keep it safe. (Security is somewhat laughable here: your bot asserts its ID and the server believes it with no “login” procedure.) We’ll talk more about the callback URL in a moment; for now, just leave it blank.</p>
<p>Once you’ve done these steps, you have created a bot–as far as GroupMe is concerned. If you send a specifically formatted JSON mssage, the newly created bot will post in your group. However, if left at this point, your “bot” is little more than a proxy for human-written messages submitted with <code>curl</code>. Your bot needs some way of reading messages sent to the group, formulating a response, and only then sending messages to the GroupMe servers. </p>
<a id="more"></a>
<p>This communication is performed using HTTP POST requests carrying JSON data between your bot and the server. Every time a message is sent to your bot’s group, GroupMe POSTs the data to the callback URL you specified above. When your bot wants to respond, it POSTs its response to <code>https://api.groupme.com/v3/bots/post</code>.</p>
<img src="/blog/2017/02/28/How-I-wrote-a-Groupme-Chatbot-in-24-hours/communication.png" alt="Looks simple enough..." title="Looks simple enough...">
<p>We’ll take a closer look at the JSON format when our bot is ready to send messages. The important thing here is that your bot needs to 1) have a public-facing URL and 2) can process POST requests. To avoid security headaches or the possibility of downtime, running your bot in the cloud is a good approach.</p>
<h2 id="Heroku-A-cloud-solution"><a href="#Heroku-A-cloud-solution" class="headerlink" title="Heroku: A cloud solution"></a>Heroku: A cloud solution</h2><p>For my bot (and the rest of the tutorial), I used the <a href="https://heroku.com/" target="_blank" rel="external">Heroku</a> platform for hosting. I had two primary criterion for selecting a cloud platform for my chatbot. First, it had to be free or <em>really</em> cheap. I’m a student and this is a “for fun” project, so I’m not going to be spending money for full server or something like that. Heroku has a free tier with 1000 hours of computation time per account per month, which is more than sufficient for the purposes of a hobby chatbot. Second, it needed an easy way to listen for visits to the callback URL. It turns out that it’s fairly simple to set up a stripped-down server in Python with Gunicorn and Flask for integration with Heroku. You can likely follow a similar process with AWS, Azure, or some other cloud service, though.</p>
<p>Heroku deployment operates through <code>git</code> pushes. I recommend <a href="https://devcenter.heroku.com/articles/heroku-cli" target="_blank" rel="external">installing the CLI</a>, as it allows fast access to log information. Using this tool, you automatically create a <code>heroku</code> remote for your project’s git repository, then do <code>git push heroku [branch-name]</code> to update the running version of your app. Since I already use git for all my projects, this is a nice integration to have.</p>
<p>After installing the Heroku CLI, run the following commands in your terminal to create your bot app:</p>
<figure class="highlight elixir"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="variable">$ </span>mkdir apnorton-demo-bot <span class="comment"># folder for your git repository</span></span><br><span class="line"><span class="variable">$ </span>cd apnorton-demo-bot</span><br><span class="line"><span class="variable">$ </span>git init . <span class="comment"># create a new git repository</span></span><br><span class="line"><span class="variable">$ </span>heroku <span class="symbol">apps:</span>create apnorton-demo-bot <span class="comment"># create heroku app </span></span><br><span class="line"><span class="variable">$ </span>git remote <span class="comment"># should show one remote target</span></span><br></pre></td></tr></table></figure>
<p>(Of course, you should use a name different from <code>apnorton-demo-bot</code>.)</p>
<p>After running the <code>heroku apps:create ...</code> step, you will see two URLs printed as output; the first is the public-facing address of your server. This should be placed in the “callback URL” in the GroupMe settings for your chatbot. If you now login to Heroku, you will see your newly created app in your dashboard. Heroku also needs some configuration files to successfully launch your bot (names are exactly as below) </p>
<ul>
<li><code>Procfile</code> : commands Heroku should use to launch your app</li>
<li><code>runtime.txt</code> : specifies a particular version of Python</li>
<li><code>requirements.txt</code> : any <code>pip</code> packages that need to be installed </li>
</ul>
<p>These will likely be super simple for your bot; here are the options I used for each:</p>
<h3 id="Procfile"><a href="#Procfile" class="headerlink" title="Procfile"></a>Procfile</h3><p>This starts up the gunicorn-based Python webserver and prints all log information to standard out.</p>
<figure class="highlight stata"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">web: gunicorn <span class="keyword">app</span>:<span class="keyword">app</span> --<span class="keyword">log</span>-<span class="keyword">file</span>=-</span><br></pre></td></tr></table></figure>
<h3 id="runtime-txt"><a href="#runtime-txt" class="headerlink" title="runtime.txt"></a>runtime.txt</h3><p>I prefer Python 3, so that’s the runtime I specified for my bot. If the runtime is not specified, Heroku defaults to Python 2 (as of this writing).<br><figure class="highlight css"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="selector-tag">python-3</span><span class="selector-class">.6</span><span class="selector-class">.0</span></span><br></pre></td></tr></table></figure></p>
<h3 id="requirements-txt"><a href="#requirements-txt" class="headerlink" title="requirements.txt"></a>requirements.txt</h3><p>The gunicorn package provides a lightweight server, while Flask is a framework to handle the incoming HTTP requests. You can require other packages, too, but these two are the most basic requirements:</p>
<figure class="highlight ini"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">Flask</span>==<span class="number">0.12</span></span><br><span class="line"><span class="attr">gunicorn</span>==<span class="number">19.6</span>.<span class="number">0</span></span><br></pre></td></tr></table></figure>
<h2 id="Core-Functionality-Python"><a href="#Core-Functionality-Python" class="headerlink" title="Core Functionality: Python"></a>Core Functionality: Python</h2><p>Now that you’ve set up the Heroku server, the next step is to build a lightweight Python server to handle incoming HTTP POST requests to the Heroku URL.</p>
<h3 id="A-Note-on-the-Bot-ID"><a href="#A-Note-on-the-Bot-ID" class="headerlink" title="A Note on the Bot ID"></a>A Note on the Bot ID</h3><p>Since Heroku is git-based, it’s really easy to upload your bot code to GitHub to show off your project. However, you have to be careful that you don’t leak your bot ID key to the public, as anyone who has this ID can impersonate your bot and send messages on its behalf. (This is a big problem, as explicit content or phishing messages could be sent from your bot with no (easy) way of tracing the source.) Fortunately, there’s a really easy way to circumvent this using environment variables.</p>
<p>Instead of hardcoding the bot ID into your Python code, you can create a “config variable” in Heroku. (“Config variables” is just Heroku’s name for environment variables; they work exactly like typical environment variables in bash or the Windows command line.) Create a new environment variable through the Heroku CLI as follows:</p>
<figure class="highlight mipsasm"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ heroku <span class="built_in">config</span>:set GROUPME_BOT_ID=[your <span class="keyword">bot </span>id]</span><br><span class="line">$ heroku <span class="built_in">config</span> <span class="comment"># should display all current configuration variables</span></span><br></pre></td></tr></table></figure>
<p>Now, whenever you need to access your secret bot ID, you can just reference the <code>GROUPME_BOT_ID</code> environment variable (for instance, through Python’s <code>os.getenv('GROUPME_BOT_ID')</code>) and the secret is not leaked when you upload your code to GitHub.</p>
<h3 id="Bot-Code"><a href="#Bot-Code" class="headerlink" title="Bot Code"></a>Bot Code</h3><p>Create a new file called <code>app.py</code>. This will contain the some functions to handle any post request to the root URL. It should start with some basic boilerplate for using Flask (some standard includes and setting the <code>app</code> variable to be a new instance of a Flask object):</p>
<figure class="highlight qml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span><span class="string"> os</span></span><br><span class="line"><span class="keyword">import</span><span class="string"> json</span></span><br><span class="line"></span><br><span class="line">from urllib.parse <span class="keyword">import</span><span class="string"> urlencode</span></span><br><span class="line">from urllib.request <span class="keyword">import</span><span class="string"> Request, urlopen</span></span><br><span class="line"></span><br><span class="line">from flask <span class="keyword">import</span><span class="string"> Flask, request</span></span><br><span class="line"></span><br><span class="line">app = Flask(__name__)</span><br></pre></td></tr></table></figure>
<p>After this, create a function that will be called whenever the Heroku URL receives a POST request as in the snippet below. This uses the “@app.route” decorator to specify it is for the <code>'/'</code> URL and responds to <code>POST</code> requests. For this demo, the bot will simply echo back everything said by other people in chat. The basic idea is to use the <code>request.get_json()</code> method to get the JSON form of the reply, create a message, and send that back to GroupMe.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">@app.route('/', methods=['POST'])</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">webhook</span><span class="params">()</span>:</span></span><br><span class="line"> data = request.get_json()</span><br><span class="line"></span><br><span class="line"> <span class="comment"># We don't want to reply to ourselves!</span></span><br><span class="line"> <span class="keyword">if</span> data[<span class="string">'name'</span>] != <span class="string">'apnorton-test-bot'</span>:</span><br><span class="line"> msg = <span class="string">'{}, you sent "{}".'</span>.format(data[<span class="string">'name'</span>], data[<span class="string">'text'</span>])</span><br><span class="line"> send_message(msg)</span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> <span class="string">"ok"</span>, <span class="number">200</span></span><br></pre></td></tr></table></figure>
<p>Note that we check to make sure the name isn’t the name of our bot. This is important for any echoing bot–otherwise, it will get stuck in an infinite loop of replies to itself. (Yes, I did learn this the hard way.)</p>
<p>The <code>data</code> dictionary has the following format (blatantly stolen from the <a href="https://dev.groupme.com/tutorials/bots" target="_blank" rel="external">GroupMe bot tutorial</a>):</p>
<figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line">{</span><br><span class="line"> <span class="attr">"attachments"</span>: [],</span><br><span class="line"> <span class="attr">"avatar_url"</span>: <span class="string">"http://i.groupme.com/123456789"</span>,</span><br><span class="line"> <span class="attr">"created_at"</span>: <span class="number">1302623328</span>,</span><br><span class="line"> <span class="attr">"group_id"</span>: <span class="string">"1234567890"</span>,</span><br><span class="line"> <span class="attr">"id"</span>: <span class="string">"1234567890"</span>,</span><br><span class="line"> <span class="attr">"name"</span>: <span class="string">"John"</span>,</span><br><span class="line"> <span class="attr">"sender_id"</span>: <span class="string">"12345"</span>,</span><br><span class="line"> <span class="attr">"sender_type"</span>: <span class="string">"user"</span>,</span><br><span class="line"> <span class="attr">"source_guid"</span>: <span class="string">"GUID"</span>,</span><br><span class="line"> <span class="attr">"system"</span>: <span class="literal">false</span>,</span><br><span class="line"> <span class="attr">"text"</span>: <span class="string">"Hello world ☃☃"</span>,</span><br><span class="line"> <span class="attr">"user_id"</span>: <span class="string">"1234567890"</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>The <code>attachments</code> key would contain special features of the message, including mentions and pictures. Unfortunately, I have yet to find a way to allow users to @-ping my bot (though the bot can use the <code>attachments</code> key to mention other users). Your bot will likely only care about the <code>name</code> (or <code>user_id</code>) and <code>text</code> fields of the message, and not much else. That’s all we need for this “echo” bot.</p>
<p>The last remaining thing is to define the <code>send_message</code> function. (From a software engineering perspective, we probably would want separate <code>parse_message</code> and <code>send_message</code> functions so the core bot code could be used on any platform. However, for the echobot, the parsing is essentially nonexistent, so I rolled that into the <code>webhook</code> function above.) The duties of this function are to package up the message and bot ID in JSON format and submit it as a POST request. (Notice how <code>os.getenv</code> is used to retreive the bot ID from the Heroku environment variables instead of hard-coding the bot ID.)</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">send_message</span><span class="params">(msg)</span>:</span></span><br><span class="line"> url = <span class="string">'https://api.groupme.com/v3/bots/post'</span></span><br><span class="line"></span><br><span class="line"> data = {</span><br><span class="line"> <span class="string">'bot_id'</span> : os.getenv(<span class="string">'GROUPME_BOT_ID'</span>),</span><br><span class="line"> <span class="string">'text'</span> : msg,</span><br><span class="line"> }</span><br><span class="line"> request = Request(url, urlencode(data).encode())</span><br><span class="line"> json = urlopen(request).read().decode()</span><br></pre></td></tr></table></figure>
<p>To deploy, simply save and commit your files, then run <code>$ git push heroku master</code> to start your app running on Heroku. The bot should now echo whatever is said in its group. I recommend creating a “test” group developing your bot; otherwise, the other members of the bot’s group will probably get annoyed during development.</p>
<h2 id="More-resources-NLP-and-Code"><a href="#More-resources-NLP-and-Code" class="headerlink" title="More resources: NLP and Code"></a>More resources: NLP and Code</h2><p>This blog post focuses primarily on the “implementation” details of deploying a chatbot, and doesn’t deal with none of the natural language processing, personality creation, or content generation. There’s a lot of good content out there; I found <a href="https://apps.worldwritable.com/tutorials/chatbot/" target="_blank" rel="external">this tutorial</a> on creating a chatbot with <code>Textblob</code> (a wrapper for the popular <code>nltk</code> Python library) to be helpful in my own bot explorations. For my TA chatbot, I dealt only with search-string matches and used a fair bit of hardcoding, but natural language processing is the next step in my bot’s development.</p>
<p>As another resource, I’ve created a GitHub repository with the relevant skeleton code used in this blog at <a href="https://github.com/apnorton/apnorton-demo-bot" target="_blank" rel="external">github.com/apnorton/apnorton-demo-bot</a>. The original chatbot I wrote for the CS 2150 TAs is also on GitHub <a href="https://github.com/apnorton/bloombot" target="_blank" rel="external">here</a>.</p>
<p>For the past couple years, I have worked as a teaching assistant for UVa’s CS 2150 (Program and Data Representation) course. We recently started a <a href="https://groupme.com/">GroupMe</a> chat for the course staff, and I thought it would be fun to create a chatbot to help remind all the TAs to submit timesheets, keep track of when people are holding office hours, and remember when/where TA meetings are being held. Setting up a basic chatbot is a lot simpler than it sounds and is really fun–I wrote my bot from scratch using Python in just one day.</p>
<img src="/blog/2017/02/28/How-I-wrote-a-Groupme-Chatbot-in-24-hours/screenshot.png" alt="The 2150 chatbot" title="The 2150 chatbot">
<h2 id="GroupMe-Bot-Overview"><a href="#GroupMe-Bot-Overview" class="headerlink" title="GroupMe Bot Overview"></a>GroupMe Bot Overview</h2><p>GroupMe has a very <a href="https://dev.groupme.com/tutorials/bots">brief tutorial</a> explaining how their API may be used for bots. The easiest way to create a bot is through their <a href="https://dev.groupme.com/bots/new">web form</a>, which prompts you for the bot’s name, callback URL (technically optional, but you want it), avatar URL (optional), and the name of the group where the bot will live. Once you’ve done this, you will be given a unique bot ID token. Anyone with this token can pretend to be your bot, so keep it safe. (Security is somewhat laughable here: your bot asserts its ID and the server believes it with no “login” procedure.) We’ll talk more about the callback URL in a moment; for now, just leave it blank.</p>
<p>Once you’ve done these steps, you have created a bot–as far as GroupMe is concerned. If you send a specifically formatted JSON mssage, the newly created bot will post in your group. However, if left at this point, your “bot” is little more than a proxy for human-written messages submitted with <code>curl</code>. Your bot needs some way of reading messages sent to the group, formulating a response, and only then sending messages to the GroupMe servers. </p>
TensorFlow with the Surface Bookhttp://www.apnorton.com/blog/2017/01/04/Machine-Learning-with-the-Surface-Book/2017-01-04T20:32:45.000Z2017-01-04T20:34:28.087Z<p>While interning at Microsoft over the summer, I received a first-generation Surface Book with an i5-6300U CPU (2.4 GHz dual core with up to 3.0 GHz), 8GB RAM, and a “GeForce GPU” (officially unnamed, but believed to be equivalent to a GT 940). This is a huge step up from my older laptop, so I wanted to set it up for my ML work. In this post, I’ll outline how I set it up with TensorFlow and GPU acceleration.</p>
<h2 id="CUDA-cuDNN"><a href="#CUDA-cuDNN" class="headerlink" title="CUDA + cuDNN"></a>CUDA + cuDNN</h2><p>If you want to use GPU acceleration, the typical way to do so is with NVIDIA’s CUDA API. CUDA 8.0 is compatible with the Surface Book and is (as of this writing) the most up-to-date version of CUDA. Download it <a href="https://developer.nvidia.com/cuda-downloads" target="_blank" rel="external">from the NVIDIA website</a> and run their installer.</p>
<p>For work with deep learning, you’ll also want to install cuDNN. To install, just <a href="https://developer.nvidia.com/cudnn" target="_blank" rel="external">download</a> the library from NVIDIA’s website and unzip it in a convenient place (I chose <code>C:\cudnn</code>). The only “installation” you need to do is to add <code>C:\cudnn\bin</code> to your <code>PATH</code> environment variable.</p>
<a id="more"></a>
<h2 id="Python"><a href="#Python" class="headerlink" title="Python"></a>Python</h2><p>GPU acceleration is where you’re going to get the best performance improvements when running TensorFlow. However, we might as well set up Python in a way that will run as fast as we can. </p>
<p>I installed the <a href="https://software.intel.com/intel-distribution-for-python" target="_blank" rel="external"><em>Intel Distribution for Python</em></a>. This is a clone of Python 3.5, but compiled with optimizations for Intel CPUs and packaged with optimized versions of common libraries like <code>sklearn</code>, <code>pandas</code>, <code>numpy</code>, and more. It’s a free download from Intel, but it is officially still in Beta. Thus far, I haven’t run into any problems using it with TensorFlow (but will update this post if I do).</p>
<p>To install Intel Python, just download and run the installer; I installed this to <code>C:\IntelPython35</code>. If you install it in this location, add <code>C:\IntelPython35\</code> and <code>C:\IntelPython35\Scripts</code> to your <code>PATH</code> environment variable. (Adding <code>\Scripts</code> to your path allows you to use <code>pip</code> or <code>jupyter</code> directly from the commandline.)</p>
<p>If you decide to use a different installation of Python, make sure you’re installing Python 3.5 and not the recent release of Python 3.6; as of this writing, installing TensorFlow on Windows with Python 3.6 and above is <a href="https://github.com/tensorflow/tensorflow/issues/6533" target="_blank" rel="external">not supported</a>. </p>
<h2 id="TensorFlow"><a href="#TensorFlow" class="headerlink" title="TensorFlow"></a>TensorFlow</h2><p>My understanding is that compiling TensorFlow from source using Intel’s <code>icc</code> and BLAS/LAPACK libraries will give you the best performance, but I don’t have a permanent license to these, and so just installed with <code>pip</code>.</p>
<p>The version of <code>pip</code> included with Intel Python is quite old, so the first step here is to upgrade it using <code>python -m pip install --upgrade pip</code>. Following this, we need to download the TensorFlow wheel from Google. The up-to-date link can be found <a href="https://www.tensorflow.org/get_started/os_setup#pip_installation_on_windows" target="_blank" rel="external">here</a>, but the current link is for <a href="https://storage.googleapis.com/tensorflow/windows/gpu/tensorflow_gpu-0.12.1-cp35-cp35m-win_amd64.whl" target="_blank" rel="external">v0.12.1 with GPU support</a>. (I found that downloading the wheel has better success than running <code>pip</code> directly on the URL.)</p>
<p>Finally, execute <code>pip install --upgrade [TF-downloaded-file]</code> to install TensorFlow. This should finish somewhat quickly, and then you are done!</p>
<p>When I first installed TensorFlow, I had some issues with an existing <code>setuptools</code> installation and was getting an error similar to:</p>
<figure class="highlight gherkin"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">Installing collected packages: six, setuptools, protobuf, numpy, tensorflow</span><br><span class="line"> Found existing installation: setuptools 19.1.1</span><br><span class="line"> <span class="keyword">*</span><span class="keyword">*</span>Cannot remove entries from nonexistent file C:\IntelPython35\Lib\site-packages\easy-install.pth<span class="keyword">*</span><span class="keyword">*</span></span><br></pre></td></tr></table></figure>
<p>This was <a href="https://github.com/tensorflow/tensorflow/issues/622" target="_blank" rel="external">raised as an issue</a> (<a href="https://github.com/tensorflow/tensorflow/issues/135" target="_blank" rel="external">and another issue</a>) on the TensorFlow GitHub. The solution that worked for me was to run the following:</p>
<figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">pip <span class="keyword">install</span> -I <span class="comment">--upgrade setuptools</span></span><br><span class="line">pip <span class="keyword">install</span> <span class="comment">--upgrade [TF-downloaded-file]</span></span><br></pre></td></tr></table></figure>
<h2 id="Testing-the-Installation"><a href="#Testing-the-Installation" class="headerlink" title="Testing the Installation"></a>Testing the Installation</h2><p>Open up your terminal, and we’ll run a few commands: </p>
<figure class="highlight ruby"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">$ python</span><br><span class="line">...</span><br><span class="line"><span class="meta">>></span>> import tensorflow as tf</span><br><span class="line">...</span><br><span class="line"><span class="meta">>></span>> a = tf.constant(<span class="number">12</span>)</span><br><span class="line"><span class="meta">>></span>> b = tf.constant(<span class="number">30</span>)</span><br><span class="line"><span class="meta">>></span>> sess = tf.Session()</span><br><span class="line">...</span><br><span class="line"><span class="meta">>></span>> print(sess.run(a+b))</span><br><span class="line"><span class="number">42</span></span><br><span class="line"><span class="meta">>></span>></span><br></pre></td></tr></table></figure>
<p>If it printed 42 at the end, then it works! I recommend taking a look at the <a href="https://www.tensorflow.org/get_started/" target="_blank" rel="external">TensorFlow MNIST tutorials</a> after this, as they introduce TensorFlow’s capabilities quite nicely.</p>
<p>At one point, I had a problem where any call to <code>tf.Session()</code> or <code>tf.InteractiveSession()</code> would cause Python to crash without displaying any error (Windows would display a “This process has stopped responding” dialog box and kill Python after a minute or so). I never found out <em>why</em> this happened, but restarting my computer resolved the issue and I haven’t experienced it again.</p>
<p>If you’re curious whether your GPU utilized, look at the debug information that was printed after running <code>tf.Session()</code>. If it includes lines like the below (some unnecessary path names trimmed), then the GPU was used:</p>
<figure class="highlight qml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">Found device <span class="number">0</span> <span class="keyword">with</span> <span class="attribute">properties</span>:</span><br><span class="line"><span class="attribute">name</span>: GeForce GPU</span><br><span class="line"><span class="attribute">major</span>: <span class="number">5</span></span><br><span class="line"><span class="attribute">minor</span>: <span class="number">0</span></span><br><span class="line">memoryClockRate (GHz) <span class="number">0.993</span></span><br><span class="line">pciBusID <span class="number">0000</span>:<span class="number">01</span>:<span class="number">00.0</span></span><br><span class="line">Total <span class="attribute">memory</span>: <span class="number">1.00</span>GiB</span><br></pre></td></tr></table></figure>
<p><em>Credit where credit is due! When I was performing my install, I was greatly aided by <a href="http://www.heatonresearch.com/2017/01/01/tensorflow-windows-gpu.html" target="_blank" rel="external">this blog post</a> from Heaton Research.</em></p>
<p>While interning at Microsoft over the summer, I received a first-generation Surface Book with an i5-6300U CPU (2.4 GHz dual core with up to 3.0 GHz), 8GB RAM, and a “GeForce GPU” (officially unnamed, but believed to be equivalent to a GT 940). This is a huge step up from my older laptop, so I wanted to set it up for my ML work. In this post, I’ll outline how I set it up with TensorFlow and GPU acceleration.</p>
<h2 id="CUDA-cuDNN"><a href="#CUDA-cuDNN" class="headerlink" title="CUDA + cuDNN"></a>CUDA + cuDNN</h2><p>If you want to use GPU acceleration, the typical way to do so is with NVIDIA’s CUDA API. CUDA 8.0 is compatible with the Surface Book and is (as of this writing) the most up-to-date version of CUDA. Download it <a href="https://developer.nvidia.com/cuda-downloads">from the NVIDIA website</a> and run their installer.</p>
<p>For work with deep learning, you’ll also want to install cuDNN. To install, just <a href="https://developer.nvidia.com/cudnn">download</a> the library from NVIDIA’s website and unzip it in a convenient place (I chose <code>C:\cudnn</code>). The only “installation” you need to do is to add <code>C:\cudnn\bin</code> to your <code>PATH</code> environment variable.</p>
Visualizing Multidimensional Data in Pythonhttp://www.apnorton.com/blog/2016/12/19/Visualizing-Multidimensional-Data-in-Python/2016-12-20T01:51:55.000Z2016-12-20T02:31:09.743Z<p>Nearly everyone is familiar with two-dimensional plots, and most college students in the hard sciences are familiar with three dimensional plots. However, modern datasets are rarely two- or three-dimensional. In machine learning, it is commonplace to have dozens if not hundreds of dimensions, and even human-generated datasets can have a dozen or so dimensions. At the same time, visualization is an important first step in working with data. In this blog entry, I’ll explore how we can use Python to work with n-dimensional data, where $n\geq 4$. </p>
<h2 id="Packages"><a href="#Packages" class="headerlink" title="Packages"></a>Packages</h2><p>I’m going to assume we have the <code>numpy</code>, <code>pandas</code>, <code>matplotlib</code>, and <code>sklearn</code> packages installed for Python. In particular, the components I will use are as below:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> matplotlib.pyplot <span class="keyword">as</span> plt</span><br><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd</span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> sklearn.decomposition <span class="keyword">import</span> PCA <span class="keyword">as</span> sklearnPCA</span><br><span class="line"><span class="keyword">from</span> sklearn.discriminant_analysis <span class="keyword">import</span> LinearDiscriminantAnalysis <span class="keyword">as</span> LDA</span><br><span class="line"><span class="keyword">from</span> sklearn.datasets.samples_generator <span class="keyword">import</span> make_blobs</span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> pandas.tools.plotting <span class="keyword">import</span> parallel_coordinates</span><br></pre></td></tr></table></figure>
<h2 id="Plotting-2D-Data"><a href="#Plotting-2D-Data" class="headerlink" title="Plotting 2D Data"></a>Plotting 2D Data</h2><p>Before dealing with multidimensional data, let’s see how a scatter plot works with two-dimensional data in Python. </p>
<p>First, we’ll generate some random 2D data using <code>sklearn.samples_generator.make_blobs</code>. We’ll create three classes of points and plot each class in a different color. After running the following code, we have datapoints in <code>X</code>, while classifications are in <code>y</code>.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">X, y = make_blobs(n_samples=<span class="number">200</span>, centers=<span class="number">3</span>, n_features=<span class="number">2</span>, random_state=<span class="number">0</span>)</span><br></pre></td></tr></table></figure>
<p>To create a 2D scatter plot, we simply use the <code>scatter</code> function from <code>matplotlib</code>. Since we want each class to be a separate color, we use the <code>c</code> parameter to set the datapoint color according to the <code>y</code> (class) vector. </p>
<a id="more"></a>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">plt.scatter(X[:,<span class="number">0</span>], X[:,<span class="number">1</span>], c=y)</span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
<img src="/blog/2016/12/19/Visualizing-Multidimensional-Data-in-Python/output_7_0.png" alt="2D Scatter Plot with Colors" title="2D Scatter Plot with Colors">
<h2 id="n-dimensional-dataset-Wine"><a href="#n-dimensional-dataset-Wine" class="headerlink" title="n-dimensional dataset: Wine"></a>n-dimensional dataset: Wine</h2><p>In the rest of this post, we will be working with the <a href="https://archive.ics.uci.edu/ml/datasets/Wine" target="_blank" rel="external">Wine</a> dataset from the UCI Machine Learning Repository. I selected this dataset because it has three classes of points and a thirteen-dimensional feature set, yet is still fairly small.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">url = <span class="string">'https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data'</span></span><br><span class="line">cols = [<span class="string">'Class'</span>, <span class="string">'Alcohol'</span>, <span class="string">'MalicAcid'</span>, <span class="string">'Ash'</span>, <span class="string">'AlcalinityOfAsh'</span>, <span class="string">'Magnesium'</span>, <span class="string">'TotalPhenols'</span>, </span><br><span class="line"> <span class="string">'Flavanoids'</span>, <span class="string">'NonflavanoidPhenols'</span>, <span class="string">'Proanthocyanins'</span>, <span class="string">'ColorIntensity'</span>, </span><br><span class="line"> <span class="string">'Hue'</span>, <span class="string">'OD280/OD315'</span>, <span class="string">'Proline'</span>]</span><br><span class="line">data = pd.read_csv(url, names=cols)</span><br><span class="line"></span><br><span class="line">y = data[<span class="string">'Class'</span>] <span class="comment"># Split off classifications</span></span><br><span class="line">X = data.ix[:, <span class="string">'Alcohol'</span>:] <span class="comment"># Split off features</span></span><br></pre></td></tr></table></figure>
<h2 id="Method-1-Two-dimensional-slices"><a href="#Method-1-Two-dimensional-slices" class="headerlink" title="Method 1: Two-dimensional slices"></a>Method 1: Two-dimensional slices</h2><p>A simple approach to visualizing multi-dimensional data is to select two (or three) dimensions and plot the data as seen in that plane. For example, I could plot the <em>Flavanoids</em> vs. <em>Nonflavanoid Phenols</em> plane as a two-dimensional “slice” of the original dataset:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># three different scatter series so the class labels in the legend are distinct</span></span><br><span class="line">plt.scatter(X[y==<span class="number">1</span>][<span class="string">'Flavanoids'</span>], X[y==<span class="number">1</span>][<span class="string">'NonflavanoidPhenols'</span>], label=<span class="string">'Class 1'</span>, c=<span class="string">'red'</span>)</span><br><span class="line">plt.scatter(X[y==<span class="number">2</span>][<span class="string">'Flavanoids'</span>], X[y==<span class="number">2</span>][<span class="string">'NonflavanoidPhenols'</span>], label=<span class="string">'Class 2'</span>, c=<span class="string">'blue'</span>)</span><br><span class="line">plt.scatter(X[y==<span class="number">3</span>][<span class="string">'Flavanoids'</span>], X[y==<span class="number">3</span>][<span class="string">'NonflavanoidPhenols'</span>], label=<span class="string">'Class 3'</span>, c=<span class="string">'lightgreen'</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Prettify the graph</span></span><br><span class="line">plt.legend()</span><br><span class="line">plt.xlabel(<span class="string">'Flavanoids'</span>)</span><br><span class="line">plt.ylabel(<span class="string">'NonflavanoidPhenols'</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># display</span></span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
<img src="/blog/2016/12/19/Visualizing-Multidimensional-Data-in-Python/output_11_0.png" alt="Cross-Section of Data scatter plot" title="Cross-Section of Data scatter plot">
<p>The downside of this approach is that there are $\binom{n}{2} = \frac{n(n-1)}{2}$ such plots for $n$-dimensional an dataset, so viewing the entire dataset this way can be difficult. While this does provide an “exact” view of the data and can be a great way of emphasizing certain relationships, there are other techniques we can use. A related technique is to display a <a href="http://pandas.pydata.org/pandas-docs/version/0.18.1/visualization.html#scatter-matrix-plot" target="_blank" rel="external">scatter plot matrix</a>.</p>
<h2 id="Feature-Scaling"><a href="#Feature-Scaling" class="headerlink" title="Feature Scaling"></a>Feature Scaling</h2><p>Before we go further, we should apply feature scaling to our dataset. In this example, I will simply rescale the data to a $[0,1]$ range, but it is also common to <em>standardize</em> the data to have a zero mean and unit standard deviation.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">X_norm = (X - X.min())/(X.max() - X.min())</span><br></pre></td></tr></table></figure>
<h2 id="Method-2-PCA-Plotting"><a href="#Method-2-PCA-Plotting" class="headerlink" title="Method 2: PCA Plotting"></a>Method 2: PCA Plotting</h2><p>Principle Component Analysis (PCA) is a method of dimensionality reduction. It has applications far beyond visualization, but it can also be applied here. It uses eigenvalues and eigenvectors to find new axes on which the data is most spread out. From these new axes, we can choose those with the most extreme spreading and project onto this plane. (This is an extremely hand-wavy explanation; I recommend reading more formal explanations of this.)</p>
<p>In Python, we can use PCA by first fitting an <code>sklearn</code> PCA object to the normalized dataset, then looking at the transformed matrix.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">pca = sklearnPCA(n_components=<span class="number">2</span>) <span class="comment">#2-dimensional PCA</span></span><br><span class="line">transformed = pd.DataFrame(pca.fit_transform(X_norm))</span><br></pre></td></tr></table></figure>
<p>The return value <code>transformed</code> is a <code>samples</code>-by-<code>n_components</code> matrix with the new axes, which we may now plot in the usual way.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">plt.scatter(transformed[y==<span class="number">1</span>][<span class="number">0</span>], transformed[y==<span class="number">1</span>][<span class="number">1</span>], label=<span class="string">'Class 1'</span>, c=<span class="string">'red'</span>)</span><br><span class="line">plt.scatter(transformed[y==<span class="number">2</span>][<span class="number">0</span>], transformed[y==<span class="number">2</span>][<span class="number">1</span>], label=<span class="string">'Class 2'</span>, c=<span class="string">'blue'</span>)</span><br><span class="line">plt.scatter(transformed[y==<span class="number">3</span>][<span class="number">0</span>], transformed[y==<span class="number">3</span>][<span class="number">1</span>], label=<span class="string">'Class 3'</span>, c=<span class="string">'lightgreen'</span>)</span><br><span class="line"></span><br><span class="line">plt.legend()</span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
<img src="/blog/2016/12/19/Visualizing-Multidimensional-Data-in-Python/output_18_0.png" alt="PCA plot of Wine dataset" title="PCA plot of Wine dataset">
<p>A downside of PCA is that the axes no longer have meaning. Rather, they are just a projection that best “spreads” the data. However, it does show that the data naturally forms clusters in some way.</p>
<h2 id="Method-3-Linear-Discriminant-Analysis"><a href="#Method-3-Linear-Discriminant-Analysis" class="headerlink" title="Method 3: Linear Discriminant Analysis"></a>Method 3: Linear Discriminant Analysis</h2><p>A similar approach to projecting to lower dimensions is Linear Discriminant Analysis (LDA). This is similar to PCA, but (at an intuitive level) attempts to separate the classes rather than just spread the entire dataset.</p>
<p>The code for this is similar to that for PCA:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">lda = LDA(n_components=<span class="number">2</span>) <span class="comment">#2-dimensional LDA</span></span><br><span class="line">lda_transformed = pd.DataFrame(lda.fit_transform(X_norm, y))</span><br><span class="line"></span><br><span class="line"><span class="comment"># Plot all three series</span></span><br><span class="line">plt.scatter(lda_transformed[y==<span class="number">1</span>][<span class="number">0</span>], lda_transformed[y==<span class="number">1</span>][<span class="number">1</span>], label=<span class="string">'Class 1'</span>, c=<span class="string">'red'</span>)</span><br><span class="line">plt.scatter(lda_transformed[y==<span class="number">2</span>][<span class="number">0</span>], lda_transformed[y==<span class="number">2</span>][<span class="number">1</span>], label=<span class="string">'Class 2'</span>, c=<span class="string">'blue'</span>)</span><br><span class="line">plt.scatter(lda_transformed[y==<span class="number">3</span>][<span class="number">0</span>], lda_transformed[y==<span class="number">3</span>][<span class="number">1</span>], label=<span class="string">'Class 3'</span>, c=<span class="string">'lightgreen'</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Display legend and show plot</span></span><br><span class="line">plt.legend(loc=<span class="number">3</span>)</span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
<img src="/blog/2016/12/19/Visualizing-Multidimensional-Data-in-Python/output_21_0.png" alt="LDA plot of Wine dataset" title="LDA plot of Wine dataset">
<h2 id="Method-4-Parallel-Coordinates"><a href="#Method-4-Parallel-Coordinates" class="headerlink" title="Method 4: Parallel Coordinates"></a>Method 4: Parallel Coordinates</h2><p>The final visualization technique I’m going to discuss is quite different than the others. Instead of projecting the data into a two-dimensional plane and plotting the projections, the Parallel Coordinates plot (imported from <code>pandas</code> instead of only <code>matplotlib</code>) displays a vertical axis for each feature you wish to plot. Each sample is then plotted as a color-coded line passing through the appropriate coordinate on each feature. While this doesn’t always show how the data can be separated into classes, it does reveal trends within a particular class. (For instance, in this example, we can see that <em>Class 3</em> tends to have a very low OD280/OD315.)</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># Select features to include in the plot</span></span><br><span class="line">plot_feat = [<span class="string">'MalicAcid'</span>, <span class="string">'Ash'</span>, <span class="string">'OD280/OD315'</span>, <span class="string">'Magnesium'</span>,<span class="string">'TotalPhenols'</span>]</span><br><span class="line"></span><br><span class="line"><span class="comment"># Concat classes with the normalized data</span></span><br><span class="line">data_norm = pd.concat([X_norm[plot_feat], y], axis=<span class="number">1</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment"># Perform parallel coordinate plot</span></span><br><span class="line">parallel_coordinates(data_norm, <span class="string">'Class'</span>)</span><br><span class="line">plt.show()</span><br></pre></td></tr></table></figure>
<img src="/blog/2016/12/19/Visualizing-Multidimensional-Data-in-Python/output_23_0.png" alt="Parallel Coordinates Plot" title="Parallel Coordinates Plot">
<h2 id="Which-do-I-use"><a href="#Which-do-I-use" class="headerlink" title="Which do I use?"></a>Which do I use?</h2><p>As with much of data science, the method you use here is dependent on your particular dataset and what information you are trying to extract from it. The PCA and LDA plots are useful for finding obvious cluster boundaries in the data, while a scatter plot matrix or parallel coordinate plot will show specific behavior of particular features in your dataset.</p>
<p>I drafted this in a Jupyter notebook; if you want a copy of the notebook or have concerns about my post for some reason, you can send me an email at apn4za on the virginia.edu domain.</p>
<p>Nearly everyone is familiar with two-dimensional plots, and most college students in the hard sciences are familiar with three dimensional plots. However, modern datasets are rarely two- or three-dimensional. In machine learning, it is commonplace to have dozens if not hundreds of dimensions, and even human-generated datasets can have a dozen or so dimensions. At the same time, visualization is an important first step in working with data. In this blog entry, I’ll explore how we can use Python to work with n-dimensional data, where $n\geq 4$. </p>
<h2 id="Packages"><a href="#Packages" class="headerlink" title="Packages"></a>Packages</h2><p>I’m going to assume we have the <code>numpy</code>, <code>pandas</code>, <code>matplotlib</code>, and <code>sklearn</code> packages installed for Python. In particular, the components I will use are as below:</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> matplotlib.pyplot <span class="keyword">as</span> plt</span><br><span class="line"><span class="keyword">import</span> pandas <span class="keyword">as</span> pd</span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> sklearn.decomposition <span class="keyword">import</span> PCA <span class="keyword">as</span> sklearnPCA</span><br><span class="line"><span class="keyword">from</span> sklearn.discriminant_analysis <span class="keyword">import</span> LinearDiscriminantAnalysis <span class="keyword">as</span> LDA</span><br><span class="line"><span class="keyword">from</span> sklearn.datasets.samples_generator <span class="keyword">import</span> make_blobs</span><br><span class="line"></span><br><span class="line"><span class="keyword">from</span> pandas.tools.plotting <span class="keyword">import</span> parallel_coordinates</span><br></pre></td></tr></table></figure>
<h2 id="Plotting-2D-Data"><a href="#Plotting-2D-Data" class="headerlink" title="Plotting 2D Data"></a>Plotting 2D Data</h2><p>Before dealing with multidimensional data, let’s see how a scatter plot works with two-dimensional data in Python. </p>
<p>First, we’ll generate some random 2D data using <code>sklearn.samples_generator.make_blobs</code>. We’ll create three classes of points and plot each class in a different color. After running the following code, we have datapoints in <code>X</code>, while classifications are in <code>y</code>.</p>
<figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">X, y = make_blobs(n_samples=<span class="number">200</span>, centers=<span class="number">3</span>, n_features=<span class="number">2</span>, random_state=<span class="number">0</span>)</span><br></pre></td></tr></table></figure>
<p>To create a 2D scatter plot, we simply use the <code>scatter</code> function from <code>matplotlib</code>. Since we want each class to be a separate color, we use the <code>c</code> parameter to set the datapoint color according to the <code>y</code> (class) vector. </p>
Election 2016: Moving Forwardhttp://www.apnorton.com/blog/2016/11/09/Election-2016/2016-11-09T08:15:08.000Z2016-11-16T04:06:10.061Z<p><em>(Warning: political post ahead)</em></p>
<p>Like many of my fellow Americans, I stayed up late tonight to watch the polling results for the 2016 General Election. As of my writing this, it appears that Donald Trump will win by a slight margin. The New York Times is predicting that the popular vote will go to Hillary Clinton, while Politico and the Wall Street Journal are showing the <em>current</em> popular vote is Trump’s by about 1 million.</p>
<a id="more"></a>
<p>Also like most of my fellow Americans, I turned to social media (in my case, these are Facebook, Twitter, and Reddit) to find how my friends are taking the results. About half of my friends are exulting in the victory of their candidate, while about half of my friends are dismayed at what they see to be the impending doom of the next four years. As someone who hated both candidates with a vengeance, I’m watching the fireworks from a somewhat distant perspective. While I’m glad Clinton isn’t president, I’m not thrilled with the prospects of a Trump presidency. (Had this election gone the other way, the prior sentence would have simply switched the placement of names.)</p>
<p>What is most interesting to me is the repetitive nature of elections–I suppose I was too young to notice it before, but being cognizant of four elections in my lifetime has lead me to believe a two key things are invariant. (I recognize I am quite young to be making such bold claims; it will be interesting to look back on this post in a few decades to see how my opinions have shifted.) </p>
<p>Firstly, every election is “too important” to vote for a third party. The nature of a two-party system lends itself towards the selection of candidates with increasingly extreme views; waiting “until it’s safe” to vote for a third party means you will never vote for that third party. This was an interesting year for the Libertarian party; while they didn’t reach their “5% popular vote” goal to gain federal funding, they certainly made strides in becoming a more socially acceptable choice for voters.</p>
<p>Secondly, whoever loses an election believes that the country is doomed. A lot of my fundamentalist friends were convinced that Obama beating McCain in the 2008 election would result in the death of religious freedom. When he was re-elected in 2012, the number of people on my Facebook feed talking about leaving the country and the perils of the future was stunning. There was talk of trying to get Texas to secede. Also, they blamed the third party for stealing needed votes from Romney. However, after this election, I am seeing my more liberal friends posting their fears that the United States will cease to exist as we know it. Paul Krugman, a Nobel Prize-winning economist-turned political commentator, <a href="http://www.nytimes.com/interactive/projects/cp/opinion/election-night-2016/the-unknown-country" target="_blank" rel="external">wrote</a> in an op-ed for the New York Times: “I don’t know how we go forward from here. Is America a failed state and society? It looks truly possible.” Twitter was rife with comparisons to Brexit. I saw a friend of mine post about recommending California secede from the Union, and I’ve seen multiple posts about how the popular vote should correspond to the electoral college. To add to the confusion, I’ve even seen some friends blame the Libertarian party for stealing votes from Clinton. I can’t help but notice the parallels between Republicans who lost in 2012 and 2008.</p>
<p>Whether you are a Republican, Democrat, or third-party member, please remember that <em>this too shall pass</em>. Excessive revelry in victory doesn’t serve a positive purpose and neither does assuming the worst about the next four years. Don’t give up–whether you view this as a great leap forward or a frightening setback, there is much to be done to improve our country and we each have a role to play. Let’s seek to understand and respect each other as we attempt to rebuild trust and cooperation after a very divisive race.</p>
<p><em>(Warning: political post ahead)</em></p>
<p>Like many of my fellow Americans, I stayed up late tonight to watch the polling results for the 2016 General Election. As of my writing this, it appears that Donald Trump will win by a slight margin. The New York Times is predicting that the popular vote will go to Hillary Clinton, while Politico and the Wall Street Journal are showing the <em>current</em> popular vote is Trump’s by about 1 million.</p>
New Feature: Commenting!http://www.apnorton.com/blog/2016/10/28/Commenting/2016-10-28T04:32:31.000Z2016-11-16T04:06:10.061Z<p>Thanks to a <a href="http://www.codeblocq.com/2015/12/Add-Disqus-comments-in-Hexo/" target="_blank" rel="external">helpful blog post</a> by CodeBlocQ, I’ve now enabled Disqus-powered comments on the blog! Let me know what you think about my posts, and I’ll keep an eye on discussions to respond to questions/comments/concerns!</p>
<p>The second part of the Microsoft series should be out soon; I wanted to get comments working before I did so, but it took me a while to find the time to actually get it up and running.</p>
<p>Thanks to a <a href="http://www.codeblocq.com/2015/12/Add-Disqus-comments-in-Hexo/" target="_blank" rel="external">helpful blog post</a>
A Microsoft Summer, Part 1: Seattle Funhttp://www.apnorton.com/blog/2016/09/30/A-Summer-with-Microsoft/2016-09-30T05:26:20.000Z2016-11-16T04:06:09.994Z<p>As suggested by this post’s title, I spent this past summer as an intern with Microsoft in Redmond, Washington. The experience was highly educational for me–as my first (and last!) “real” internship, I learned a lot about software development and the importance of corporate culture, as well as discovering a lot about myself. Overall, the experience was a positive one, though, and I had an enormous amount of fun! </p>
<p><em>This is the first of a three-part series on my time at Microsoft. This post focuses on fun recreational activities for interns in the Seattle area.</em></p>
<h2 id="Outdoors"><a href="#Outdoors" class="headerlink" title="Outdoors"></a>Outdoors</h2><img src="/blog/2016/09/30/A-Summer-with-Microsoft/north_cascades.jpg" alt="Hiking in the North Cascades" title="Hiking in the North Cascades">
<p>The Pacific Northwest is home to some of the most amazing views I’ve ever seen. Seattle is conveniently located close to the beach, the mountains, Puget Sound, rainforests, and many hiking trails and campsites. Exploring the outdoors also has the advantage of being very inexpensive, which is great if you’re saving your internship money for college expenses. If you visit National Parks, consider the <a href="https://www.nps.gov/elca/planyourvisit/passport-program.htm" target="_blank" rel="external">National Park Passport Program</a>–if you’re going to once-in-a-lifetime parks, it’s a good idea to get your passbook stamped!</p>
<a id="more"></a>
<p><strong>Olympic National Park</strong> <em>(1+ days, $5 parking permit)</em>: Olympic is a <em>gigantic</em> national park just a few hours away from Seattle. It has beaches, mountains, and rainforests–whatever you enjoy seeing in nature, you’ll probably find it here. The Microsoft Internz mailing list organizes some hikes in this park.</p>
<p><strong>North Cascades National Park</strong> <em>(1+ days, $5 parking permit)</em>: Less-frequented and a bit further away than Olympic, the North Cascades is a bit less developed/road-accessible, but totally worth it. I hiked the Maple Loop Trail one afternoon–it’s around 8 miles with ~2000ft of elevation gain–the view was amazing. If you’re looking for a good trail to do in around 5 hours, I recommend that one. There’s no cell reception anywhere in the park, so be sure to bring an actual map or print directions.</p>
<p><strong>Kayaking the sound</strong> <em>(2-3 hours, $18/hour)</em>: Kayaking in the sound is an awesome way to spend a few hours on an afternoon. Be sure to wear sunscreen! <a href="http://aguaverde.com/paddleclub/" target="_blank" rel="external">Agua Verde</a> is a popular place to rent kayaks.</p>
<p><strong>Camping</strong> <em>(2+ days)</em>: I didn’t bring my camping gear, but I sure wish I did! Camping in national or state parks (especially the Cascades or Olympic) would be amazing. </p>
<p><strong>Climbing</strong> <em>(~\$60/mo or ~\$20/day)</em>: Seattle is home to some awesome climbing gyms! <a href="https://www.stonegardens.com/" target="_blank" rel="external">Stone Gardens</a> has locations in both Seattle and Redmond, and it had roped climbing and bouldering, so I joined that gym. A Stone Gardens membership costs around $60/month (bring your student ID!), but you can go for free on your first visit if a member brings you–reach out on Facebook or Slack for other interns to invite you. Going on outdoors climbs and/or mountaineering events was something some of my friends did, which they seemed to really enjoy.</p>
<h2 id="Museums-Landmarks-etc"><a href="#Museums-Landmarks-etc" class="headerlink" title="Museums, Landmarks, etc"></a>Museums, Landmarks, etc</h2><img src="/blog/2016/09/30/A-Summer-with-Microsoft/museum_of_flight.jpg" alt="Boeing Museum of Flight" title="Boeing Museum of Flight">
<p>There’s a lot to do in Seattle, and one of your best tools for experiencing as much as you can while still being under budget is the Microsoft Prime Card. With discounts on everything from haircuts and food to zoos and museums, it’s a really awesome perk for interns who want to see everything. I’ve listed the Prime Card cost and the “normal” cost for as many of the activities below as I could find.</p>
<p><strong>Boeing Museum of Flight</strong> <em>(3hrs, $5 with Prime Card, <a href="http://www.museumofflight.org/Plan-Your-Visit/Hours-and-Admission" target="_blank" rel="external">$21 without</a>)</em>: The Museum of Flight is simply amazing. I’ve been to the Smithsonian Air and Space Museum in DC, the Udvar-Hazy Center in Dulles, and the National Museum of the USAF, but and this is certainly in their tier. In fact, I’d contend some of the exhibit design far exceeds that of the Smithsonian, even though the collection seems to be smaller. Some of my favorite parts were the one-of-a-kind M-21 Blackbird (instead of the more common SR 71), being able to walk through a Concorde jet, and seeing my third B-29 Superfortress. </p>
<p><strong>Ada’s Technical Books</strong> <em>(1-2hrs, prices reasonable)</em>: This is the dream bookstore for any techie. I’ve <a href="http://www.apnorton.com/blog/2016/08/07/Ada-s-Technical-Books/">already written</a> about it, so I’m not going to duplicate my summary here. Suffice it to say that you can find books here on nearly any technical subject you’d ever be interested.</p>
<p><strong>Woodland Park Zoo</strong> <em>(3-4hrs, \$8.50 with Prime Card, \$19.95 without)</em>: This zoo is well designed, and balances providing quality animal habitats while allowing guests to quickly move through exhibits and see many animals. It’s certainly worth a weekend afternoon for a visit; I only wish the Komodo Dragon exhibit had been operational when I went. My favorite section was the Northern Trail–they have really nice bear and wolf exhibits.</p>
<p><strong>Seattle Aquarium</strong> <em>(2hrs, \$10.25 with Prime Card, \$24.95 without):</em> The aquarium is really nice, but I’m afraid I was spoiled by growing up next to the National Aquarium in Baltimore. For the $10.25 it costs with the Microsoft Prime Card, I’d certainly recommend going (it’s also right next to the Ferris Wheel and the National Historic Site Klondike Museum, so perhaps it could be part of a larger trip downtown), though. The octopus exhibit is really cool, and they do public feedings, so try to be there when one is scheduled. </p>
<p><strong>Space Needle</strong> <em>(2hrs, \$17 with Prime Card, \$22 without):</em> Ok, sure, the Space Needle is expensive and really touristy, but you <em>must</em> go up to the top if you haven’t already. The view is spectacular, and there’s so much history tied up in this building about the World’s Fair (there are really nice exhibits on your way up).</p>
<p><strong>EMP Museum</strong> <em>(2-3hrs, $8 with Prime Card):</em> While “EMP” stands for the “Experience Music Project,” there is a lot more here than just music. Ever wonder where many of the costumes for <em>The Wizard of Oz</em> ended up? What about the costumes from <em>The Princess Bride</em> or Susan’s horn from <em>The Lion, The Witch, and the Wardrobe</em>? Yep, they’re all here. They even have a sci-fi exhibit with props from Doctor Who, Star Trek, 2001, and many other famous movies and TV series. I highly recommend going.</p>
<p><strong>Chihuly Glass Museum</strong> <em>(1hr, \$17 with Prime Card, \$22 without):</em> If you’re into glass art, this is a nice place. Chihuly really has a talent for making spectacular works from simple materials, and it’s all lit up in an amazing way. This, the EMP Museum, and the Space Needle are all within walking distance from each other, so the trio could make for a full day of fun.</p>
<p><strong>Pike Place Market & Starbucks</strong> <em>(1-2hrs):</em> The Pike Place Market is a pretty famous part of Seattle for many reasons, not the least of which being that the first Starbucks is here. You <em>must</em> eat the Salmon or Halibut sandwich at the Market Grill. It would be worth a trip from Redmond to Seattle in rush hour just to eat this delicious meal.</p>
<h2 id="Microsoft-Related"><a href="#Microsoft-Related" class="headerlink" title="Microsoft-Related"></a>Microsoft-Related</h2><img src="/blog/2016/09/30/A-Summer-with-Microsoft/intern_signature_event.jpg" alt="Ellie Goulding in Concert<br />at the Microsoft Intern Signature Event" title="Ellie Goulding in Concert<br />at the Microsoft Intern Signature Event">
<p>Microsoft itself hosts some pretty awesome events for interns. I didn’t do as many as were offered (and some of the ones I missed sounded pretty cool, like NERF Battles and Amazon vs. Microsoft Paintball), but I’ve listed below ones I participated in and enjoyed.</p>
<p><strong>Microsoft Intern Puzzle Day</strong>: Consistently reported by interns who participated as one of the best parts of the internship. If you’re interested in puzzles, codes, and problem-solving (and, let’s face it–if you’re in CS, the answer is probably “yes”), this is the event for you! Also, you can get a free t-shirt.</p>
<p><strong>Hunger Games</strong>: What could be more fun than running around Microsoft campus in the dark with NERF guns? Not much! This event is a great way to meet new friends and unwind after a hard day’s work.</p>
<p><strong>Assassin</strong>: Can you track down and tag your target before your deadline expires? This is the ultimate in workplace sneakery as, for approximately 3 weeks, interns will be hunting each other down and tagging each other with namebadges. Kill or be killed–it’s a lot of fun either way. </p>
<p><strong>Internz Hikes</strong>: Sign up for the Internz mailing list, and go on at least one hike. These range from simple hikes that anyone could do to 14 milers that might benefit from some preparation. Like I’ve mentioned before, there’s a lot of great hiking in the area–don’t finish out the summer without at least trying one hike.</p>
<p><strong>Intern Signature Event</strong>: This is the best event for interns that Microsoft hosts, hands down. They typically rent out a large venue (past events have been held at the Boeing factory, Gasworks, and–most recently–the Space Needle and surrounding areas), host a private concert (e.g. Maroon 5, Ellie Goulding, etc), and give away a really nice gift to all their interns (this past year, we all got Surface Books). Definitely go–it’s a blast and you’ll have a lot of fun.</p>
<p>As suggested by this post’s title, I spent this past summer as an intern with Microsoft in Redmond, Washington. The experience was highly educational for me–as my first (and last!) “real” internship, I learned a lot about software development and the importance of corporate culture, as well as discovering a lot about myself. Overall, the experience was a positive one, though, and I had an enormous amount of fun! </p>
<p><em>This is the first of a three-part series on my time at Microsoft. This post focuses on fun recreational activities for interns in the Seattle area.</em></p>
<h2 id="Outdoors"><a href="#Outdoors" class="headerlink" title="Outdoors"></a>Outdoors</h2><img src="/blog/2016/09/30/A-Summer-with-Microsoft/north_cascades.jpg" alt="Hiking in the North Cascades" title="Hiking in the North Cascades">
<p>The Pacific Northwest is home to some of the most amazing views I’ve ever seen. Seattle is conveniently located close to the beach, the mountains, Puget Sound, rainforests, and many hiking trails and campsites. Exploring the outdoors also has the advantage of being very inexpensive, which is great if you’re saving your internship money for college expenses. If you visit National Parks, consider the <a href="https://www.nps.gov/elca/planyourvisit/passport-program.htm">National Park Passport Program</a>–if you’re going to once-in-a-lifetime parks, it’s a good idea to get your passbook stamped!</p>
Ada's Technical Bookshttp://www.apnorton.com/blog/2016/08/07/Ada-s-Technical-Books/2016-08-07T05:01:24.000Z2016-11-16T04:06:10.097Z<p>This summer, I’ve been working as an intern for Microsoft on the Direct2D/DirectWrite team. While I can’t really talk about what my work entails, I <em>can</em> talk about some of the fun things I’ve done this summer in my free time and the non-work-related components of my internship. I suppose most people wouldn’t start blogging about their internship by describing a bookstore, but I went to this place today and it was so incredible that I had to write about it.</p>
<p>In Capitol Hill, there’s a small store by the name <a href="http://www.seattletechnicalbooks.com/" target="_blank" rel="external"><em>Ada’s Technical Books</em></a>. It’s in a house that’s been converted to a cafe and bookstore, and is quite possibly the most amazing bookstore I’ve ever seen. As you walk in, you’re greeted by an small cafe counter to your left and an open area to your right with short bookcases and comfy chairs. Toys, puzzles, and “Maker”-appropriate items like lockpicks and Raspberry Pis. </p>
<img src="/blog/2016/08/07/Ada-s-Technical-Books/widgets.jpg" alt="Who can resist a giant 555 timer?" title="Who can resist a giant 555 timer?">
<a id="more"></a>
<p>As I explored the shelves, I found dozens of copies of Petzold’s <em>CODE</em>, a strong science fiction section with nods to <em>The Princess Bride</em> and <em>The Lord of the Rings</em>, myriad <strong>Make:</strong> books, <a href="http://www.sparkfun.com/" target="_blank" rel="external">Sparkfun</a> kits, and many books I’ve only seen online catalogs. Looking for a copy of <a href="https://mitpress.mit.edu/books/introduction-algorithms" target="_blank" rel="external">CLRS</a>? It’s there, along with many other famous textbooks. <a href="http://codebabies.com/product/html-for-babies" target="_blank" rel="external"><em>HTML for Babies</em></a>? It’s in the kids section alongside books for helping your child learn to program and <a href="http://www.snapcircuits.net/" target="_blank" rel="external">Snap Circuits</a> kits for teaching electronics. Math books ranging from basic statistics and calculus all the way through advanced algebra and cryptanalysis are available, as are language-specific programming texts for just about almost every language I’ve heard of.</p>
<img src="/blog/2016/08/07/Ada-s-Technical-Books/reading_room.jpg" alt="Wall-to-wall bookcases in the back reading room" title="Wall-to-wall bookcases in the back reading room">
<p>The back of the building has a “reading room” area with a chalkboard, large table, and a bar-like area for reading or coding. On the 3 sides of the room not covered by the chalkboard, there are wall-to-wall bookcases containing books on dozens of languages and technologies. </p>
<p>If you’re in the Seattle area, especially if you’re interning or going to school in the area, I <em>highly</em> recommend visiting this store. Even if you don’t plan on purchasing anything, it’s a great opportunity to read a few pages of a book you might not be convinced is worth adding to you personal collection, but can’t read enough of a preview online to make that determination. The fact that this store has sufficient customers to survive is a strong indicator of the impact Microsoft, Amazon, and Google have on the community–I wish there was a store like this where I live! </p>
<p>This summer, I’ve been working as an intern for Microsoft on the Direct2D/DirectWrite team. While I can’t really talk about what my work entails, I <em>can</em> talk about some of the fun things I’ve done this summer in my free time and the non-work-related components of my internship. I suppose most people wouldn’t start blogging about their internship by describing a bookstore, but I went to this place today and it was so incredible that I had to write about it.</p>
<p>In Capitol Hill, there’s a small store by the name <a href="http://www.seattletechnicalbooks.com/"><em>Ada’s Technical Books</em></a>. It’s in a house that’s been converted to a cafe and bookstore, and is quite possibly the most amazing bookstore I’ve ever seen. As you walk in, you’re greeted by an small cafe counter to your left and an open area to your right with short bookcases and comfy chairs. Toys, puzzles, and “Maker”-appropriate items like lockpicks and Raspberry Pis. </p>
<img src="/blog/2016/08/07/Ada-s-Technical-Books/widgets.jpg" alt="Who can resist a giant 555 timer?" title="Who can resist a giant 555 timer?">
Visualizing Graphs in Program Outputhttp://www.apnorton.com/blog/2016/03/08/Visualizing-Graphs-in-Program-Output/2016-03-08T22:11:35.000Z2016-12-19T22:55:04.718Z<p>Many computer science problems utilize <a href="https://en.wikipedia.org/wiki/Graph_%28discrete_mathematics%29" target="_blank" rel="external">graph</a>-based data structures. Their use can range from explicit inclusion in an algorithm-centric problem (like path-finding) to a more “behind-the-scenes” presence in Bayesian networks or descriptions of finite automata. Unfortunately, visualizing large graphs can be difficult to do, especially for debugging. Unlike lists or dictionaries, which can be represented clearly by plain text printing, depicting a graph tends to require more graphics overhead than is reasonable for most programmers to write simply for debugging purposes. I’ve found that <a href="http://graphviz.org/" target="_blank" rel="external">Graphviz</a>, a free graph visualization utility, can be quite useful in debugging graph-related programs.</p>
<h2 id="Installing-Graphviz"><a href="#Installing-Graphviz" class="headerlink" title="Installing Graphviz"></a>Installing Graphviz</h2><p>If you’re on a Debian-based Linux OS (e.g. Ubuntu), you can install Graphviz using <code>apt-get</code>. Just run <code>$ sudo apt-get install graphviz</code> and you’ll have everything you need to complete the steps in this blog post. Mac OS X users can use <code>brew</code> equivalently.</p>
<p>Windows users should install using a binary downloaded from the <a href="http://graphviz.org/Download_windows.php" target="_blank" rel="external">Graphviz Windows page</a>, but there might be some issues with setting the <code>PATH</code> variable for running in the commandline.</p>
<h2 id="Making-a-basic-graph"><a href="#Making-a-basic-graph" class="headerlink" title="Making a basic graph"></a>Making a basic graph</h2><p>Once you’ve installed, the next thing you’ll want to do is create a basic graph to ensure the installation succeeded and to gain practice using Graphviz tools. We do this by creating a <code>*.dot</code> file that describes the graph we wish to display. If you’re the type of person who likes to jump right in and experiment first before reading too much, or if you love formal language specification, the <a href="http://www.graphviz.org/content/dot-language" target="_blank" rel="external">DOT grammar</a> is fairly readable and can give a quick introduction to creating DOT files.</p>
<p>The below is a fairly representative DOT file to demonstrate some of the capabilities of Graphviz. Open your favorite text editor, copy/paste it in, and save it as <code>firstgraph.dot</code>:</p>
<figure class="highlight java"><figcaption><span>firstgraph.dot</span><a href="/blog/downloads/code/firstgraph.dot">view raw</a></figcaption><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">digraph G {</span><br><span class="line"> <span class="comment">// Style information for nodes</span></span><br><span class="line"> A [style=<span class="string">"filled"</span>, color=<span class="string">".05 .3 1.0"</span>]</span><br><span class="line"></span><br><span class="line"> <span class="comment">// Edge declarations</span></span><br><span class="line"> A -> {B, C};</span><br><span class="line"> D -> E [label=<span class="string">"-1"</span>];</span><br><span class="line"> E -> F [label=<span class="string">"3"</span>];</span><br><span class="line"> F -> D [label=<span class="string">"10"</span>];</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>This creates a directed graph (also called a <em>digraph</em>) with six nodes and two connected components. Some of the edges have labels, and one of the nodes is colored. After you’ve copied (or downloaded) this file, open up a terminal to the directory with <code>firstgraph.dot</code> in it and run <code>$ dot firstgraph.dot -Tpng -o firstgraph.png</code>. The resulting image file should look something like the below:</p>
<img src="/blog/2016/03/08/Visualizing-Graphs-in-Program-Output/firstgraph.jpg" alt="The rendered `firstgraph.dot`" title="The rendered `firstgraph.dot`">
<a id="more"></a>
<p>What did that terminal command do? The <code>dot</code> utility is used for producing an image corresponding to a <em>directed</em> graph. (If you want to create the diagram for an undirected graph, consider using <code>neato</code>, or the other variants listed in <code>man dot</code>.) The <code>-Tpng</code> flag will produce PNG output image output, and <code>-o firstgraph.png</code> provides the output name. If you don’t include the <code>-o</code> flag, <code>dot</code> will send its output straight to <code>stdout</code>, which will produce a lot of garbage on the terminal.</p>
<h2 id="Creating-a-DOT-File-with-Python"><a href="#Creating-a-DOT-File-with-Python" class="headerlink" title="Creating a DOT File with Python"></a>Creating a DOT File with Python</h2><p>Most recently, I used Graphviz to depict the output of a graph coloring (approximation) algorithm within the register allocation routine for a compiler. I wanted to make sure that each pair of adjacent nodes never shared the same color; looking at an adjacency list and checking the coloring by hand would have been difficult; however, by having my program create a DOT file describing the graph coloring, I can check the results at a glance:</p>
<img src="/blog/2016/03/08/Visualizing-Graphs-in-Program-Output/rig_small.jpg" alt="Register interference graph (compiled with circo)" title="Register interference graph (compiled with circo)">
<p>This only required a few lines of Python code (see below), but produces very useful debugging information. The below assumes we represent a graph as a dictionary that maps a vertex label to a set of adjacent vertex labels (essentially an adjacency list, but more pythonic), and writes the output to a file named <code>rig.dot</code>.</p>
<figure class="highlight python"><figcaption><span>graphviz_dot_output.py</span><a href="/blog/downloads/code/graphviz_dot_output.py">view raw</a></figcaption><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">##</span></span><br><span class="line"><span class="comment"># export_graph()</span></span><br><span class="line"><span class="comment"># Saves the graph to a dot file</span></span><br><span class="line"><span class="comment"># graph : a mapping of vertices to sets of vertices</span></span><br><span class="line"><span class="comment"># (adjacency map form)</span></span><br><span class="line"><span class="comment">##</span></span><br><span class="line"><span class="function"><span class="keyword">def</span> <span class="title">export_graph</span><span class="params">(graph, color=None)</span>:</span></span><br><span class="line"> <span class="comment"># Map integer colorings to graphviz colors</span></span><br><span class="line"> cmap = { </span><br><span class="line"> <span class="number">0</span> : <span class="string">'brown'</span>,</span><br><span class="line"> <span class="number">1</span> : <span class="string">'maroon2'</span>, </span><br><span class="line"> <span class="number">2</span> : <span class="string">'orangered'</span>, </span><br><span class="line"> <span class="number">3</span> : <span class="string">'crimson'</span>,</span><br><span class="line"> <span class="number">4</span> : <span class="string">'lightseagreen'</span>,</span><br><span class="line"> <span class="number">5</span> : <span class="string">'gold'</span>, </span><br><span class="line"> <span class="number">6</span> : <span class="string">'cyan'</span>, </span><br><span class="line"> <span class="number">7</span> : <span class="string">'plum'</span>,</span><br><span class="line"> <span class="number">8</span> : <span class="string">'salmon'</span></span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="keyword">with</span> open(<span class="string">'rig.dot'</span>, <span class="string">'w'</span>) <span class="keyword">as</span> f:</span><br><span class="line"> f.write(<span class="string">'graph G {\n'</span>)</span><br><span class="line"></span><br><span class="line"> <span class="comment"># For each vertex u in the graph</span></span><br><span class="line"> <span class="keyword">for</span> u <span class="keyword">in</span> graph.keys():</span><br><span class="line"> <span class="comment"># Add coloring information</span></span><br><span class="line"> <span class="keyword">if</span> color: </span><br><span class="line"> f.write(<span class="string">' "%s" [color=%s, style=filled];\n'</span> % (u, cmap[color[u]]));</span><br><span class="line"></span><br><span class="line"> <span class="comment"># For each neighbor v of u, add the edge</span></span><br><span class="line"> <span class="keyword">for</span> v <span class="keyword">in</span> graph[u]: </span><br><span class="line"> <span class="keyword">if</span> (u < v):</span><br><span class="line"> f.write(<span class="string">' "%s" -- "%s";\n'</span> % (u, v))</span><br><span class="line"></span><br><span class="line"> f.write(<span class="string">'}\n'</span>)</span><br></pre></td></tr></table></figure>
<p>You’ll notice that the shape of the graph above is different than the shape of the first graph I showed. Instead of using <code>dot</code> to produce the output image for this graph, I instead used <code>circo</code> (which attempts to draw the graph using a circular layout) and used a command line argument to ensure <a href="http://stackoverflow.com/a/13420913/1110928" target="_blank" rel="external">nodes wouldn’t overlap</a>. The resulting command was <code>$ circo -Goverlap=scale rig.dot -Tpng -o rig.png</code>.</p>
<h2 id="Conclusion"><a href="#Conclusion" class="headerlink" title="Conclusion"></a>Conclusion</h2><p>Graphs are an integral data structure for many computer science problems, yet are usually difficult to represent pictorially. Graphviz can help reduce the amount of effort required to produce valuable debug output. In this post, I’ve provided a short example of the DOT format and some example code to output a graph in DOT form. </p>
<p>For users who want more detailed information on how to use Graphviz, I strongly recommend the <a href="https://www.ocf.berkeley.edu/~eek/index.html/tiny_examples/thinktank/src/gv1.7c/doc/dotguide.pdf" target="_blank" rel="external">DOT User’s Manual</a> and reading the manpage (run <code>$ man dot</code> in the terminal).</p>
<p>Many computer science problems utilize <a href="https://en.wikipedia.org/wiki/Graph_%28discrete_mathematics%29">graph</a>-based data structures. Their use can range from explicit inclusion in an algorithm-centric problem (like path-finding) to a more “behind-the-scenes” presence in Bayesian networks or descriptions of finite automata. Unfortunately, visualizing large graphs can be difficult to do, especially for debugging. Unlike lists or dictionaries, which can be represented clearly by plain text printing, depicting a graph tends to require more graphics overhead than is reasonable for most programmers to write simply for debugging purposes. I’ve found that <a href="http://graphviz.org/">Graphviz</a>, a free graph visualization utility, can be quite useful in debugging graph-related programs.</p>
<h2 id="Installing-Graphviz"><a href="#Installing-Graphviz" class="headerlink" title="Installing Graphviz"></a>Installing Graphviz</h2><p>If you’re on a Debian-based Linux OS (e.g. Ubuntu), you can install Graphviz using <code>apt-get</code>. Just run <code>$ sudo apt-get install graphviz</code> and you’ll have everything you need to complete the steps in this blog post. Mac OS X users can use <code>brew</code> equivalently.</p>
<p>Windows users should install using a binary downloaded from the <a href="http://graphviz.org/Download_windows.php">Graphviz Windows page</a>, but there might be some issues with setting the <code>PATH</code> variable for running in the commandline.</p>
<h2 id="Making-a-basic-graph"><a href="#Making-a-basic-graph" class="headerlink" title="Making a basic graph"></a>Making a basic graph</h2><p>Once you’ve installed, the next thing you’ll want to do is create a basic graph to ensure the installation succeeded and to gain practice using Graphviz tools. We do this by creating a <code>*.dot</code> file that describes the graph we wish to display. If you’re the type of person who likes to jump right in and experiment first before reading too much, or if you love formal language specification, the <a href="http://www.graphviz.org/content/dot-language">DOT grammar</a> is fairly readable and can give a quick introduction to creating DOT files.</p>
<p>The below is a fairly representative DOT file to demonstrate some of the capabilities of Graphviz. Open your favorite text editor, copy/paste it in, and save it as <code>firstgraph.dot</code>:</p>
<figure class="highlight java"><figcaption><span>firstgraph.dot</span><a href="/blog/downloads/code/firstgraph.dot">view raw</a></figcaption><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line">digraph G {</span><br><span class="line"> <span class="comment">// Style information for nodes</span></span><br><span class="line"> A [style=<span class="string">"filled"</span>, color=<span class="string">".05 .3 1.0"</span>]</span><br><span class="line"></span><br><span class="line"> <span class="comment">// Edge declarations</span></span><br><span class="line"> A -> {B, C};</span><br><span class="line"> D -> E [label=<span class="string">"-1"</span>];</span><br><span class="line"> E -> F [label=<span class="string">"3"</span>];</span><br><span class="line"> F -> D [label=<span class="string">"10"</span>];</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>This creates a directed graph (also called a <em>digraph</em>) with six nodes and two connected components. Some of the edges have labels, and one of the nodes is colored. After you’ve copied (or downloaded) this file, open up a terminal to the directory with <code>firstgraph.dot</code> in it and run <code>$ dot firstgraph.dot -Tpng -o firstgraph.png</code>. The resulting image file should look something like the below:</p>
<img src="/blog/2016/03/08/Visualizing-Graphs-in-Program-Output/firstgraph.jpg" alt="The rendered `firstgraph.dot`" title="The rendered `firstgraph.dot`">
Deranged Exams: An ICPC Problemhttp://www.apnorton.com/blog/2015/10/15/Deranged-Exams-An-ICPC-Problem/2015-10-16T00:24:00.000Z2016-12-19T22:55:04.614Z<p>This past week, my ICPC team worked the 2013 Greater New York Regional problem packet. One of my favorite problems in this set was Problem E: Deranged Exams. The code required to solve this problem isn’t that complicated, but the math behind it is a little unusual. In this post, I aim to explain the math and provide a solution to this problem.</p>
<h2 id="Problem-Description"><a href="#Problem-Description" class="headerlink" title="Problem Description"></a>Problem Description</h2><p>The <a href="http://acmgnyr.org/year2013/e.pdf" target="_blank" rel="external">full problem statement</a> is archived online; in shortened form, we can consider the problem to be:</p>
<blockquote>
<p>Given a “matching” test of $n$ questions (each question maps to exactly one answer, and no two questions have the same answer), how many possible ways are there to answer at least the first $k$ questions wrong?</p>
</blockquote>
<p>It turns out that there’s a really nice solution to this problem using a topic from combinatorics called “derangements.” (Note that the problem title was a not-so-subtle hint towards the solution.)</p>
<h2 id="Derangements"><a href="#Derangements" class="headerlink" title="Derangements"></a>Derangements</h2><p>While the idea of a permutation should be familiar to most readers, the closely related topic of a derangement is rarely discussed in most undergraduate curriculum. So, it is reasonable to start with a definition:</p>
<blockquote>
<p>A derangement is a permutation in which no element is in its original place. The number of derangements on $n$ elements is denoted $D_n$; this is also called the subfactorial of $n$, denoted $!n$. </p>
</blockquote>
<p>The sequence $\langle D_n\rangle$ is <a href="https://oeis.org/A000166" target="_blank" rel="external">A000166</a> in OEIS (a website with which, by the way, every competitive programmer should familiarize themselves).</p>
<p>It turns out that there is both a recursive and an explicit formula for $D_n$:</p>
<span>$$\begin{aligned}
D_n &= (-1)^n \sum_k\binom{n}{k} (-1)^k k! \\
&= n\cdot D_{n-1} + (-1)^n;\;(D_0=1)
\end{aligned}$$</span><!-- Has MathJax -->
<p>This is significant because we can use the explicit formulation for computing single values of derangements, or we can use dynamic programming to rapidly compute $D_n$ for relatively small $n$.</p>
<a id="more"></a>
<h2 id="Problem-Approach"><a href="#Problem-Approach" class="headerlink" title="Problem Approach"></a>Problem Approach</h2><p>The key observation here is that, using the derangement formula, we may compute the number of ways to answer a given set of questions incorrectly, using only the answers corresponding to those questions. Instead of focusing on the first $k$ questions, which we must answer incorrectly, let us look to the remaining $n-k$ questions.</p>
<p>Consider the case when we answer $r$ questions correctly. There are $\binom{n-k}{r}$ ways of choosing which $r$ questions we answer correctly (since the first $k$ must be wrong).</p>
<p>The remaining $n-r$ questions must be answered incorrectly using only the answers to the same $n-r$ questions. Using our knowledge of derangements, there are $!(n-r)$ ways to assign those incorrect answers.</p>
<p>Finally, note that the number of correct answers, $r$ is bounded by $n-k$; summing over all possible values of $r$, we obtain:</p>
<p>$$S(n, k) = \sum_{r=0}^{n-k} \binom{n-k}{r}\cdot !(n-r)$$</p>
<h2 id="Code"><a href="#Code" class="headerlink" title="Code"></a>Code</h2><p>Equations are great, but implementation is required for ICPC. First, we must consider input/output size. The problem statement gives the following ranges for $n$ and $k$:</p>
<p>$$\begin{aligned}<br> 1 \leq n \leq 17 \\<br> 0 \leq k \leq n<br>\end{aligned}$$</p>
<p>We can expect that this will fit in a 64-bit integer, as $n! \leq 2^{63}-1$ for $n\leq 20$. Thus, we don’t even need to be careful in computing binomial coefficients due to intermediate overflow! I’ll let the code (and comments) speak for itself:</p>
<figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br></pre></td><td class="code"><pre><span class="line">import java.util.*;</span><br><span class="line"> </span><br><span class="line"></span><br><span class="line"><span class="keyword">public</span> <span class="keyword">class</span> Test {</span><br><span class="line"> <span class="comment">// Basic iterative factorial; just multiply all</span></span><br><span class="line"> <span class="comment">// the numbers less than or equal to n.</span></span><br><span class="line"> <span class="comment">// returns 1 if n < 1 (which is important for n=0)</span></span><br><span class="line"> <span class="function"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">long</span> <span class="title">fact</span><span class="params">(<span class="keyword">int</span> n)</span> </span>{</span><br><span class="line"> <span class="keyword">long</span> retval = <span class="number">1</span>; </span><br><span class="line"> <span class="keyword">while</span>(n > <span class="number">0</span>) </span><br><span class="line"> retval *= n--;</span><br><span class="line"> <span class="keyword">return</span> retval;</span><br><span class="line"> }</span><br><span class="line"></span><br><span class="line"> <span class="comment">// Naive binomial coefficient computation </span></span><br><span class="line"> <span class="comment">// Generally, you need to watch overflow. But,</span></span><br><span class="line"> <span class="comment">// we can ignore that here because fact(17) < 2^63-1</span></span><br><span class="line"> <span class="function"><span class="keyword">private</span> <span class="keyword">static</span> <span class="keyword">long</span> <span class="title">binom</span><span class="params">(<span class="keyword">int</span> n, <span class="keyword">int</span> k)</span> </span>{</span><br><span class="line"> <span class="keyword">return</span> fact(n)/(fact(k)*fact(n-k));</span><br><span class="line"> }</span><br><span class="line"> </span><br><span class="line"> <span class="function"><span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> <span class="title">main</span><span class="params">(String[] args)</span> </span>{ </span><br><span class="line"> <span class="comment">//While not recommended in general, we can use </span></span><br><span class="line"> <span class="comment">// a scanner because we're not reading a lot of input.</span></span><br><span class="line"> Scanner <span class="built_in">cin</span> = <span class="keyword">new</span> Scanner(System.in);</span><br><span class="line"> </span><br><span class="line"> <span class="comment">// Precompute the derangement numbers</span></span><br><span class="line"> <span class="keyword">long</span>[] d = <span class="keyword">new</span> <span class="keyword">long</span>[<span class="number">18</span>]; <span class="comment">// we might need values of D_n up to n=17</span></span><br><span class="line"> d[<span class="number">0</span>] = <span class="number">1</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">1</span>, j=<span class="number">-1</span>; i < d.length; i++, j*=<span class="number">-1</span>)</span><br><span class="line"> d[i] = i*d[i<span class="number">-1</span>] + j;</span><br><span class="line"> <span class="comment">//Process the input</span></span><br><span class="line"> <span class="keyword">int</span> P = <span class="built_in">cin</span>.nextInt();</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> caseNum = <span class="number">0</span>; caseNum < P; caseNum++) {</span><br><span class="line"> <span class="built_in">cin</span>.nextInt();</span><br><span class="line"> <span class="keyword">int</span> n = <span class="built_in">cin</span>.nextInt();</span><br><span class="line"> <span class="keyword">int</span> k = <span class="built_in">cin</span>.nextInt();</span><br><span class="line"> </span><br><span class="line"> <span class="comment">//S(n, k) = sum(binom(n-k, r)*d[n-r], r=0..n-k)</span></span><br><span class="line"> <span class="keyword">long</span> ans = <span class="number">0</span>;</span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> r = <span class="number">0</span>; r <= n-k; r++)</span><br><span class="line"> ans += binom(n-k, r)*d[n-r];</span><br><span class="line"> </span><br><span class="line"> System.out.<span class="built_in">printf</span>(<span class="string">"%d %d\n"</span>, caseNum+<span class="number">1</span>, ans);</span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<h2 id="Further-Reference"><a href="#Further-Reference" class="headerlink" title="Further Reference"></a>Further Reference</h2><p>Derangements are discussed in Concrete Mathematics by Graham, Knuth, and Patashnik on pages 193-196. In those pages, the identities shown in this blog entry are derived. Also discussed is a closely related problem that may be called $r$-derangements.</p>
<p>In the $r$-derangement problem, we seek the number of arrangements in which exactly $r$ elements are in their original place. (The number of $0$-derangements, then, is just $D_n$.)</p>
<p>This past week, my ICPC team worked the 2013 Greater New York Regional problem packet. One of my favorite problems in this set was Problem E: Deranged Exams. The code required to solve this problem isn’t that complicated, but the math behind it is a little unusual. In this post, I aim to explain the math and provide a solution to this problem.</p>
<h2 id="Problem-Description"><a href="#Problem-Description" class="headerlink" title="Problem Description"></a>Problem Description</h2><p>The <a href="http://acmgnyr.org/year2013/e.pdf">full problem statement</a> is archived online; in shortened form, we can consider the problem to be:</p>
<blockquote>
<p>Given a “matching” test of $n$ questions (each question maps to exactly one answer, and no two questions have the same answer), how many possible ways are there to answer at least the first $k$ questions wrong?</p>
</blockquote>
<p>It turns out that there’s a really nice solution to this problem using a topic from combinatorics called “derangements.” (Note that the problem title was a not-so-subtle hint towards the solution.)</p>
<h2 id="Derangements"><a href="#Derangements" class="headerlink" title="Derangements"></a>Derangements</h2><p>While the idea of a permutation should be familiar to most readers, the closely related topic of a derangement is rarely discussed in most undergraduate curriculum. So, it is reasonable to start with a definition:</p>
<blockquote>
<p>A derangement is a permutation in which no element is in its original place. The number of derangements on $n$ elements is denoted $D_n$; this is also called the subfactorial of $n$, denoted $!n$. </p>
</blockquote>
<p>The sequence $\langle D_n\rangle$ is <a href="https://oeis.org/A000166">A000166</a> in OEIS (a website with which, by the way, every competitive programmer should familiarize themselves).</p>
<p>It turns out that there is both a recursive and an explicit formula for $D_n$:</p>
<span>$$\begin{aligned}
D_n &= (-1)^n \sum_k\binom{n}{k} (-1)^k k! \\
&= n\cdot D_{n-1} + (-1)^n;\;(D_0=1)
\end{aligned}$$</span><!-- Has MathJax -->
<p>This is significant because we can use the explicit formulation for computing single values of derangements, or we can use dynamic programming to rapidly compute $D_n$ for relatively small $n$.</p>
Eight Books on Math and Computer Sciencehttp://www.apnorton.com/blog/2015/06/09/Eight-Books-on-Math-and-Computer-Science/2015-06-09T04:41:23.000Z2016-12-19T22:55:04.614Z<p>A friend recently emailed me asking for titles of books I’d recommend to read over the summer, particularly to prepare for computer science and mathematics. I’ve adapted my suggestions into this post. I’d like to note that I’ve restricted my responses to “non-textbooks;” otherwise, I’d have several more additions that would increase the average page count and price quite drastically. As such, these books don’t have problems to work or present an extreme level of detail, but in many cases they present enough information to provide a strong foundation and context for math and CS classes.</p>
<h2 id="From-Mathematics-to-Generic-Programming"><a href="#From-Mathematics-to-Generic-Programming" class="headerlink" title="From Mathematics to Generic Programming"></a>From Mathematics to Generic Programming</h2><p><em>Alexander Stepanov and Daniel Rose (<a href="http://www.amazon.com/Mathematics-Generic-Programming-Alexander-Stepanov/dp/0321942043" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>I will most likely write a separate blog post about this book. I read it during the end of the fall semester and found that it presented a very interesting approach to designing reusable code by utilizing principles from abstract algebra. It’s written to be accessible by someone who hasn’t studied abstract algebra yet, which means it also can serve as an introduction to that subject.</p>
<h2 id="CODE-The-Hidden-Language-of-Computer-Hardware-and-Software"><a href="#CODE-The-Hidden-Language-of-Computer-Hardware-and-Software" class="headerlink" title="CODE: The Hidden Language of Computer Hardware and Software"></a>CODE: The Hidden Language of Computer Hardware and Software</h2><p><em>Charles Petzold (<a href="http://www.amazon.com/Code-Language-Computer-Hardware-Software/dp/0735611319/" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>Four years ago, I wrote <a href="http://robodesigners.blogspot.com/2011/04/code-hidden-language-of-computer.html" target="_blank" rel="external">a review</a> of this book on RoboDesigners. At that time, my perspective was that of a high school student and I thought the book was interesting; with the additional perspective of a year of college study in Computer Science, I cannot recommend this book highly enough.</p>
<p>By “building” a computer piece-by-piece from the idea of a relay through developing a simple assembly language, it covers nearly all of the material as the Digital Logic Design course I took, but in an easy-to-read book. If you comprehend the material in this book, you will be able to coast through DLD.</p>
<h2 id="A-Mathematician’s-Apology"><a href="#A-Mathematician’s-Apology" class="headerlink" title="A Mathematician’s Apology"></a>A Mathematician’s Apology</h2><p><em>G. H. Hardy (<a href="http://www.amazon.com/Mathematicians-Apology-Canto-Classics/dp/110760463X/" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>When a mathematician with Hardy’s stature writes a book on why he studies math, it’s probably advisable to read it! Multiple professors of mine have said it’s a book any mathematician should read and I wholeheartedly agree. It’s really short (the printing I’ve linked above is only 154 pages), but the content is amazing. Hardy addresses the complaints many have with pure math and embodies the spirit of “doing mathematics for mathematics’ sake.” If you are thinking about pursuing a theoretical route in either CS or math, I highly recommend you read this book.</p>
<a id="more"></a>
<h2 id="The-Code-Book"><a href="#The-Code-Book" class="headerlink" title="The Code Book"></a>The Code Book</h2><p><em>Simon Singh (<a href="http://www.amazon.com/Code-Book-Science-Secrecy-Cryptography/dp/0385495323/" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>I love codes–anything resembling secret or hidden knowledge has a particular allure. Singh does a great job discussing the past, present, and likely future of codes, ciphers, and cryptography in this book. Starting with Caesar Shift (what good historic code book doesn’t?) and traveling through centuries to discuss the Vigenère cipher, RSA encryption, and the general idea of quantum cryptography, this book gave me a broad understanding of where we are at now with codes and how we got here. It also includes the clearest description of the actual flaws in Enigma that I’ve ever read.</p>
<h2 id="In-Pursuit-of-the-Unknown-17-Equations-that-Changed-the-World"><a href="#In-Pursuit-of-the-Unknown-17-Equations-that-Changed-the-World" class="headerlink" title="In Pursuit of the Unknown: 17 Equations that Changed the World"></a>In Pursuit of the Unknown: 17 Equations that Changed the World</h2><p><em>Ian Stewart (<a href="http://www.amazon.com/Pursuit-Equations-That-Changed-World/dp/0465085989/" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>Every equation has a story–how was this truth discovered, who discovered it, and what exists now because of it? This book examines 17 equations and their impact on society. If you’re actually reading this blog, the early chapters will probably be old hat, but after (and including) Chapter 4 the content becomes quite interesting. </p>
<p>Reading Chapter 4 helped me visualize vector fields for Vector Calculus and Differential Equations in the context of planetary movement (and it finally made me understand the age-old analogy for dense objects warping the fabric of spacetime). The last chapter was also memorable for its description of a particular equation used in high-frequency stock trades, the misapplication of which Stewart claims was partially responsible for the 2008 recession.</p>
<h2 id="The-Music-of-the-Primes"><a href="#The-Music-of-the-Primes" class="headerlink" title="The Music of the Primes"></a>The Music of the Primes</h2><p><em>Marcus du Sautoy (<a href="http://www.amazon.com/Music-Primes-Searching-Greatest-Mathematics/dp/0062064010/" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>It is said that Gauss once asserted that “Mathematics is the queen of the sciences and number theory is the queen of mathematics.” This book outlines the history parts of this “queen of mathematics” that relate to prime numbers. It includes mini-biographies on the people who made great breakthroughs in the search for a formula for the nth prime; obviously the story isn’t finished yet, but it’s a pretty neat overview of how primes are important, who made them important, and other related topics.</p>
<p>As someone who finds Number Theory fascinating, I’d recommend this book especially to people who like Project Euler problems–not because it will help you solve the problems per-se, but because it provides some historical background to the people who developed the equations you use to solve Project Euler problems.</p>
<h2 id="To-Engineer-Is-Human"><a href="#To-Engineer-Is-Human" class="headerlink" title="To Engineer Is Human"></a>To Engineer Is Human</h2><p><em>Henry Petroski (<a href="http://www.amazon.com/To-Engineer-Is-Human-Successful/dp/0679734163" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>This is a wonderfully dry book–one that might bore some people, but I <em>loved</em> it. (I actually read this right after reading <em>CODE</em>.) Petroski discusses the role of failure in design, why the principles of engineering are intuitively “built-in” to the human brain, and how engineers must account for error in their designs. The most thought-provoking and the single most vivid idea that has stuck with me since reading this book was a connection he drew between computerized design and an increased failure rate in new products (toys, furniture, or even buildings).</p>
<h2 id="The-Abolition-of-Man"><a href="#The-Abolition-of-Man" class="headerlink" title="The Abolition of Man"></a>The Abolition of Man</h2><p><em>C. S. Lewis (<a href="http://www.amazon.com/The-Abolition-Man-C-Lewis/dp/0060652942" target="_blank" rel="external">on Amazon</a>)</em></p>
<p>C. S. Lewis is known primarily as the author of <em>The Chronicles of Narnia</em>, then as a Christian Apologist. So then, <em>why</em> am I asserting that this particular book is relevant to all computer scientists? In fact, this book is of particular interest to anyone working in a field of applied science. It provokes thought on why we’re solving the problems that we are and why we’re even interested in the sciences. In particular, it discusses the effect of a morality outside ourselves on the purpose of science; the third and final chapter paints a vivid picture of what happens when we reduce mankind simply to “nature.” This book bears re-reading, and perhaps someday I’ll write another blog post on why every scientist (applied or not) should read this book.</p>
<p>A friend recently emailed me asking for titles of books I’d recommend to read over the summer, particularly to prepare for computer science and mathematics. I’ve adapted my suggestions into this post. I’d like to note that I’ve restricted my responses to “non-textbooks;” otherwise, I’d have several more additions that would increase the average page count and price quite drastically. As such, these books don’t have problems to work or present an extreme level of detail, but in many cases they present enough information to provide a strong foundation and context for math and CS classes.</p>
<h2 id="From-Mathematics-to-Generic-Programming"><a href="#From-Mathematics-to-Generic-Programming" class="headerlink" title="From Mathematics to Generic Programming"></a>From Mathematics to Generic Programming</h2><p><em>Alexander Stepanov and Daniel Rose (<a href="http://www.amazon.com/Mathematics-Generic-Programming-Alexander-Stepanov/dp/0321942043">on Amazon</a>)</em></p>
<p>I will most likely write a separate blog post about this book. I read it during the end of the fall semester and found that it presented a very interesting approach to designing reusable code by utilizing principles from abstract algebra. It’s written to be accessible by someone who hasn’t studied abstract algebra yet, which means it also can serve as an introduction to that subject.</p>
<h2 id="CODE-The-Hidden-Language-of-Computer-Hardware-and-Software"><a href="#CODE-The-Hidden-Language-of-Computer-Hardware-and-Software" class="headerlink" title="CODE: The Hidden Language of Computer Hardware and Software"></a>CODE: The Hidden Language of Computer Hardware and Software</h2><p><em>Charles Petzold (<a href="http://www.amazon.com/Code-Language-Computer-Hardware-Software/dp/0735611319/">on Amazon</a>)</em></p>
<p>Four years ago, I wrote <a href="http://robodesigners.blogspot.com/2011/04/code-hidden-language-of-computer.html">a review</a> of this book on RoboDesigners. At that time, my perspective was that of a high school student and I thought the book was interesting; with the additional perspective of a year of college study in Computer Science, I cannot recommend this book highly enough.</p>
<p>By “building” a computer piece-by-piece from the idea of a relay through developing a simple assembly language, it covers nearly all of the material as the Digital Logic Design course I took, but in an easy-to-read book. If you comprehend the material in this book, you will be able to coast through DLD.</p>
<h2 id="A-Mathematician’s-Apology"><a href="#A-Mathematician’s-Apology" class="headerlink" title="A Mathematician’s Apology"></a>A Mathematician’s Apology</h2><p><em>G. H. Hardy (<a href="http://www.amazon.com/Mathematicians-Apology-Canto-Classics/dp/110760463X/">on Amazon</a>)</em></p>
<p>When a mathematician with Hardy’s stature writes a book on why he studies math, it’s probably advisable to read it! Multiple professors of mine have said it’s a book any mathematician should read and I wholeheartedly agree. It’s really short (the printing I’ve linked above is only 154 pages), but the content is amazing. Hardy addresses the complaints many have with pure math and embodies the spirit of “doing mathematics for mathematics’ sake.” If you are thinking about pursuing a theoretical route in either CS or math, I highly recommend you read this book.</p>
How to Learn Haskellhttp://www.apnorton.com/blog/2014/07/14/How-to-Learn-Haskell/2014-07-15T00:55:08.000Z2016-12-19T22:55:04.614Z<p>To grow my programming repertoire, I decided to learn a functional language; at the recommendation of a friend, I selected <a href="http://www.haskell.org/haskellwiki/Haskell" target="_blank" rel="external">Haskell</a>. Thus far, it seems great. As a mathematician at heart, I love the way that the notation and language constructs resemble math (list comprehensions, tuples, function composition, etc). In this blog post, I will outline the major resources I am using to learn Haskell.</p>
<p>To learn Haskell, I am using the ebook <a href="http://learnyouahaskell.com/chapters" target="_blank" rel="external"><em>Learn You a Haskell for Great Good</em></a>. Yes–terrible grammar in the title, but it’s (fairly) grammatically correct on the inside. This is a great introduction to Haskell, although I’d highly recommend prior knowledge of another programming language like Java or C++.</p>
<p>Unfortunately, that ebook is somewhat lacking in practice problems. It does have examples, but there isn’t a true “exercise” section like one would find in a textbook. This is a common fault with online programming tutorials; to be honest, creating a good exercise set is <em>hard</em> work. To remedy this problem, I turned to a favorite site of mine, <a href="http://www.hackerrank.com" target="_blank" rel="external">HackerRank.com</a>. While designed for competitive programmers, this site also has an “introductory” set of functional programming challenges (see <a href="https://www.hackerrank.com/categories/fp/intro" target="_blank" rel="external">here</a>). These range in difficulty from very easy to extremely hard. This provides a great compliment to the tutorial I referenced above.</p>
<p>Finally, one last resource I am going to use after finishing <em>Learn You a Haskell</em> is a <a href="http://shuklan.com/haskell/" target="_blank" rel="external">set of lectures</a> by former University of Virginia student-teacher <a href="http://shukla.io/" target="_blank" rel="external">Nishant Shukla</a>. Although I have not been able to view them in great detail, they appear to present a great introduction to Haskell.</p>
<p>To grow my programming repertoire, I decided to learn a functional language; at the recommendation of a friend, I selected <a href="http:
Factorization and Divisor Counthttp://www.apnorton.com/blog/2014/07/14/Factorization-and-Divisor-Count/2014-07-14T15:59:15.000Z2016-12-19T22:55:04.614Z<p>How many divisors are there of the number $1281942112$? It turns out that determining the answer to this problem is (at most) only as difficult as determining the prime factorization of the number. In this blog post, I will outline a solution to this (and similar) problems.</p>
<h2 id="The-Math"><a href="#The-Math" class="headerlink" title="The Math"></a>The Math</h2><p>The <a href="http://mathworld.wolfram.com/FundamentalTheoremofArithmetic.html" target="_blank" rel="external">Fundamental Theorem of Arithmetic</a> guarantees each positive integer greater than $1$ a unique prime factorization. We write this factorization as:</p>
<p>$$N = p_0^{e_0}p_1^{e_1}\cdots p_n^{e_n}$$</p>
<p>where $p_k$ is a prime number, and $e_k$ is its corresponding exponent. This provides us with useful information regarding divisors of $N$: any divisor of $N$ must be comprised of some combination of those prime factors (and exponents). Specifically, we can define the divisor, $D$, as:</p>
<p>$$D = p_0^{a_0}p_1^{a_1}\cdots p_n^{a_n}$$</p>
<p>where the $p_k$ are the same as in the factorization of $N$ and $a_k \in {0, 1, \ldots, e_k}$. To find the total number of divisors, we multiply together the number of options we have for each exponent. That is,</p>
<p>$$\text{Number of Divisors}\; = (e_0+1)(e_1+1)\cdots(e_n + 1)$$</p>
<p>Example: Consider $N = 20$. In this case, $N$ has $6$ divisors; to determine this without needing to list them all, we may note that $N = 2^2\cdot 5^1$. Using the notation described above, this means that $p_0 = 2,\;p_1 = 5$ and $e_0 = 2\;e_1 = 1$. Each of our divisors will be of the form $2^{a_0}\cdot 5^{a_1}$, where $a_0$ could be $0, 1,$ or $2$ and $a_1$ could be $0$ or $1$. Since we have $e_0+1 = 3$ options for $a_0$ and $e_1+1 = 2$ options for $a_1$, we have $3\cdot 2 = 6$ total divisors. In case you were wondering, the list of divisors is:</p>
<p>$${2^0 5^0, 2^1 5^0,2^2 5^0,2^0 5^1,2^1 5^1,2^2 5^1}$$</p>
<a id="more"></a>
<h2 id="The-Program"><a href="#The-Program" class="headerlink" title="The Program"></a>The Program</h2><p>We’re not out of the woods yet–we have a formula, but we need to write a program to make use of it. The first thing our program needs is a list of primes. I’m going to assume you have a function already that can generate a list of primes. A prime-listing function is an important tool in any programmer’s toolkit, but I’ll save that for a future post.</p>
<p>The pseudocode for our program is below:</p>
<figure class="highlight stata"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line">numberOfDivisors: int <span class="keyword">N</span> -> int divisorCount</span><br><span class="line"> divisorCount = 1</span><br><span class="line"> <span class="keyword">for</span> (p = 1 to <span class="built_in">floor</span>(<span class="built_in">sqrt</span>(<span class="keyword">N</span>)) && p prime):</span><br><span class="line"> exponent = 0 </span><br><span class="line"></span><br><span class="line"> <span class="comment">//Determine exponent of p in prime factorization</span></span><br><span class="line"> <span class="keyword">while</span> (p divides <span class="keyword">N</span>):</span><br><span class="line"> exponent++</span><br><span class="line"> <span class="keyword">N</span> = <span class="keyword">N</span> / p</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> <span class="comment">//Update divisorCount</span></span><br><span class="line"> divisorCount = divisorCount * (exponent + 1)</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> <span class="comment">//In this case, there is one prime factor greater than the square root of N</span></span><br><span class="line"> <span class="keyword">if</span> (<span class="keyword">N</span> != 1) divisorCount = divisorCount * 2</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"> <span class="keyword">return</span> divisorCount</span><br></pre></td></tr></table></figure>
<p>This is mostly straightforward: We iterate through all prime numbers less than the square root of N. For each prime, we determine how many times it divides N–this is that prime’s exponent. We then multiply the current divisor count by one more than the exponent. I have pushed an update to my <a href="https://github.com/apnorton/math" target="_blank" rel="external">math GitHub repository</a> that includes a Java version of this algorithm in NumberTheory.java.</p>
<p>If we kept track of which primes divide $N$ (for example, adding them to a List whenever we enter the while loop) this program is easily modified to output the prime factorization of a number.</p>
<h2 id="The-Analysis"><a href="#The-Analysis" class="headerlink" title="The Analysis"></a>The Analysis</h2><p>Before analyzing the performance of the algorithm, it would be best to explain why we only need to use primes less than $\sqrt{N}$, not primes less than $N$. This is because <strong>there can only be one prime factor of $N$ greater than $\sqrt{N}$, and (if there is one) it must have only be raised to the $1$st power.</strong> A proof by contradiction works well here (I’m skipping some rigor, please don’t kill me):</p>
<blockquote>
<p>Assume that there are two prime factors (not necessarily unique) $p$ and $q$ of $N$, such that $p,q \gt \sqrt{N}$. Let the product of the remaining prime factors be some integer $m$. Then we have:</p>
<p> $$\begin{align}<br> N &= p\cdot q\cdot m \<br> &\le p\cdot q \<br> &\lt \sqrt{N}\sqrt{N}\<br> &\lt N\end{align}$$</p>
<p>This is clearly a contradiction, thus we have proven that there cannot be at least two prime factors of $N$ greater than $\sqrt{N}$. Equivalently, there may only at most one prime factor of $N$ greater than $\sqrt{N}$.</p>
</blockquote>
<p>This explains why we don’t need to use primes greater than $\sqrt{N}$: if, after “dividing out” all primes less than $\sqrt{N}$, we are left with a number, then that number must be the single prime factor of $N$ greater than $\sqrt{N}$.</p>
<p>On to the performance of the algorithm. Assuming we are <em>given</em> a list of prime numbers (and don’t have to compute them), this procedure has a time complexity of $\mathcal{O}\left(\pi\left(\sqrt{N}\right)\text{lg}(N)\right)$ and $\Omega\left(\pi\left(\sqrt{N}\right)\right)$, where $\pi(x)$ is the <a href="http://mathworld.wolfram.com/PrimeCountingFunction.html" target="_blank" rel="external">prime counting function</a> and $N$ is the input number. ($\Omega$ provides a lower bound, and $\mathcal{O}$ provides an upper bound.) Let’s see why.</p>
<p>We have an outer-most for loop that makes one iteration for each prime less than $\sqrt{N}$. This gives us the “$\pi\left(\sqrt{N}\right)$” part of the bounds. For the lower bound, we would assume the inside of the for loop executes in constant time, every time. (That is, we never enter the while loop.) This occurs when $N$ is a prime number. For the upper bound, we may assume that we execute the while loop at most $\text{lg}(N)$ times each iteration. This is because $\text{lg}(N) = \log_2(N) \gt \log_b(N)$ for $N \gt 2$ and integer $b\gt 2$, and $\lfloor\log_{p_k}(N)\rfloor$ provides a fairly close upper bound on the exponent of $p_k$ in the prime factorization of $N$.</p>
<p>The upper bound can be improved by performing some summation and simplification, but it’s close enough to show how fast this algorithm is.</p>
<p>How many divisors are there of the number $1281942112$? It turns out that determining the answer to this problem is (at most) only as difficult as determining the prime factorization of the number. In this blog post, I will outline a solution to this (and similar) problems.</p>
<h2 id="The-Math"><a href="#The-Math" class="headerlink" title="The Math"></a>The Math</h2><p>The <a href="http://mathworld.wolfram.com/FundamentalTheoremofArithmetic.html">Fundamental Theorem of Arithmetic</a> guarantees each positive integer greater than $1$ a unique prime factorization. We write this factorization as:</p>
<p>$$N = p_0^{e_0}p_1^{e_1}\cdots p_n^{e_n}$$</p>
<p>where $p_k$ is a prime number, and $e_k$ is its corresponding exponent. This provides us with useful information regarding divisors of $N$: any divisor of $N$ must be comprised of some combination of those prime factors (and exponents). Specifically, we can define the divisor, $D$, as:</p>
<p>$$D = p_0^{a_0}p_1^{a_1}\cdots p_n^{a_n}$$</p>
<p>where the $p_k$ are the same as in the factorization of $N$ and $a_k \in {0, 1, \ldots, e_k}$. To find the total number of divisors, we multiply together the number of options we have for each exponent. That is,</p>
<p>$$\text{Number of Divisors}\; = (e_0+1)(e_1+1)\cdots(e_n + 1)$$</p>
<p>Example: Consider $N = 20$. In this case, $N$ has $6$ divisors; to determine this without needing to list them all, we may note that $N = 2^2\cdot 5^1$. Using the notation described above, this means that $p_0 = 2,\;p_1 = 5$ and $e_0 = 2\;e_1 = 1$. Each of our divisors will be of the form $2^{a_0}\cdot 5^{a_1}$, where $a_0$ could be $0, 1,$ or $2$ and $a_1$ could be $0$ or $1$. Since we have $e_0+1 = 3$ options for $a_0$ and $e_1+1 = 2$ options for $a_1$, we have $3\cdot 2 = 6$ total divisors. In case you were wondering, the list of divisors is:</p>
<p>$${2^0 5^0, 2^1 5^0,2^2 5^0,2^0 5^1,2^1 5^1,2^2 5^1}$$</p>
"Big-O" notation: An Introduction to Asymptotics of Loopshttp://www.apnorton.com/blog/2014/06/09/Big-O-notation-An-Introduction-to-Asymptotics-of-Loops/2014-06-10T03:28:45.000Z2016-12-19T22:55:04.614Z<p>Algorithmic efficiency is imperative for success in programming competitions; your programs must be accurate and fast. To help evaluate algorithms for speed, computer scientists focus on what is called “asymptotics,” or “asymptotic analysis.” The key question answered by asymptotics is: <strong>“When your input gets <em>really</em> big, how many steps does your program take?”</strong> This post seeks to explain basic asymptotic analysis and its application to computing simple program runtime.</p>
<p>The underlying principle of asymptotic analysis is that a program’s runtime depends on the number of <em>elementary operations</em> it performs. The fewer elementary operations, the faster the program (and vice-versa). What do I mean by “elementary operation?” By this, I refer to any operation such that the runtime is not affected by the input size. This is more commonly referred to as a <em>constant-time</em> operation. Examples of such operations are assignment, basic arithmetic operations (<code>+, -, *, /, %</code>), accessing an array element, increment/decrement operations, function returns, and boolean expressions. </p>
<h2 id="A-First-Example"><a href="#A-First-Example" class="headerlink" title="A First Example"></a>A First Example</h2><p>So, a good way of gauging the runtime of a program is to count the number of elementary operations it performs. Let’s jump right in by analyzing a simple program. </p>
<figure class="highlight aspectj"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">static</span> <span class="function"><span class="keyword">int</span> <span class="title">test</span><span class="params">(<span class="keyword">int</span> N)</span> </span>{</span><br><span class="line"> <span class="keyword">int</span> i = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">while</span>(i < N) {</span><br><span class="line"> i++;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> i;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>Obviously, this program always returns $N$, so the loop is unnecessary. However, let’s just analyze the method as-is.</p>
<p>Lines 2 and 7 each contribute one constant-time operation. The loop contributes two constant-time operations per iteration (one for the comparison, one for the increment), plus one extra constant-time operation for the final comparison that terminates the loop. So, the total number of operations is:</p>
<p>$$1 + 1 + \underbrace{\sum_{i = 0}^N 2}_{\text{loop operations}} + 1 = 3 + 2N$$</p>
<p>(Notice how I used sigma (summation) notation for counting a loop’s operation. This is useful, because loops and sigma notation behave in much the same way.)</p>
<p>Thus, it will take $3+2N$ operations to perform that method, given an input $N$. If each operation takes $2\times 10^{-9}$ (about the speed of a 2 GHz processor), it would take 5 seconds to run this program for an input of $N=10^{10}$.</p>
<a id="more"></a>
<h2 id="Let’s-make-that-easier…"><a href="#Let’s-make-that-easier…" class="headerlink" title="Let’s make that easier…"></a>Let’s make that easier…</h2><p>That was a lot of work for such a simple result; is there an easier way to get a similar answer? Fortunately, the answer is <strong><em>yes!</em></strong></p>
<p>First, let us introduce something that we will call “Big-O notation.” This is a way of describing the long-term growth of a function. The rigorous definition of Big-O is beyond the scope of this blog, but the following should suffice:</p>
<blockquote>
<p>We say $f(n)$ is $\mathcal{O}(g(n))$ if and only if a constant multiple of $g(n)$ is greater than $f(n)$, when $n$ is sufficiently large. Simply put, this means that, in the long term, $g$ grows as fast or faster than $f$.</p>
</blockquote>
<p>As an example, we can say that $f(n) = 3n+2$ is $\mathcal{O}(n)$, because the function $g(n) = n$ grows exactly as fast as $f(n)$. Or, we can say, $f$ is $\mathcal{O}(n^2)$, because $n^2$ grows faster than $f$, for sufficiently large $n$. Basically, this means we can ignore two things:</p>
<ol>
<li>We can ignore anything that is “small” in the long-term. For example, if $f(x) = 4x^3 + 2x + 3 + \frac{1}{x}$, everything except the “$4x^3$” part becomes small (in comparison) as $x$ gets big.</li>
<li>We can also ignore coefficients. That is, we don’t have to worry about the difference between $4x^3$ and $x^3$. As $x$ gets really big, the two graphs are so close that it doesn’t really matter.</li>
</ol>
<p>To apply this to algorithm analysis, this means that we only have to worry about the “biggest time-user,” rather than all the individual steps. For most simple programs, this means focusing on loops. (In advanced problems, you must account for recursion.)</p>
<p>Next, we recall that a single loop can be represented with a single summation sign. One can fairly quickly see that a nested loop can be represented with a “sum of sums,” or multiple, nested summation signs. It can be proven that:</p>
<p>$$ \underbrace{\sum_N\left(\sum_N\left(\cdots\sum_N f(N)\right)\right)}_{k \text{ summation signs}} = \mathcal{O}(N^k\cdot f(N)) $$</p>
<blockquote>
<p><strong>Important Result</strong><br>Interpreted into programmer-speak, this means that <em>a program with nested loops (each executing ~$N$ times) to a maximum depth of $k$ will take $\mathcal{O}(N^k)$ operations to complete said loops.</em></p>
</blockquote>
<h2 id="Another-Example"><a href="#Another-Example" class="headerlink" title="Another Example"></a>Another Example</h2><p>So, let’s apply this idea to a bit more complicated program:</p>
<figure class="highlight hsp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line">public static <span class="keyword">int</span> test(<span class="keyword">int</span> N) {</span><br><span class="line"> <span class="keyword">int</span> total = <span class="number">0</span><span class="comment">;</span></span><br><span class="line"> </span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> i = <span class="number">0</span><span class="comment">; i < N; i++) {</span></span><br><span class="line"> <span class="keyword">for</span> (<span class="keyword">int</span> j = i<span class="comment">; j < N; j++) {</span></span><br><span class="line"> total++<span class="comment">;</span></span><br><span class="line"> }</span><br><span class="line"> }</span><br><span class="line"> </span><br><span class="line"> <span class="keyword">return</span> total<span class="comment">;</span></span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>Now we have a nested loop! Looking at this program, we realize that the “deepest” nesting is only $2$ deep. Thus, by our important result, we know that this program runs in $\mathcal{O}(N^2)$ time.</p>
<p>This means, that as $N$ gets very large, doubling the input will result in a <em>quadruple</em> increase in runtime.</p>
<h2 id="Other-Notes"><a href="#Other-Notes" class="headerlink" title="Other Notes"></a>Other Notes</h2><p>Obviously, there are more cases that can arise in algorithm analysis, instead of the simple loops given above. For example, recursion and atypical loops (e.g. loops that double the counter each iteration, rather than adding one) require other methods than the “Important Result” I gave here. Fortunately, there are a few common designations that arise:</p>
<p>$$ \mathcal{O}(\log_2(n)),\;\mathcal{O}(n^k),\;\mathcal{O}(2^n),\;\mathcal{O}(n!),\;\mathcal{O}(n^n) $$</p>
<p>I will note that I have written the above in increasing order of runtime. That is, an algorithm that runs in $\mathcal{O}(\log_2(n))$ is faster than one that runs in $\mathcal{O}(2^n)$, etc.</p>
<p>One can spend many hours studying asymptotic calculations. In fact, there’s an entire chapter devoted to this in Concrete Mathematics by Graham, Knuth, and Patashink. (I <em>highly</em> recommend this book to anyone interested in programming; it is, quite literally, the best book I have ever opened related to computer science.) For a thorough guide of the application of asymptotic calculations to programs, I recommend consulting a good Algorithms and Data Structures text.</p>
<p>Algorithmic efficiency is imperative for success in programming competitions; your programs must be accurate and fast. To help evaluate algorithms for speed, computer scientists focus on what is called “asymptotics,” or “asymptotic analysis.” The key question answered by asymptotics is: <strong>“When your input gets <em>really</em> big, how many steps does your program take?”</strong> This post seeks to explain basic asymptotic analysis and its application to computing simple program runtime.</p>
<p>The underlying principle of asymptotic analysis is that a program’s runtime depends on the number of <em>elementary operations</em> it performs. The fewer elementary operations, the faster the program (and vice-versa). What do I mean by “elementary operation?” By this, I refer to any operation such that the runtime is not affected by the input size. This is more commonly referred to as a <em>constant-time</em> operation. Examples of such operations are assignment, basic arithmetic operations (<code>+, -, *, /, %</code>), accessing an array element, increment/decrement operations, function returns, and boolean expressions. </p>
<h2 id="A-First-Example"><a href="#A-First-Example" class="headerlink" title="A First Example"></a>A First Example</h2><p>So, a good way of gauging the runtime of a program is to count the number of elementary operations it performs. Let’s jump right in by analyzing a simple program. </p>
<figure class="highlight aspectj"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">public</span> <span class="keyword">static</span> <span class="function"><span class="keyword">int</span> <span class="title">test</span><span class="params">(<span class="keyword">int</span> N)</span> </span>{</span><br><span class="line"> <span class="keyword">int</span> i = <span class="number">0</span>;</span><br><span class="line"></span><br><span class="line"> <span class="keyword">while</span>(i < N) {</span><br><span class="line"> i++;</span><br><span class="line"> }</span><br><span class="line"> <span class="keyword">return</span> i;</span><br><span class="line">}</span><br></pre></td></tr></table></figure>
<p>Obviously, this program always returns $N$, so the loop is unnecessary. However, let’s just analyze the method as-is.</p>
<p>Lines 2 and 7 each contribute one constant-time operation. The loop contributes two constant-time operations per iteration (one for the comparison, one for the increment), plus one extra constant-time operation for the final comparison that terminates the loop. So, the total number of operations is:</p>
<p>$$1 + 1 + \underbrace{\sum_{i = 0}^N 2}_{\text{loop operations}} + 1 = 3 + 2N$$</p>
<p>(Notice how I used sigma (summation) notation for counting a loop’s operation. This is useful, because loops and sigma notation behave in much the same way.)</p>
<p>Thus, it will take $3+2N$ operations to perform that method, given an input $N$. If each operation takes $2\times 10^{-9}$ (about the speed of a 2 GHz processor), it would take 5 seconds to run this program for an input of $N=10^{10}$.</p>
Green's Theorem and the Area of Polygonshttp://www.apnorton.com/blog/2014/06/05/Greens-Theorem-and-The-Area-of-Polygons/2014-06-05T22:48:45.000Z2016-12-19T22:55:04.614Z<p>I am an avid member of the <a href="http://math.stackexchange.com" target="_blank" rel="external">Math.StackExchange</a> community. We have recently reached a milestone, as our request to create a site blog has been approved by the Stack Exchange administration. I volunteered to write a post which I believe should be useful to competition programmers.</p>
<p>Using Green’s Theorem, <a href="http://math.blogoverflow.com/2014/06/04/greens-theorem-and-area-of-polygons/" target="_blank" rel="external">this post</a> derives a formula for the area of any simple polygon, dependent solely on the coordinates of the vertices. This is useful for some computational geometry problems in programming; for example, the formula can be used to compute the area of the <a href="https://en.wikipedia.org/wiki/Convex_hull" target="_blank" rel="external">convex hull</a> of a set of points.</p>
<p>I am an avid member of the <a href="http://math.stackexchange.com" target="_blank" rel="external">Math.StackExchange</a> community. We h
Hello World!http://www.apnorton.com/blog/2014/05/23/Hello-World/2014-05-24T03:58:00.000Z2016-12-19T22:55:04.614Z<p>After managing a fairly successful blog for many years about competitive robotics, I am attempting to re-brand myself as I begin my studies in the field of Computer Science and Mathematics.</p>
<p>This blog will be the place I post interesting pieces of code I either develop or find, as well as math concepts useful to competitive programmers. This blog will focus heavily on ACM-style competitions, and may occasionally contain hints or my solutions to problems from sites like USACO or UVA Online Judge. I will attempt to post most of my code from here on a GitHub repository, but I’m still experimenting with that.</p>
<p>Java is my “native language,” although I know Visual BASIC, C++ to a fair extent (few people can actually say they “really know” C++), and some Python. I’ll try to mix up the languages I post (I’ll have a tag for each language), but I predict most of my posts will be Java-oriented.</p>
<p>As always, topic suggestions are welcome via comments on any post.</p>
<p>After managing a fairly successful blog for many years about competitive robotics, I am attempting to re-brand myself as I begin my studi