<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ethan Fast &#187; Computer Science</title>
	<atom:link href="http://blog.ethanjfast.com/category/computer-science/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.ethanjfast.com</link>
	<description>Lambdas, Hacks, and Fiction</description>
	<lastBuildDate>Fri, 27 Aug 2010 12:50:02 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	
		<item>
		<title>CS Grad School Applications</title>
		<link>http://blog.ethanjfast.com/2010/08/cs-grad-school-applications/</link>
		<comments>http://blog.ethanjfast.com/2010/08/cs-grad-school-applications/#comments</comments>
		<pubDate>Tue, 24 Aug 2010 01:21:53 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Applications]]></category>
		<category><![CDATA[computer science grad school]]></category>
		<category><![CDATA[CS grad school]]></category>
		<category><![CDATA[CS graduate school]]></category>
		<category><![CDATA[CS Phd]]></category>
		<category><![CDATA[Grad School]]></category>
		<category><![CDATA[School]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=494</guid>
		<description><![CDATA[For a few years now, I&#8217;ve suspected that I would be applying for graduate programs in computer science. Since late freshman year, then, I&#8217;ve been absorbing and collecting information related to the process from any place that such might be acquired. Although undoubtably the best way to get insight about grad school is to talk to [...]]]></description>
			<content:encoded><![CDATA[<p>For a few years now, I&#8217;ve suspected that I would be applying for graduate programs in computer science. Since late freshman year, then, I&#8217;ve been absorbing and collecting information related to the process from any place that such might be acquired. Although undoubtably the best way to get insight about grad school is to talk to professors or current grad students, a vast amount of high quality information is available on the web. Notably, I&#8217;ve collected and remembered those pieces that have been most helpful:</p>
<p><span style="text-decoration: underline;">Grad School Application Advice:</span></p>
<ul>
<li><a href="http://www.cs.cmu.edu/~harchol/gradschooltalk.pdf">CS Professor Mor Harchol-Balter at Carnegie Mellon</a></li>
<li><a href="http://www.stanford.edu/~pgbovine/grad-school-app-tips.htm">Phillip Guo (a Stanford CS grad student)</a></li>
<li><a href="http://pages.cs.wisc.edu/~gleicher/Web/Advice/GradSchoolFAQ">CS Professor Mike Gleicher at Wisconsin</a></li>
<li><a href="http://matt.might.net/articles/how-to-apply-and-get-in-to-graduate-school-in-science-mathematics-engineering-or-computer-science/">CS Professor Matt Might at Utah</a></li>
</ul>
<p>It also may be useful to take a look at a few forums. The quality of information here is far less consistent, but there are certainly useful facts to be gleaned.</p>
<p><span style="text-decoration: underline;">Forums:</span></p>
<ul>
<li><a href="http://forum.thegradcafe.com/">The Grad Cafe</a></li>
<li><a href="http://forum.thegradcafe.com/">Urch CS Admission Forums</a></li>
<li><a href="http://talk.collegeconfidential.com/graduate-school/">Graduate School Forum on College Confidential</a></li>
</ul>
<p>I&#8217;ll resist editorializing the insights I&#8217;ve gained form all these resources, but the links I&#8217;ve provided should give anyone interested in CS graduate school a good starting point for developing their own opinions.</p>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=494" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2010/08/cs-grad-school-applications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Allure of the Asymmetrical</title>
		<link>http://blog.ethanjfast.com/2010/03/the-allure-of-the-asymmetrical/</link>
		<comments>http://blog.ethanjfast.com/2010/03/the-allure-of-the-asymmetrical/#comments</comments>
		<pubDate>Sun, 28 Mar 2010 14:35:24 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[C]]></category>
		<category><![CDATA[Clojure]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Functional Programming]]></category>
		<category><![CDATA[Gajure]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[Syntax]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=407</guid>
		<description><![CDATA[Thoughts on code asymmetry, as inspired by the lowly egg. Although I&#8217;ve lived in Charlottesville for quite a while now, I haven&#8217;t really taken advantage of the numerous and various local farms. This is perhaps odd, given my health-obsessive nature, and our local prevalence of natural, grass-fed animal products. But whatever the impediment to my [...]]]></description>
			<content:encoded><![CDATA[<p><em><span style="color: #808080;">Thoughts on code asymmetry, as inspired by the lowly egg.</span></em></p>
<p>Although I&#8217;ve lived in Charlottesville for quite a while now, I haven&#8217;t really taken advantage of the numerous and various local farms. This is perhaps odd, given my health-obsessive nature, and our local prevalence of natural, grass-fed animal products. But whatever the impediment to my action &#8212; let us suppose  schoolwork, research, or entrepreneurial activity &#8212; I finally got around to patronizing <a href="http://www.averysbranchfarms.com/">Avery&#8217;s Branch Farms</a> last week.</p>
<p>So what does this have to do with asymmetry?  Well, I&#8217;ll be short and banal and to the point: I found that the &#8220;natural&#8221;  Avery eggs look nicer than their industrial-farm begotten counterparts. Consider Avery eggs:</p>
<p style="text-align: left;"><img class="aligncenter" title="Avery Eggs" src="http://ethanjfast.com/images/egg.jpg" alt="Avery Eggs" width="500" /></p>
<p style="text-align: left;">Against that of the typical no-name brand:</p>
<p style="text-align: center;"><img class="aligncenter" title="Eggs Normal" src="http://ethanjfast.com/images/eggs_bad.jpg" alt="Eggs Normal" width="493" height="335" /></p>
<p style="text-align: left;">At least to my eye, the local farm&#8217;s eggs look quite a bit more appealing, and not simply because of differences in color or lighting. To my unartistic perception, I would suggest that the appeal stems directly from asymmetry. If you look closely at the farm eggs, you might see that there are small but noticeable deviations in size across the carton, a slight variance in color, and a psuedo-random sprinkling of freckles across shell exteriors. This is all in comparison to the bleached look of the industrial eggs, with little or no variation in egg-size across the carton.</p>
<p style="text-align: left;">Rest assured, lest I be accused of inane ramblings on this wonderful subject, that there is a larger point here. I am guessing that a slight asymmetry plays nicer with human perception. It might draw our eyes to important details, and make it easier for us to perceive the totality of written information. In short, asymmetrical perception may may implications for how we write code.</p>
<p style="text-align: left;">Consider the comparison of a C function (taken sort-of randomly from <a href="http://github.com/git/git">Git</a>) and a lisp function (taken from <a href="http://github.com/Ejhfast/Gajure">Gajure</a>). First let&#8217;s look at <em>add_files_to_cache</em>:<br />
<script src="http://gist.github.com/346771.js?file=function_from_git.c"></script> </p>
<p>Reading from the left, this function is actually quite symmetrical. The eye is not overly drawn to any particular piece of code within the curly braces &#8212; perhaps the longer lines, if anything &#8212; and functional properties are not clearly conveyed through it&#8217;s structure. Much of this, although not all of it, has to do with the imperative nature of C.  Now, consider a second function, <em>list-crossover</em>:      <script src="http://gist.github.com/346774.js?file=list-crossover.clj"></script></p>
<p>Here we see a nesting of sorts. Although this is a simpler function, to be sure, we can see by way of indentation certain functional qualities of the code (say, that <em>take</em> and <em>drop</em> are applied in the context of <em>concat</em>). The functional nature of clojure (or any lisp) lends itself well to this kind of visual deconstruction, and at least to my eye, meaning is more readily conveyed through such asymmetrical properties.</p>
<p>In closing, I don&#8217;t mean to imply that one language is better than another at conveying meaning through syntax and symmetry, merely to suggest that the way code is written (surprise, surprise, no?) has quite a bit to do with how easily it can be understood. To that end, I think symmetry, or a lack thereof, plays an important role.</p>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=407" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2010/03/the-allure-of-the-asymmetrical/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Analyzing Word Frequencies with Clojure, Enlive and Incanter</title>
		<link>http://blog.ethanjfast.com/2010/03/analyzing-word-frequencies-with-clojure-enlive-and-incanter/</link>
		<comments>http://blog.ethanjfast.com/2010/03/analyzing-word-frequencies-with-clojure-enlive-and-incanter/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 18:39:48 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Enlive]]></category>
		<category><![CDATA[Incanter]]></category>
		<category><![CDATA[Wordy]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=381</guid>
		<description><![CDATA[I&#8217;ve long been interested in getting a better feel for Incanter, a statistical computing and graphical environment for Clojure. So gifted with the fleeting favors of my muse (otherwise known as free time), I thought I&#8217;d put together a small library &#8212; although it&#8217;s not quite a library, yet &#8212; for analyzing word-use patterns on blogs and webpages. To [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve long been interested in getting a better feel for <a href="http://incanter.org/">Incanter</a>, a statistical computing and graphical environment for Clojure. So gifted with the fleeting favors of my muse (otherwise known as <em>free time</em>), I thought I&#8217;d put together a small library &#8212; although it&#8217;s not quite a library, yet &#8212; for analyzing word-use patterns on blogs and webpages.</p>
<p>To do this, I drew a bit of help from <a href="http://github.com/cgrand/enlive">Enlive</a>, which functions primarily as a templating library, but has a few features useful for screen-scraping. This was perhaps a bit of overkill, as I only ended up using one of it&#8217;s functions, <em>html-resource</em>, which takes an URL as input, and outputs an hash that nicely represents a web-page&#8217;s structure.</p>
<p>What I ended up is <a href="http://github.com/Ejhfast/wordy">wordy</a>, which at the moment can do a simple word-count frequency analysis on a given page. That is, it counts how often words used, filtering (if desired) on word length. In just a bit, I&#8217;ll get into some of the more interesting aspects of coding it  up, but first,  here is a simple use case.</p>
<p>Running the following in slime&#8230;<br />
<code>(graph-words "http://ethanjfast.com" 5 5 1)</code></p>
<p style="text-align: center;"><img class="aligncenter" title="As applied to this blog." src="/images/ethanjfast.com.png" alt="" width="500" /></p>
<p style="text-align: left;">Where the parameters correspond to:</p>
<ul>
<li>ethanjfast.com -&gt; web page to look at</li>
<li>5 -&gt; minimum length (letter count) of word for first anaylsis</li>
<li>5 -&gt; minimum length of word for last anaylsis</li>
<li>1 -&gt; the amount of word length to increment by between the first and last anaylsis</li>
</ul>
<p>To make this a bit clearer, consider a different run:<br />
<code>(graph-words "http://ycombinator.posterous.com" 3 10 3)</code></p>
<p style="text-align: center;"><img class="aligncenter" title="Ycombinator Run" src="/images/ycom2.png" alt="" width="500" /></p>
<p style="text-align: left;">Here wordy does three analyses, with minimum word lengths of 3, 6, and 9 respectively. Clearly, I have some work to do insofar as these graphs look rather pathetic, but it was nice to get incanter working.</p>
<p style="text-align: left;">Now, onto some implementation details. Most of the code is quite simple, so I&#8217;ll just go through a few functions that may have some value to someone learning Clojure. For instance, here is <em>rec-map</em>, a function which recursively traverses the map/list structure returned by <em>html-resource</em>.</p>
<script src="http://gist.github.com/325414.js"></script>
<p>Basically, this function filters out all page content that doesn&#8217;t match specific tags (getting rid of links, css, javascript, ect.) But at first glance, you might wonder why I used <em>trampoline</em> rather than <em>recur</em>. After all, <em>trampoline</em> is used to recurse between two different functions, and it looks very much like <em>rec-map</em> is calling itself. Well, the trick is that I am calling <em>trampoline</em> inside the function passed to map, so <em>recur</em> will fail spectacularly (and in a very confusing manner). So watch out for recursion within anonymous functions!</p>
<p>Here is another bit of code, where I create the graph with Incanter.</p>
<script src="http://gist.github.com/325432.js"></script>
<p>The :group-by parameter is slightly unintuitive. To use it, you make a new vector of labels, each label mapping to a counterpart in the data vector. All data with the same label are then put into the same group (e.g for data ["You" "Me" "I"] [3 2 4] one might use the label vector [0 1 1] to group &#8220;Me&#8221; and &#8220;I&#8221; together). The rest is fairly self-explanatory, but I&#8217;ll mention one thing that I didn&#8217;t know until this morning. You can&#8217;t nest the # function shortcut. For instance, the following would not work:</p>
<p><code>(map #(map #(first %1) %1) lst)</code></p>
<p>It&#8217;s rather obvious in retrospect, I know. But I was dumb enough to try it. That&#8217;s all for now, and the code is available on <a href="http://github.com/Ejhfast/wordy">github</a>.</p>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=381" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2010/03/analyzing-word-frequencies-with-clojure-enlive-and-incanter/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Gajure Now on Clojars</title>
		<link>http://blog.ethanjfast.com/2010/02/gajure-now-on-clojars/</link>
		<comments>http://blog.ethanjfast.com/2010/02/gajure-now-on-clojars/#comments</comments>
		<pubDate>Mon, 22 Feb 2010 18:39:03 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[Clojure]]></category>
		<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Gajure]]></category>
		<category><![CDATA[Genetic Algorithms]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=351</guid>
		<description><![CDATA[Gajure, my small genetic algorithm framework, is now up on Clojars. Hopefully, this should make it much more convenient to use in a real project. I also added Leiningen support, and if you use Clojure with any frequency, I&#8217;d recommend checking that out.]]></description>
			<content:encoded><![CDATA[<p><a href="http://github.com/Ejhfast/Gajure">Gajure</a>, my small genetic algorithm framework, is now up on <a href="http://clojars.org/gajure">Clojars</a>. Hopefully, this should make it much more convenient to use in a real project. I also added <a href="http://github.com/technomancy/leiningen">Leiningen</a> support, and if you use Clojure with any frequency, I&#8217;d recommend checking that out.</p>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=351" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2010/02/gajure-now-on-clojars/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Tweeting Narcissist</title>
		<link>http://blog.ethanjfast.com/2009/12/the-tweeting-narcissist/</link>
		<comments>http://blog.ethanjfast.com/2009/12/the-tweeting-narcissist/#comments</comments>
		<pubDate>Thu, 31 Dec 2009 22:34:47 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[sinatra]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[web application]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=261</guid>
		<description><![CDATA[I&#8217;ve been playing a bit with the Sinatra web framework, and after some intermittent coding, I ended up with a toy project I&#8217;m calling the Narcissist Quotient. It may seem that I&#8217;m poking fun at of Twitter&#8217;s ego-centric bent, and perhaps this is true. It is equally possible, however, that my design is to satirize [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been playing a bit with the <a href="http://www.sinatrarb.com/">Sinatra</a> web framework, and after some intermittent coding, I ended up with a toy project I&#8217;m calling the <a href="http://narcissist.ethanjfast.com">Narcissist Quotient</a>. It may seem that I&#8217;m poking fun at of Twitter&#8217;s ego-centric bent, and perhaps this is true. It is equally possible, however, that my design is to satirize those who boisterously lament &#8220;the rampant self-absorption&#8221; instilled through today&#8217;s technologies. I&#8217;ll leave the verdict up to the reader.</p>
<p>As always, the code open source and available at <a href="http://github.com/Ejhfast/Narcissist">github</a>.</p>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=261" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2009/12/the-tweeting-narcissist/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Slowly Programming in R</title>
		<link>http://blog.ethanjfast.com/2009/12/slowly-programming-in-r/</link>
		<comments>http://blog.ethanjfast.com/2009/12/slowly-programming-in-r/#comments</comments>
		<pubDate>Sat, 12 Dec 2009 13:43:23 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[R Programming Language]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=231</guid>
		<description><![CDATA[Recently, I coded up a cross validation function in R, and things were moving rather less quickly than I would have liked. (The purpose of c.v. is to assess how well one&#8217;s statistical analysis will generalize to an independent data set.)  Anyhow, I was implementing 10-fold cross validation, and with a dataset containing around 100,000 observations, my [...]]]></description>
			<content:encoded><![CDATA[<p>Recently, I coded up a <a href="http://en.wikipedia.org/wiki/Cross-validation_(statistics)">cross validation</a> function in R, and things were moving rather less quickly than I would have liked. (The purpose of c.v. is to assess how well one&#8217;s statistical analysis will generalize to an independent data set.)  Anyhow, I was implementing 10-fold cross validation, and with a dataset containing around 100,000 observations, my code was taking hours to run. This was, of course, ridiculous.</p>
<p>Now, I doubt that it will come as a surprise, but I am rather a newbie at this whole R thing, and as I later found out, loops in R should be avoided at all costs. After hacking around with my code, I found that its critical path looked something like this:<br />
<code><br />
total &lt;- 0<br />
for(i in 1:nrow(dataset)){<br />
total &lt;- total + sum( dataset[i,1:25]*coef )<br />
}<br />
</code><br />
Now this is very simple loop, and it seemed to me somewhat less than obvious that it would beget a significant performance bottleneck. Ever so naturally, then, it did.</p>
<p>Ironically, the solution here is to use code more along the lines of the map-reduce paradigm, something I would have loved to do in the first place, were not I overcome by the cryptic nature of R&#8217;s documentation. After all, my favorite languages are all variants of lisp, and I am no stranger to functional programming. After some digging, I stumbled across <em>apply</em>, which more-or-less functions along the lines of <em>map</em> in scheme or clojure. So I tried:<br />
<code><br />
my_sum &lt;- function(x){ sum( x[1:25]*coef ) }<br />
sum( apply( dataset, my_sum ) )<br />
</code><br />
In addition to being more elegant, this is much, much faster. What was taking hours, now takes tens of seconds. Apparently, R has a fast backend implementation for this sort of thing.  So, this post is dedicated to as a warning to my fellow inexperienced users: avoid iterative loops in R!</p>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=231" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2009/12/slowly-programming-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>For the Autodidact</title>
		<link>http://blog.ethanjfast.com/2009/10/for-the-autodidact/</link>
		<comments>http://blog.ethanjfast.com/2009/10/for-the-autodidact/#comments</comments>
		<pubDate>Thu, 08 Oct 2009 13:36:40 +0000</pubDate>
		<dc:creator>Ethan</dc:creator>
				<category><![CDATA[Computer Science]]></category>
		<category><![CDATA[ebook]]></category>
		<category><![CDATA[Math]]></category>

		<guid isPermaLink="false">http://blog.ethanjfast.com/?p=87</guid>
		<description><![CDATA[I recently stumbled upon several good (and free) books. All in pdf format: Linear Algebra (Jim Heffron) Statistics (Michael Lavine) A Field Guide to Genetic Programming (Riccardo Poli, William B Langdon, Nicholas Freitag McPhee) Neural Networks (Raul Rojas) Introduction to Computing (David Evans)]]></description>
			<content:encoded><![CDATA[<p>I recently stumbled upon several good (and free) books. All in pdf format:</p>
<ul>
<li><a href="http://joshua.smcvt.edu/linearalgebra/">Linear Algebra</a> (Jim Heffron)</li>
<li><a href="http://www.math.umass.edu/~lavine/Book/book.html">Statistics</a> (Michael Lavine)</li>
<li><a href="http://www.lulu.com/items/volume_63/2167000/2167025/2/print/book.pdf">A Field Guide to Genetic Programming</a> (Riccardo Poli, William B Langdon, Nicholas Freitag McPhee)</li>
<li><a href="http://page.mi.fu-berlin.de/rojas/neural/">Neural Networks</a> (Raul Rojas)</li>
<li><a href="http://www.computingbook.org/">Introduction to Computing</a> (David Evans)</li>
</ul>
 <img src="http://blog.ethanjfast.com/wp-content/plugins/feed-statistics.php?view=1&post_id=87" width="1" height="1" style="display: none;" />]]></content:encoded>
			<wfw:commentRss>http://blog.ethanjfast.com/2009/10/for-the-autodidact/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
