InquiryLabs

Politics, Programming and Possibilities

Archive for the ‘Software Engineering’ Category

Working With File Names

There is likely a way to do what I’m about to blog about, but I thought this might be useful in case there isn’t. Basically, I want a File object, but without the necessity of it actually being a file on the filesystem. For example, I’d kind of like to call "/tmp/myfile.txt".dirname and get "/tmp" back.

Ruby’s File class does not allow the following, unless /tmp/myfile.txt actually exists:

f = File.new("/tmp/myfile.txt")

And what’s worse (imo), you can’t do this once you have the file handle:

f.dirname

But it does allow this:

File.dirname("/tmp/myfile.txt")

When you have a lot of this kind of thing going on (getting dirname, basename, extname, etc.) it gets tiresome to type “File” all over the place. What we really need is a Filename object. And it should proxy methods to the File.* class methods. Here it is:

##
# Simple class that makes File.* class methods available on a
# Filename object
#
# doctest: Can call File's class methods on a Filename object
# >> Filename.new ("/tmp/myfile.png").dirname
# => "/tmp"
#
# doctest: Can create a Filename from another Filename object
# >> path = "/tmp/myfile.png"
# >> Filename.new(Filename.new(path)).to_s
# => path
#
# doctest: Filename can create a filepath from segments
# >> Filename.new("/tmp", "inner", "other.txt").to_s
# => "/tmp/inner/other.txt"
#
class Filename
  def initialize(*segments)
    @filepath = File.join(*(segments.map{ |s| s.to_s }))
  end

  def to_s
    @filepath
  end

  def method_missing(method, *args, &proc)
    File.send(method, *([@filepath] + args), &proc)
  end
end

GetDoc: A Simple CMS for eRuby

Following up on my earlier post about Rails deployment (it can still be a pain), I’ve made my first little Ruby CMS using eRuby (i.e. Ruby without Rails). The idea is simple: Google has a wonderful word processor that clients can use to edit web pages. Why not harness that ability for simple web pages that need just a little bit of content editing here and there?

The GetDoc class uses hpricot to parse and cache a public Google document, and then show the result wherever is needed in the web page. Here is a sample .rhtml file from the project. The source code is available at github:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
   "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<%
  require File.join(File.dirname(__FILE__), "ruby/get_doc")
  # Default document is a Google word processing document
  google_doc = GetDoc.new("dcjnq5tv_4gw4qk2")

  # Getting a document from somewhere other than google
  # docs is possible, but it is a little more complicated,
  # because we need to specify the following:
  # 1. The URL of the post without the domain
  # 2. The domain of the blog, and the location in the
  #    string to insert the post (%s)
  # 3. A transformation proc which can extract the content
  #    from the blog using hpricot
  blog_post = GetDoc.new(
    "2008/07/15/lucky-to-be-a-programmer",
    "http://blog.inquirylabs.com/%s") do |hpricot|
      (hpricot / ".PostHead h1:first").to_html +
      (hpricot / ".PostContent:first").inner_html
  end

  # Uncomment the following line to have the cache cleared
  # for EVERY page load:
  # GetDoc.reset_cache
%>
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
  <head>
    <title>Sample GetDoc Page</title>
    <style type="text/css" media="screen">
      #left {
        width: 50%;
        float: left;
      }
      #right {
        width: 50%;
        float: right;
      }
    </style>
  </head>
  <body>
    <div id="container">
      <div id="top">
        <div id="header">
          <div id="title">Sample GetDoc Page</div>
        </div>
        <div id="content">
          <div id="left">
            <h1>Google Doc</h1>
            <%= google_doc %>
          </div>
          <div id="right">
            <div class="sideitem">
              <h1>Blog Post</h1>
              <%= blog_post %>
            </div>
          </div>
        </div>
    </div>
  </body>
</html>

Note that it is also possible (as demonstrated above) to customize GetDoc for blogs or other sources of HTML data. What a wonderful web!

Lazy Evaluation at Work

One of Haskell’s touted features is “lazy evaluation”. I’m starting to see the bigger picture with this language feature, and the following example really helped me. From Real World Haskell:

[The method] hGetContents is different [from the traditional open/read/close method of accessing files]. The String it returns is evaluated lazily. At the moment you call hGetContents, nothing is actually read. Data is only read from the Handle as the elements (characters) of the list are processed. As elements of the String are no longer used, Haskell’s garbage collector automatically frees that memory. All of this happens completely transparently to you. And since you have what looks like—and, really, is—a pure String, you can pass it to pure (non-IO) code.

Lazy evaluation is the ability to describe a model for all possible scenarios, and then compute just the scenario you need, on demand. In this case, “all possible scenarios” is any sequence of bytes, read from disk. As mentioned in the book, lazy evaluation of a file’s contents means you can read gigabytes of file contents without having to worry about buffering, chunking, looping, or garbage collection. Sweet!

A Glimmer of Monadic Hope

Melodramatic? Maybe, but I began to understand what a Monad might be in the Haskell language tonight. I’ve been banging my head against the wall trying to understand them. Finally, I decided to read the Haskell 98 Report (something I would never have done until I met a certain Russian friend who actually reads the manuals and specifications out there… it turns out that’s a pretty good idea.)

So here’s what the Report has to say about Monads:

A do expression provides a more conventional syntax for monadic programming. It allows an expression such as

  putStr "x: "    >>
  getLine         >>= \l ->
  return (words l)

to be written in a more traditional way as:

  do putStr "x: "
     l <- getLine
     return (words l)

Ahah! The do syntax creates a sequence of anonymous closures! Not only that, but each closure gets a single value (optionally) bound to a variable, so that, for example, in the case of the IO Monad, it appears we are stepping through time as a sequence of events.

Now I just need to understand why these Monad things are so valuable that we need to go through this convolution in the first place…

Using “foldr” in Haskell

I’ve been reading Real World Haskell lately in order to get a better grasp on Haskell and functional programming. It’s a book I’d highly recommend—especially when it’s done!

One of the fundamental building-blocks of Haskell is the foldr method which, I am told, is called a “primitive recursive” function because it is capable of building any function in a set of useful recursive functions (e.g. foldl, map, filter, etc.) This is really important to me, because I had always liked foldl before, which seemed more intuitive (i.e. it iterates through a list from left to right, which is my natural way of thinking of lists). Anyway, it turns out that foldr can be used to build foldl, but not vice-versa, which proves foldr is more “primitive”.

One of the exercises in the Real World Haskell book is to create the concat function which takes a list of lists and turns them into one list. I spent about an hour trying various recursive tricks last night, and realized that this is a double-recursive problem—you need an append function available before you can recursively join a list of indefinite length.

What really delighted me in the end was that both the append and the concat functions can be very easily defined in terms of “foldr”:

myAppend a b = foldr (:) b a
myConcat = foldr myAppend []

In the first definition, myAppend is using foldr to replace the last (empty) element of the list with “b” (the list to be appended) and then consing each element in a onto the front of that list.

In the second definition, myConcat is using foldr to start with an empty list and then append prepend each element of the outer list with a previous element of that list (i.e. the parameter passed to myConcat is expected to be a list of lists).

Since myAppend is just the concatenation of two lists, it is exactly the same as the (++) operator in Haskell. So, the concat function can concisely be defined as:

myConcat = foldr (++) []

This code makes use of currying, which means that I’m defining a function that uses another function with incomplete parameters (in this case, I’ve only passed two parameters to foldr when three are normally required). The first function expects an anonymous parameter (i.e. the list of lists) which will get passed into foldr, thus completing the foldr function.

Update: I just found an even more concise way to define concat (assuming ++ is not available), thanks to the help on #haskell:

myConcat = foldr (flip (foldr (:))) []

That’s one reason Haskell is “precision engineering for programmers” :)

Cool Your MacBook Pro

I work at least 8 hours a day on my MacBook Pro. Around noon I start to notice that my left hand is uncomfortably hot and I’m tempted to either shut things down (and waste time eating, for example) or go on for little bits at a time with my hand moving on and off of the keyboard.

Today, I found a better solution. It turns out that I’m not the only one with this problem, and in fact, a kind soul in Germany has developed smcFanControl2 which lets you turn up one or both of the fans in the MacBook Pro so that you can pre-emptively get rid of that heat. I was so pleased that I immediately donated 4 euros. Thanks, eidac!

Lucky to be a Programmer

I just read Gustavo Duarte’s essay, “Lucky to be a Programmer” and had to re-post some of it here. Rarely to I call a blog post an essay, but this one is so personal and so fun that I wanted to elevate it a bit:

Few things are better than spending time in a creative haze, consumed by ideas, watching your work come to life, going to bed eager to wake up quickly and go try things out. I am not suggesting that excessive hours are needed or even advisable; a sane schedule is a must except for occasional binges. The point is that programming is an intense creative pleasure, a perfect mixture of puzzles, writing, and craftsmanship.

My brother, Chris and I have often felt this “intense creative pleasure” and I think that’s why we are now designers / programmers. It’s such a fun job! Duarte goes on:

This analytical side is what most people associate with programming. It does make it interesting, like a complex strategy game. But in most software the primary challenge is communication: with fellow programmers via code and with users via interfaces. By and large, writing code is more essay than puzzle. It is shaping your ideas and schemes into a coherent body; it is seeking clarity, simplicity and conciseness. Both code and interfaces abound with the simple joy of creation.

All I can think of to say is, “Thanks for finding words to describe how I feel!”

Haskell is Popular on IRC

I hang out on irc.freenode.net like many other programmers in the open source community. Lately, I’ve noticed that the #haskell channel has exceeded #rubyonrails in participation and “attendance”:

Picture 3.png

With all of the talk of parallelizing languages and multi-core processors lately, I think this can only be a good thing.

One of the things that gives me a lot of confidence in Haskell is the enormous academic brainshare invested in the language. My gut tells me we will be seeing a lot of innovation and stable, well-engineered programs coming from this area (i.e. more of the same, but at higher volume). It kind of reminds me of the Ruby community a few years back—there is a big emphasis on experimenting, playing, implementing new ideas, etc.

In other words, Haskell has matured to the point where I think it may have finally succeeded.

Intel CEO Andy Grove published an article at The American called, “Our Electric Future” in which he suggests energy independence is not only the wrong goal, but actually impossible to achieve in an increasingly global economy.

He suggests, rather, that our goal should be “energy resilience”—meaning we should be focused on shifting to electricity as our primary “energy source” since it is possible to create electricity from many different sources (e.g. wind, solar, diesel, oceans, etc.)

From a software engineering perspective, this makes a lot of sense—it’s like building a properly decoupled system where each layer can act on its own. For example, software engineers know that the “model, view, controller” paradigm is a useful one because it is important to be able to represent data in multiple ways—as charts, or as graphs, for instance, as well as in a spreadsheet or on printed paper. In Grove’s terms, this is “data resilience”, meaning that the data can be used to render multiple views without restructuring or re-writing the models.

Sounds good, Mr. Grove! Let’s work toward energy resilience (get your car converted to electric? :) ).

eRuby @ A Small Orange

Even though I like development in Rails, I’ve been frustrated at times with the whole Rails deployment rigamarole. The joy of Rails for me, however, comes primarily from its underlying language, Ruby, so I’ve been thinking about ways around Rails.

There are a number of up-and-coming solutions to the problem of deploying Rails, but one of the simplest has been around the longest: eRuby. Basically, this is a way to use Ruby like you would PHP… embedded in HTML files using <%= and %>. Another way to think of it is Rails without controllers.

I recently wrote an eRuby Howto at A Small Orange hosting. The “howto” describes how I set up my shared account.

If you’re interested in an inexpensive hosting solution for Ruby, or if you just want a reliable and friendly hosting solution, may I recommend A Small Orange? I have had nothing but good experiences with their responsive support department, and I am quite happy with their shared hosting service. I also get a small referral credit if you sign up via one of the links in this blog entry.

I may be writing a little more about simple eRuby solutions to problems in the future.