Log files were invented long before the JSON data exchange format existed.  In fact, they bring a sort of crisp 1970s feel whenever I look at them… text files streaming across a terminal screen.  It’s almost like the Matrix—anyone who sees text files streaming across a terminal instantly believes we might be hackers.  But I don’t think log files are that cool.

To be honest, I wonder why they are still in line-text format.  I realize it’s good to be able to process log files quickly (grep, sed, perl), but imagine the useful metadata that could be gleaned from log files—if only they allowed for metadata.  Is it time to start producing JSON log files?  It might be worth a look.  Here’s why I began moving in this direction.

I first started noticing a need for something more than a line of text when this sort of thing started showing up in my log files:

I, [2009-12-16T13:24:59.264678 #13623]  INFO — : [app:des.rb, line:26] Encode file: blah -> 

The area of focus here is the second set of square brackets, [app:des.rb, line:26].  I was essentially starting to “tag” my log output so that I could tell where the lines were coming from.  Next, I realized some of the log lines that were reporting the results of executing external commands could benefit from metadata.  I started tagging my log lines with things like “shell:true” with additional metadata such as “process_id:39212″ and “status_code:0″ (process return code for “success”).

Finally, I started to wonder, “Why am I not using JSON for this?”  I couldn’t think of a good reason except that JSON requires an (opening square brace) [ to begin and a (closing square brace) ] to end.  Why not add an open square bracket to the beginning of an empty log file, and let it hang open forever?  Here’s my new Ruby logging code:

log(”Encoding file”, :level => “info”) # info is the default and can be omitted
log(”Unable to encode”, :level => “error”, :reason => “missing command ‘des’”)

The resulting log file looks like this:

[{"level":"info","msg":"Encoding file","time":"2009-12-16 14:15:10 -0600","caller":["\/Users\/duanejohnson\/bin\/test.rb:10:in `<main>'"]},
{”level”:”error”,”reason”:”missing command ‘des’”,”msg”:”Unable to encode”,”time”:”2009-12-16 14:15:10 -0600″,”caller”:["\/Users\/duanejohnson\/bin\/test.rb:11:in `<main>'"]},

The hanging comma allows new JSON objects (hashes, essentially) to be appended to the end of the file.  A JSON reader then just has to replace the last comma with a closing square brace and it becomes a valid JSON file that can be imported to a database, or searched arbitrarily.  It’s still quite easy to use awk and grep, too, since each “object” ends in both a comma *and* a newline.

Finally, there is a simple way with the common unix tool ’sed’ to replace the final comma with a closing brace, thus making it fairly easy to deal with the hanging comma (thanks to some helpful people at stackoverflow for this suggestion):

sed ‘$ s/,$/]/g’

So, for example, you could fix the hanging comma with the following:

cat ruby.hanging.json | sed ‘$ s/,$/]/g’ >ruby.log.json

I like being able to store more information in my log files this way.  It seems to fit my need to zoom in on data subsets and find answers to meaningful questions.  Is there anyone else who feels the same way?

Posted via email from Duane’s Quick Posts