moving to blogspot

October 9, 2008

For the last 2 1/2 years I’ve been working on a series of prototypes that have morphed — thanks to lots of really smart people and the work they brought as well as the work they’ve done since — into evri. Evri’s main goal is to create a ‘data graph of the web’, where you can find the best media for specific  people, places, and things, as well as navigate from one entity to the next via the relationship between the two.

Gee, that sounds like Semantic Web, and Semantic Web has not really shown itself to be Useful. Well at Evri we’ve been focused on the user experience, and while we’ve got a ways to go, we feel that providing the site as well as the tools to access the underlying data store is important.

One of these tools is the Evri content recommendation widget, which looks up all entities in your blog post and shows connections between the entities and recommends related media for those entities. Unfortunately the hosted version of wordpress is very restrictive when it comes to widgets, so I can’t put the Evri widget in my blog.

So, I’m moving my blog — to Waving Not Drowning, where I’ve embedded the widget. I’ll still use this blog, it represents a year and a lot of knowledge — of things that I forget, frequently. I’ll continue taking ‘notes to self’ in the new blog.

I guess this is why you should never confuse ambition with intelligence

September 26, 2008

I used to think that all people with outsize ambition were very intelligent, and used their intelligence to rise above the rest of us. Even when GWB ‘won’, and when he won for real 4 years later, I just thought it was Satan Dick Cheney being the puppet master. That was before I saw Sarah Palin in action. Go ahead. Watch this, it’s a powerful object lesson that while you can be a mayor and a governor and a mom — a juggling act if there ever has been one– and just because you are photogenically catty and can spin funny pitbull jokes when the teleprompter is rolling, those qualities don’t translate to the raw intelligence needed to be in the #2 slot.

In fact, the more she talks without the backup of a teleprompter, the worse I feel. As much as I hate to admit it, she’s got a decent shot of being one heartbeat away from the presidency. That said I think — and fervently hope — that watching the VP debate is going to be like watching a slow motion train wreck. Hopefully Biden wont let his predilection for talking way too much get in the way of putting Palin out to pasture.

If they win I’m outta here. Canada is looking pretty good right now.

Evri Beta goes Open!

September 24, 2008

A big day here at Evri, as we have taken the password protection off and opened ourselves up to the world. When we started looking at the problem of managing information on the web almost 3 years ago as a tiny research team of 2, our main goal was to make processing information easier for ourselves. We were stuck in an endless loop of keyword search -> sift through results -> alter keyword search -> forget what we were looking for in the first place.

Fast forward to now and Evri is an incredibly talented team who have gone far beyond the initial proof of concept prototypes and have delivered an intuitive and easy to use site that lets you find the content you want to find about the things you care about. Along the way I’ve been exposed to the real problems and solutions inherent in making a real product from a raw prototype, and I’ve got to say it’s been a great ride so far, and with the open beta we have just crossed the starting line…now it’s real!

Instead of writing about what we do, which is best summed up here, I encourage you to visit the site and poke around. If you have a blog, try installing the widget — note that my blog, which is hosted by wordpress, cannot run the widget, but this is a general wordpress problem, and there are known work-arounds that we are investigating. Stay Tuned!

RRD and averages

September 15, 2008

As noted in other posts, I’m writing a monitoring application. Because this is at best a part time effort that needs to be done quickly, I’ve made some technology choices that emphasize rapid development: the app is a Rails app, and I’m storing statistics with RRD.

I really like RRD, as I’ve mentioned before, it gets me out of the business of drawing graphs and storing data, both of which are hard problems I’d rather not solve. But I was having some issues with it when I would try to store values.

How I thought RRD worked

It seemed pretty simple. I thought I would create an RRD file (wow, lots of parameters, wonder what they mean?), then update it whenever I had a number. And the values would be graphed. And all would be good. But when I did all of the above, I noticed that the values being graphed were not the values I was storing. Hmmm. Time to figure out what some of those parameters mean.

How RRD actually works

Pretty well, actually, because it was designed by some smart people to store numbers that came in at any time, and average those numbers across a create time defined interval. Well, that’s one of the ways RRD works. It can also store counters, store the results of methods applied to raw values, and store the derivative value of the line being graphed. I was using it in the simplest case, to store a value.

What I didn’t realize is  that if my values were updated  between interval boundaries (known as ‘step values’), they would be averaged across that interval. If the values were updated outside of the specified ‘heartbeat’ value, RRD would store an ‘unknown’ value. A good explanation of how this works is found here ( in an SNMP monitoring solution).

That is actually the way graphing in a loosely coupled environment_should_ work. The reason that I was seeing strange numbers was because my insertions were falling within the same interval boundary. Which may be rational, but doesn’t jibe with then numbers I’m trying to (a) display and (b) alert on.

How I got my app to work with RRD

The key for this app is that it is expecting an average value across a time interval. So in essence I have to make sure that only one data point is inserted per interval. I do this by munging the time of insertion in the RRD graph (I still keep the original insert time for purposes of reporting).

I insert the data point at the end of the interval, so if I have a 5 minute interval and I receive and update at 21:52:34, my actual insertion is at 21:55:00. The next value will be inserted at 22:00:00. If the interval was 1 minute, I would have inserted at 21:53:00, and the next value would be inserted at 21:54:00.

More fun with RRD

I’m sure that more fun awaits. I have not hit the point where round robin averaging kicks in, and my ‘default’ values are based on my current (mis)understanding of RRD. I’ll update this post so I don’t repeat history.

ActiveResource as a Web Servicification Tool

September 11, 2008

That’s right Web Servicification. Servicifying, just like the subtle art of Strategery, is an oft derided but subtly powerful part of my toolkit.

Or at least it is now. I have been working on a monitoring application so that we could figure out when things were going pear shaped — as opposed to finding out after the fact. This monitoring application was somewhat novel in that it successfully got me out of doing hard things of little quantifiable value and let me focus on doing easy things of much greater value (see above goal). Using RRD is a good example of outsourcing a hard problem  — what to do with all that data ?!?– to something that handled it for me. Not an especially hard leap to make, thanks to lots of SysAdmins who feel exactly the same way, but still, I’m really happy that I’m not collapsing data to keep my disk footprint somewhat finite.

This whole strategy of ‘doing more with less effort’ is really fun, I’m searching for something non geeky to try it in.  If I get the same efficiency boost in my personal life I’ll have enough time to write a best seller, become a kickboxing champion, or both.

In the context of my (geeky) monitoring app, Web Servicification is something else that gets me out of a couple of fairly hard to do things. Wait, let me back up a bit. Web Servicification with ActiveResource gets me out of a couple of fairly hard to do things.

The first is writing a web service. Go ahead and sneer at the difficulties of writing an XML consuming service, but until you’ve rolled your own in Java, you just don’t have that Juan Valdez “I wrote this one bean at a time” feeling.

The second hard thing this gets me out of is writing monitors for a bunch of heterogenous systems. We’ve got some things implemented in Java, some in Ruby, some in Perl, etc. I either slap a bunch of web interfaces on all of those systems, then write some centralized code to poll those interfaces, or I slap a web interface on my monitor app and let other people figure out which statistics are meaningful to them, and how often they should be updated.

A Web Service by Default

ActiveResource comes more or less enabled by default in Rails 2.0. Every method in a default generated controller can be accessed either by the UI or an REST action. Here is an example controller generated for one of my resources.

# GET /samples
# GET /samples.xml
def index

respond_to do |format|
format.html  #index.html.erb
format.xml  { render :xml => @monitor_instances }


# GET /monitor_instances/foo
# GET /samples/1
# GET /samples/1.xml
def show

# GET /samples/new
# GET /samples/new.xml
def new

# GET /samples/1/edit
def edit

# POST /samples
# POST /samples.xml
def create

# PUT /samples/1
# PUT /samples/1.xml
def update

# DELETE /samples/1
# DELETE /samples/1.xml
def destroy

Notice that the standard REST verbs are implemented in the same methods that handle rails application page requests. That’s pretty cool, and it means that you’ve got basic CRUD from the get go. The secret is in the render method (in bold above), which returns either a page or XML content depending on the requested format. If the request ends in xml, it’s assumed to be REST, otherwise it’s assumed to be a standard page request.

Routing for both REST and page based requests is provided in routes.rb:

map.resources :monitor_instances

provides routing access to the default methods defined above.

Accessing the Default Web Service

ActiveResource::Base is the class that abstracts the wire format and provides basic CRUD access to the resource. To access the MonitorInstance objects defined above, I could do the following:

class MonitorInstance < ActiveResource::Base
# define what you need in here


The ActiveResource based MonitorInstance acts similarly to an ActiveRecord based MonitorInstance:

monitor_instance = MonitorInstance.create(:name=>monitor_name,:monitor_instance_id=>,:frequency_id=>,:status_id=>@status_by_name[‘good’].id,:monitor_type_id=>@monitor_type_by_name[“stand_alone”].id)

creates a monitor with the parameters as specified above.

MonitorInstance.delete( OR


removes the monitor instance.

monitor_instance = MonitorInstance.find(1) finds me the monitor instance with an ID of 1.

removes that Monitor. So far, so good.

Find (not by ID)

What if I want to find something by a secondary attribute, like name? The default rails app expects qualifying parameters to be passed in a params hash:

monitor = MonitorInstance.find(:first,:params=>{:name=>monitor_name})

My MonitorInstances can be nested under other MonitorInstances. In the Rails app model, each MonitorInstance model specifies that it  belongs_to :monitor_instance.

This doesn’t quite have a corollary in the ActiveResource world. ActiveResource is concerned with abstracting basic access of web based resources, and that associations are not available via that abstraction layer. When I want to find a nested MonitorInstance, I do the following:

def get_monitor(monitor_name,parent_name = nil)

if(parent_name != nil)
parent = get_monitor(parent_name)
@logger.debug(“finding first instance of monitor #{monitor_name} under #{parent_name}”)
monitor = MonitorInstance.find(:first,:params=>{:name=>monitor_name,:monitor_instance_id=>})
@logger.debug(“finding first instance of monitor #{monitor_name}”)
monitor = MonitorInstance.find(:first,:params=>{:name=>monitor_name})


So I need to first get the parent resource, then make a request with the parent ID in the params hash, as indicated by the bolded text above.

Updating — Avoid Non Writeable Parameters!

It was hard to find any updating doc that didn’t just say “to update, just invoke the ActiveResource-derived object save method”. Which sounds great in theory, but didn’t work, because the default implementation of save POSTS all attributes, even those that are considered immutable, to the web service endpoint. For instance, my MonitorInstance class has an id field that is immutable. That field is posted with all other (mutable) fields. There is a method in ActiveResource to remove all immutable/protected attributes, but that method calls an undefined logger object to notify you that you are trying to modify an immuatble attribute, and an exception is raised.

To get around this, I stripped the immutable attribute — the id — out of the incoming params hash of the controller update method (in the Rails app) — see the bolded text below:

class StatisticsController

# PUT /statistics/1
# PUT /statistics/1.xml
def update
@statistic = Statistic.find(params[:id])

if(params[:statistic][:id] != nil)
logger.debug(“removing ID from input params!”)

respond_to do |format|
if @statistic.update_attributes(params[:statistic])





Nested Resources

The StatisticsController above handles all posts to Statistics resources, which are 1..N measurements associated with a monitor. In order to enforce that kind of scoping in the request path, I need to update the monitor_instances routes to scope the statistics routes:

#map.resources :statistics

map.resources :monitor_instances, :has_many => [:statistics]

In the statistics controller, I now need to always be aware of the ‘owner’ MonitorInstance. I do this by adding a before_filter, a method that gets invoked prior to every method being called:

before_filter :find_monitor_instance

This before_filter corresponds to the find_monitor_instance method, which returns the appropriate MonitorInstance:


def find_monitor_instance
@monitor_instance = MonitorInstance.find(params[:monitor_instance_id])

Now I have an attribute that I can refer to in my controller. Note that in all of the controller methods that handle both REST and page requests, I need to scope my model requests/updates with the @monitor_instance variable:

def index
@statistics = Statistic.find(:all,:conditions=>{:monitor_instance_id=>})

respond_to do |format|
format.html # index.html.erb
format.xml  { render :xml => @statistics }

I can also take advantage of RAILS path freebies. For instance, after a create request for statistic, I redirect to the appropriate monitor_instance scoped path like this:

flash[:notice] = ‘Statistic was successfully created.’
format.html { redirect_to(monitor_instance_statistic_path(@monitor_instance,@statistic)) }
format.xml  { render :xml => @statistic.to_xml, :status => :created, :location => monitor_instance_statistic_path(@monitor_instance,@statistic) }

monitor_instance_statistic_path generates a path that looks like {path to server}/monitor_instances/1/statistics/3.html or .xml depending on the requested output format.

Some Helpful Links:

ActiveResource RDoc

REST + ActiveResource

Comments from this Railscast

Mac launchd and launchctl — the OSX alternative to cron

August 28, 2008

I was revisiting my metrics project, having used the first one as the prototype to refine requirements (nothing works better at getting real requirements out of people than showing them something that doesn’t quite do what they want).

When it came time to test a monitor, I tried to get one running under cron and it didnt actually work for me. I can’t remember if cron has ever worked for me on a mac, but didn’t have the time to figure out why and how. It was time to make the jump to launchd.

Launchd is billed as an  init.d, /etc/rc, xinetd, .profiile, and crontab replacement, i.e. it can launch scripts at system startup, user login, or on a specified interval.

My use case was to do something cron like. This was not entirely straightforward, there is a difference between using StartCalendarInterval (to run things on a specified date, or every minute if no value is specified) and StartInterval (to run things at a specified interval, similar to specifying */5 for every 5 minutesin cron).

programs are loaded into launchd with launchctl, they are specified as plist files with a pretty simple key/value and/or key/dictionary of values XML format. Here is my .plist file for running something every 5 minutes:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "">
<plist version="1.0">

Note that in key value parlance, StartInterval takes an integer which specifies the # of seconds. If I wanted to run something every day at a specified time, I would use StartCalendarInterval, which takes a dictionary element that contains time intervals.

<?xml version="1.0" encoding="UTF-8"?>
	<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
	<plist version="1.0">

Note the difference between StartCalendarInterval syntax and StartInterval syntax — StartCalendarInterval takes a dict structure that contains key/value pairs. In other words it takes a hash. You can also use Arrays, as specified in the value for the ProgramArguments key. Just make sure your keys have the correct kind of values. as specified here.

More Rails-tarded ness: named resources

August 21, 2008

I was showing my monitoring app to a co-worker, who wanted to access some of the resources by URLs that contained their names. Hey, that actually makes sense! He wants to refer to resources by their actual names — brilliant. Unfortunately for my lazy ass, this is a departure from the standard rails resource routing conventions, where

map.resources :{controller name}

automagically generates routing like this:

/controller name/:id

I wanted to have both approaches, mainly because I’m lazy and dont want to rework my code that navigates back to these resources by ID. My first attempt at doing this was to put a custom named resource in front of my default map.resources statement:

map.named_monitor_instances ‘monitor_instances/:name’, :controller=>’monitor_instances’, :action=>’show_named_monitors’

this resulted in me getting a ‘missing template for show_named_monitors’ message, which was fine. I didn’t want to render the same view in another erb file.

The best solution I’ve found for having it both ways is by realizing that the default route :id parameter is just a parameter, and can contain a name as well as a number. Other named routes can be quite specific about what they contain, but the default route is pretty forgiving. I modified the controller code to look like this:

@monitor_instance = MonitorInstance.find(params[:id])

@monitor_instance = MonitorInstance.find_by_name(params[:id])

to catch the instance where the find_by_id(‘foo’) fails and try to find foo by name. Graceful? No. Elegant? Not really. I’m sure this level of rails-tardedness will get me flamed by Rails Zealots who think I’ve gone and dicked up a perfectly elegant solution. But is it easy? Hell to the Yeah it is.