Evri Goes Beta!

June 26, 2008

Whew! Its been an exciting couple of weeks as we’ve gone through our first launch, but it’s official: Evri has a real beta release, and we’re telling the world.

It’s been two and a half years since I started down this road, building our very first, very crude prototype in a couple of weeks, and walking into the first demo knowing that we were onto something big, even if we didn’t know what that really meant. I consider myself very lucky to have had the opportunity to dive headfirst into such an engrossing, challenging space over the last couple of years, it’s been a great ride with a great team!

Neil’s blog post says it much better than I could ever rehash it (but I’ll try anyway): we are building a data graph of the web, a set of people, places, things, concepts, connected by specific relationships harvested from relevant content. The company motto: “search less. understand more.” is our way of saying that we have something truly disruptive: this is not search gone to eleven, or Wikipedia on steroids, but something much more powerful that gets around the current, completely disjoint user experience of “search, then browse, then search some more, then forget what you were originally searching for”.

Check it out! Sign up, and, most importantly, tell us what you think, so we can make it (even) better.

The Cron of Tab

June 19, 2008

Damn, setting up crons was supposed to be a walk in the park, so much so that I didn’t even budget mental energy for it! Maybe that was the problem…

Anyways, lessons learned from setting up simple cron jobs.

  1. How-Tos are good, but man is better.
  2. crontab -e will either load your current crontab, or load a blank template for you.
  3. crontab -r will kill your current crontab. Which is why it’s nice to keep the output of crontab -l, which lists your crontab jobs, in a backup file. Because unless you like setting up crons, you don’t want to be left high and dry without a backup.
  4. try running your jobs before putting them in the crontab. Or, if you’re like me, do crontab -l, cut and paste, and figure out why /usr/local/binruby is not the command you want (/usr/local/bin/ruby was what I was looking for).
  5. to make sure your jobs are running, tail -f /var/log/syslog. Note that this doesn’t tell you if they’re crap or not.
  6. Or, append your output to a log, and check to see that the log is growing.

I’m still kind of cruxing about how I’m going to run this on a deployed app. I think I’m going to have to get ops to add deploy_user, and add the crons on deploy_user’s behalf. Of course, I’ll worry about that when I can actually smoothly install mod_rails on my semi jacked up box. I must be the only tard in the universe who screwed up a mod_rails install, but more on that tomorrow.

Tomorrow (er, 1 week later)

In the mad rush to launch (more later), I forgot to update this page, which is bad because this is my scratchpad that has more than once saved me from repeating a painful process. Summary: anyone installing mod rails should take 4 minutes and view the railscast. I was missing the following in my conf file:

LoadModule passenger_module /usr/local/lib/ruby/gems/1.8/gems/passenger-1.0.5/ext/apache2/mod_passenger.so
RailsSpawnServer /usr/local/lib/ruby/gems/1.8/gems/passenger-1.0.5/bin/passenger-spawn-server
RailsRuby /usr/local/bin/ruby

Metrics Part IV: RRDTool on Ubuntu 7.04 (Feisty)

June 19, 2008

The production instance of this metrics server is going to run on Ubuntu Feisty, which comes installed with rrdtool 1.2, but the ruby bindings I want (need) to use bind to 1.3. So this is how to install rrdtool on Feisty.

(1) download the source:

curl http://oss.oetiker.ch/rrdtool/pub/rrdtool.tar.gz > rrdtool.tar.gz

then follow the instructions rrdbuild page : basically,


sudo make

sudo make install

However, in order to get configure to complete, I needed to install a couple of dependencies, pango and xml-2. I like configure, it’s very good about telling you what is missing and where to get it. And the rrdbuild page is also great at specifying exactly how to install the missing packages.

I figured I was up and running at that point. I built the ruby bindings from {src dir}/bindings/ruby, but when I tried to run the files I had been running on my Mac, I got:

/usr/local/lib/ruby/site_ruby/1.8/x86_64-linux/RRD.so: librrd.so.4: cannot open shared object file: No such file or directory - /usr/local/lib/ruby/site_ruby/1.8/x86_64-linux/RRD.so (LoadError)


How can a file that exists, /usr/local/lib/ruby/site_ruby/1.8/x86_64-linux/RRD.so, not be found? Was it a permissions thing? I tried building as sudo, same thing. Then I made sure that the file actually existed, just to check my head. Yes, it’s there. Yes, I’m getting the same error. Wait. Could it be the librrd.so ? I try a

ldconfig -v | grep rrd

and only get librrd.so.2. Hmmm. OK, desperate times…I edit /etc/ld.so.conf.d/local-installs.conf, and add


to the path. That worked. The ops team is not going to be super thrilled about the amount of jackassery it took to get this up and running, which is why I’m documenting it here. Because they’ll make me maintain it. Just like I would if I were in their shoes 🙂

Metrics Fast ‘n Easy, part III: accessing Rails goodies outside of a Rails app

June 18, 2008

The basic architecture of this (very simple) metrics gathering and display application is:

  1. Scripts run as crons from within the rails directory. Every minute, the cron wakes up and checks the database for the last time run and the poll interval. If they are
  2. They update RRD files and generate .pngs that reside in the rails /public dir.
  3. they update the latest value, and the last time run.

The scripts and the rails app intersect at two points:

  • the interval and last time polled
  • the generated PNG file.

I may choose to store and display the last collected value, but that will happen after getting feedback from my customers (the development, operations, and product teams).

Since the script has to check the interval and the last time polled, it needs access to the database. I naturally wanted to use the ActiveRecord classes that I use in the rails app, I also wanted access to rails environment variables, like RAILS_ROOT.

By requiring environment.rb:

require File.dirname(__FILE__) + '/../../config/environment.rb'

I was able to get rails like behavior into my scripts.

Metrics Fast ‘n Easy, Part II: actually using RRDTool from Ruby

June 14, 2008

Continued from part I:

Now that the RRD bundle is installed in Ruby’s default load path, I require RRD and access the convenience methods. The methods basically pass all parameters in as strings, which is fine, but I don’t like thinking of time and values as strings if I can avoid it. So I wrote a wrapper class that allows me to pass in values as typed options, and then casts them to the internal strings.

A couple of notes about creating, updating, and rendering RRD graphs using the built in Ruby binding.

Creating an RRD graph

At create time, the first parameter is the name of the file, minus the .rrb extension (create will puke if you specify the extension). The start time is expressed in seconds, my code below passed it in as a Time object and converts it to seconds. The step time is the minimal amount of time an update can occur at — in other words, if your step is 50 seconds and you try to update at 10 seconds, you get an error.

The DS option defines a dataset as follows:

DS:[name]:[graph type]:[min time to show an error condition]:[min value or unknown]:[max value or unknown]. More explanation of the suitable graph types is found here.

"--start", "#{@start.to_i}",
"--step", "#{@step}",

In the example above, I only create a single dataset. You can create 1..N, although I’m sure N has an upper limit, I haven’t found it specified anywhere. Also, I believe the data set is restricted to < 19 characters in length. The RRA section syntax is as follows:
RRA:AVERAGE | MIN | MAX | LAST:xff:steps:rows
where the collapsing is done by averaging values, or min/maxing values, the xff value specified limits unknown values from being collapsed by establishing a max ratio of unknown values to known values. The steps value specifies the number of datapoints collapsed, and the rows value specifies how many collapsed datapoints to keep. So RRA is where you really get a chance to limit the size of the RRD file.

Updating an RRD Graph

Once the graph has been created, it exists as the file you specified using the name parameter above. You update it with time:value statements: in the code below, I’m updating an array of time:value statements:
# simple update of multiple values
def update(times, values)

for i in 0..times.length-1


In the code above, as for all RRD operations, you specify the name of the RRD file you want to operate on in the first parameter.

Note that you cannot update a graph with a time less than it’s start time or a time that is less then the last time + the step time specified at creation.

Displaying an RRD Graph:

Graph display is the most complex operation with RRD. I’m not going to go into all of the details: some really good examples are found here.

I’ve taken the simplest approach to displaying a graph:

"--title", title,
"--start", start.to_i.to_s,
"--end", finish.to_i.to_s,
"--imgformat", "PNG",

Unlike the update method, the name of the actual desired graph is the first parameter, not the name of the RRD file. The RRD file to load is specified in the DEF line. You can specify multiple DEF values to display dataset from different RRD graphs. You will need to specify the way you want each dataset rendered: in the above example, I define a value a with the DEF statement that I reference in the following LINE statement:

In order to render data, you will need to specify how you want to display it with( as a line, as area under a line, as a tick mark, etc). More details about how to define data sets, including creating datasets via the CDEF statement, are found in the graph data documentation. Details about how to display data are in the rrdgraph method documentation. The format of the DEF, CDEF, LINE statements is RPN, i.

Make sure to specify start and end in a way that shows values as you would like to see them, i.e. make sure your latest value is in the specified start and end range.

Metrics fast ‘n easy (?) with RRDTool and Ruby: Part 1: Installing RRDTool and ruby bindings on a Mac

June 10, 2008

I’m trying to set up a simple heads up display of key information about our running system as we move towards releasing our first product. This display is grouped into themes, each theme contains 1..N graphs of relevant system data.

As a manager this is not my full time job, so I didn’t to get into navel gazing mode wrt the  ‘ultimate’  stats monitoring system, especially since that wheel has already been invented. I settled on using RRDTool to display statistics, mainly because RRDTool has pretty advanced, robust ways to age data out, while keeping the database down to a finite size. It also generates PNG files, which I could update and throw up on a page as needed. There would be a need to track application specific metadata (like what each graph maps to), but no need to handle storing and retrieving metrics data, which gets me (and the poor bastard that inherits this tool) out of a world of pain.

It’s been years since I’ve set up RRDTool, and that was on a Red Hat system. Now I’m a Mac fanboy, and so here are my notes on how to set up RRDTool with Ruby binding on Mac OSX Tiger.

RRDTool can be installed via mac ports which makes pulling in the dependencies ‘not my problem’– the best kind of problem to have :):

sudo port install rrdtool

To build the ruby binding, you need to get the source from here. The ruby binding is in {src folder}/bindings/ruby. In order to get the generated Makefile to point to the installed rrd libs, you need to change the following line in extconf.rb:




Then run

ruby extconf.rb

to generate the makefile, then



make install

to respectively build and then install the RRD.bundle to /opt/local/lib/ruby/site_ruby/1.8/i686-darwin8.9.1

To confirm that things have run successfully:

ruby test.rb should generate a test.png file.