Garmin TCX to KML: the Prelude (splitting my huge exported exercise file)

February 19, 2008

TCX is the Garmin proprietary file format that logs exercise information, here is a snippet:

<Activity Sport="Running">
<Lap StartTime="2008-01-26T18:29:26Z">
<AverageHeartRateBpm xsi:type="HeartRateInBeatsPerMinute_t">
<MaximumHeartRateBpm xsi:type="HeartRateInBeatsPerMinute_t">
<HeartRateBpm xsi:type="HeartRateInBeatsPerMinute_t">
<Creator xsi:type="Device_t">

KML is Google file format to display geodata, here is a snippet of a path that is overlaid on a map:

<?xml version="1.0" encoding="UTF-8"?>

<kml xmlns="">



    <description>Examples of paths. Note that the tessellate tag is by default

      set to 0. If you want to create tessellated lines, they must be authored

      (or edited) directly in KML.</description>

    <Style id="yellowLineGreenPoly">










      <name>Absolute Extruded</name>

      <description>Transparent green wall with yellow outlines</description>






        <coordinates> -112.2550785337791,36.07954952145647,2357
















In order to display geodata, I need to convert the geo location specific part of TCX to KML. Fortunately, this guy had run into this issue before, and provided some XSLT to do the job here: Thanks, Jorn, and sorry about the missing umlaut on your name, my codepage foo is not what it should be.

Unfortunately, when I export data from my mac based Garmin Training Center, I get over a years worth of information — there is no way in this program to export a day, a week, or a month. So my first task is to break out this huge a** file into digestible chunks. I’m opting for breaking out by activity right now, maybe later I can break out by time.

I thought about the quickest way to do this, after all I’m not in the mood to do anything laborious after putting the kids to bed. I’ve written SAX parsers before, and I’m way too lazy to keep around a bunch of state I need to refer to whenever I get a ‘tag encountered’ event. Plus, I had a sneaking suspicion that sed or something sed-like would do the job utilizing regex. One of my mentors used to tell me ‘Arun, you think you’re really smart and you go around inventing all of these rounder wheels. Why dont you just take the time to read a couple of man pages?’ He went on to say that those man pages were written by much smarter people than he or I, which really used to piss me off 🙂

Turns out csplit does an admirable job of splitting out files based on context that matches a specific regex. There are a couple of ‘gotchas’.

(1) put your regex in quotes, otherwise it will be interpreted by the command shell. This _really_ sucks when using xml tag syntax in your regex, i.e. /<Activity Sport=.*>/ gets interpreted as a set of pipe symbols with arbitrary characters between in.

(2) csplit can execute at max 100 times, it creates files in xx00 – xx99 format by default. You can change the numbering scheme, but not the limit. For any XML file with > 100 sections of extractable XML, this poses a problem.

(3) if you don’t specify -k (keep written files on error), and you have < 100 files written out, all files written for that run will get erased.

My version of csplit that split out the chunks:
csplit -k -f act exercise.out.tcx '//' {100}

This seems like a great time to actually write some code (as opposed to writing a SAX parser) — I need to drive csplit until there are no more <Activity> tags to individually extract. Ruby has become my script of choice lately, primarily because I can maintain it over time, also because of irb, the Ruby commandline shell, which allows me to ‘test drive’ commands I want to eventually put into a shell.

csplit writes out the number of bytes in each created file to stdout, which we can take advantage of:

ret = `csplit -f act input.tcx '/<Activity Sport=.*>/' {100}`
puts a newline delimited set of byte values of output files, all starting with ‘act’ and ranging from 00 to 99.

vals = ret.split

if(vals.length == 100)

allows us to see if we have more work to do, i.e. 99 files have been created. We take the last file, act99, copy it to a new directory to start over, and repeat until vals.length < 100:

while(continue == true)

# run csplit here.
puts "splitting files by <Activity> tag in #{newdir}..."
ret = `csplit -k -f act #{input_file} '/<Activity Sport=.*>/' '{100}'`
vals = ret.split
if(vals.length == 100)

newdir = "../#{gen_new_dir(count)}"
puts("creating #{newdir}/#{input_file}")
Dir.mkdir("#{newdir}") if(File.exists?(newdir) == false)
`cp act99 #{newdir}/#{input_file}`


continue = false



What is left: take these files and see if the XSLT code above works with them or pukes — these are not standard TCX files anymore, so I’m not expecting much love. Also, extracting KML is only one part of what I want to do with these files — showing heart rate vs distance vs altitude, etc is also something that isn’t super well done in the existing freeware.