Monday, May 30, 2016

Strava and a curious tale of WW2 bombers

The city of Waterloo published an apparently innocent tweet that stopped me in my tracks.

Now, this is a really interesting move on Waterloo's part, but it left me feeling concerned. I'd like to talk about why, in the hopes that it helps Waterloo and other places avoid falling into a common trap.

First, though, credit where credit is due: the city of Waterloo has been doing some excellent stuff over the last few years, by connecting and improving its trails, continuing to extend bike lanes, and supporting a new uptown streetscape with protected bike infrastructure (which will be joined by more trails along Caroline and Erb as part of a regional reconstruction). There are signs of progress in finally making the Lexington overpass bike-friendly. It's really encouraging and I'd like to see other cities (like Kitchener) take a closer look at their initiatives.

Also positive: Waterloo has also taken an interest in data. Loop counters are now installed on several trails across town, counting travelers on foot and on bike year round. Waterloo wants feedback on its improvements, and wants to let real data help them drive decisions. Maybe they can also show that data live at some point!

Now, it seems Waterloo has taken up with Strava to use Strava's health and fitness tracking data to show where people are jogging and biking, using their Metro planning service, which looks like an attempt by Strava to monetize their user's data. (Update: I've been informed that Waterloo isn't using Metro, just looking at the publicly available information on Strava.)

If this is the case, Waterloo had best tread very carefully and realize what they are getting, and what they are not.

There's no arguing Strava's cycling heat map is a beautiful thing. Who doesn't like seeing where Strava users are biking? How couldn't this help us make decisions on cycling infrastructure?

It's so beautiful.
Have you spotted the problem yet? You should ask who are Strava's users, and what are they doing. Are they actually representative of the people you're trying to encourage to use a bicycle?

A cautionary tale

Here's a lesson from history as to why knowing where your data comes from is so important.

During World War II, Allied bombers would fly out over Europe to bomb targets and sometimes they would be shot down. The odds of crews surviving the many missions of their "tour" was depressingly low, because mission after mission, the chance of being the victim of guns or flak catches up with many.

In an effort to increase the survivability of their bombers, military leaders and engineers decided they needed to add armour to their bombers. They couldn't add much-- armour is heavy, and bombers need to fly. So, where would it help the most?

They came up with a clever idea: look at all of the surviving bombers and where they were shot up. Where they were being hit most, it was argued, was where the aircraft should be strengthened.

Statistician Abraham Wald turned this argument on its head. You're only looking at the bombers that came back, he said. They don't need more armour where they've been hit. They need armour where they haven't been hit. It turns out this was around the pilots and in the tail.

All the bombers that had been hit around the pilots and in the tail hadn't returned, so nobody had good data on them. But bringing those bombers and their crew home was the goal. And until Wald pointed this out, nobody realized the mistake they were making.

The black areas mark hits on the surviving bombers

This is called Survivorship Bias and it happens all over the place.

Survivorship Bias and Self-selection Bias

It's not the only kind of statistical bias that Waterloo needs to worry about here, but it's a big one. People who don't bike but could, don't use Strava. The only people Strava can collect data on are those who have managed to make themselves able to tolerate riding on our streets and roads. Everyone who can't has been filtered out.

There's more.

Strava calls itself a "social network for athletes". Its cycling userbase has a lot of athletic cyclists who use Strava to compare their performance to others. In other words, Strava's data is heavily weighted towards the enthusiastic, confident and athletic cyclists who look for places they can ride fast for long distances, and who are more willing to ride in traffic.

This presents a second source of bias: People choose whether to use Strava. This is Self-selection Bias. The people who ride bikes, who choose to use Strava, may not relfect all people riding in general.

(Look ma, I'm finally putting my math degree to good use!)

So what does this mean for Waterloo, which like many cities, wants to grow cycling? The question here is, are the people who they want to attract to cycling similar to Strava users? Do these users' choices about where to ride reflect where the hypothetical 60% Interested But Concerned want to ride, or the trips they'd make?

I would say there are very significant differences between these groups. Strava's cyclists are the "survivors" of bicycle-unfriendly city design, that keeps all but the bravest of us off our bikes. They are making very different choices about their cycling than the majority of people who say they would like to ride a bike but don't. The first such choice is that these cyclists are already riding on streets and roads that would make most of us very nervous.

This is not to say the data is useless. Some of it reflects overall truths: the great success that is the Iron Horse Trail, for instance. But let's take a closer look at these maps. They can lead you into some strange interpretations that don't make sense.

1. Students on bikes are missing

University of Waterloo

Around UW, there is heavy bike traffic on the Laurel trail, and on Ring Road, and on University, Columbia and Westmount. There are virtually no traces within Ring Road.

This is highly surprising. The interior UW campus has many bike racks littered with bikes. Why aren't paths to these points showing? Could it be that students on bikes aren't using Strava? If so, what does that say about how Strava data represent their travel patterns off UW campus?

This is a sign that Strava is completely silent on a major bike-using demographic. Yikes.

2. It takes a lot of nerve to cross 85 on University Avenue

University between Weber & Bridge

According to Strava, there's little difference in cycling on University Ave. where bike lanes exist (west of Lincoln Road) and where they don't (over the expressway). In fact, it looks like there's plenty of people on bikes going crossing the expressway on University Avenue. My own experience has been that anyone with a choice avoids crossing here, unless they are supremely confident about mixing with high speed traffic merging on and off the highway.

Someone might look at this and interpret it as a case where bike lanes aren't really having an effect, and that usage by people on bikes is just fine on University Ave. For a certain kind of athlete, that is completely true. When considering the general public, that would be a mistake.

3. Where are the neighbourhood riders?

Eastbridge neighbourhood

If you look at this map, you'd be led to believe that virtually everyone is biking on arterial roads. My experience is that there are a lot of casual riders on neighbourhood streets within this area, and younger students who traverse the quieter streets. These are clearly not being captured in Strava's data set.

What's more, this image doesn't really capture who wants to ride, but can't. The southeast portion of the image is Conestoga Mall, with a supermarket and a transit terminal. Do the lack of traces to here mean that nobody bikes to Conestoga mall? Or just that people who use Strava aren't going shopping or connecting to transit? We can't know. Nor can we determine if the neighbourhood would appreciate a little more bike accessibility at the mall.

The map says nothing.

Approach with Caution

These are just a few examples of where Strava as a planning tool comes up short. As I said before, the data is not without value, but if your goal is to make cycling attractive to the majority of people, then at best this information is incomplete, and at worst it can be deceptive. And it won't always be obvious when that happens, either.

What Strava shows is where a very particular kind of cyclist rides. It doesn't show where improvements would do the most good for the people who could ride but choose not to, which is where cycling growth will come from. Strava wants to make money with their data, so I don't trust them to be forthcoming about this, nor would I say that their agenda matches those of the cities hiring them.

The city of Waterloo should be commended for seeking out data to base their decisions on. I trust the they know that Strava is just one tool among many in their toolbox. It can help them visualize how the city looks from two wheels, but they still need to take a step back from these maps and analyze Waterloo's bike network with careful thought and direct observation.

Let's not lose sight of that.

Meanwhile, I guess I should install Strava. If nothing else, I want my own trips to be represented!

Wednesday, May 25, 2016

A tale of fickle, elusive Transit

Update 27/5: Good news! GRT is reinstating the Weber/Guelph stop to serve the midtown area and it will be back up by mid-next week. The Weber/Union stop will remain closed, as it sees additional service from route 8, and was less used than Weber/Guelph. It was also only about 500m away.

There is concern about keeping the 200 on schedule, but it's encouraging to see responsiveness from GRT on this! I guess there was more than one crotchety blogger on the phone with them after all. For those of you who provided feedback about how you were affected, thank you.

Maybe I need to change the ending to this fairy tale...


nce upon a time, there was a happy central neighbourhood called Mount Hope. In Mount Hope, being very central (directly between two great cities), Transit was present and available, as you might expect. Some residents in the neighbourhood chose to live there because of this, as they could travel the great Central Corridor through the Cities without needing a Car. 
But one day, the Great Road Eating Rail Monster came, and Transit ran far away. And the people of Mount Hope were sad.

A little later, just as sad people in Mount Hope were trying to figure out how to live without Transit (likely by buying Cars to drive the few roads not being eaten), Transit came back again! In a different place, but still reachable to the residents of Mount Hope, and they breathed a great sigh of relief.

Unfortunately, it was not to be for long. One day just a few months later, Transit went away again. The Central Corridor buses were to drive by, with strict orders to not stop for the poor people of Mount Hope. In its place, a different bus would come every now and then, once in a while on Saturdays and never at all on Sundays, and it would take people where it went, which was usually not where they wanted to go. And it would take people when it showed up, which was often not when people needed it, because it only came every now and then, only once in a while on Saturdays and never-- it was decreed, never! on Sundays. This was called "Transit", but the people of Mount Hope were not fooled.

But they were once again sad-- not to mention, confused-- and they started browsing new car ads on the Internet because who can trust Transit that comes and goes? The Great Road Eating Rail Monster would eventually leave, having birthed a beautiful Train. But that Train would see fewer riders, because the people of Mount Hope were all driving the Cars they had to buy to get around while the Monster raged and Transit had proven too fickle and elusive to rely on.

The End.

... OK, so, not a great fairy tale. Let me explain. Once, Mount Hope had good access to transit, as the fairy tell told.

And then:
  • The 4, marginally useful at the best of times, was rerouted to cover the cancelled 18.
  • When the 200 iXpress detoured over to Weber because of ION construction, it didn't stop anywhere in between uptown and downtown, leaving a lot of riders in the lurch. 
  • Simultaneously, route 7 shifted over to Park St., which was great for the Belmont neighbourhood.
  • Simultaneously, the low-frequency route 6 turned off Wellington at Weber.
  • The Mount Hope neighbourhood has been left high and dry, with only the very occasional route 4, which is a milk run bus with limited service (and no Sunday service.)

But then, the 4 needed to be detoured as well! So to infill the 4 on Weber, GRT added stops for 200 iXpress (which runs every 10-15 minutes and has good Saturday and Sunday service) at Weber/Union and Weber/Guelph.

This gave back good transit access to the Mount Hope neighbourhood.

Now that the 4 is returned to its normal route, the 200 no longer stops at these locations. This is despite the fact it drives right by them, every 10-15 minutes.

The result is awful:

In fact, 200 iXpress travels for an astounding 3.5 km without stopping even once. From the north end of Uptown, all the way to King/Victoria. Certainly, Mount Hope isn't the only area affected by this-- Mary/Allen is, as is anyone in the Midtown area relying on the 200 to get them to points south of Fairview.

But with all these changes put together, the Mount Hope neighbourhood is once again left like an island without transit, unless you count the occasional lonely route 4 bus. And very few people will count this for anything. Frequency matters, and routing matters. GRT has ignored these things when deciding the 200's detour.

In 2015, GRT saw transit ridership fall for the second year in a row. Even accounting for changes to school student busing, and even with new service being added, GRT ridership still fell.

The problem with Mount Hope is just part of a larger issue affecting ridership, and that is the chaos and disruption of ION construction. But unlike a lot of the other aspects of this problem, this one can be solved easily and locally. In fact, the solution was already in place. All we have to do is return to it.

All we have to do is reintroduce stops at Weber/Guelph and Weber/Union for 200 iXpress.

GRT, are you listening?