Vulnerability in Strava Home Address Privacy

Strava and Privacy

This article details an information leakage vulnerability in an activity-tracking app called Strava. Strava enables users to record and share their activity metrics, including location details, in a social media format with friends and the wider community of over 120 million users. Strava has faced scrutiny from privacy advocates, notably in 2018 when it inadvertently exposed the location of secret military bases and again in 2023 when its heatmap feature was found to potentially leak users’ home address locations.

To protect users, Strava introduced a feature that allows them to obfuscate sensitive locations by creating "hidden zones" around them. Initially, these were perfect circles centered on the location. A 2018 study by Hassan et al. demonstrated that this method left users vulnerable—by identifying the edges of a hidden zone, one could pinpoint the address. In response, Strava included a randomly generated offset to the hidden zone to eliminate this attack. A bad actor using this method could still determine the contours of the hidden zone but would be unable to gain information on the exact location of the protected address within it.

Offset hidden zone of testing location

Vulnerability

During a run, I devised a potential method allowing a malicious actor to discern the location of a protected address within a "hidden zone”. This attack only requires publicly available information. The vulnerability originates from Strava's practice of displaying the total distance of an activity to high precision, while redacting the GPS track inside the hidden zone. The activity can be divided into two segments: the visible distance and the hidden distance.

Hidden section of activity shown in grey

An attacker could find the hidden distance by subtracting the visible from the total distance. With the hidden distance known, it's possible to infer that the true start and end of the activity are that distance from the publicly displayed start and end points. By analyzing data from multiple activities that enter and exit the hidden zone at geographically spaced points, an attacker could triangulate the protected address.

Implementation

I began by gathering activity data, created an alternate account and randomly selected a residential address in Vancouver for my experiment. Strava permits a hidden zone radius ranging from 200 to 1600 meters, with 200 meters being the default choice. I opted for a 400-meter radius, accepting the first randomized offset presented.

I conducted four running activities, each starting and ending just outside the chosen test address, heading in different cardinal directions. My objective was to gather data conducive to the attack while reflecting typical user activities. Running in varied directions and circling a local park before returning, ensured geographically spaced entry and exit points, aiding in triangulation. Additionally, I opted for relatively short distances of approximately 2km to reduce distance calculation errors. This setup models a scenario with a limited, nearly optimal sample set. Naturally, real-world data presents more variance, however, it is possible to compensate for this with more data.

Hidden sections in grey, visible sections in red

In reviewing the GPS logs, I anticipated discrepancies in matching my distance estimates with Strava's calculations; however, the results were surprisingly accurate. I managed to estimate a friend's 25km run within a 50-meter variance from Strava's calculation.

Subtracting the visible distance from the total distance for each of the four activities allowed me to derive estimates of the hidden distances. This yielded four sets of coordinates and their corresponding distances from the protected address. It's worth noting that a complication emerged in how the hidden distance was calculated; being the distance run rather than the straight-line distance between the activity's hidden zone entry point and the protected address. Since runners navigate street contours, calculating point-to-point distances requires prior knowledge of the street network.

To determine distances along city streets, I utilized OpenStreetMap—a free, open-source geographical database. Through OpenStreetMap data, I constructed a boundary delineating every geographic point within a specific walking distance from a starting point. This boundary encompasses all points reachable along city streets exactly that distance away from the start. By positioning the starting point at the edge of the hidden zone, the protected house should fall exactly on one of the boundary points.

Every possible location the hidden distance away from the edge of the hidden zone

I repeated this process for all data and generated a map with varying colors for each activity. To pinpoint the protected house's location, I identified where all colors overlapped. Enclosing the smallest circle around a point where each color intersected precisely located the protected address. By using this method, the address was narrowed down to just a few houses.

Left: Possible home locations for each activity. Right: cluster outside protected address

Findings Disclosure

Before publishing, I disclosed my findings to the security team at Strava and delayed publication by four months. In testing this I have only used data generated by myself.

Lastly, right before publishing, I discovered a paper from KU Leuven in Belgium outlining the same inference method used here. In it, they measure the effectiveness of the method using 1.4 million real-world activities.