README - GTFS Real-Time Operator Summary Metrics

GTFS Real-Time Trip Updates - Performance Metrics¶

One of the most common transit user behaviors is to consult an app (Google Maps, Apple Maps, NextBus, etc) to find out when the bus or train is going to arrive.

That widely desired piece of information is powered by GTFS Real-Time Trip Updates, specifically the Stop Time Updates specification.

GTFS Real Time trip updates performance metrics, specifically the stop time update messages. Accurate and reliable information should be provided to transit users for journey planning. These performance metrics provide insights into:

availability and completeness of RT - is there information available for users?
prediction inconsistency - how much are predictions changing from minute to minute as the bus approaches time of arrival?
prediction reliability and accuracy - are these predictions accurate (when compared to our estimated actual time of arrival)?

Reliable Prediction Accuracy¶

The prediction is considered accurate if it falls within the bounds of this equation: -60ln(Time to Prediction+1.3) < Prediction Error < 60ln(Time to Prediction+1.5).

As the bus approaches each stop, the software is making predictions for when the bus should arrive. When the bus is 30 min away from arrival, there is a more generous buffer for accuracy; this buffer tightens as the bus is nearing the stop.

Minutes Until Bus Arives	Accurate Within Bounds
0 min	-0.26 min early - 0.41 min late
10 min	-2.42 min early - 2.44 min late
30 min	-3.44 min early - 3.45 min late

Positive values = arrival came after the prediction.
- actual_arrival = 8:05 am
- predicted arrival = 8:00 am
- actual_arrival - predicted_arrival = +5 seconds
- follow the prediction, you will catch the bus
Negative values = arrival came before the prediction.
- actual_arrival = 8:05 am
- predicted arrival = 8:10 am
- actual_arrival - predicted_arrival = -5 seconds
- follow the prediction, you will miss the bus...this is very bad!
- we want fewer of these kinds of predictions, and would much rather wait for the bus than to miss it

Reliable Prediction Accuracy Metrics in Report¶

Goal	Metric Columns
Bus Catch Likelihood 75%+ of predictions result in catching the bus	Bus Catch Likelihood % Early / On-Time / Late Predictions
Prediction Error Closer to zero, small positive values. Late predictions = negative values = riders miss bus.	Average prediction error (minutes)
Prediction Error Variability Variability is the interquartile range (IQR = 75th - 25th percentile). Smaller values = better = more consistent experience for riders using app. Ex1: 25th percentile = -5 minutes = a quarter of riders get predictions that are 5 or more minutes late. Ex2: 75th percentile = 3 minutes = a quarter of riders get predictions that are 3 or more minutes early. Ex3: half of riders get predictions between 5 minutes late and 3 minutes early.	10th, 20th, ..., 90th percentiles Variability = IQR = 75th percentile - 25th percentile Accuracy Loss = 10th percentile / 50th percentile

Availability and Completeness of Predictions¶

This metric is the easiest to achieve. For starters, having information is better than no information.
It measures the completeness within the RT data we are capturing, regardless of coverage gaps across dates.
For each instance of scheduled stop arrival, RT information is complete with at least 2 predictions each minute (every 30 seconds).
For the 30 minute period before the bus arrives at each stop, each minute is an observation that goes into this calculation (up to 30 observations).
This ensures that we have fairly equal number of observations for each stop and can compare across stops.
- We want to avoid having 30 minutes of predictions for the 1st stop and 60 minutes of predictions for the last stop and comparing metrics that have different denominators.

Availability and Completeness Metrics in Report¶

Goal	Metric Columns
2+ vehicle positions and trip updates messages per minute.	[Trip Updates / Vehicle Positions] Messages per Minute
100% routes are covered by RT, and 75%+ of trips have RT. Out of scheduled trips, how many trips have RT, regardless of completeness? Out of scheduled routes, how many routes have at least 1 trip with RT?	[Trip Updates / Vehicle Positions] % Trips, [Trip Updates / Vehicle Positions] % Routes
90%+ of minutes has predicted arrival information. How many minutes have at least 2+ messages, in the 30 minutes before the bus arrives?	% Minutes with 2+ Predictions

Prediction Inconsistency¶

This metric (also called jitter or wobble) captures another aspect of transit user experience. Any change in prediction is counted, so this metric only has positive values, but smaller positive values are better.
- If the prediction is changing from minute to minute, a large spread would show up.
- If the prediction is fairly consistent, we would see small spread.
There is research around how transit users perceive wait time, and that users perceive longer wait times than what is actually experienced. Decreasing the perceived wait time by providing real-time information has positive benefits for user experience.

Prediction Inconsistency Metrics in Report¶

Goal	Metric Columns
Less wobbly or jittery predictions, to a point. Real-time predictions should reflect traffic conditions and convey updated information to riders, so aiming for zero is not the goal. Higher = predictions change more = worse rider experience. Lower = predictions are not fluctuating minute to minute = riders trust the real-time arrival information.	Prediction Spread (minutes)
Lower padding = riders add less time to prevent missing the bus. Riders adjust their behavior to catch the bus, and add time to adjust for receiving late predictions. Late predictions (negative prediction error values) become the time a rider adds to make sure they don’t miss the bus next time, signaling a lack of trust with the information.	Prediction Padding (minutes) Absolute value of the 5th percentile prediction error.

Master Services Agreement¶

Exhibit H definitions (pg 53 on pdf)

Item	Report Metric	Definition	Implementation Notes
3. Availability of Acceptable StopTimeUpdate Messages	pct_tu_complete_minutes, n_tu_complete_minutes, n_tu_minutes_available	Percent of time riders have up-to-date prediction information available, calculated as the percent of one-minute time bins for a given trip and stop during a Trip Time Span where there are two (2) or greater GTFS-RT StopTimeUpdate arrival predictions per minute.	Each minute for the 30 minute period before each stop’s arrival for equal comparison across stops
9. Experienced Wait Time Delay	prediction_error_label, avg_prediction_error_minutes	The amount of time a transit rider perceives they have waited after seeing the real-time information in their Journey Planning Application and the arrival of the next vehicle arrives at their stop. This is calculated as the time interval between the next trip to arrive at a stop for a given route_id/shape_id/stop_id combination and the next predicted arrival time from a StopTimeUpdate message for that route_id/shape_id/stop_id combination as sampled for each minute of the day that the route_id/shape_id/stop_id combination is in service.	Use a simpler derived version with average prediction error. Current aggregation does not support route aggregation yet.
23. Measurement Time Windows		A series of 30 consecutive time windows, each starting one (1) minute apart and lasting two (2) minutes.	Each minute for the 30 minute period before each stop’s arrival for equal comparison across stops
27. Prediction Error	avg_prediction_error_minutes	Actual Trip Stop Arrival Time minus the Predicted Trip Stop Arrival Time in seconds. Note that while Prediction Error is not the final metric in this case, it is useful to retain this value into the future and in archival storage in the event that the definition of the frontier defined in Reliable Accuracy is changed in the future based on a specific agency’s needs
28. Prediction Inconsistency	avg_prediction_spread_minutes	How much the prediction changes in the last thirty (30) minutes before a vehicle arrives at a stop, calculated for a given trip and stop as the average Predicted Trip Stop Arrival Spread of all Measurement Time Windows where a given window has a StopTimeUpdate message for the trip and stop with a timestamp in that window.
29. Prediction Reliability	pct_tu_accurate_minutes, n_tu_accurate_minutes, n_tu_minutes_available, pct_tu_predictions_early/ontime/late, n_predictions, n_predictions_early/ontime/late	The percent of time transit riders are looking at a reliably good prediction – understanding that the closer a vehicle is to a stop, the better the prediction should be, calculated as the percent of minutes for each stop for each trip where predictions have Reliable Accuracy, starting sixty (60) minutes before the first scheduled stop for the trip.	Each minute for the 30 minute period before each stop’s arrival for equal comparison across stops
32. Reliable Accuracy	pct_tu_accurate_minutes, n_tu_accurate_minutes, n_tu_minutes_available, pct_tu_predictions_early/ontime/late, n_predictions, n_predictions_early/ontime/late	A prediction has reliable accuracy if: -60ln(Time to Prediction+1.3) < Prediction Error < 60ln(Time to Prediction+1.5).
39. Time to Prediction		The current time until the Predicted Trip Stop Arrival Time in minutes
43. Trip Start Time		Time of the first scheduled stop arrival of the trip per the GTFS Schedule for trips with ScheduleRelationship = SCHEDULED or CANCELED or the first predicted arrival time for other ScheduleRelationship values.
44. Trip Time Span		Time in minutes from the Trip Start Time to the arrival time at the stop being measured	Each minute for the 30 minute period before each stop’s arrival for equal comparison across stops

References¶

Caltrans GTFS RT Master Service Agreement Contract
- Swiftly provides a prediction accuracy exponential equation
Professor Gregory Newmark’s paper: Assessing GTFS Accuracy
- This project is a work in progress for productionizing and implementing all the ideas presented in this paper.
- This paper provides the basis of policy and planning interpretations around the various metrics.
- We replicate as many of the visualizations and tables as possible.
Yingling Fan, Andrew Guthrie, David Levinson’s paper on Waiting time perceptions at transit stops and stations