Reported: How the NYC Taxi complaints app was created

The story of honking complaints, rules that taxi drivers have to follow and the penalties if they don’t, how 311 complaints are handled, FOIL requests and lots of data.

Part 1: Introduction to the Reported App

There are about 175 million NYC cab trips each year. You can even see all of the ones from 2013. Yet there were only 13,000 consumer complaints in 2013 about drivers (that lead to a summons against the driver). To give a sense of scale, there are about 67,000 registered medallion taxi drivers and as of Dec 2013, about 25,000 currently leasing about 14,000 yellow cabs.

So there’s about 1 summons for every 13,400 taxi rides. Compare that to how Uber not only prompts users for a star rating at the end of every trip, but also uses this feedback to regularly review drivers. They do not hesitate to penalize drivers with poor reviews. If you can’t keep your rating up, you get cut. It’s why every Uber trip I’ve taken — in SF, Chicago, NY and Phoenix — has been exceptional.

The problem is that Taxi and Limousine Commission (TLC) isn’t getting the quantity of feedback it needs to get unsafe drivers off the road.

Even David Yassky, former TLC commissioner, was quoted by the WSJ as saying:

Without passenger complaints, we would have a very difficult time enforcing our rules. We simply don’t have enough enforcement personnel to really do the job on issues like cellphone use, distracted driving and passenger refusals.

Reported is an iOS app that lets you submit a taxi complaint in less than 30 seconds and helps you understand exactly what the next steps will be in the complaint process. Reported also pulls together official complaints and correlates them with charges, penalties and outcomes. Now you’ll not only understand your rights as a passenger or pedestrian, but also understand the rule that the driver potentially violated and its penalty, the likelihood of a guilty outcome, and the other charges against the driver.

The app was built for the 5th annual NYC BigApps competition and has a profile page here.

Have you ever witnessed a taxi driver…

  • Honk their horn when there was no emergency
  • Drive recklessly (go through red light, fail to yield, speed, etc.)
  • Ask where you’re going and refuse to drive you there
  • Use a cellphone, text or watch TV while driving
  • Be rude or threatening

If so, you should have called 311 and made an official complaint because these are all illegal. Understandably, most of the time it’s just not worth the hassle of calling or filling out a form online. Sure, you might get upset and fire off a tweet or post something to Facebook, but without an official complaint to 311 and the TLC your voice isn't heard.

Did you know

  • The fine for a driver who honks when there is no emergency is $100
  • The #1 complaint by far (23%) is of drivers refusing passengers which carries a minimum $350 fine
  • The fine for using a cell phone or other electronic devices while driving is $250

Reported aims to allow NYers to report incidents immediately when they happen, as easily as saying what happened, within seconds. A person on our team will transcribe your recording, parse the necessary info from it along with your GPS, contact info and other info and file an official 311 complaint on your behalf. You’ll then be notified about the status of the case as it works its way through the TLC system.

Now your voice can be heard and counted and together we can work within the system to get bad drivers off the road.

What the app does

  • Submit an official 311 complain in less than 30 seconds and automatically track its status
  • Make it easy to refer to the incident when you are asked about it
  • Look up what the relevant charges and fines are for any incident
  • Look up how common a charge is
  • See how many similar complaints were made in 2012 and 2013
  • Look up a driver and see what complaints have been made against them

This resonates with New Yorkers

I have been actively following twitter searches for “NYC tax” and “NYC cab” and anything to/from @nyctaxi (the official TLC twitter handle) to better understand how frequently people have bad interactions with cabbies.

The results shouldn't be surprising. Here are just a few examples:

If complaints were as easy to make as tweets, I believe the TLC would enjoy a whole lot more community engagement and feedback on drivers which would help them manage the fleets better and get the reckless drivers off the road. That’s what Reported is all about.

I’d like to share our story. It involves lots of data, FOIL requests, a lot of honking complaints, rules that taxi drivers have to follow and the penalties if they don’t, and how 311 complaints are handled.

Part 2: How I got interested in taxi complaints

I know a thing or two about TLC complaints…

I live on Frederick Douglass Circle, up at the corner of W110th St and Central Park West. It’s a lovely place and we enjoy having a view of Central Park. But the downside is the noise. There is a confluence of factors that makes the area very noisy: 110th Street is a commercial route for tractor trailers, the roundabout is poorly designed with lights that are confusing, there are several bus stops, and until recently, there was a gas station next door.

The most frustrating thing was that cars would regularly honk. Red light turned green? Honk. Car trying to make a left out of the gas station? Honk honk. All day long. Seriously, I did an entire analysis of it.

I wanted to have an impact. I noticed a very large percentage of these impatient drivers were either livery cabs or yellow taxis. So I started calling 311 and quickly understood that, in fact, you could make complaints to the TLC about honking and going through red lights, without having to be a passenger and even without video evidence.

I submitted over 14 complaints to the TLC about honking cabbies I saw while walking my dog or returning from a run. Several ran red lights. I would make a note of what I saw and the medallion or license and then submit the complaint online when I got home.

I kept a detailed spreadsheet of everything: the description of what happened (called the “complaint narrative”), all the info I submitted online as well as the TLC’s response dates, the service request number, what the outcome was (driver paid summons, dismissed or trial date set), and other notes.

My spreadsheet of taxi complaints.

This spreadsheet would prove very helpful. You see, these complaints are handled through a special court called OATH. Consequently, lawyers are the ones who actually deal with the cases. So that means the attorney will ask something like “I’m calling about an incident you reported on [a date 2 months ago]. Tell me what happened.” I have a poor memory, but in this case, it’s non existent. We’re talking about cars going through red lights and honking. Small, fleeting stuff that doesn’t exactly burn into your memory like, say, an assault, in a way that can be prodded reliably. So you answer the best you can but they might ask additional questions like “was there a car in front of the taxi?”, or “how busy was the intersection at that moment”, or “were there other people waiting at the crosswalk?” These aren’t hard to answer moments after an incident, but 2 months means you’re now only able to answer “I don’t remember.” Hence the spreadsheet. I learned that you absolutely do not need photographic evidence of anything to make your case compelling. You just need to have details. Even if you don’t, it’s not bad, but the more detail you have, the better.

When I would see a taxi honk, I would pull out my iphone and narrate a message to Siri into the notepad. Here’s an example:

I had just finished a run and was waiting at the South side of the East side of Frederick Douglass Circle waiting for the light so I could cross to the North side of W110th St. As the light turned from yellow to red, a yellow cab barely made it through in time, and then was followed by a white livery sedan that just drive through the red light (there was already a walk signal when he sped through). I was waiting to walk north so I had a clear line of sight both to the sedan as well as the traffic light.

I noticed when I reported honking, the attorneys would often ask how I knew that particular car was the one that honked. I love this question for two reasons:

First, it all but admits that honking is a completely worthless thing to do because even a pedestrian watching from the curb might not reliably tell which car was honking. It basically says that honking is ambient noise and determining its source is difficult. Then why exactly do we allow horns in the city? That’s another issue.

Second, it makes your ears seem like very inaccurate sensors. Your ears are not microphones. They can triangulate sound. They’re pretty damn good at determining the location of a sound. 110th street only has 1 westbound and 1 eastbound lane, so it’s not like I’m standing in the middle of the Holland Tunnel entrance during rush hour with 1,000 cars honking and no way to figure out who’s making noise.

To handle these clarifying questions, I started including detail on how I could tell which car was honking. “I had a clear line of sight” or “I was 20 feet from the car” or “I saw the driver’s hand.”

But I digress…

Ultimately, these resulted in $850 in fines.

I calculated a few other interesting facts about my 14 complaints:

  • It took an average of 51 days for the complaint to resolve.
  • 8 were found guilty (6 immediately paid the summons)
  • 4 were ultimately dismissed: 1 driver had expired license (what?!), 1 driver couldn’t be identified, 1 was the wrong medallion (my fault, misread the number). And after 1 scheduled hearing, the attorney called me and said she decided to dismiss the case because there wasn’t enough evidence, which I was not happy about since all the information was clearly explained.
  • 1 is still open
  • 1 was found not guilty

The only hearing that I participated in, the driver was found not guilty. Here’s what happened:

The TLC prosecutor calls me and says he’s sitting in a room with the driver, driver’s defense, and an administrative judge. I’m on speaker phone and they are recording it.

I’m sworn in when the judge says “raise your right hand. Do you swear to tell the truth.” (That was all she said).

The TLC attorney says “direct the witness to the events of December 24”.

I explain my story which was essentially:

I just finished walking my dog and was standing on the sidewalk approaching my apartment when I watched 3 cars queued up on the westbound side of 110th street at the red light. The light turned green and the first car, a green taxi, proceeded through. The 2nd car, a white livery car, delayed for a split second and that was when the driver of this yellow taxi honked once at that white sedan. Honking is for emergencies only and this was just impatience. I was walking on the North side of 110th towards the light, so these cars were directly to my left.

“Was there an emergency”

“No there wasn’t.”

“How did you determine this driver was the one who honked?”

“I had a clear line of sight to the vehicle and read the medallion number and took down a note on my phone before going to my apartment and logging the complaint.”

“No further questions for the witness. Now I’d like to direct attention to other evidence:

  • The trip sheet that shows the driver was working that night and these were his pickups before and after.
  • Here is the narrative of the original 311 complaint
  • Here is Section 375 of NY State Law 1A which says “[a horn] shall not be used other than as a reasonable warning nor be unnecessarily loud or harsh.”
  • And NY city Vehicle Traffic Codes 4-12(i) which says “(i) Horn for danger only. No person shall sound the horn of a vehicle except when necessary to warn a person or animal of danger.”

The driver then had his turn to explain the incident and his defense. I just couldn’t understand much of what he said because I was on speakerphone and he didn’t speak clear English. But he said something about like “I don’t want an accident” so he will sometimes honk at drivers who are swerving into his lane. This was tough to swallow because 110th street is just one lane and he was the 3rd car at a RED LIGHT. So no one swerved at all. It was a very common (incredibly obnoxious) incident of “the light turned green, so I’m going to tell the driver in front of me to go.” But there were no more questions to me about the incident.

I was thanked for my testimony and the call ended.

About 30 minutes later the TLC attorney called me and gave me the news. The driver had been found not guilty.

I can almost guarantee the defense for the driver advised him to say something about the swerving and talk about how he’s trying to avoid accidents. The problem is that wasn’t true since I watched the whole thing and took notes. You win some you lose some.

Complaints don’t end up in a black hole

I think a lot of people suspect complaints to 311 end up in an abyss and city employees simply disregard. But that’s not true at all. The most surprising thing I found was that each and every complaint is handled by a TLC prosecuting attorney and is taken seriously. I learned a lot through phone calls with several of the lawyers handling my cases because I spent time asking them additional questions about the process.

Most incidents end with a summons to a driver who then pays the fine. I have tracked how long it takes to get responses, the resolutions of all my complaints, and I only testified in a trial once (which was a 5 minute phone call).

There are some previous articles about how the TLC handles complaints.

Consumer complaints are a critical piece of how the TLC enforces rules. A 2011 WSJ article quoted TLC Commissioner David Yassky:

Without passenger complaints, we would have a very difficult time enforcing our rules,” he said. “We simply don’t have enough enforcement personnel to really do the job on issues like cellphone use, distracted driving and passenger refusals.

A 2009 NYT article describes a bit more about the hearing process. But I can provide a lot more detail here:

The life of a TLC complaint

  1. You submit a taxi complaint online or via 311.
  2. The TLC checks the taxi’s GPS to ensure it was in the area where you said the incident happened. They also check who was driving at the time (several drivers can have access to one medallion taxi).
  3. If they can identify the driver and location, within 5-7 days a TLC attorney will call you to confirm your story (to ensure you meant what you said). The conversation is very quick. They basically ask “can you tell me what happened on that day” and you tell them what you wrote or said when you submitted the initial complaint. The more detail the better.
  4. If sufficient evidence is presented, the attorney will try to categorize your “complaint narrative” into one or more explicit charges. For example, if you say “this driver asked me where I was going and then didn’t take me to Brooklyn,” that will align with the charge 54-20(a)(1):

Which carries a $350 fine:

A summons with the charges is sent to the driver and he has 6 weeks to respond. He can either:

A. Pay the fine directly and the complaint is closed. Most complaints end like this. (In fact, this is what led me to FOIL the violation data — to better understand the typical outcomes for various charges and how common certain charges are, as I describe in Part 3.)

B. Plead not guilty. If the driver pleads not guilty, a court date will be set for the hearing. You will likely be asked if a certain date works for you and your preference for a particular block of time. The driver has to physically show up. You don’t, nor should you. You will be given a 1 hour window of time in which you’ll be called to give your testimony over the phone. There will be a specific time set for the hearing but the driver has up to 1 hour to show up. (In 2 of 3 hearing dates scheduled for my complaints, the driver didn’t even show up and was automatically found guilty.)

5. A day or two before the hearing, the TLC attorney who was assigned your case will give you a call and possibly go through a “mock hearing” where you’ll be asked questions as though it’s the real thing. Not a big deal.

6. When your court date arrives, assuming the driver shows up, the phone call takes about 10-15 minutes. You’re sworn in and asked to provide your story. The prosecutor will ask a few additional questions (nothing you wouldn’t already have discussed). Then you hear the defense and the driver might provide his own testimony about the story. When that is finished, you are thanked for your time and dismissed.

7. A verdict is rendered by a TLC tribunal judge (technically through OATH) and the TLC attorney will give you a call explaining what happened. You’ll also receive a notice in the mail or email about it. The complaint is then closed.

Part 3: Making sense of TLC data

Note: We plan to publicly release all of the data used in our app.

The TLC is sitting on a lot of data. You just have to ask and you can get it.

A lot of the value of Reported is in the data. When New Yorkers submit complaints about a taxi driver, it often feels like they are the only ones who have ever gone through the process. There is never a mention of how these things get processed, how many similar complaints were made or what the typical outcomes are. Those were the key questions I wanted to find out. Here’s a breakdown of all the data I accumulated — some through public sources and some through FOIL requests — to better understand outcomes of taxi complaints.

Having seen how complaints make their way through 311 and the TLC, I knew there was a bunch of data that related to the types of charges brought against drivers as well as the outcomes. I submitted a FOIL request for TLC data. Specifically:

TLC complaints filed in 2012 and 2013 via 311 or the TLC website with the following data: Date filed, Date & time of incident, LOCATION of incident, taxi medallion or license (if livery car), DESCRIPTION of incident, TLC response, Final status (ie “driver pled guilty”), and any additional information available

A few weeks later, here’s what I got:

2012 & 2013 Complaints and Violations

I received a spreadsheet that, with a little cleaning up, listed ~21,000 consumer complaints were made, that constituted ~32,000 violations (a single complaint can have multiple violations). A violation is something that neatly correlates to an exact charge from the NYC Rulebook (below). They all have disposition codes, which are internal identifiers for the status or outcome of the violation — essentially guilty, not guilty or dismissed.

I spoke with the TLC FOIL coordinator at length about exactly what each of the fields meant and I learned that I had only requested consumer complaints. I asked “can I get all the violation data?” That is, all the internal complaints made when, for example, the TLC’s own enforcement assesses their drivers, or NYPD pull over a car, or a TLC enforcement agent writes a summons for a driver doing something illegal at the airport.

With this request, I got another Excel file with ~95,000 complaints connected to ~108,000 violations. Here’s what that looks like:

Complaints t
Violations table

Ok so let’s quickly deconstruct the Complaints table:

  • Lic_no: the official hack license number for the driver (as all complaints are against drivers)
  • Vio_type: the type of violation. For consumer complaints that would be CV and CC. For others, I’m not sure what the codes mean.
  • Vio_number: identifier for the specific violation which is tied to a single charge.
  • Vio_date: Date the violation happened (or was reported)
  • Appear_date: If it went to trial, this is the date the driver appeared in court for his hearing

Next up is the Violations table:

  • Vio_number: the violation id number.
  • Charge_code: this is the specific charge that was filed. A charge code will correlate to one of the NYC rulebooks, such as Chapter 54. This is really important because it tells you exactly what rule was potentially broken, the description of the rule and the penalty and fine for it.
  • Disp_code: This is the disposition code. It is essentially the outcome for that particular violation, ie was he guilty or not.

Disposition Codes

I also received a poorly copied PDF of the disposition codes with no explanation of what they meant.

Disposition codes

I had to manually type them into a table. I spoke with the TLC at length about what some of them meant and understood that they are something of an internal record keeping system for OATH, the Office of Administrative Trials and Hearings — a separate court system for city agencies so they can expedite these sorts of violations and not clog up the civil court system. A very good idea indeed! I learned that drivers could appeal if they were found guilty. Or if a driver had to pay a fine, but couldn’t pay all at once (such as a $3,000 fine), that would be “Open, Settlement Accepted”. There are a lot of oddities hidden in these codes but I tried to get the gist of which meant guilty and not guilty.

  • CSA and OSA: means the driver immediately accepted the summons

This would be a partial query for the percent of guilty outcomes:

100*( SUM(CASE WHEN d.disposition_code IN (‘CAD’, ‘CDI’, ‘CGA’, ‘CGH’, ‘CGI’, ‘CGP’, ‘CSA’, ‘CSW’, ‘OGH’, ‘OGI’, ‘OGP’, ‘OOT’, ‘OSA’) THEN 1 ELSE 0 END)) / ( SUM(CASE WHEN d.disposition_code IN (‘CDH’, ‘CDR’, ‘CWC’, ‘CAD’, ‘CDI’, ‘CGA’, ‘CGH’, ‘CGI’, ‘CGP’, ‘CSA’, ‘CSW’, ‘OGH’, ‘OGI’, ‘OGP’, ‘OOT’, ‘OSA’) THEN 1 ELSE 0 END) ) as “Percent guilty outomes”,
  • CDH, CDR, CWC: are basically not guilty because they are dismissed or withdrawn. Based on the data these show up most frequently.

Charges (The NYC Rulebook)

Chapter 54 — One of many rules, 54-14(e)(1)

This is just one charge in Chapter 54 of the NYC Rulebook. It’s 70 pages of specific rules and regulations that taxi drivers must follow and it spells out exactly what constitutes a violation and the penalties for the infraction. There are actually several: Chapter 55, 59, 60, 02 and more!

I converted the relevant charges that showed up often in the data into a table because, honestly, it’s just a spreadsheet masquerading in this PDF. The list of charges (from the violations table) maps directly to these charge codes. Any complaint submitted must be categorized into at least one charge. And all charges have a specific penalty. This document tells you exactly what rules a driver may be breaking. This is also the key to understanding just what kind of justice would be served in the end.

Table of Medallions (vehicle owners) and the drivers (licenses) who lease those cars

I searched for something that connected vehicles to drivers and found it here.

Drivers who lease medallion vehicles. The key to linking drivers and medallion numbers!

Unfortunately, PDFs aren’t the best format and it didn’t look like the TLC had a clean Excel file to share. So I took the following steps to clean things up:

  1. Saved the PDF as XML — this at least made things much more structured and easier to manage.
  2. Many many search/replacements — to clear leading spaces, newlines, odd characters, etc, so that I could…
  3. Paste it into Excel in a structured format
  4. In order to fill the medallion number and owner name into each of the rows for the TLC driver info, I added an Excel formula to check and copy. Nifty. =IF(ISNUMBER(F3),IF(D2<>0,D2,A2),0)
  5. One irksome fact about Excel is that if you paste something like “4E12”, the field will auto-annoyingly convert it to exponential format, like 4* 10^12. Even if you convert the cells formats to “text” it doesn’t fix the problem. So I used a little trick to convert a large number (ie 3000000000000) back to a medallion number. It took the left most number, then added “E”, then divided the LOG of the number by the left most digit (ie 3), to get just the exponent number. And then, because Excel is so annoying, I added a “.” so it wouldn’t convert it back to a number. =LEFT(B2,1) & “E” &LOG10(B3/LEFT(B3)) & “.”

Unused, but cool data…

175 Million Taxi Rides in Google Bigquery

There has been a lot of exciting work done with taxi data recently. Chris Whong chronicled his FOIL request for all 180 million taxi trips in 2013. Then someone posted the 30GB of data into Google’s BigQuery machine which can be queried online in seconds.

This is great, but the hack licenses (drivers) and medallions (vehicles) are hashed (anonymized), so it wasn’t very useful for my app. But then, Vijay Pandurangan wrote a wonderful post explaining that the hashes were simply one way MD5 hashes of the original strings or numbers and could easily be uncovered in a sort of brute force method.

I had a feeling this was the way the data was set up. I was able to replicate this in Ruby with effectively one line of code:

Digest::MD5.hexdigest([license or medallion]).upcase
Example: 1A10 is converted to ‘35A2BDABB3011645C2F7612ABA558C76’

Look up in Bigquery

FROM [833682135931:nyctaxi.trip_data]
WHERE medallion = ‘35A2BDABB3011645C2F7612ABA558C76';
Query for a hashed medallion number

What this unlocks for Reported is the ability to connect trip data to drivers and medallions. The most obvious one in my mind was the number of trips a driver made that had an average speed of greater than 45mph. We’ll work on that soon.

Phone number and Garage of a Medallion

Turns out there is an API to retrieve information about a medallion. On this page you can provide a medallion and get info about it. That’s just linked to this simpler API call:

311 Service Requests (2010-present)

This data file was over 1 GB, so I wasn’t able to simply open the CSV in Excel and filter for just TLC complaints. That would have been too easy. So I wrote a little ruby script to open the file and copy only the relevant lines to a new file.

This took a few minutes to run and produced a much more manageable 27k line CSV file (13MB).

Daily Hearings at OATH

I just stumbled across this looking up a taxi driver’s license. This is OATHss “Chronological Hearing Report”, a daily schedule of all the drivers who have hearings scheduled with OATH (Office of Administrative Trials and Hearings — basically a court that handles city hearings so they don’t clog up the civil courts).

OATH Daily Hearings Report

I would love to build a little daily script that scrapes this and converts it into an API for much more current statuses of drivers. Unfortunately you don’t get the outcomes of the hearing and this only covers violations that lead to court dates (most never get scheduled for a hearing). But it’s definitely a start!

Part 4: The App and BigApps

So that’s why we built Reported, an iOS app that lets you submit a taxi complaint in less than 30 seconds and helps you understand exactly what the next steps will be in the complaint process.

We know New Yorkers are busy so Reported is designed to let you submit a complaint on-the-go, within a few minutes of it happening… so it’s accurate and timely.

Note: When a complaint gets submitted via Reported, an email is sent that contains all your relevant info to a Reported team member who manually submits it on your behalf using the 311 online form. About 5-10 minutes after you submit it on the app, you’ll get an update that your complaint was successfully submitted along with the Service Request number from 311 that lets you check the status.

Value for New Yorkers

Reported is an app designed by New Yorkers for New Yorkers — particularly:

  • People who regularly take taxis
  • People who regularly have bad interactions with taxis (ie running a red light outside their apartment)

While there has been a lot of recent interest in taxi trip data, virtually no one has analyzed the violation data or these rulebooks.

By making it significantly easier for New Yorkers to more regularly submit complaints about dangerous taxi drivers, the TLC will do a better job of tallying points and sending reckless drivers to the Critical Driver Program or taking them off the streets.

Even a small number of beta testers for Reported could bring the number of legitimate TLC complaints up by a lot. Some people have asked if this is a business. No it’s not. It is a passion project that we are excited to work on.

A bit of control in a city that pushes you around

People often feel powerless in this city. A lot happens and you have no control over it. The interesting thing about taxi complaints is that regular people can participate in bringing reckless taxi drivers to justice by reporting them.

My 14 complaints led to $850 in fines. My wife submitted a complaint for a driver who ran a red light and nearly hit her and our dog. That driver got hit with 4 points and $2100 fine.

These drivers will be motivated to be more conscientious of pedestrians and passengers. Likewise, New Yorkers will feel empowered by Reported by better understanding and engaging in the complaint process and NYC streets will improve for everyone.

This project was a collaboration with Josh Weitzman, who did all the iOS development and design.

Questions? Comments? Email me at jeffnovich [at] gmail