Exploring San Francisco’s Public Data with BigQuery

Reto Meier
Google Cloud - Community
6 min readMar 7, 2017

--

San Francisco. Fog City. The City by the Bay. No matter what you call it, those 47mi² are home to over 800,000 people about whom we can draw outrageous conclusions using the new San Francisco public dataset in BigQuery.

Thanks to the City and County of San Francisco’s SF OpenData project and Bay Area Bike Share, Google BigQuery’s Public Datasets now includes San Francisco public data, including:

SF Protip #1: Don’t leave valuables in your car when parked on Bryant. Especially on Friday. Definitely not before lunch.

In 2016, stolen cars and theft from locked cars accounts for over 21% of all SFPD crime incidents.

SELECT
ROUND(100*countif(strpos(descript, "STOLEN AUTOMOBILE") > 0)
/ count(*)) as stolen_pct,
ROUND(100*countif(strpos(descript, "THEFT FROM LOCKED AUTO") > 0)
/ count(*)) as theft_pct
FROM
`bigquery-public-data-staging.san_francisco.sfpd_incidents`
WHERE
category != "NON-CRIMINAL" AND category != "SECONDARY CODES"
AND category != "WARRANTS" AND EXTRACT(YEAR from timestamp) = 2016

Reports of thefts from locked cars peak on Friday at lunch-time and after-work, presumably as folks are arriving at their cars to find them ransacked. Parking on Bryant is easily the best way to maximize your chances of having your stuff stolen.

If you’re looking to minimize your chances of being a victim of grand theft from a locked auto, this heatmap shows where most thefts from cars have been reported — avoid parking there.

SF Protip #2: If you lose something on Bryant, try looking for it on Valencia and Eddy.

Nearly 20% of all lost property was reported on Bryant Street. The best odds for finding something are over on Valencia and Eddy, where more than twice as many items are reported found than lost.

You’re most likely to come back to your car and find it missing entirely if you parked on Mission Street. Especially on Friday.

SF Protip #3: Sunset Boulevard is the leafiest street in San Francisco.

You can get a taste of Monterey by visiting Sunset Boulevard — the leafiest street in San Francisco, with more trees than any other street, including nearly 1,400 Monterey Pines and Monterey Cypress. If you’re looking for a shady spot, the heatmap below shows the density of tree distribution throughout the City.

SF has a greater density of trees than NYC, featuring 2,637 trees per square mile compared to NYC’s 2,242.

Like New York, the most common tree in San Francisco is the London Plane Tree, which represents 7% of all SF trees, compared to 13% of the trees in New York. San Francisco trees are much more diverse, you’ll find 492 different species, compared to New York’s 183.

San Francisco can boast at least 1 example of each of New York’s top 7 tree species, where New York has none of San Francisco’s, apart from the London Plane.

SF Protip #4: San Francisco residents complain more than New Yorkers.

Per capita, San Francisco uses its 311 municipal complaint line more often than NYC. The most common complaints in SF are requests for sidewalk cleanups, rubbish pickups, and graffiti removals.

SF Protip #5: There’s more graffiti, but at least it’s getting more polite.

The biggest growth in complaints is in the categories that already represent the largest volume, with bulky item pickup requests and illegal homeless encampments rising most quickly.

Note that non-offensive graffiti is on the rise, with offensive graffiti is the fastest dropping complaint category. So either San Franciscans are writing less offensive graffiti, or the residents have grown immune to the current attempts to offend.

SF Protip #6: Steep hills make for slower bike rides.

Riders of the Bay Area Bike Share network average 14 minutes per ride — nearly the same as the 15 min average for New York Citi Bike riders. San Francisco is a smaller town though, and riders average only 1.4 km per trip, compared to New Yorkers, who go 1.8 km.

Or put another way, San Francisco cyclists are 0.3 m/s slower than their New York counterparts. I blame the steep hills.

SF Protip #7: Marijuana legalization will reduce arrests in Haight-Ashbury by 7%.

Marijuana represents 26% of all drug related arrests in San Francisco. In 2016, it was overtaken by meth as the most likely drug involved in an arrest.

Overall, drug crime is down 62% since 2003, with fewer arrests for all drugs — except methamphetamine, which is up 14%.

The map below shows all areas where there have been at least 5 drug-related arrests since 2003. Red pins are for marijuana, blue for crack cocaine, pink for meth amphetamine, green for heroine, and yellow represent cocaine.

The larger pins indicate over 200 drug-related arrests in the same location.

The Tenderloin is the center of San Francisco’s drug arrests, particularly for crack cocaine — the arrests for which are clustered around specific locations; methamphetamine arrests, however, are more likely to follow roads.

Marijuana arrests spike in the Haight-Ashbury neighborhood, where they account for 7% of all arrests.

SF Protip #8: The best place to have your house catch fire is Chinatown. The most likely place is the Tenderloin. If you’re lucky they’ll dispatch Engine 3.

The average response time between calling 911 and firefighters arriving at your burning building is 6 minutes 8 seconds, with units arriving at Chinatown blazes at an impressive city-wide best of 4 minutes and 30 seconds.

The fastest fire truck in the SFFD is Engine 3 based in the 4th battalion with an average response time of just over 3 minutes.

This is a taste of what you can explore and find using this new San Francisco dataset.

Join us here every week, for Today I Learned with BigQuery, as we dig into each of these table in more detail, use the NOAA weather tables to explore the effect of weather on crimes and 311 calls, and compare what we know about San Francisco and other cities, starting with New York.

If you’re new to BigQuery follow these getting started instructions, and remember that everyone gets 1TB at no charge every month to run queries. When you’re done remember to share the results with us using #TILwBQ.

--

--

Reto Meier
Google Cloud - Community

Developer Advocate @ Google, software engineer, and author of “Professional Android” series from Wrox. All opinions are my own.