California solar. How much do we have right now?

My job is to help homeowners go solar from start to finish and I have often wondered just how much solar we have in the residential sector. Unfortunately I’ve found the data available sparse at best and maddeningly absent in any other case. In lieu of residential solar capacity, I opted to investigate utility scale solar. I’ve become very familiar with how residential solar works from my job, but I have not had much interaction with solar at the utility scale. I am curious how much large scale solar we can generate and where we are producing it. I thought it may be helpful to contrast that capacity with the current infrastructure for fossil fuels - specifically coal and natural gas plants. I also include crude oil refineries and pipelines despite the fact that petroleum is not used to generate electricity. I think oil is relevant just as a metric to gauge how invested we are in fossil fuel infrastructure as that can also pose a barrier to adopting renewables. For the sake of brevity I do not include other renewables in this analysis but, may incorporate wind later because of its increasing relevance within California’s renewable energy portfolio.

There was some moderate data manipulation needed in order understand the distribution of solar projects in the state. This information was obtained from the US Department of Energy and the US Census Bureau. First I needed to join the power plant data to the geospatial county data using the following query

SELECT ca_counties.the_geom_webmercator,pp.solarplants, pp.county, pp._olar_otential_ as solarpotential
FROM ca_counties
INNER JOIN solarplants_bycounty as pp
ON ca_counties.name=pp.county

Then I had to properly aggregate the information by county

SELECT county, COUNT(cartodb_id), SUM(mw)
FROM power_plants
WHERE general_fuel = ‘Solar’
GROUP BY county

The query above produces the base layer used to indicate utility scale solar projects grouped by county. This data is represented as the varying shades of yellow within each county on the map. Darker indicates a higher number of solar power plants within that county while gray indicates zero. You can click on the map to get more details about any one feature. One figure which I consider particularly insightful is the “solar potential”, which you can view for any county by clicking on it. This represents the sum total of the potential output for all solar power plants in the region. This figure is in mega Watts (MW) which is convenient as it allows us to compare across different fuel types. For example if a county has a solar capacity of 100 MW and a coal plant that produces 20 MW, then it produces approximately 5 times more energy from solar than coal.

So on top of the solar choropleth (fancy word for a map who’s changing colors represent the value of its features) I could then layer contrasting energy sources to gain perspective on how each compare. Here you can see the solar choropleth below with an overlay that identifies some basic crude oil infrastructure. These maps were created using Carto online. You can zoom drag and click on all three of these maps to look around. The maps don’t work so well on a phone tablet so I recommend a browser.

Crude Oil and Solar

In order to identify which counties were affected by the fossil fuel infrastructure, I had to use two data processing methods. The first was common among all three maps because it was working with point data and used the ST_Contains function. The second is specific to the crude oil infrastructure because it also contained line data.

SELECT ca_counties.the_geom_webmercator, ca_counties.name
FROM ca_counties, cacoalplants
WHERE 
ST_Contains(
ca_counties.the_geom_webmercator,cacoalplants.the_geom_webmercator)
GROUP BY ca_counties.name, ca_counties.the_geom_webmercator

And here is the second query which used the ST_Intersects function.

SELECT * FROM(
SELECT
ca_counties.the_geom_webmercator,
ca_counties.name as Cname,
ST_Intersects(ca_counties.the_geom_webmercator,
pipelines.the_geom_webmercator) as hasPipeline
FROM ca_counties, pipelines
)as popo
WHERE hasPipeline = TRUE
GROUP BY the_geom_webmercator, Cname, haspipeline

Here is the second map which contains the solar base map and coal powered electric plants

Coal Plants and Solar

As expected, there were not that many coal powered plants. This is not surprising considering the widespread shift towards renewables and other cheaper fossil fuels like natural gas. One detail which I added to all three maps was the distance of each location from San Francisco. This is generally useful as a metric to gauge proximity to one of California's renewable energy leaders and major metropolitan centers. The distance was calculated using the ST_Distance method from PostGIS. This information, along with other relevant info, can be viewed by clicking on the facility marker within the map. The code to generate the distance was simple and is included in the following

SELECT *,ST_Distance(
the_geom::geography,
CDB_LatLng(37.776429,-122.451287)::geography
) / 1000 AS dist FROM coalplants

The final map is the one which I found most informative and includes the solar choropleth base layer with the overlay of all natural gas electric plants in California.

Natural Gas Plants and Solar

This analysis was helpful in gauging the relative contribution of solar to utility scale electric generation but is not quantitative enough to really make any definitive assertions. I believe this analysis is useful in a preliminary sense, so I will avoid drawing any direct conclusions and let you explore for yourself. I would love to hear any thoughts in the response section down below.