Machine Learning and Climate Change
Intersections and Opportunities
In preparation for the coming NeurIPS conference, I was thinking about sitting in on a workshop session with presentations around the theme of the intersection of machine learning and climate change — or more specifically how the tools of machine learning could be applied towards mitigating the risks of climate change and its impacts to society. I find myself in a somewhat unusual position in that I believe I may inhabit a unique intersection in the venn diagram of the issue, in that I’ve spent a good part of my career involved in the power sector — including various roles ranging from supporting aspects of the of the fossil supply chain to extensive experience in renewable energy development. Oh and more recently a little machine learning here and there to go with it. So with the hope that I might be able to contribute some to the discussion, figured it might be worth the effort to try and organize my thoughts a little by way of this essay, which after all is what this blog has always been about. (I find the ‘skin in the game’ aspects of public speech quite helpful in forcing a little intention about forming cohesive opinions and positions, otherwise I mostly tend to waffle and side-step around politics and stuff when caught unprepared.)
Basically I think I’m going to approach this essay as a brainstorming session of sorts, whereby I’ll walk through a few different domains within proximity of the issue and try to extrapolate from my experience where there might be potential for applications of machine learning towards benefit, primarily from the standpoint of reducing carbon emissions I expect. One thing this essay won’t be is an attempt at defense of the premise that climate change is an urgent issue of known cause with potential to derail the whole human trajectory, plenty of better qualified people have covered that ground much better than I could so basically going to take it for granted that a reader accepts the reality of our situation. Great so without further ado.
On the surface wind turbines may appear as somewhat mechanical beasts without a great deal of overlap with the digital realm, after all the original manifestations of the simple extraction of the force of the atmospheric winds into mechanical rotational energy dates back centuries, and still to this day retain most of those core features of the earliest mechanics; a set of aerodynamic foils, or blades, are presented to the wind affixed to a shaft which translates the rotation to some use, whether those be earlier agricultural tasks or modern applications of rotating the coils in a generator to induce electrical currents. It turns out that in modern practice, there are several subtle process control considerations that may introduce complexities beyond the reach of traditional digital control feedback loops. The mechanical interface points on a wind turbine are few: the orientation of the shaft can be yawed to face the predominate direction of the wind, and the orientation of the blades can be pitched to accommodate different wind speeds. (Some more elaborate controls may also allow the rotational speed of the generator to deviate such as to facilitate some preferred mix of active and reactive power based on grid conditions but I digress.)
The complexities arise first from the unpredictability of the wind fluctuations (such as in speed and direction), but are also compounded by interactions between turbines. It turns out that, just as anyone who has looked off the back of a motor boat’s propeller in operation may recognize from the water’s froth, the wind downstream of a wind turbine has a kind of cone of turbulence that may interfere with adjacent equipment, whose extraction of rotation is hampered by this turbulence. One opportunity for machine learning may be for more elaborate control strategies that optimize for wind farm total power output instead of single turbine controls, as what may be a collection of optimal configurations for single wind turbines may not necessarily be that for the entire wind farm. I speculate that given the static nature of wind turbines with their fixed positions, this may be a good candidate for reinforcement learning. However there may be benefit to incorporating some elements of “continuous learning” for non-static distributions, which especially might be beneficial over a statically trained model when you consider that the geographic terrain within the quite large boundary of an onshore wind farm may not be suitably static over the 20+ years of operation — after all new buildings may be erected, tree canopy within or surrounding the farm may evolve, perhaps even some of the individual wind turbines may go out of service or be repowered with longer blades. Not to mention the predominant wind conditions may change over the years for onshore and offshore alike, such as may be due to the very climate change that we’re trying to mitigate. Certainly an interesting dynamic to this problem that I think may differ from traditional candidates for reinforcement learning is that we would be targeting the control of a collectively high dimensioned set of agents agent (pitch/yaw of potentially several dozen or more wind turbines) in response to an environment of roughly comparable dimensions, after all while the range of wind and weather conditions that a farm may experience may be quite wide, the interface measurement points are primarily the sensors affixed to each turbine for monitoring conditions like temperate, pressure, wind speed, and direction. The fact that configurations, compositions, or fidelity of these sensors may materially differ between manufacturers or wind farms reinforces the need for a flexible solution. Albeit the target variable itself, total wind farm power output over time, is just one number so perhaps that makes this fairly tractable. Of course one issue with trying to utilize reinforcement techniques on a live system is that there would be a significant squandering of power extraction during the early exploration phases when the system tries to hone in on optimal strategies. Two approaches I see could be for the initialization of controls with the original single turbine methods and reinforcement only allowed to deviate slightly from those parameters in exploration. Another method would be for the explorations themselves not to take place on live equipment but to be conducted in a kind of ‘digital twin’ of the plant in a simulated environment which is updated with readings when state points of corresponding sensor readings and pitch yaw configurations change.
All of this wind turbine discussion is applicable to what is already in the field, meaning could be applied even retroactively to existing installations. An extension of this analysis that is perhaps a slightly tougher nut to crack comes when we lift the requirement of static geographic positioning of wind turbines, such as would be the case when we are trying to design a layout of wind turbines for a given geography. This initial siting and layout of a wind farm is not as simple as just an equidistant checkered grid of points, it actually requires a whole optimization study taking into account those same considerations mentioned prior such expected wind distributions and potential of adjacent turbine interference. I’m not sure if this kind of optimization problem is as suitable for direct application of machine learning, but I speculate that some corpus of historical output for a large fleet of equipment and their associated geographic and atmospheric exposures, such as might be available to a wind turbine manufacturer or top tier developer, could potentially be applied to train some type of supervised model that provides guidelines for layout in novel terrain. Of course once those foundations are poured and the turbines erected our hands are kind of tied for the next 20+ years, so some extra attention to this phase of development is certainly worth consideration.
Solar power is a little different kind of beast than wind. (I’ll focus on photovoltaic for discussions.) While wind generation is predominately a utility-scale endeavor, the range of solar installations are kind of bi-modal so to speak, and while utility scale installations certainly carry some economies of scale that translate to lower installed cost, the availability of ample rooftop space within proximity of consumers allows for more efficient transmission and distribution of residential solar, along with added benefits such as potential for facility grid independence or backup power. Generally, consumer grade solar is a passive, fixed installation with limited opportunities for advanced control such as might take advantage of machine learning. Although some facility or utility scale installations may incorporate mechanics to rotate the orientation angles of panels throughout the day, such control is just a simple matter of tracking the orientation of the sun, which of course only varies by time of year. Thus, in looking for avenues to incorporate machine learning into the ecosystem, we’re going to have to look in adjacent domains to the installation itself.
Upstream of the generation side, a few other channels that come to mind include the manufacture, installation, and customer acquisition aspects which contribute to the metrics of lower cost and therefore by proxy increased penetration / increased carbon offset. Outside of the paradigms of photovoltaic material and embeddings that may facilitate increased levels of efficiency, traditional photovoltaic panels (unlike Tesla’s photovoltaic roofing for instance) are largely a commodity manufacturing business, and although I am not an expert suspect that in general manufacturing facilities (largely based in Asia I believe) are pretty sophisticated already given the decades of successful realization of Swanson’s Law (sort of an analogue of Moore’s Law from the microchip realm finding that PV module prices tend to fall around 20% for every doubling of volume, which has recently worked out to around 75% per decade). I’m not sure if photovoltaics are traditionally manufactured in batch processes like LCD panels or if there might be potential for alternates to glass backing which may make use of continuous ‘roll-to-roll’ such as with “printed” coatings on polymer substrates. Not really machine learning territory but I expect if Swanson’s law has a chance going forward as we approach the cost of materials this might be one avenue.
It turns out, and forgive me I’m too lazy to grab a citation here, that particularly in residential solar the driving costs of the installations are not even the photovoltaic panels themselves, but also have material contributions from adjacent functions like the labor of installation and customer acquisition. Utility-scale solar, given the potential scale of facilities, is surely a viable candidate for taking advantage of advanced robotics for the repetitive erection of structures and mounting of panels tasks in a controlled greenfield environment. In residential solar the rooftop conditions are all quite unique and make robotic automation certainly a more sophisticated challenge. Perhaps a more likely source of near-term robotics for residential solar could be in the simple routine maintenance of washing dust and debris to improve efficiency, which based on how my kitchen floor looks I’m guessing a lot of consumers neglect in general — I could totally see Windex offering a branded drone here just a thought.
Which brings us to I think a real opportunity for machine learning in residential solar. If you look at customer acquisition tactics in players like Vivint, Sunrun, or SunPower I believe they generally pursue tactics like door-to-door sales (at times supplemented by retail partnerships). You know, a salesman canvases a neighborhood, knocks on doors, and using some mix of charisma and people skills convinces a homeowner to sign on the dotted line. Kind of like how encyclopedias are sold. Such man-power is not cheap, when I wrote the essay “Distributed Rivalry” a little while back these folks’ annual reports indicated customer acquisition costs ranging around 30–40% of total operating expenses. Applying machine learning to this problem, which could look like targeted online advertising or even mixed methods such as targeted canvassing based on demographic profiles I think there is real opportunity to be a little more intelligent in these methods. I’ll offer a tip not really an expert but when I was looking into this a while back I recall being struck by the potential of one startup’s platform for targeting demographics which was exhibiting at the Solar Power Industry conference a while back by the name of Faraday. (Oh and how on earth is no one partnered with Best Buy yet? Geez.)
It might be a little counterintuitive to discuss a fossil generation resource in the context of exploring intersections of machine learning and climate change, but it turns out these platforms are somewhat interlinked. While it is certainly feasible to facilitate a 100% carbon neutral generation mix for a given region — such as by some combination of renewables, hydro, nuclear, and (extensive) energy storage — at least for the time being the cost of such a mix put this solution well outside of any reasonable standard of economic feasibility. We are constrained as to the extent of renewable penetration by the ability of our controllable variable resources, such as gas turbine ‘peaker plants’ or eventually energy storage, to offset the (not wholly predictable) intermittence of renewable energy that comes with inevitable atmospheric fluctuations for wind, transient cloud cover for solar, or of course the fully predictable paused solar production in between twilight and dawn. Such constraints on renewable penetration could partly be offset by improved interstate transmission capacity for increased utilization of the wind rich midwest belt to our coastal population centers, such as for instance utilizing efficient high voltage direct current (HVDC) lines, but I believe a political push of current administration to privatize transmission infrastructure might make such feasibility all the much more difficult (somewhat speculation on that point). Another improvement to this metric might be to adopt the European appetites and standards for offshore wind production, where winds are more steady and closer resemble a base-loaded resource.
Turning back to gas turbines, I expect the real potential for machine learning revolves around the grid-scale interplay between renewable and variable resources like peaker plants or combined cycle plants put into peaker use. I noted in my essay “Wind Power in America” that the operation of gas turbines in response to variations in renewable generation can cause them to operate outside of their peak efficiency, resulting in increased NOx emissions or for our considerations increased fuel usage and corresponding carbon emissions. A gas turbine on its own may take on order of ~10 minutes to ramp up to full production, while a combined cycle plant utilizing a steam turbine to recover additional energy from the gas turbines waste heat I believe is closer to around ~30 minutes to approach full load. The grid itself has some capacity to support variations in balance between generation and load realized as deviations in current frequency, although if allowed to get too far out of balance brownouts may occur. I am not an expert on this point but I believe the interplay between utility scale renewable production and managing variable gas turbine leads is supported with market pricing signals, with markets managed by regional grid operators under the regulation of FERC, such as RTO’s or ISO’s. For example (again not an expert on this point) I think a wind farm not under long term power purchase agreement who is selling their generated power at spot prices would provide some forecasts of production into the market at different time scales and the extent to which they are able to meet these forecasts influences the remuneration. Thus it seems that forecasting renewable production, such as might be the purview of a machine learning analysis, might actually provide the potential for increased stability of gas turbine operation, such as to ensure maintaining those sweet spots near maximum efficiency. I’m probably not going to win any friends in the renewable energy sphere with this thought, but I speculate that it might even on some occasions be beneficial to intelligently curtail the output of wind farms within short time periods such as to support maintaining gas turbine operations within peak efficiencies, allowing some degree of ramp up / ramp down to be born by the pitching of the blades instead of ramping gas turbines up and down. Such cross-plant coordination would certainly require a great deal of sophistication at scale.
I’m not going to spend a lot of time on coal generation because I believe this is one clear area where long term via negative is the clear solution. It has been a great blessing of economics and enhanced drilling techniques that in the last decade natural gas has become materially more competitive with coal as a base-loaded resource, and we have the fracking revolution to thank for a macro trend away from goal generation, at least in the US anyway. As a base loaded resource coal lacks the ability to support intermittence of renewable energy on its own, so increased coal production dampens our ability to integrate carbon neutral generation. But of course more importantly coal is by far the most carbon-intensive form of generation in wide use, with emissions roughly twice that of a gas turbine combined cycle plant of equivalent output, not to mention sulphuric emissions which may contribute to acid rain and stuff. While modern practice has certainly reduced the sulphuric emissions with scrubbing technology, comparable “carbon scrubbing” technology is a far cry from reasonable economics, and only now starting to be realized in some of the first demonstration plants. I am not suggesting that we shouldn’t continue research into carbon capture technology for coal, this may eventually prove viable and I think any technology that gives us more carbon neutral options will prove extremely valuable in the long term, but at the time scale under consideration for the Paris Agreement for instance incorporating carbon capture into coal generation at scale will not be a workable solution. There are no obvious to me avenues to apply machine learning to this specific problem so lets turn focus elsewhere.
I’ll first disclose that I was recently employed by a midstream oil and gas firm, no longer any affiliation though. And oil is and will always be synonymous with transportation. Unlike coal, we actually do today have very actionable means to influence the carbon intensity of oil-based transportation systems, simply by the lever of fuel economy standards for production vehicles. This isn’t really a machine learning problem, more so of politics, in that the internal combustion automobile manufactures and oil industry lobbyists for that matter have a very clear incentive to avoid these regulations, from a manufacturer standpoint because doing so increases the upfront cost of their vehicles, making them less competitive with electric cars, and from an oil interests standpoint also because less efficient vehicles mean more gas purchased at the pump and higher oil prices. Now probably worth an asterisk that unlike natural gas and to some extent coal for instance, oil is much more of a globally balanced market, in that oil extracted on one continent may be easily shipped around the world resulting in reasonably balanced prices between regions, thus the influence of one region’s policy won’t necessarily have as significant an impact oil prices as say shutting down coal plants.
Actually it’s certainly worth note and an important part of the conversation that the US has recently taken the helm as the world’s largest oil exporter. True story. And so the question of oil policy is a question impacting our national interests. It is kind of an unfortunate irony that this revolution in drilling technologies made this possible just as the mobilization of climate action started to come to a head. Now the US government finds itself with a hard choice between serving the economic interests of its Fortune 500 constituents and the environmental interests of its citizens and coastal communities. I don’t think there is any single easy answer here, and I expect this will continue to be politically contested, but I would like to offer an observation that by maintaining policy which explicitly favors the carbon intensive forms of industry, we are handicapping our ability to compete globally in emerging domains critical to our future economic prospects, such as electric cars and renewable energy for that matter.
This is of course relevant and adjacent to the oil discussion. First as an aside I’m not sure why but it seems so prevalent that self driving cars and electric vehicles are joined in perception. There’s no reason that internal combustion vehicles can’t tack on a few cameras or a lidar and go from there. I suspect there are two advantages for electric here, one is that refueling an electric vehicle has no need for dealing with hazardous fluids (which would necessitate human interactions without expensive gas pump robotics), and two, and please forgive me don’t have a citation handy, but the energy cost per mile driven is significantly lower for an electric vehicle, which is of course part of the reason why people are willing to pay a premium for a Tesla (notwithstanding the torque, the design, the two trunks, the interface, the self driving options, oh and did I mention the torque?), which other than the slight inconvenience of current density of refueling stations for long distance travel is sort of an across the board sweep.
Well certainly shouldn’t be forgotten that just because an electric vehicle does not fuel at a pump that does not mean it is a carbon free form of transportation, that energy has to come from somewhere, and since it’s likely refueling overnight in its owner’s garage that means it’s probably not a huge amount of solar in that mix unless said owner opted to install some batteries. So one step I think that could improve the carbon intensity of electric vehicles is to find ways to shift more of charging from overnight between commutes to I guess during working hours, especially in solar-favorable geography, which would allow for us to make better use of the energy rich daylight hours. You know with enough electric cars on the road that could even be feasible, with a reasonable fee schedule for charging stations. Consider that in general an owner with a known commute has a relative amount of certainty where they will be parking over the long term, and thus could potentially enter into some kind of pseudo commitment for making use of charging infrastructure in employer parking or public garages. Perhaps a little regulatory push might get this ball rolling.
With respect to machine learning for transportation, the obvious point of influence comes from the race for full self-driving in all environments, which may not be far. How will self-driving translate to reduced carbon emissions? The real opportunity for self-driving is further found in potential for improved traffic flow with increased car density. (And if the uber-ization continues perhaps even passenger density for that matter). I’ll just offer a suggestion that one way to help this along would be an increased use of roundabouts instead of traffic lights at high traffic intersections, which on their own have some improvements for traffic flow but I expect when you couple these with potential for high density multi-car coordination could see like real sizable improvements vs traffic lights. Anyhoo just a suggestion. (This kind of multi-car coordination for high-density traffic is actually I think certainly worth investigation, after all what may be good for each car individually might not be best for overall flow — the challenge being how do you phase this in when you have a mix of bumper cars and self-driven sharing the road — perhaps may be worth phasing in with dedicated self-driving lanes to start.)
The real concern I have is that as the mix shifts from car-ownership to ride-sharing apps, and especially once those ride-shares transition to self-driven, there is potential for real havoc with gridlock in urban environments, something that could actually hurt the carbon cause. Managing the logistics of a fleet of robots roaming the streets, where idling is cheaper than parking, well we might need to consider some limits need to be placed on density of self driven ride-shares. This is a complex issue and I don’t think I have an answer off top of my head, certainly a head-scratcher.
Given that this paper is intended for organizing my thoughts in preparation for a machine learning research conference, I’ll briefly touch on the topic of data center energy use. After all last I checked, and this was a little earlier in the cryptocurrency mania, I believe globally data centers were collectively on the order of one percent of energy use, in other words one percent of our carbon emissions. (Don’t have a citation handy, this is going by memory from a while back). So it’s encouraging to see cloud vendors do things like contract with large wind and solar farms to achieve carbon neutrality — that is a real thing that many of the major players do and I sincerely applaud this fact as I think it is not as fully appreciated for the investment in the environment as it should be. But I think these type of investments for large infrastructure is just kind of a bandaid for a larger problem — software development that still follows the philosophy of designing around Moore’s Law, that we can always count on shrinking transistors to overcome our one way street of increasing code complexity. This philosophy works at the scale of personal computers, but I speculate that it may be a cause of large-scale systems that don’t necessarily approach the frontiers of computational efficiency. I think those of us that design systems intended for mass-scale use — looking at you cryptocurrencies — well we have a certain ethical responsibility that we are not incentivizing the wasting of computations on mechanisms without commiserate return on computation. If mining cryptocurrencies needs to be a thing, can we at least find a way to put all of this “puzzle solving” to use to tackle real problems? Why can’t we have a cryptocurrency mining system where the puzzle that is being cracked is some problem of real world use? Could that be done? I mean I’m sure there’s some smart people at NeurIPS perhaps some of you could put a little thought into that.
Anyway I’m rambling. I’ll close with one more suggestion related to the use of machine learning to combat climate change. I recently read a short collection of speeches by the noted young climate change activist Greta Thunburg, who by the simple power of example mobilized millions of teenagers to action. She had a very simple story to tell, and it was striking for it’s clarity of a call to action, such clarity that is really hard to come by in mainstream media channels in discussions around the issue. Greta likes to ask the hypothetical along the lines, when your house is burning, how should you respond. Do you take a measured assessment and plan next steps, or do you by necessity require a little panic in your response. Greta tells us that there is a time and a place for measured debate, and there is a time for rational panic. The emergency of our ecosystem rapidly on pace to exceed those carbon emission targets that were collectively agreed under the Paris Agreement, even originally by this country, the world’s largest oil exporter, well we are jeopardizing the lives of our children and grandchildren, and failing at a moral obligation to protect our ecosystem from the tragedy of the commons. I’ll offer in closing that perhaps we in the machine learning community may also have something to offer in contributing to the political environment for facilitating elected leadership by those with demonstrated will to pursue what is needed. Machine learning can be a very powerful tool. Let’s put that tool to use for the greater good.
Books that were referenced here or otherwise inspired this post:
No One is Too Small to Make a Difference — Greta Thunberg
As an Amazon Associate I earn from qualifying purchases.
 Dushyant Rao, Francesco Visin, Andrei A. Rusu, Yee Whye Teh, Razvan Pascanu, and Raia Hadsell (2019) Continual Unsupervised Representation Learning arXiv:1910.14481
 Julie K. Lundquist, Andrew Clifton, Scott Dana, Arlinda Huskey, Patrick Moriarty, Jeroen van Dam, and Tommy Herges (2019) Wind Energy Instrumentation Atlas NREL Technial Report NREL/TP-5000–68986
 Alan Goodrich, Ted James, and Michael Woodhouse (2012) Residential, Commercial, and Utility-Scale Photovoltaic (PV) System Prices in the United States: Current Drivers and Cost-Reduction Opportunities NREL Technial Report NREL/TP-6A20–53347
 Swanson’s law Wikipedia
 Distributed Rivalry Medium
 #SPIconvention 2016 Medium
 Government Reform Medium
 Wind Power in America Medium
 Regional transmission organization (North America) Wikipedia
 Greta Thunberg (2019) No One is Too Small to Make a Difference (Penguin Books)