Why Should Today’s Electric Utilities Hire Data Scientists?

Hui Z
The MegaWatts
Published in
7 min readOct 27, 2022
Image by rawpixel.com on Freepik

In today’s business world, energy companies, including natural monopolistic utilities, face an increasingly competitive business landscape. To be and stay in a leading business position, collecting, digesting, and strategically leveraging the available information is critical to driving sound business decisions. As the most common information carrier, data has become the new (digital) gold mine from which valuable information can be extracted, synthesized, and integrated into developing business plans.

What Is A Data Scientist And What Does A Data Scientist Do?

The data scientist position originated in the tech industry around 2012, where large amounts of data need to be analyzed for insights to guide business decisions. Since then, the role has become highly demanded and was quickly adopted by industries such as investment, pharmaceutical, and manufacturing. However, energy companies, specifically utilities, are late starters in the trend. Till today, most utilities in the United States either do not have a data analytics initiative or outsource to their vendor’s off-the-shelf data analytics services, which in most cases are no more than a handful of dashboards showing descriptive plots based on historical data.

A data scientist’s job is not limited to analyzing and interpreting complex data; the responsibilities also include developing algorithms and models to learn from historical patterns and predict future outcomes. At a minimum, a data scientist should be proficient in SQL, Python, R, Git, etc., and be familiar with statistical analysis such as regression and modern machine learning algorithms. Additionally, a successful data scientist must be able to transform abstract data models into easy-to-understand “stories” and clearly communicate them to business decision-makers with or without a technical background.

Image by DCStudio on Freepik

Why Should Electric Utilities Hire Data Scientists?

Data scientists can benefit a utility’s business in several areas:

  • Reserve requirement/operating limit prediction. Every year, utilities in the United States spend billions of dollars to carry the operating reserves required by NERC to protect against contingencies. Data scientists can look into historical operational data, including power flow, generation dispatch pattern, and weather conditions to predict the day-ahead and real-time reserve requirements so that utilities can more efficiently allocate their resources to provide energy and ancillary services. Maintaining operating limits can be challenging for peak seasons or systems under severe stress. It is critical to know what the transmission system is capable of, especially in a deregulated environment. Operating too liberally can cause unintended reliability consequences, while the opposite side could incur operating costs unnecessarily. Thanks to the advanced metering and sensor technologies that detect and record real-time temperature and wind data around transmission facilities, data scientists can leverage these data to develop dynamic operating limits that more accurately reflect the real-time energy transfer capability.
  • Wildfire prediction and maintenance schedule. Climate change has made wildfires occur more frequently than before. With more weather sensor data installed around fire-prone areas, data scientists may develop more targeted mitigation plans to minimize public power interruptions and recovery time. It is also possible, through historical data analysis, to identify high-risk line sections for which mitigation plans such as prioritized line patrols can be pre-developed. Additionally, data scientists can quantify the system impacts of an outage (or combination of outages) by regressing (or machine learning) historical outage data and key operating metrics (e.g., congestion and voltage profiles). Such analysis helps utilities schedule major facility maintenance strategically, improving operating efficiency while minimizing operating risks.
  • Energy trading/rate design. Advanced data analytics has been widely used in finance to drive data-driven trading decisions. Energy trading has no difference. In addition to fundamental power analysis, data scientists can help reveal hidden relationships among different variables and derive data-backed trading recommendations by learning the vast amount of data provided by RTO/ISO and vendors today. Similarly, conducting comprehensive data analysis should help utilities and policymakers see through the data and reach an economical and equitable energy rates design.
  • Marketing and customer engagement. The decentralization of the grid has brought more competition into the energy space, allowing today’s customers to have more choices in their energy services. Aggregated anonymous usage data may be collected and analyzed (complying with all privacy laws) to help utilities better understand the energy consumption patterns of different customers, thus allowing utilities to effectively segment their customers and offer tailored energy products. In addition, data scientists may scrape customers’ comments from survey data or social media and conduct sentimental analysis to better understand customers’ needs and pain points, which are valuable information for utilities to improve their customer engagement and satisfaction.
Image created by the author using MS PPT

The Challenges

Despite the benefits listed above, challenges remain that prevent advanced data analytics from being fully utilized in the utilities.

  • Lack of a fully interpretable result-generating process. Unlike regression-based analysis, in which the process of reaching a conclusion can be fully tracked and explained, results from learning-based models, on the other hand, may not be easily interpretable. Due to the nature of the algorithm, the learning process is essentially a black box to users and even the data scientist who built it. The lack of explainability has limited the application of tools based on advanced machine learning models in areas such as real-time power operations, even if the results from the tools are, in most cases, superior to those from the traditional approaches because there could be significant compliance implications in case something goes wrong, and the results are not explainable.
Image created by the author using MS PPT
  • Not ready to accept a probabilistic presentation. People are so used to believing the world is deterministic that they rarely realize that it is not. For example, the weather forecast used to be quite deterministic by telling people what tomorrow’s weather would be like. Then, when tomorrow doesn’t rain, people will blame the forecast is wrong. Not too long ago, the forecast changed to a probability-based presentation. For example, instead of stating that tomorrow will rain, the forecast will say that the chance of precipitation is X%, where X is 0 to 100. Why? Well, clearly, how the weather system works has not changed; it is the way people perceive it that has changed. With a probabilistic model, there is arguably no wrong answer anymore because if tomorrow doesn’t rain, it just means that the alternative event with (1-X)% chance is realized. So, should you bring an umbrella? The answer really depends on how much risk you are willing to take to get wet! Similarly, many data science applications have a probabilistic presentation in the final results. Decision-makers should get used to such representations and select the outcome based on risks/rewards.
Image created by the author using MS PPT
  • Over-expectation. Data analytics is not a silver bullet. It will provide insights to help reach an educated decision, but it by no means is designed to have an answer to every question. Even the most potent predictive algorithms today need historical data with similar patterns to learn from in order to produce a reasonable prediction. For example, climate change, the increasing generation of behind-the-meter renewables/DERs, and the pandemic together have posed significant changes to load forecasts today partly because the historical data that the learning algorithms depend on have become less informative in predicting the future. Companies that use analytics outputs blindly will eventually be shocked and punished by negative consequences caused by ignorance.
Image created by the author using MS PPT

External Hire or Internally Grow?

Because data analytics is still a relatively new initiative to most utilities, it is better to hire data scientists from external in the short term. A qualified data scientist can leverage his/her knowledge to clean existing data, design new data structures, and set up analytics tools and platforms, which are the foundation for any further analysis. In the meantime, it is a good practice to partner the data scientist — assuming he/she is a generalist — with someone with deep business knowledge to train the data scientists with business knowledge and ensure the models and analyses make business sense.

In the longer term, utilities can choose to keep hiring generalist data scientists and repeat the training process to build their business knowledge. Alternatively, the data scientist can train the engineers and analysts with data analytics and programming skills to prepare them for doing similar data analytics work. The permanent solution, however, is hiring data scientists with business knowledge or engineers/analysts with solid data analytics backgrounds.

Image created by the author using MS PPT

What Could Be Academia’s Role?

Should engineers understand statistics and data analytics? Most will say yes. Fundamental statistics training and data analytics should be included in the curriculum of every engineering program. However, when 80% of my recent interview candidates couldn’t answer a basic linear regression question (not the actual calculation, but knowing the question may be solved by linear regression), I knew something was wrong.

To better prepare future engineers for the changing energy industry, engineering programs, specifically power engineering programs, should advance with the times by increasing the weight of data analytics and programming-related courses in the curriculum with real-world projects. Furthermore, training in communication skills, both in writing and presentation, should be emphasized to prepare engineers for a business environment with stakeholders from cross-functional teams. The new era requires engineers to be good at calculating numbers accurately — as they always should be — and leveraging the latest tools and models to analyze the data, ultimately leading to informed business decisions.

--

--

Hui Z
The MegaWatts

I talk about Power Systems, Electricity Market, and Energy Transition. Founder of The Megawatts—an energy-focused publication: https://medium.com/the-megawatts