The Three Dimensions of Modern Power Systems Operations and Their Optimization

Hui Z
The MegaWatts
Published in
8 min readSep 13, 2021



Modern power systems (a.k.a. grid) operations refer to a series of manual and automated actions in real-time (from milliseconds to minutes) that deliver electric energy from generators to loads. These actions, including forecasting, communicating, controlling, and monitoring, are happening 24/7/365 in the “control room” of today’s utilities and ISOs to achieve one goal — keep your lights on. Without too much technicality, this article introduces the three critical dimensions in modern power systems operations and discusses ways to improve real-time power operations.

Dimension 1: Reliability

Reliability in the context of electricity services means that electricity is there when you need it. You can flip a switch or plug in a cord to get it. In power systems operations, reliability is the ability of a power system to serve electric loads while withstanding anticipated and unanticipated disturbances. The higher the ability, the more reliable a power system is. Disturbances can happen on transmission elements (e.g., lines and transformers) and generators. In the power industry, we usually refer to anticipated disturbances as planned outages/maintenance and unanticipated disturbances as forced outages/contingencies. At a minimum, power systems should be designed following the “N — 1” rule, which means an outage of any single element should not affect the overall system’s ability to serve loads. To today’s utilities and ISOs in the United States, reliability remains the top service metric because electricity, as one of the basic infrastructure services, is essential to keep essential services and businesses running to support the citizens’ livelihood.

In a simple world where reliability is the only dimension of interest, optimizing the system operations means maximizing reliability to the extent possible. However, the world is never simple, and more dimensions need to be accounted for. Two factors are almost always associated with every decision made in real-time system operations: money and risk. People may argue that risk, even reliability, eventually can be expressed in terms of dollars. True. But calculating the accurate values of reliability and risk is not a trivial task. Therefore, for this article, we will keep these factors separate.

Dimension 2: (Economic) Efficiency

In modern power systems, loads should also be served efficiently, with the least costs possible, while maintaining reliability. According to the data published by the U.S. Energy Information Administration, 72% of U.S. electricity is served by Investor-Owned Utilities (IOUs), and the rest is by publicly-owned and cooperatives. While the nature of the business allows IOUs to pass most of the cost increase down to customers through rate increases, IOUs are still incentivized to reduce their costs for two reasons. First, IOUs are for-profit companies that must bring a decent return to keep their investors. Second, they can only increase rates periodically, and every attempt will be subject to scrutiny and approval by regulators such as Public Utility Commissions (PUCs). On the other hand, publicly-owned utilities and cooperatives are usually formed as not-for-profit companies aiming to serve reliable electricity at an affordable rate. To achieve that, keeping costs low is critical.

Furthermore, the supply-side deregulation, which allows generators to bid and compete with others to serve load, has pushed the need for efficient grid operations to a new level. As market operators, ISOs performs market optimizations that dispatch the most cost-effective generators across multiple utilities’ service areas to serve load. As a result, to remain competitive in the energy market, traditional utilities have to rethink how they operate the system for decades and strategically plan their operations activities.

Dimension 3: Complexity

Compared to reliability and economy, complexity is a dimension that may not have received as much attention but can be equally important. As advanced algorithms and complicated rules are implemented in software tools and operating procedures to gain accuracy (a benefit), risks — technologies or human-driven — also increase (a cost), sometimes even more significant than the gain. Common risks include undertested features, impossible to debug in real-time, lack of user training, and information overload. Thus, while striving for constant improvement is excellent, we also need to be cautious not to get too intrigued by achieving incremental gains and forget about the law of diminishing marginal returns. The rule is simple: all else equal, the least complex operating plan is always preferred.

Is there a fourth dimension? Maybe. But before adding another one, always think about the complexity dimension and ask whether the addition is really necessary.

Pareto Efficiency and Power System Operations

Pareto efficiency, or Pareto optimality, refers to a state in which it is impossible to reallocate existing resources to make one individual better off without hurting at least one other. It is a concept widely used in solving multi-objective optimization problems. Suppose these dimensions are independent and can be isolated from each other, then operating a power system is equivalent to solving a Pareto efficiency problem.

Let’s start by analyzing a simple case where reliability and economic efficiency are the only metrics of interest. The dark blue curve represents the pre-studied Pareto frontier, and the current system operates at point A. Any area beyond the curve is unstudied and, therefore, inoperable. Apparently, the current operating point is Pareto inefficient because a path exists to move A and improve reliability or economic efficiency without hurting another. Suppose a system operator wants to keep the current efficiency level and improve reliability. In that case, he or she can move the operating point to B. Notice that B is on the Pareto frontier, meaning the reliability cannot be further improved without hurting the economic efficiency. On the other hand, if the system operator wants to improve the system’s economics and keep the current reliability level, then he or she can push the operating point to D, another Pareto efficient point. Or, even better, the system can be operated towards point C, at which both reliability and economic efficiency are improved compared to the status quo. In reality, any points between B and D are acceptable operating points because by moving there, at least one metric will be improved, and no metric will be hurt. The final decision of where to operate depends on the system’s characteristics and the operator’s goal. However, the path to point E is not acceptable because the efficiency is improved at the cost of reliability. The same conclusion applies even if point E is on the frontier.

Now, let’s add the third dimension. Notice that the opposite of complexity, simplicity, is used to make the axis’s direction consistent in the plot below. The idea is similar to before. By moving the current operating point towards the efficient frontier, the system can be more Pareto efficient. The only difference is that compared to the two-dimensional case above, the efficient frontier, in this case, is an area. A good practice is that always go with the simplest operating plan once the reliability and economic efficiency levels are determined.

Case study 1: It is 5:30 on a Friday morning, and you just started your day shift on the Generation Desk. Upon reviewing today’s outages, you realized that a large combined-cycle plant (Unit A) will finish its last fixed schedule testing at 12:00 today, and if everything goes well, it will start generating at 13:00. You learned from the night shift that the previous test on that generator was successful. The plant operator was confident in meeting the testing schedule and bringing back the unit on time. Energy trading also knows about it and has submitted bids for it starting at 13:00 today. You feel relieved because the cooler-than-normal weather, together with the fixed schedules test on the generator, has made procuring flex capacity challenging for the last couple of days. If the unit can be online and generates normally, it will help provide more flex down capacity.

At 6:00, an Outage Coordinator (OC) called in and asked whether it was possible to accommodate a last-minute request from another combined-cycle plant (Unit B). “They wanted to schedule a fixed-schedule unit test at its Pmax from 10:00–12:00, but also indicated that the time can be flexible. They would like to know our decision as soon as possible to make the arrangement accordingly,” the OC said. You told the OC that you would need to check on something and call him back. Today will be another cooler-than-normal day, and you know that we will need to move some water during the day to prepare for the forecasted precipitation in two days…Your phone rings again, and as expected, it is from the OC.

Question: What do you plan to tell the OC?

Based on the forecasted system condition, it is better to tell the OC to reschedule the test on Unit B after Unit A is back online. This is because the testing on Unit B will require it to generate at its Pmax during the test, which, if done before Unit A returns, will reduce the available flex down capacity during the needed period. Also, moving water means that hydro units will likely be self-scheduled at high outputs, further aggravating the flex problem. The OC mentioned that the test time on Unit B is flexible. Hence, it is better to perform the test after Unit A is back and, if possible, during peak load hours when more generation is needed. In this case, by proposing to reschedule the testing request, the system operator successfully moves the operating point from point A to somewhere between B and D on the efficient frontier, improving the reliability without hurting (probably helping) the system’s economic efficiency.

Case study 2: You see a post-contingency overload on the real-time contingency analysis tool. There is no sign that the contingency may happen, and even if the contingency is materialized, you know there is a Remedial Action Scheme (RAS) that can run back the most effective generator to mitigate the overload. The RAS is in service and functioning. There are three choices:

  1. Mitigate by manually dispatching down the unit that the RAS will run back post-contingency
  2. Enforce the overloaded line as a constraint and activate the contingency in the market. Then, let the market bind the constraint and dispatch units to mitigate.
  3. Run a quick study to confirm that the RAS will mitigate the overload post-contingency without causing other reliability issues and take no further action.

The most appropriate choice is the third one. First, this is a post-contingency overload that RAS will protect. Second, since the RAS is functioning, why make it more complicated by manually dispatching pre-contingency or enforcing the constraint/contingency in the market? Doing so turns a protected probabilistic event into a deterministic event, which gives the same reliability level but is much more expensive and unnecessarily complicated.


Modern power systems should be operated reliably, economically efficient, and simple where possible. As discussed in this article, Pareto efficiency is a great way to improve one or more of the three critical dimensions without compromising others. In real-time operations, it is important to understand the system conditions, evaluate the available options, and operate with these factors in mind.



Hui Z
The MegaWatts

I talk about Power Systems, Electricity Market, and Energy Transition. Founder of The Megawatts—an energy-focused publication: