Software Energy-Efficiency: Code Optimization Tactics

Max Meinhardt
10 min readMay 25, 2023

--

The four categories of techniques for application software energy efficiency are Computational Efficiency, Low-level or Intermediate Code Optimization, Parallelism, and Data and Communications Efficiency according to the Roskilde University’s Energy-Aware Software Development Methods and Tools research paper [1]. In this article, we will explore the definitions of these categories and provide implementation suggestions and unique ideas for incorporating them effectively.

This article belongs to the Code Optimization sub-category, which is a part of the Resource Adaptation category discussed in the first article of this series on Software Energy-Efficiency.

1) Computational Efficiency

Photo by Djim Loic on Unsplash

Achieving computational efficiency is crucial for optimizing energy consumption in applications, especially when considering single-threaded and multi-threaded scenarios. While the correlation between time and energy is clear for single-threaded applications, the distribution of functionality and resource-utilization density across threads in multi-threaded applications adds complexity to this relationship. However, the development of computationally efficient algorithms plays a significant role in improving the energy efficiency of both architectures.

1.1) Algorithm Design

Energy-efficient algorithm design relies on utilizing algorithms within their intended context and conditions, rather than applying a generic “one-size-fits-all” approach. A helpful perspective is to view energy-efficient algorithm design through the lens of Darwinian theory or the path of least resistance. Recent advancements in AI research, such as the work by scientists in Korea [2] emulating the human brain, demonstrate the benefits of emulating nature’s optimization of resource utilization.

In their research, the scientists realized that the traditional “one-size-fits-all” approach to hardware design, where CPUs and memory utilize a fixed architecture, created a bottleneck. To address this constraint, they developed hardware that can dynamically alter its architecture, similar to the brain’s ability to change the connectivity structure of synapses. This innovative approach resulted in a 37% reduction in energy usage compared to current neural network implementations, without sacrificing accuracy. This breakthrough exemplifies how nature’s path of least resistance model can be applied to both architecture and algorithm design.

It is important to note that the fastest or smallest algorithms may not always be the most energy-efficient. Recursive algorithms, for instance, can be less energy-efficient than their iterative counterparts due to increased stack memory usage.

1.2) Energy-Efficient-Algorithm Design Process

Designing an energy-efficient software algorithm involves following a holistic “path of least resistance” throughout the design process. This approach does not imply doing the minimal amount of work but rather consistently scoping efficiency to minimize the algorithm’s energy footprint while maximizing performance.

Analogous to nature, which maintains efficiency through the path of least resistance, the energy-efficient algorithm design process can be visualized as shown in Figure 1 and described below:

Figure 1: Energy-Efficient-Algorithm Design Process
  1. Identify objective and variables: Identify the algorithm’s objective and its dependent and independent variables.
  2. Optimize algorithm architecture: Holistically minimize the compute time, memory requirements, and network utilization that are needed in order to reach this objective. Utilize knowledge of software energy efficiency best practices.
  3. Assess data outlier probabilities: Evaluate the impact of data outliers on the algorithm’s energy efficiency. If the effect is significantly negative, re-evaluate the algorithm’s architecture by returning to Step 2 to minimize any resulting overhead and inefficiency from additional code required to handle outliers.
  4. Implement: Once the algorithm design is complete, implement it with an understanding of the energy efficiency of programming languages and individual language instructions (e.g., Java [3][7][8]) used in the application. Utilize energy-efficiency static analysis tools to aid in this process.
  5. Measure and assess: Employ dynamic analysis tools to measure the energy efficiency of the software. If this is the initial iteration, use the measurement as a baseline for comparing successive iterations when seeking improvements. Repeat Step 5 until optimal energy efficiency is achieved.

1.2) Simplify Boolean-Logic Expressions

During software development, there are instances where long and complex boolean logic expressions are required. Simplifying these expressions can be done manually using techniques like Karnaugh maps or by utilizing online tools designed for this purpose, such as the Calculators.tech Boolean Algebra Calculator [6]. Simplification helps improve code readability, maintainability, and potentially reduces computational overhead, leading to enhanced energy efficiency.

2) Low-Level or Intermediate Code Optimization

Low-level software energy optimization techniques are typically performed within a compiler, a runtime engine (such as a JVM), or by selecting an energy-efficient programming language. Additionally, an application-specific power-scaling algorithm can be implemented in a middleware that acts as an interface between the runtime engine and the OS kernel’s DVFS-based (dynamic voltage and frequency scaling) [9][10][11][12][13] power governor API. For more details on this implementation, refer to section 3.1(DVFS) below.

Intermediate code optimizations focus on individual or small groups of code. It is ideal to start by using an energy-efficiency static analysis tool to identify and classify energy inefficiencies in the code. This tool can be likened to an application security vulnerability static analysis tool that automatically categorizes findings as Critical, High, Medium, or Low impact. If an energy static analysis tool is not available, manual code analysis becomes an alternative. In the case of Java, several research papers [3][7][8] delve into this topic.

2.1) Selecting the Most Optimal Computer Languages

It is a common misconception that the fastest programming language is always the most energy-efficient. However, as Pereira et al. highlight [14], this is not always the case, as demonstrated in Table 1 below.

The speed and energy efficiency of a programming language can vary depending on the specific use case. Some languages offer universality and cross-platform capabilities, while others excel in specific tasks and environments. It’s also important to consider how libraries and extensions are implemented, as they can impact both speed and energy utilization. For example, the implementation of event handling can significantly influence energy efficiency. In most cases, interrupt-driven event handling proves to be more energy-efficient than using a polling mechanism.

Furthermore, employing multiple programming languages within an application can help optimize energy efficiency, speed, and memory usage, as each language can be leveraged for its strengths in different components or modules.

Table 1: Energy, speed, and memory-use of computer languages (Pereira et al. [2])

3) Parallelism

Multi-threaded applications that utilize multiple CPU cores generally exhibit higher energy efficiency compared to their single-threaded counterparts, thanks to their ability to effectively utilize computing resources. However, the energy efficiency of multi-threaded applications depends on how their functionality is distributed across process threads and the CPU cores that host them.

3.1) DVFS

In certain scenarios, multi-threaded applications can optimize their energy efficiency at runtime by utilizing a CPU’s Dynamic Voltage Frequency Scaling (DVFS) [9][10][11][12][13] technology, which allows for controlling the frequency of its cores. By default, the OS kernel’s driver automatically manages this technology. However, it is possible to override the default behavior by setting the CPU’s power governor to userspace mode, enabling manual configuration. The OS’s thread affinity runtime library can then be used to manually assign process threads with different speed requirements to cores operating at varying speeds.

It’s worth noting that letting an application control these DVFS OS defaults can be particularly useful for applications with dedicated hardware, such as embedded IoT devices. In such cases, the CPU’s energy profile can be tailored specifically for the application during selected phases of runtime.

When a CPU changes a core’s frequency via DVFS, its voltage is automatically altered.

In a multi-tenant environment, allowing a CPU to run multiple instances of the same application with identical energy profiles could potentially enhance energy efficiency compared to other scenarios. However, if the system architecture leads to conflicting energy profiles among processes without any synergistic effects, allowing the software application to directly control DVFS may defeat the purpose and potentially result in worse CPU energy efficiency compared to when the system governor manages it.

Additionally, it is important to note that due to architectural and hardware access restrictions, cloud-hosted applications cannot directly control their DVFS CPU registers. However, the underlying cloud orchestration implementer may choose to exercise control over DVFS settings.

3.1.1) Energy Profiling for DVFS

Assigning process threads of an application to specific CPU cores should be based on the application’s energy profile. For instance, a CPU core can be designated to host an application thread that predominantly performs I/O operations and spends a significant amount of time in a wait state. Such a thread would be classified as low power in the application’s energy-efficiency static analysis and can be assigned to a low-frequency CPU core with a similar classification during application initialization.

Initializing an application’s energy parameters can involve running it in separate passes, with each subsequent pass refining the distribution of resources to DVFS-configured CPU cores. The settings derived from these passes can then be used for runtime execution. An example of this technique, as applied to Java applications, is presented in the research paper titled “Vincent: Green Hot Methods in the JVM” [11]. The authors’ approach, implemented in their tool called VINCENT, achieves 14.9% energy savings compared to the built-in power management in Linux. The paper provides a high-level overview of the four execution passes as follows:

  • Phase 1: Hot Method Selection — VINCENT obtains a list of hot methods.
  • Phase 2: Energy Profiling — VINCENT profiles the energy consumption of hot methods under the default ONDEMAND governor (cpu), ranks their energy consumption, and reports a list of top energy-consuming methods as output.
  • Phase 3: Frequency Selection — For each top energy-consuming method, VINCENT observes the energy consumption and execution time of the application at different CPU frequencies (configurations). It ranks the efficiency of different configurations based on energy metrics and selects the most efficient one for each method.
  • Phase 4: Energy Optimization — VINCENT runs the application with each top energy-consuming method scaled to the CPU frequency determined in the Frequency Selection phase.

4) Data and Communications Efficiency

Photo by Gabriel Vasiliu on Unsplash

In a research paper by Roskilde University [1], the following is stated.

“Energy can be saved by minimizing data movement. This can be achieved by writing software that reduces data movement by using appropriate data structures, by understanding and exploiting the underlying system’s memory hierarchy and by designing multi-threaded code that reduces the cost of communication among threads.

For example, the size of blocks read and written to memory and external storage can have a major impact on energy efficiency, while memory layout of compound data structures should match the intended usage in the algorithm, so that consecutively referenced data items are stored adjacently if possible. In multi-threaded code, consolidating all read-writes to or from disk to a single thread can reduce disk contention and consequent disk-head thrashing. Furthermore, knowledge of the relative communication distances for inter-core communication can be used to place frequently communicating threads close to each other thus reducing communication energy costs.”

Conclusion

Efficient software code optimization is crucial for achieving energy efficiency. By focusing on computational efficiency, low-level or intermediate code optimization, parallelism, and data and communications efficiency, developers can reduce energy consumption.

Efficient algorithm design considers the intended context and follows the path of least resistance. It’s important to note that the fastest or smallest algorithms may not always be the most energy-efficient.

Low-level code optimization involves selecting energy-efficient programming languages and utilizing static analysis tools. Intermediate code optimization improves code readability and reduces computational overhead.

Parallelism, especially in multi-threaded applications, maximizes computing resources. Techniques like Dynamic Voltage Frequency Scaling (DVFS) enable manual control of core frequencies and thread assignment.

Improving data and communications efficiency reduces energy consumption. Minimizing data movement, optimizing memory layout, and consolidating disk read-writes in multi-threaded code are effective strategies.

By incorporating these code optimization techniques, developers contribute to energy efficiency and promote a greener computing environment. Efficient code reduces energy consumption and improves overall performance.

Articles in this series

The following articles are part of this comprehensive series that explores the energy-efficiency tactics in software architecture and implementation. The first article shown below contains a diagram that is described in the remaining four articles.

References

[1]: Roskilde University, University of Bristol, IMDEA Software Institute, XMOS Limited. March 1, 2016. ENTRA 318337 — Whole-Systems ENergy TRAnsparency — Energy-Aware Software Development Methods and Tools. In beneficiary of IMDEA Software Institute. http://entraproject.ruc.dk/wp-content/uploads/2016/03/deliv_1.2.pdf. accessed on 2022–06–27.

[2]: ScienceDaily. “Energy-efficient AI hardware technology via brain-inspired stashing system?” Published on 2022–17–05. Science Daily sourced from the Korea Advanced Institute of Science and Technology (KAIST).
https://www.sciencedaily.com/releases/2022/05/220517210435.htm. accessed on 2022–05–07.

[3]: Kumar, Mohit. “Improving Energy Consumption Of Java Programs” (2019). Wayne State University Dissertations. 2325.

[4]: Amazon Web Services. Compression encodings. Amazon Redshift — Database Developer Guide. https://docs.aws.amazon.com/redshift/latest/dg/c_Compression_encodings.html. accessed on 2022–05–07.

[5]: Amazon Web Services. Amazon Redshift Engineering’s Advanced Table Design Playbook: Compression Encodings. https://aws.amazon.com/blogs/big-data/amazon-redshift-engineerings-advanced-table-design-playbook-compression-encodings/. accessed on 2022–06–26.

[6]: Calculators.tech. Boolean Algebra Calculator. https://www.calculators.tech/boolean-algebra-calculator.
accessed on 2022–06–26.

[7]: Gustavo Pinto, Kenan Liu, Fernando Castor, Yu David Liu. “A Comprehensive Study on the Energy Efficiency of Java’s Thread-Safe Collections” (2016).

[8]: Mohit Kumar, Youhuizi Li, Weisong Shi. “Energy Consumption in Java: An Early Experience” (2017). 2017 IGSC Track on Contemporary Issues on Sustainable Computing.

[9]: Rafael J. Wysocki, 2017. CPU Performance Scaling. Intel Corporation.
https://www.kernel.org/doc/html/v4.12/admin-guide/pm/cpufreq.html. accessed on 2022–06–22.

[10]: Rafael J. Wysocki, 2017. intel_pstate CPU Performance Scaling Driver.
https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html. accessed on 2022–06–25

[11]: Kenan Liu, Khaled Mahmoud, Joonhwan Yoo, Yu David Liu. 2022. Vincent: Green Hot Methods in the JVM, 29 pages.

[12]: Kenan Liu, Gustavo Pinto, Yu David Liu. 2015. Data-Oriented Characterization of Application-Level Energy Optimization. 12 pages.

[13]: Timur Babakol, Anthony Canino, Khaled Mahmoud, Rachit Saxena, Yu David Liu. 2020. Calm Energy Accounting for Multithreaded Java Applications. 8 pages.

[14]: Rui Pereira, Marco Couto, Francisco Ribeiro, Rui Rua, Jácome Cunha, João Paulo Fernandes, and João Saraiva. 2017. Energy Efficiency across Programming Languages. In Proceedings of SLE’17, Vancouver, BC, Canada, October 23–24, 2017, 12 pages. https://doi.org/10.1145/3136014.3136031. accessed on 2022-05–07.

--

--