**Computing Like A Human**

**Approximate Computing**

Until recently, the goal of computing was to be the fastest. We increased the speed of compute by squeezing out more operations per second. This charge to maximize megahertz hit the ‘power wall’[1] in the mid-2000s. The industry initially made some headway in optimizing for speed *and* power but soon succumbed to increasing workloads. Now, we have to explore a third dimension for optimization: error tolerance, trading-off accuracy to push the envelope of performance and efficiency[2]. This is called Approximate Computing.

#### Today’s Computers Do Not Compute Like Humans.

“If I asked you to divide 500 by 21 and I asked you whether the answer is greater than one, you would say yes right away. You are doing division but not to the full accuracy. If I asked you whether it is greater than 30, you would probably take a little longer, but if I ask you if it’s greater than 23, you might have to think even harder. The application context dictates different levels of effort, and humans are capable of this scalable approach, but computer software and hardware are not like that. They often compute to the same level of accuracy all the time.” — Dr. A. Raghunathan, Purdue University [5].

We’re excited about Approximate Computing because it brings us closer to human-like computational efficiency. Our neurons don’t do exact arithmetic when computing or deterministically fire electric pulses when transmitting information. Rather, our neurons’ judicious tolerance to inexact calculation and communication by our brain cells result in an outsized improvement in speed and energy utilization. Similarly, approximate computing exploits the gap between the level of accuracy required by applications and estimates provided by the computing system to garner disproportionate performance and efficiency gains.

Another aspect of computing like humans is in learning by generating richer context with very little training data [14]. This includes learning visual concepts such as recognizing handwritten characters or learning inferences by observing an expert play a video game. This discussion on ‘learning like humans’ is beyond the scope of this article.

**Numerous Applications of Approximate Computing**

Google in their announcement of the deep learning chip [TPU] this year said: “…because of [approximate computing], we can squeeze more operations per second, use more sophisticated and powerful machine learning models and apply these models more quickly, so users get more intelligent results more rapidly.”

There are several domains applying approximate computing at scale, where the application can tolerate a lack of exactness while seeing significant savings.

- Embedded computing, where battery life or other resources are constrained and applications can live with inexactness. This is true where the input is analog. For example, speech recognition solution turns analog input signals into a sentence, and navigation software turns maps and location estimates from a GPS into driving directions. This is also true where the output is analog such as in audio, image and video processing, since human perception is inherently inaccurate.
- Saving energy when computing in data centers on huge workloads is a significant opportunity. An example of such an application is search, which turns queries over large sets of data into digestible information.
- The notion of approximating a solution through domain knowledge is increasingly evident in applications such as machine learning because these problems typically don’t have exact solutions. The key idea is if there is a task that is repeated
*n*times and each repetition improves the quality of the output, then fewer iterations mean lower quality and cost. That is, cost can be traded for quality. Google’s*PageRank*[6] scheme is a good example because the quality of search, even though inexact, improves with the increased knowledge generated from data. - We can speed up convergent optimization methods such as Gradient Descent by applying approximation strategies. For example, with Gradient Descent, one starts by simply picking an arbitrary point
*X0*that is within a function’s range and takes small steps towards the direction of greatest slope changes — the direction of the gradient — and eventually, after many iterations, we find the function’s minimum. The number of iterations and the time per iteration can be significantly reduced by approximating the step size or the error tolerance after each iteration.

These and hundreds of other applications have already adopted the principles of approximate computing.

**Approximate Computing Requires Innovation Across The Full Computing Stack**

Approximation must be applied across the full computing stack to reap maximum benefits while providing estimated guarantees on the correctness of the computed results at every step. It requires:

- Software engineers to design their applications to take unpredictable and volatile operational conditions, and noisy data into account;
- Algorithm experts to design innovative ways to expose and programmatically dial up (or down) accuracy;
- Database management systems (DBMS) designers to create data models and transactions (store, delete, search, sort, retrieve, etc.) that are imprecise, but even so meet the intended SLAs of the applications built on top of DBMS;
- System solution providers to develop programming languages, libraries and compilers to abstract, apply and test imprecise computing models;
- Hardware engineers to design transistor-level devices and circuits that trade accuracy for computational (and storage) efficiency and energy savings; and
- Test specialists to verify correctness of every component across the stack while being aware of the ‘designed’ range of inexactness.

Today, developers of approximate computing applications and solutions have to manually reason about accuracy, energy consumption and timely execution. Design and implementation of these applications is ad-hoc; software, systems and hardware are all independently developed and integration is non trivial. Mainstream adoption of approximate computing, however, requires a deep understanding of the inherent application resilience and of the full computing stack. This dictates tools and methods that can help programmers systematically explore and reason about the scope and behavior of approximations in applications [3]. For example, Qualcomm’s Zeroth processors [8], IBM’s TrueNorth [9], Intel’s Minerva ALU design [12][13], Lyric Semiconductor’s belief propagation accelerator [10] (acquired by Analog Devices), University of Washington’s REACT [11] and others either apply or model approximate computing principles.

The general approach in approximate computing today is to devise methodical approaches for automating abstraction, development and test of approximate software that runs on current commodity hardware, or will run on new and upcoming approximate hardware.

*This is the first of a 3-part article. **Part-II presents multiple examples of Approximate Computing** adopted in industry and research. Part III will introduce adjacent branches of computing that both employ and advance Approximate Computing.*

**Footnotes & References**

[1] Power Wall refers to power supply limitations or thermal dissipation limitations (or both) — which impose a hard constraint on the total amount of power that processors can consume in a system.

[2] http://rebootingcomputing-ieee.blogspot.com/2014/04/approximate-computing-path-beyond.html

[3] http://drops.dagstuhl.de/opus/volltexte/2016/5800/pdf/dagrep_v005_i011_p151_s15491.pdf

[6]S. Brin. and L. Page (1998) *The anatomy of a large-scale hypertextual Web search engine*. *Computer Networks and ISDN Systems*. 30: 107–117

[7]J. von Neumann (1945) *First Draft of a Report on the EDVAC**.* Archived from the original on March 14, 2013

[9] http://research.ibm.com/cognitive-computing/neurosynaptic-chips.shtml#fbid=M8xHPtHZqLJ

[11] https://homes.cs.washington.edu/~wysem/publications/wysem-msreport.pdf

[12] H. Kaul, M. Anders, S. Mathew, S. Hsu, A. Agarwal, F. Sheikh, R. Krishnamurthy, and S. Borkar (2012) *A 1.45ghz 52-to-162gflops/w variable precision floating-point fused multiply-add unit with certainty tracking in 32nm CMOS.*

[13] https://pdfs.semanticscholar.org/c30f/086b26714adf477b3eee03112ed23d7813d4.pdf

[14] http://www.mit.edu/~tomeru/papers/machines_that_think.pdf