The eScience of Battery Research

How digitalization matters and which data enable it

Developing better batteries is an incredible hard puzzle which is full of unanticipated setbacks (Bullis 2015). Fortunately, not only does the amount of research change, but also how new discoveries are made and commercialized. One improvement is to intensify learning across multiple disciplines, academia and industry (a topic for another post). This post analyzes how battery research benefits from data-driven discovery and engineering.

The idea of designing batteries on the computer is simple: it’s cheaper and faster than iterating through all possible designs in the real world. The need to reap cost and time savings across the entire scale with predictive models has been summarized by Pesaran et al. (2011, see Fig. 1).

Fig. 1. Pesaran et al. (2011) on Computer-Aided Engineering for Electric Drive Vehicle Batteries (CAEBAT).

Simulation tools are present from various sources, for various scopes and purposes: material discovery, lifetime prediction, thermo-, cost- and abuse modeling. They are also coveted because of the aggregate knowledge and elegant application of theory they embody. For example, around 1998 a techno-economic battery model has been developed at a national lab and now “GM uses it, everyone is using it” but “they don’t tell you what they’ve done” (Crabtree 2015).

What is important for all modeling purposes, is the need to validate them or build them from real-word data. Certain types of data are an essential ingredient for progress — and they often have to come from other organizations and researchers. Therefore, how data is produced and exchanged along the value-chain warrants a detailed look.

Ab initio calculation, or simulation by first principles, has become good practice to screen large arrays of candidate materials and select only the most promising for further real-world experiments. Allocations at the XSEDE for computational studies list around eight to ten discovery projects per year specifically for better batteries. A powerful method to make computational, and to some extent experimental results available to the research community are databases with well defined access protocols (Jain et al. 2016). Such databases are grounded in the larger field of material science and found adoption by industry and academics. As for concrete results, in early 2015 the Materials Project registered around 8 million downloaded records and 12,000 users with 71% from academia and 12,6% from industry (Persson, LBL Industry Day 2015). One year later, the number of users has climbed to over 17,000. The resounding enthusiasm from battery and automotive industry evidences the value of such large-scale repositories (e.g. being “incredibly happy” about the effort, appreciating the project’s “free and easy access” and reducing weeks of work to 15min, ibid). The Materials Project also offers workflows to accept user contributions, thereby laying the foundation for scalable collaboration through data-exchange procedures (Qu et al. 2016). Moreover, integrated “Apps” like the battery explorer as part of the Materials Project, help researchers to reduce costs of discovering beyond lithium-ion designs (Chao 2016).

Yet, systematic errors due to approximate computations or incomplete theories must be resolved with knowledge from the real world (Nonengo 2016). An example where experimental data and simulation work in lockstep to advance theory and practice is the understanding of novel composite anode materials. Adding silicon to the mix increases the capacity to store ions per given volume. However, this volume also shrinks/expands with the state of charge to a greater extent, fractures its surroundings and ultimately impedes a battery’s lifetime. In order to exact this problem, Higa & Srinivasan (2015) used empirical volume expansion data as a basis and were able to reliably quantify and qualify the stress a binder has to accommodate between anode and collector.

If computation can guide the selection of new materials and predict their required properties, the problem how to synthesize them still remains. Sharing knowledge about the synthesis of materials increases reproducibility for experimentalists and would also unlock this domain to computer aided discovery. One way to gain structured knowledge is to reverse engineer tacit but unstructured knowledge. For example, leading scientist are “trying to develop machine-learning algorithms to extract rules from known manufacturing processes to guide the synthesis of compounds” (Nonengo 2016). A fully deductive machine learning model already outperformed the intuition of experts in a narrow domain: predicting the success of synthesizing vanadium compounds (Raccuglia et. al 2016). Structured data of materials and their chemical reactions are thus a viable source of innovation and a fundamental link towards software designed batteries.

Cells of non-flow batteries are made by sandwiching layers of materials between conductive sheets and optionally equipping safety mechanisms. A virtual design at this scale allows to come up with better geometries and material compositions while avoiding drawbacks from mechanical or thermal stress. The CAEBAT project, mentioned in the beginning, matured over the years and included funding for competing modeling solutions at (or around) this scale. They have been co-created between three industrial partners and academia, are in use by over 60 licensees and feature an open source variant. However, “electrochemical/thermal parameter identification is an intrinsically under-determined problem” (Smith et al. 2015). Or in other words, the right characterization data to parameterize such models is in steep demand (see Fig. 2).

Fig. 2. Smith et al. (2015) on Parameter Identification of multi scale models.

Even without explicit software that translates data into predictive models, observable data from charging/discharging cells is coveted (e.g. time series data of temperature, voltage, impendance, current and gaseous emissions). Jeff Dahn pioneered characterization methods that allow to draw conclusions about a cell’s degradation over long-term within a short period of time. Industry embraced this innovation and sent cells to be tested with novel additive formulations for the electrolyte, without disclosing what they are (Dahn 2013). Specifically, the combination of additive formulas and ageing data are compact and valuable. This value increases once contrasted with many variants because it allows to hone in on complex but usually synergistic performance effect of additives (Wang et al. 2014). Wildcat Discovery Technologies perfected the process of data-driven optimization of chemistries. Not unlike high-throughput assays for biomarkers, ideal combinations are found by rapidly intermingling test results from prototype cells with hypotheses that are partly formulated through the use of software.

At some point, model led R&D is seen as a rarely adopted method to develop new chemistries. However, usage by consortia and private companies alike signals an uptake in that direction (e.g. Franco 2013, ExZellTUM, Ricardo). Continuous improvements of modeling the electrochemical state of cells through observable proxy data also generates returns within shorter time-frames than rolling-out rather new chemistries. SEI formation is one of the last steps that influence performance of a cell before it’s being shipped from the manufacturer. The formation process can be gauged by total exo- and endothermic effects. Incorporating the net heat balance, makes it possible to devise generalized protocols for ideal SEI formation in lithium-ion cells at scale (Xu 2016).

An accurate estimate of the internal state of battery cells (also summarized as state of function or SoF), is vital to utilize the full capacity of packs without damaging cells unknowingly. However, once cells are strung in series and tightly packed into systems, less data, constrained compute resources and greater variance in exogenous factors . Less data per cell than in the lab or after manufacturing is available because hardware to do so is expensive. Factors that vary in more unforeseen ways are for example load, ambient temperature and in the case of electric vehicles, vibration (Bruen et al. 2016).

To partly compensate for the higher uncertainty in estimating true SoF of cells, long-term data needs to be fed back into the models of battery management systems (BMS). The National Renewable Energy Laboratory tackles this problem by integrating various partners into a joint program. One automotive partner, Ford supplied end-of-life packs and requested a validated prediction model in return while other long-term data is produced with pack manufacturers. This allows to validate assumptions in the operating strategy, account for variations between cells and leads to reduced risks in guaranteeing performance. More long-term data is still needed (Smith et al. 2015). One particular target for data-driven optimization is to constraining lithium plating while charging (Jossen & Keil 2016). Lithium plating is also dependent on depth of charge, temperature and a limited reversibility over time. Because of the complexity to accurately manage varoius degradation pathways, BMSs mix and match physics-based models with empirical models (Wu et al. 2016). One problem is to maximize the representativeness of synthetic tests for a given use case. This gap is reduced by using realistic load cycles to benchmark cells (e.g. like is hosting them from drive cycles).

Even if BMSs are perfected over time for well known chemistries, four reasons ensure that optimizing control strategies is a moving target. First, new chemistries like e.g. lithium-sulfur require significantly different state estimation and charging protocols (Fotouhi et al. 2016). Second, an increase in processing power and streamlined algorithms enables to model closer to physical principals and at higher dimensions (Northrop et al. 2014). Third, new types of data can be incorporated into online SoF estimation with new sensors. In several longshot R&D initiatives by the AMPED program, new sensors like pressure sensors have been explored. Fourth, mass production of fixation systems with integrated sensors makes gathering granular data feasible (e.g. temperature per cell, see Fig. 3).

Fig. 3. Monitoring Laminated Busbar by Mersen (reported by Yole Development 2016)

The idea of an IoT-like connected battery is by far not new. What has changed though are the economics: more expensive batteries deployed means more capital expenditure and thus a higher leverage to incorporate operational data and in-depth measurements from the lab. Furthermore, the acquisition and storage of select data is facilitated by the need to evidence warranty claims.

Digitalization in battery R&D occurs on two important trajectories: knowledge dissemination and knowledge creation through predictive modeling. The capability to disseminate knowledge with low overhead builds upon existing or emerging data standards. Predictive models are built and parameterized on such codified data, which can then be used to prototype new designs or operate systems more effectively (see Fig. 4 for an overview).

Fig. 4. Digitalization of battery R&D across the value-chain: influential simulation software and data repositories. Illustrations from Cheng et al. (2015), Franco (2013) and Faguy (2015).

A need to exchange data arises under two conditions which are prevalent across the value-chain. First, where pooling resources to obtain data makes sense but its utilization happens downstream. This is for example the case for simulations on high performance computers, specialized labs that can leverage economies of scale or where scarce talent is needed to craft software solutions. Second, exchange of data happens if data becomes more powerful on aggregate than in isolation. This is for example the case, if few out many new candidate materials must be compared and selected, better theory can be deducted from aggregate data, cells from different vendors must be compared effectively or large amounts of cells must be sampled to gauge their variance.

It’s exciting to see the principle of eScience come to fruition in battery research. Moreover, structured data and computation are not only utilized in basic science but part of the entire value-chain: from material properties to synthesis of such materials and characterization of novel and established cell chemistries under real-life conditions. For certain constellations, exchanging data is a win-win situation. The mission of is to assist the research community with ideal rules for data-driven collaboration in consortia. Specifically, we look into license and attribution rules that make commercial sense while removing some of the frictions associated with pooling intellectual property.

Stay tuned for more on this topic. If you have questions and remarks, don’t hesitate to send an e-mail to or use the comment section.

Alexander Hirner —

OpenBatt’s mission is to simplify data-sharing for battery research collaborations. #OpenInnovation #eScience

OpenBatt’s mission is to simplify data-sharing for battery research collaborations. #OpenInnovation #eScience