Comparison of Open Datasets for Lithium-ion Battery Testing
This story is contributed by Abolfazl Shahrooei.
- Testing of Li-ion batteries is costly and time-consuming, so publicly available battery datasets are a valuable resource for comparison and further analysis.
- Fourteen publicly available datasets are reviewed in this article and cell types, testing conditions, charge/discharge profiles, recorded variables, dates of experiments, and links to the datasets are provided.
- A Google spreadsheet of the open datasets is provided here as a resource to be updated continuously as a comprehensive table of open datasets.
Lithium-ion (Li-ion) batteries are widely used in different aspects of our lives including in consumer electronics, transportation, and the electrical grid. However, many challenges remain before they can be used reliably towards promoting a sustainable future through electrification. A vast amount of research has been conducted to address these challenges, including battery design, modeling, state estimation, and lifetime diagnosis and prognosis. Validation of the results from ongoing research requires a significant amount of experimental testing, and conducting such tests is resource intensive and time-consuming. Not only is specialized equipment, such as multi-channel cyclers, potentiostats, and thermal chambers, needed, but a typical reliability test for battery degradation may require more than six months of uninterrupted cycling. Several battery research groups have made their Li-ion datasets publicly available for further analysis and comparison by the greater community as a whole. This article introduces several of the most well-known open datasets for battery testing.
The Prognostic Center of Excellence (PCoE) at NASA Ames publishes the Prognostic Data Repository. This data repository is intended for developing prognostic algorithms and includes the following four battery datasets:
- PCoE Battery Dataset
- Randomized Battery Usage Data Set
- HIRF Battery Data Set
- Small Satellite Power Simulation Data Set
These datasets can be accessed at NASA Datasets. The two first datasets provide cycling data for commercial cells and will be described in further detail below. The last two datasets contain data from testing of battery packs for a small aircraft and a small satellite.
PCoE Battery Dataset
In this dataset, 34 18650 cells with 2 Ah capacity have been cycled to 70% or 80% of initial capacity at different temperatures using a custom-built battery tester . Cycling consisted of three operational profiles: charging, discharging, and electrochemical impedance spectroscopy (EIS). All experiments used a charging profile of CC-CV at 1.5 A to 4.2 V with a cutoff current of 20 mA. However, different discharging profiles were applied to induce degradation based on more realistic usage. EIS was conducted with a frequency sweep from 0.1 Hz to 5 KHz.
The table below summarizes this experimental dataset, which consists of six groups of cells.
This dataset has been widely used and among the experimental groups, Group 6 is the most commonly used dataset. However, because of continuous quality and technology improvements, today’s batteries typically have longer lifetimes.
Randomized Battery Usage Dataset
This dataset presents data for 28 18650 LCO cells with 2.1 Ah capacity in seven groups . Five groups of cells were cycled at room temperature and the other two groups were cycled at 40 °C. The dataset was recorded in 2014.
Cycling for each cell consisted of two types of charge/discharge cycles: Random Walk (RW) cycling and pulsed load characterization cycling. For each cell, after every set of 50 random walk cycles, a pulsed load characterization cycle was performed. In RW cycling, the current profile for both discharge and charge was changed every five minutes to a value randomly selected from the set of 0.5, 1, 1.5, 2, 2.5, 3, 3.5, or 4 A. In pulsed load cycling, the discharge profile consisted of a rest period of 20 minutes followed by a loading period of 10 minutes at 1 A. Randomized loads were applied during discharge in an effort to simulate more realistic usage.
The dataset was first used in  to adapt a battery model to account for degradation under random loads.
CALCE CS2 Dataset
The battery research group at the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland published a battery dataset  widely used for SOH estimation.
In the CALCE CS2 dataset, 15 prismatic LCO cells with a nominal capacity of 1.1 Ah were cycled at room temperature. All cells used a standard charging profile of CC-CV at 0.5C to 4.2 V with a current cutoff of 0.05 A. The cells were divided into six types based on the discharging profiles. For type 1 and type 2 experiments, the cells underwent a constant current discharge at 0.5C and 1C respectively. Cells in type 3 experiments were also discharged at constant current, but the discharge rate for each cycle was switched among six predetermined rates. For cells in experiments type 4–6, different cutoff voltages were used to simulate real usage conditions such as high and low voltage partial charge/discharge cycles.
The CS2 dataset was recorded from 2010 to 2013. This dataset can be accessed at CALCE Datasets along with other datasets provided by the same group.
Stanford Fast-Charging Datasets
Researchers from Stanford and MIT have published two relatively large datasets including cycling data of commercial 1.1Ah 18650 LFP/Graphite cells. These datasets are useful in particular for applying machine learning methods. The Cycle Life Prediction Dataset includes 135 cells cycled to their end of life and was used for developing an accurate model for cycle life prediction using the first 100 cycles data. In the Fast-Charging Optimization Dataset, cells were cycled 100–120 times with 224 different charging profiles. This data together with the prediction model based on the Cycle Life Prediction Dataset, were used to optimize the charging profile for lifetime.
Cycle Life Prediction Dataset
In this dataset, 135 cells were cycled to 80% of initial capacity in a temperature-controlled chamber at 30 °C. The dataset consists of three batches. Batch 1 with 46 cells and batch 2 with 48 cells were recorded in 2017. Five of the cells from batch 1 were continued to batch 2 until reaching 80% of initial capacity. Batch 3 with 46 cells was recorded in 2018. Six cells did not reach their end of life capacity. This dataset was introduced in  and data from 124 cells was used for early cycle life prediction. It can be accessed at the Cycle Life Prediction Dataset.
Voltage, current, and cell can temperature were continuously recorded for the duration of these tests. Internal resistance was measured with 10 pulses of ±3.6C with a pulse width of 30 or 33 ms applied each charging cycle at 80% SOC.
All cells were discharged through a CC-CV at 4C to 2 V with a current cutoff of C/50. However, for charging, 72 different fast charging profiles were applied. Each charging profile consisted of two stages; the first stage was one of 72 one-step or two-step candidate charging profiles from 0 to 80% SOC. The second stage, from 80% to 100% SOC, was a 1C CC-CV charging step with a current cutoff of C/50 for all cells.
No EIS data was provided in this dataset. This dataset focuses only on charging profiles, but the research group should consider looking at different discharge profiles as well for comparison.
Fast-Charging Optimization Dataset
The Fast-Charging Optimization dataset, recorded in 2018 and 2019, consists of five batches of almost 48 cells each. In four batches the cells were cycled 100–120 times, and the data was used in  for testing closed-loop fast charging optimization algorithms. In the final batch, cells were cycled to end of life (80% of nominal capacity) and the data was used for validation of the best fast-charging protocol predicted by the fast-charging optimization algorithm.
As in Life Cycle Prediction Dataset, all cells in this dataset were discharged with a CC-CV profile at 4C to 2 V with a current cutoff of C/20. However, 224 different six-step 10-minutes fast-charging protocols were used for charging. All experiments were carried out in a thermal chamber at 30 °C.
Internal resistance was recorded for all cells by averaging ten 3.6C 33ms pulses at 80% SOC during charging. In this dataset, temperature was only recorded for the validation batch. This dataset is available at Fast-Charging Optimization Dataset.
Synthetic Training Datasets
Two large synthetic battery datasets, the Graphite//LFP synthetic training diagnosis dataset and the Graphite//LFP synthetic training prognosis dataset, were recently published  with the goal of providing benchmark datasets for comparison of different diagnosis and prognosis algorithms. As the term synthetic implies, these datasets were not collected from empirical experiments but computationally generated using the so-called mechanistic approach . Instead of using electrochemical models for simulating degradation under a wide range of conditions, the loss of lithium inventory, loss of active material at the positive electrode, and loss of active material at the negative electrode (LLI, LAMPE, LAMNE) were used to simulate path-dependent degradation from experimental half-cell data as input. The basis for this approach is the concept that sweeping through all combinations of these three loss parameters (LLI, LAMPE, LAMNE) would cover every possible degradation path.
In the diagnosis dataset, 5,000 different degradation paths were obtained by sweeping through each of the loss parameters with a 0.01 step size and charge profiles with a C/25 step size. Each path was simulated at 100 points until 85% degradation, generating 500,000 voltage-capacity curves. The resulting diagnosis dataset is 465 MB in size and can be accessed at Graphite//LFP synthetic training diagnosis dataset.
For synthesizing the prognosis dataset, the evolution of the three degradation modes was modeled using eight parameters. Sweeping these parameters yielded 130,000 different duty cycles, with one voltage curve for every 100 duty cycles. The resulting prognosis dataset contains 3,000,000 voltage-capacity curves. It is 2.7 GB in size and can be accessed at The Graphite//LFP synthetic training prognosis dataset.
Half-cell data from four commercial cells was used to generate these synthetic datasets. The diagnosis dataset can be used for training and testing of SOH algorithms based on machine learning methods, although the generalizability to full cells may be somewhat limited. Another usage of this dataset is sensitivity analysis for different features and understanding the representativeness of different health indicators. The prognosis dataset can be used to evaluate different algorithms for early prediction of cycle life and to validate diagnostic methods.
Sandia National Laboratories Datasets
Three Li-ion battery datasets published by Sandia National Laboratories contain data for cycling commercial 18650 cells over a wide range of conditions. The main focus of these datasets below is a comparison of performance between different battery chemistries.
Short-Term Cycling Performance Dataset
In this dataset, 24 18650 cells with four different chemistries (LCO, LFP, NCA, and NMC) were tested at different temperatures and discharge rates at Sandia National Laboratories in 2017 .
Two types of tests, namely cycling and abuse testing, were performed on three cells from each chemistry. In the cycling tests, cells were first discharged to 0% SOC. Then, cells went through a 12-hour wait to reach the desired temperature. After this period, EIS was performed using a range of 0.1 Hz to 100 KHz with a 0.01 V perturbation. The cells were then cycled at different rates, and EIS was performed again. At the end, a standard 1C charge/discharge cycle at 25°C was used to measure the final capacity. This sequence was repeated at temperatures of 5°C, 15°C, 25°C, 35°C, and 45°C. In the abuse tests, no EIS was performed and the experiment was allowed to continue even when operating conditions, such as cell temperature, went beyond the manufacturer’s recommendations.
The published data includes voltage, current, capacity, temperature, and EIS data. This dataset can be used to compare the electrochemical behavior of different chemistries and is accessible at Sandia National Laboratories Dataset.
This dataset  provides calorimetry and thermogravimetric data from commercial LCO, LFP, and NCA cells. The dataset focuses on the effect of individual cell components on thermal runaway and is accessible at Sandia National Laboratories Dataset.
Long-Term Degradation Dataset
This dataset is a continuation of the Short-Term Degradation Dataset. In this dataset, 86 commercial 18650 cells with NCA, NMC, and LFP chemistries are cycled to evaluate the effects of temperature, depth of discharge, and discharge rate on the long-term degradation of the commercial cells .
The experiments started with a one day rest in a thermal chamber to equilibrate the cells to the specified cycling temperature. After the cells were discharged, EIS was performed. Then, a round of cycling was performed under temperature, C-rate, and DOD conditions specified for each cell. At the beginning and end of each round, a capacity check was recorded, and EIS was repeated for every 3% of capacity fade. Depending on the degradation rate of each cell and its specific test condition, a round of cycling could range from 125 to 1000 cycles. Although the study based on this dataset in  only looked at the cycling data until the cells reached 80% of their initial capacity, the experiments were continued even after the cells were below 80% of their initial capacity.
This dataset was published via batteryarchive.org and is accessible at Long-Term Degradation Dataset.
The Hawaii Natural Energy Institute (HNEI) dataset, recorded in 2013 and 2014, investigates the intrinsic cell-to-cell variability of degradation in Li-ion batteries . The experiment tested 51 NMC-LCO 18650 cells with a nominal capacity of 2.8 Ah, commonly used for notebook battery applications.
At the beginning, a set of conditioning tests were applied to the cells. Cell weights and as-shipped open-circuit voltages (OCV) were measured. Then a number of C/2 formation cycles were performed until the cell capacity had stabilized. After that, each cell underwent a reference performance test (RPT), which consisted of successive constant-current cycles. After these conditioning tests, a number of cell performance factors were obtained, including thermodynamic capacity, capacity ratio, pseudo-OCV curve, internal series resistance, and rate capability.
After the conditioning tests, 15 of the 51 cells were cycled 1000 times at 25°C with a CC-CV charge rate of C/2 rate and discharge rate of 1.5C. An RPT was performed after every 100 aging cycles.
The HNEI dataset was published on batteryarchive.org. The published data includes the 15 cells that participated in all the experiments and can be accessed at HNEI Dataset.
Oxford Battery Degradation Dataset
Eight 740mAh pouch cells were cycled to end of life (about 30% capacity fade) in a chamber at 40 °C. The dataset was recorded by the Howey Research Group at the University of Oxford in 2015 . Two types of tests were performed: drive cycle tests and characterization tests. In the drive cycle tests, cells were charged using a CC-CV profile and discharged with a load based on an Urban Artemis driving profile. After every 100 drive cycles, characterization tests were performed, which included a 1C cycle and a C/18 pseudo-OCV cycle.
The published dataset does not include data from the drive cycles but does present voltage, current, charge, temperature, and time from the characterization cycles. This dataset can be accessed at Oxford Battery Degradation Dataset 1.
Panasonic 18650PF Li-ion Battery Dataset
In , a 2.9 Ah Panasonic NCA 18650PF cell was tested in a thermal chamber under varying conditions at the University of Wisconsin-Madison.
The tests included ten cycles at 1C, a C/20 cycle, a five-pulse discharge HPPC test, EIS, a series of nine drive cycle tests, and another ten-cycle step. This sequence was repeated at 25°C, 10°C, 0°C, -10°C, and -20°C. The drive cycles used one or a mix of US06, HWFET, UDDS, LA92, and a custom Neural Network drive cycle. For tests with temperatures below 10°C, the regenerative braking portions of the drive cycle were not applied. A CC-CV charging profile at 1C to 4.2 V with a cutoff current of 0.05 A was applied after each test. During the tests, the cell underwent about 110 cycles and its capacity decreased to 2.3 Ah at the end of the experiment.
This dataset can be used for SOC algorithms and battery modeling. It was recorded in 2018 and is available at Panasonic 18650PF Li-ion Battery Data.
Automotive Li-ion Cell Usage Dataset
In this dataset a 15 Ah NMC cell (ePLB C020) was tested with an emulated EV usage profile . The training and test sets were recorded on separate trips. The training set included battery data for a 12-hour 277 km trip while the test set recorded a 7-hour 163 km trip. To simulate realistic usage, both trips were a mix of urban, extra-urban and highway driving cycles from the Federal Test Procedure repository and rest and charging periods.
Both sets include voltage, current, conducted charge, SOC, and time, and the driving cycles are also provided along with their time duration. This dataset is available at Automotive Li-ion Cell Usage Dataset.
With the ever-increasing presence of Li-ion batteries, the efficient establishment, management, and use of battery testing data is of paramount importance. Since these experiments tend to be costly and time-consuming, publicly available datasets provide a lot of value. They allow for benchmarking of different methods and algorithms. As a first step, we have provided a brief overview of some of the most commonly used open Li-ion datasets. The summary table describing these datasets is available here, and we invite you to help us keep this up-to-date by contributing other open Li-ion battery testing datasets and sharing your thoughts on these datasets in the spreadsheet.
Abolfazl Shahrooei is a control engineer with more than five years of experience in design and implementation of control systems. He is currently working on design of battery modules and packs, and battery testing. He is also doing research on battery lifetime prognosis and SOC estimation.
 B. Saha and K. Goebel (2007). “Battery Data Set”, NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA
 Randomized Battery Usage Data Set”, NASA Ames Prognostics Data Repository (http://ti.arc.nasa.gov/project/prognostic-data-repository), NASA Ames Research Center, Moffett Field, CA
 B. Bole, C. Kulkarni, and M. Daigle, ‘Adaptation of an Electrochemistry-based Li-Ion Battery Model to Account for Deterioration Observed Under Randomized Use’, Annual Conference of the Prognostics and Health Management Society, 2014
 “CALCE CS2 Battery Dataset”, Center for Advanced Life Cycle Engineering (CALCE), University of Maryland
 Severson, K.A., Attia, P.M., Jin, N. et al. Data-driven prediction of battery cycle life before capacity degradation. Nat Energy 4, 383–391 (2019). https://doi.org/10.1038/s41560-019-0356-8
 Attia, Peter M., et al. “Closed-loop optimization of fast-charging protocols for batteries with machine learning.” Nature 578.7795 (2020): 397–402.
 Dubarry, Matthieu, and David Beck. “Benchmark Synthetic Training Data for Artificial Intelligence-based Li-ion Diagnosis and Prognosis 2.” (2020).
 Dubarry, Matthieu, Cyril Truchot, and Bor Yann Liaw. “Synthesize battery degradation modes via a diagnostic and prognostic model.” Journal of power sources 219 (2012): 204–216.
 Barkholtz, Heather M., et al. “A database for comparative electrochemical performance of commercial 18650-format lithium-ion cells.” Journal of The Electrochemical Society 164.12 (2017): A2697.
 Sandia National Laboratories, “Battery Cell Calorimetry Data Archive,” 2019
 Preger, Yuliya, et al. “Degradation of Commercial Lithium-Ion Cells as a Function of Chemistry and Cycling Conditions.” Journal of The Electrochemical Society 167.12 (2020): 120532.
 Devie, Arnaud, George Baure, and Matthieu Dubarry. “Intrinsic variability in the degradation of a batch of commercial 18650 lithium-ion cells.” Energies 11.5 (2018): 1031.
 Birkl, Christoph. “Oxford battery degradation dataset 1.” (2017).
 Kollmeyer, Phillip (2018), “Panasonic 18650PF Li-ion Battery Data”, Mendeley Data, V1, doi: 10.17632/wykht8y7tg.1
 Massimiliano Luzi, September 7, 2018, “Automotive Li-ion Cell Usage Data Set”, IEEE Dataport, doi: https://dx.doi.org/10.21227/ce9q-jr19.