Why is battery data hard?

Why is battery data hard?

Why is battery data hard?

An AmpLabs Community Contribution

Talk to any seasoned battery engineer, and you’ll probably hear them say batteries are hard. With electrodes, electrolytes, and interfaces constantly changing upon cycling - it's hard to predict what will happen with any given cell chemistry and how to tweak it to improve performance. Personally, I like to think of a battery as having a mind of its own. Batteries are black boxes, we can’t directly see what’s going on in there. The only way to know what a battery is doing is by measuring performance data that is generated by tests performed on it. Therefore, we want to identify the areas of the battery R&D pipeline that cause friction, and develop coping mechanisms that both accelerate battery R&D and make our lives easier as battery engineers. One source of friction resides within how we process battery data - it's hard and can be time-consuming. 

Battery data is hard, simply because it’s messy. When I was a kid, my parents wouldn’t let me hangout with friends or do fun things unless I cleaned my room first - which I hated. Turns out, I am faced with nearly the same problem as a battery engineer - though my parents are no longer yelling at me. Want to perform basic analysis to drive insight? You have to clean your data first. Want to build out super cool prediction models now that chatGPT can assist you? You need to clean your data first. 

Let’s first understand why the data is messy - because unlike my desk from grad school the data isn’t covered in crumbs and jelly from the PB&J that I accidentally ate 2 hours before lunch. We collect data from either cyclers or a BMS - each producing various data formats with various data columns, time steps. etc. Of course the data output from each brand of cycler changes, so you cannot compare Arbin data to Neware data out of the box. In some cases, you cannot even compare different software versions from the same brand. Adding an auxiliary signal from an impedance device or the likes makes things even more complex. Consider a time when you elected to record one data column at a different frequency than another - this can create out of band data - which complicates things when you want to make a basic time-series plot. 

A while ago, I came across a LinkedIn post that mentioned data scientists spend 90% of their time cleaning data…. not exactly a citable statistic, but let’s assume it's accurate. That’s a lot of time spent merely in preparation for the main task, so what is a battery engineer to do? As I mentioned earlier, I always hated cleaning my room - and I hate cleaning data too. So similar to utilizing a cleaning service to help you with a task you loathe you can utilize AmpLabs to clean your data for you. This means that you can import any data format you fancy - even if it's from a home grown cycler, and AmpLabs will clean and label the data for you, generating time and cycle series KPIs. The AmpLabs cleaning regimen allows you to perform an apples to apples comparison of data from various cyclers or BMS. 

While data cleaning is a major problem plaguing battery data analysis, other issues also exist. Battery Research generates enormous datasets. For example, a lifetime study could take several months to years of generating data for just one cell. Now scale that to an automotive OEM, with thousands of cells in the lab, and several fleets of telematics data that they have been collecting since Carter Administration. Tracking these vast datasets can be troublesome, I can’t even find my meeting notes from last week. Fortunately, AmpLabs can save the day here too. AmpLabs allows you to archive all of your battery data in the AmpLabs platform. Depending on your individual or organizational needs - you can store this data either on the cloud, locally, or on a private cluster. From the platform, you can conveniently search for data that you want to analyze - similar to how your university's online library database works. This ensures that data is always available, and easily searchable independent of things like employees transitioning out of the organization and losing their personal mapping of where they put data.

Large datasets present another problem when we go to build models with them, or perform other types of scripted analysis on. Running a script on a large dataset can make the script sluggish, luckily AmpLabs founder, Gabe Hege, comes from a cloud & big data background and has designed AmpLabs with this aspect in mind - building the platform with scale in mind. With AmpLabs, Battery R&D, both small and large scale, is a breeze. And now cleaning my room isn’t so bad. 


Join AmpLabs and Observe Your Batteries

We provide engineers with the necessary tools and resources to enable Battery Observability, driving progress towards a cleaner and more sustainable future.

learn more
Join AmpLabs and Observe Your Batteries