benjamin.computer

Blocking Big Battery Booms! (with bootloaders)

21-10-2020

As part of my PhD program I'm supposed to undertake 3 months of work experience outside of my research. It's a good idea in theory. For me, it felt a bit odd; I've had full-time jobs before now and I'm somewhat older than my cohort of students, but rules are rules so I dutifully looked for a placement. My brother-in-law Phil runs a company called Voltsport and asked him whether or not he fancied a programmer for 3 months, free of charge. He said he'd love one and had a project in mind - writing the software for a battery management system. His company deals in electric motors, batteries and control systems and fitting them to various machines, such as cars, boats, cranes - you name it! I figured I could see my niece and nephew at the same time - efficient!

Of course, all this was planned during the Before-Plague (BP) times. We decided to go ahead remotely, as most of the work could be done with little need for lots of specialist equipment and one-to-one chats. At least, this is what we thought at the beginning and maybe for-the-most-part it was true.

Batteries and boards

Modern battery cells are strange things. We've all heard the stories of fires, explosions and the like, but there are many other things that can go wrong that are less dramatic. Batteries need to be managed. Temperatures and voltages need to be constantly monitored, and any cell that steps out of line needs to be brought back into balance. This is where a battery management board comes in.

The board Phil designed consists of several components, chief of which are the ARM Cortex M3 chip, the battery management chip and a bluetooth module. My job was to link all of these together - enabling one to check the status of a set of cells from my phone. I've never worked on an embedded software project for a real product before but I've always been keen to try.

Embedded software and bootloaders

I've done a fair bit of C in my time, though recently I've been in the land of Python. The programming language that is C tends to be the go-to for systems development, even after all this time. It's reliable, you can get to the lowest levels, add in some assembly, all that stuff. It doesn't have all the extra layers of something like embedded python and given modern tools, it's probably no more faff to use. It does have a few flaws but at least it isn't having the entire kitchen sink thrown at it - I'm looking at you C++.

Voltsport is a Windows and Microsoft shop.I'm predominately a Linux person these days. I decided that rather than change things, I'd setup a VM and run the tools that way. I managed to setup Keil uVision - the official (I guess?) ARM development IDE and set to work.

My first challenge was to create a bootloader. The idea is that the bootloader is always there, running away on the final board. When the board is turned on, the bootloader is the first thing that runs. It has the following job:

  1. Check to see if there is a program in flash memory.
  2. If there is a program, run a CRC check over it to make sure it's not corrupted.
  3. Run the program if the CRC matches.
  4. If it doesn't match, enter a loop and wait for CANbus messages carrying a new program.
  5. If CANbus messages are received, start writing the new program into flash.
  6. Once the new program is received, decrypt it using AES.
  7. Once decrypted, goto step 2 and continue as before.

Sounds fairly simple right? Well, there are a few little gotchas here and there. Where do you keep the keys? Where is the CRC checksum kept inside the program to be run? What about the vector table and handing over all the interrupt handling to the new program? At these lower levels, you have to think about things like Interrupt Vector tables, the perils and peculiarities of flash memory and making sure your compiled code actually fits inside the flash available to you, to mention but a few. I was programming for an Arm Cortex-M3 which has about 64K of memory available to it, running at 72Mhz. You have many more hardware constraints to program to - something I suspect not often encountered by the majority of programmers these days.

Development boards

The battery board consists of 3 major parts - the ARM Cortex-M3, the battery cell balance chip and the Bluetooth Low Energy module. The board would talk to the outside world via CANbus (which I'm told is one of the major standards in the automotive and battery world). Fairly straight-forward you'd think - until I realised we didn't have the boards yet.

Working with a prototype is different to working with a battle-hardened test piece of hardware. Iterating, modifying and occasionally letting out the magic smoke are part-and-parcel of the process and it's not something I'm used to. The boards were still being fabricated so in order to get up to speed, I started instead with an LPCXpresso development board. This way, I could focus on the bootloader and the encryption side of things first. I also picked up a Bluetooth development board from Cypress in order to play with the phone end of things.

Computer Comms...

One of the hardest elements of the project are the different communications between the individual components. Talking to the Bluetooth module required UART. Talking to the battery chip required I2C. The Android phone requires Bluetooth Low Energy(BLE), leaving some CANbus for the bootloader and battery management application! Four different communication protocols, none of which I'd tried before. Thankfully, there is plenty of help available.

A collection of dev boards.
A collection of dev boards, all wired up to send a message from one end to the other.

The Android side of things is perhaps the most complicated. BLE has quite a hurdles to jump through. I decided to program in Kotlin as that seems like the language of choice at the moment in that sphere. It seems alright and not such a departure from Java. There are a set of BLE Android examples that give you most of the parts you need, though some copying from the source on github is still required. It turns out Cypress also has a Bluetooth test application you can download for your phone, which is an even quicker way to test if your module is working.

Cypress have their own IDE called PSoC creator. This seems a bit more mature than uVision and includes a sort of drag and drop, flow chart diagram affair for generating the code. You draw out the circuit you want, linking the various components together and the IDE creates the code for you. I'm not always a fan of this sort of thing but it seemed to work reasonably well. The BLE specification is quite considerable; there are many things you can do with a bluetooth setup - heart rate monitors, remote speakers, sports equipment. The list goes on! Many of these use cases are already configured in the IDE - you just need to pick one. Thanks to the dev board and the speedy IDE, I had some basic communication working reasonably quickly.

What didn't go quickly was the UART. I had a lot of trouble getting this to work, which is very strange as UART is one of the simplest communication protocols around. The reason for this was the port allocation on the development board. Finding documentation on how pins are allocated on this particular board was almost impossible! As UART is a common protocol, it's baked right into the ROM of this particular Cortex-M3. You can access it via the LPCOpen source code. Finding out how this works, which pins are dedicated to UART and how to turn them on correctly took far too long. The LPCXpresso documentation could do with some improvement. I can certainly see the value in documenting things properly.

CANbus was somewhat easier. I'd figured out the problems with the pin-muxing so working with the existing code was straight-forward. Our chip comes with a set of function for CANbus in the ROM, just like UART and I2C. Simply set a pointer to the right memory location and call it. Once we had a little dictionary for the CANbus codes and data, we were all set.

Real hardware

By the time I'd wrangled with the bootloader, bluetooth and some basic communications, the actual hardware arrived. This made things much simpler as we now had the actual cell balancing chip - the BQ76952 from Texas Instruments. This chip can monitor up to 16 cells - their voltages and temperatures - giving us all the information we need to finish the management software.

Logic Analyser and development board.
Debugging some I2C with the Saleae on the BMS board.

Communication with this chip takes place over I2C. It was at this point I became very grateful for having a Saleae Logic Analyser on hand. This little chap enables me to link probes to any of the wires or ports on the board and decode whatever signal passes along them. I can verify whether or not I2C is being sent at all, what the timings are and if there are any replies. This is the kind of thing you can't really do very well with just a debugger.

A working prototype.
Our working prototype. The 16 cells at the back are being monitored; their voltages being sent to the laptop via CANbus.

One of the problems of working at the cutting edge is that you can't take everything for granted. Despite trying every command I could, the balance chip simply wouldn't, well, balance! The more we looked at the data-sheet, the less it made sense until we figured out that some early samples of the chip had faulty or disabled balance functions! Incredible!

In the end however, we managed to get a working prototype together! I could read the voltages of the different cells on my little Android phone. Not too bad for a short internship. It's always very satisfying to see the final result working on real hardware!

In conclusion...

If I was doing this again, I'd spend more time getting my various tools in order. The virtual machine I was using would slow down over the course of use, as Windows 7 and Keil strained under the weight of constant compiling. I probably could have made it all work with ARM GCC or similar, therefore enabling my existing editors and environment.

I've become even more grateful for good documentation, code organisation and working examples. The bluetooth side of things, while complicated, was up and running pretty quickly. Cypress, and to a similar extent Google, have done a good job in making BLE easy enough to start with. I imagine down the road, things might be quite tricky, but fortunately that was out of scope for this part of the project.

Finally, the pandemic really did a number on my productivity. Once the lockdown restrictions were over, I had the chance to work on site for a week. During that time, I managed to get an awful lot done; Phil and I could iterate quickly. All things considered, I've been pretty lucky on that score.

Embedded development is pretty tough, but also rewarding. In many ways, it has a lot in common with the demoscene - getting the most out of the hardware, coming up with interesting and creative solutions. There are things I'd do differently - certainly my approach and way of working would change, but given the circumstances, I'm pleased with how it came out.


benjamin.computer