Published on June 13, 2023.

Disclaimer: A lot of things can cause this error. If you aren’t organizing program memory manually, then this probably won’t fix your problem. You should check your board, port, and programmer settings instead, and maybe try programming over SPI or burning the bootloader. But keep reading, just in case.

Uploading a large program to an ATmega2560

For the past year or so, I’ve been working on a small RPG that runs on a single ATmega2560 microcontroller. More specifically, I’m running it on an Arduino Mega 2560, because that’s convenient. At this point, the game is essentially finished and requires over 235 kilobytes of flash memory. Let’s upload it to the microcontroller.

$ avrdude -p atmega2560 -c wiring -P /dev/ttyACM0 -b 115200 -D -U flash:w:bin/main.hex:i

avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9801 (probably m2560)
avrdude: reading input file "bin/main.hex"
avrdude: writing flash (235758 bytes):

Writing | ################################################## | 100% 27.97s

avrdude: 235758 bytes of flash written
avrdude: verifying flash memory against bin/main.hex:
avrdude: load data flash data from input file bin/main.hex:
avrdude: input file bin/main.hex contains 235758 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 21.51s

avrdude: verifying ...
avrdude: verification error, first mismatch at byte 0xff00
         0xff != 0x00
avrdude: verification error; content mismatch

avrdude: safemode: Fuses OK (E:FD, H:D8, L:FF)

avrdude done.  Thank you.

Huh! Re-uploading a few times results in exactly the same error. The game runs, but with corrupted assets, which dispels hope that the verification error is a false alarm.

What’s going wrong?

This is a subtle bug in the Arduino STK500v2 bootloader. Here are the relevant sections, shortened for simplicity.

// from arduino-1.8.18/hardware/arduino/avr/bootloaders/stk500v2/stk500boot.c
case CMD_LOAD_ADDRESS:
#if defined(RAMPZ)
	address        =   ( ((address_t)(msgBuffer[1])<<24)|((address_t)(msgBuffer[2])<<16)|((address_t)(msgBuffer[3])<<8)|(msgBuffer[4]) )<<1;
#else
	address	       =   ( ((msgBuffer[3])<<8)|(msgBuffer[4]) )<<1;		//convert word to byte address
#endif
	msgLength      =   2;
	msgBuffer[1]   =   STATUS_CMD_OK;
	break;
case CMD_PROGRAM_FLASH_ISP:
    {
        unsigned int    size        =   ((msgBuffer[1])<<8) | msgBuffer[2];
        unsigned char   *p          =   msgBuffer+10;
        unsigned int    data;
        unsigned char   highByte, lowByte;
        address_t       tempaddress =   address;

        // erase only main section (bootloader protection)
        if (eraseAddress < APP_END )
        {
            boot_page_erase(eraseAddress);  // Perform page erase
            boot_spm_busy_wait();           // Wait until the memory is erased.
            eraseAddress += SPM_PAGESIZE;   // point to next page to be erase
        }

        /* Write FLASH */
        do {
            lowByte     =   *p++;
            highByte    =   *p++;

            data        =   (highByte << 8) | lowByte;
            boot_page_fill(address,data);

            address     =    address + 2;   // Select next word in memory
            size       -=    2;             // Reduce number of bytes to write by two
        } while (size);                     // Loop until all bytes written

        boot_page_write(tempaddress);
        boot_spm_busy_wait();
        boot_rww_enable();                  // Re-enable the RWW section

        msgLength       =   2;
        msgBuffer[1]    =   STATUS_CMD_OK;
    }
    break;

Initially, I was certain the problem had something to do with the boot_page_write call (actually a macro). To write flash memory, you first populate an internal page-sized buffer with boot_page_fill calls. Then, to actually write the buffer to flash, you call boot_page_write. Internally, this places the page address in two 8-bit registers, r31:r30, sets various bits in SPMCSR, and runs the spm instruction. After that, you should wait until the write completes before calling spm again.

However, the ATmega2560 and ATmega1280 microcontrollers have over 64 kilobytes of flash, which requires more than 16 bits to address. To access the full address space, you also have to put the higher-order bits in another register, RAMPZ. My suspicion was that the boot_page_write macro was failing to do this. But examining the compiled assembly, as well as dumping the microcontroller’s flash memory, proved that this was not the case. Much to the contrary, the bootloader takes trouble to ensure that the flash address is correct.

So what’s really happening? One thing I skipped in the previous paragraphs was that writing to flash actually only clears bits. To set a bit, you have to first erase the entire page, filling it with 0xff, and then write your data. This is what the first section of above code is supposed to do.

        if (eraseAddress < APP_END )
        {
            boot_page_erase(eraseAddress);  // Perform page erase
            boot_spm_busy_wait();           // Wait until the memory is erased.
            eraseAddress += SPM_PAGESIZE;   // point to next page to be erase
        }

And here’s the problem! eraseAddress is initially set to 0, and thereafter only used and updated in the above snippet. So it moves, page by page, through the flash memory, erasing as it goes. Meanwhile, address also moves, page by page, through the flash memory, writing as it goes. As long as these two pointers are synchronized, everything is fine. However, the address pointer can actually skip large sections if the bootloader receives a CMD_LOAD_ADDRESS command. In this case, the eraseAddress pointer will fall behind. Some sections of the memory will be incorrectly erased, others will by overwritten without erasing.

This explains why the verification failed at byte 0xff00. In my game, the first gap in data ends at byte 0xff00. So, starting at that address, the flash was written, and shortly afterward erased back to 0xffs.

Workarounds

I suspect you could fix the bug pretty easily by removing the eraseAddress variable and using address in its place. This would prevent partial-page writes, but these aren’t particularly useful and already aren’t supported. However, you’ll have to recompile and burn the bootloader.

Alternatively, you could skip the bootloader entirely and program the microcontroller over SPI. This can be done with a second Arduino or with a dedicated programmer. These approaches work, but mean that Arduino’s USB port and associated circuitry goes pretty much unused. Not very satisfying.

The easiest workaround is simply to ensure that your HEX files contain only a single, continuous segment. This will generally be the case if you’re compiling from avr-gcc or the Arduino IDE. If you’re writing assembly by hand, as I was, and organized stuff nicely among 64-kilobyte partitions, the resulting HEX files will likely contain several sections. Since HEX files aren’t really human-editable, I wrote a tool called flhex that can merge these sections into a single, larger, continuous section.

 $ flhex bin/main.hex -o bin/main.hex
 $ avrdude -p atmega2560 -c wiring -P /dev/ttyACM0 -b 115200 -D -U flash:w:bin/main.hex:i
avrdude: AVR device initialized and ready to accept instructions

Reading | ################################################## | 100% 0.01s

avrdude: Device signature = 0x1e9801 (probably m2560)
avrdude: reading input file "bin/main.hex"
avrdude: writing flash (235758 bytes):

Writing | ################################################## | 100% 37.72s

avrdude: 235758 bytes of flash written
avrdude: verifying flash memory against bin/main.hex:
avrdude: load data flash data from input file bin/main.hex:
avrdude: input file bin/main.hex contains 235758 bytes
avrdude: reading on-chip flash data:

Reading | ################################################## | 100% 29.53s

avrdude: verifying ...
avrdude: 235758 bytes of flash verified

avrdude: safemode: Fuses OK (E:FD, H:D8, L:FF)

avrdude done.  Thank you.

The disadvantage with using something like flhex is that the resulting HEX files will be somewhat larger and slower to upload. It also adds another dependency and step to the build pipeline.

Thoughts

I first ran into this bug over a year ago, not long after starting work on a Arduino-based game. It caused me to question flash wear, the USB-to-serial chip (actually a programmable microcontroller now!), avrdude, and the hardware. I’m very glad to have finally solved this tricky problem. However, it’s frustrating that a program as fundamental and well-used as an Arduino bootloader has a bug like this. To be fair, most people will be using avr-gcc or the Arduino IDE, and so are unlikely to run into this problem.

Anyway, this has been an interesting debugging journey. I hope this writeup will spare someone a few hours of trouble. If your verification woes persist, you have my sincere sympathy.