Description
I initially intended to have the title "g++ incorrectly merges PROGMEM and non-PROGMEM data", but opted to focus on the consequences since I think this is a really bad show stopper bug.
The TL;DR version is that because library file RTClib.cpp
contains (using PROGMEM
)
const uint8_t daysInMonth [] PROGMEM = { 31,28,31,30,31,30,31,31,30,31,30,31 };
and library file Time.cpp
contains (not using PROGMEM
)
static const uint8_t monthDays[]={31,28,31,30,31,30,31,31,30,31,30,31}; // API starts months from 1, this array starts from 0
the compiler incorrectly merges them, but due to the micro-controller's modified Harvard architecture they are stored in different memory types and cannot be accessed the same way. Attempts to read one of them will result in garbage data.
Initially I debugged the problem by adding some Serial.print
to Time.cpp
and RTClib.cpp
and discovered that in Time.cpp
the number of days in a month were garbage values comming from its monthDays
array.
I managed to reproduce similar behaviour in a standalone sketch, without requiering modifications of libraries.
#include <Arduino.h>
#include <avr/pgmspace.h>
// #include <Time.h>
// OK to merge
const uint8_t one_to_twelve1[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
const uint8_t one_to_twelve2[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12};
// OK to merge
const char abcd1[] PROGMEM = {'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'};
const char abcd2[] PROGMEM = {'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd', 'a', 'b', 'c', 'd'};
// NOT OK to merge
const uint8_t month_days1[] PROGMEM = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
const uint8_t month_days2[] = {31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
// NOT OK to merge
const uint8_t primes1[] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37};
const uint8_t primes2[] PROGMEM = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37};
#define PRINT(var) \
do { \
Serial.print(#var); \
Serial.print(" = "); \
Serial.println(var); \
} while (0)
static void print_vars() {
PRINT(one_to_twelve1[0]);
PRINT(one_to_twelve2[0]);
PRINT(pgm_read_byte(&abcd1[0]));
PRINT(pgm_read_byte(&abcd2[0]));
PRINT(pgm_read_byte(&month_days1[0]));
PRINT(month_days2[0]);
PRINT(primes1[0]);
PRINT(pgm_read_byte(&primes2[0]));
}
const uint8_t monthDays[] = {31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31};
const uint8_t monthDays2[] PROGMEM = {
// 'a', 'b', 'c', 'd' // OK
// 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30 // OK
// 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31 // Fails
// 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31, 32 // OK
31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31 // Fails also when changing Time.cpp to:
// static const uint8_t monthDays[]={31,28,31,30,31,99,31,31,30,31,30,31};
};
void dump_monthDays_content() {
Serial.print("monthDays: ");
for (int i = 0; i < 12; i++) {
Serial.print(monthDays[i]);
Serial.print(", ");
}
Serial.println("");
}
void setup() {
Serial.begin(9600);
dump_monthDays_content();
Serial.print("days ");
uint8_t days = 0;
#if 1
days = pgm_read_byte(&monthDays2[0]);
#endif
#if 0
days = (__extension__({
uint16_t __addr16 = (uint16_t)(&monthDays2[0]);
uint8_t __result;
__asm__ __volatile__(
"lpm"
"\n\t"
"mov %0, r0"
"\n\t"
: "=r"(__result)
: "z"(__addr16)
: "r0");
__result;
}));
#endif
Serial.println(days);
print_vars();
}
void loop() {
dump_monthDays_content();
delay(1000);
}
When compiled it produces warning
warning: uninitialized variable 'monthDays2' put into program memory area [-Wuninitialized]
const uint8_t monthDays2[] PROGMEM = {
^
but I do not understand why it does not also complain about month_days1/2
and primes1/2
, they should have the same issue.
Initially I thought that this might have something to do with the actual access to the variable through the specific assembly instruction (hence also inlining the macro just to see if it made any difference (which it did not)), but it turns out that with the #if
s disabled the variable is just optimized away. Just referencing the variable like
void *p = &monthDays2[0];
Serial.print((int)p);
will keep it in the translation unit and produce the same issue. Running the above code on an Arduino Uno gives the following, notice the garbage value for days
below:
monthDays: 31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31,
days 236
one_to_twelve1[0] = 1
one_to_twelve2[0] = 1
pgm_read_byte(&abcd1[0]) = 97
pgm_read_byte(&abcd2[0]) = 97
pgm_read_byte(&month_days1[0]) = 31
month_days2[0] = 31
primes1[0] = 2
pgm_read_byte(&primes2[0]) = 2
monthDays: 31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31,
monthDays: 31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31,
monthDays: 31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31,
monthDays: 31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31,
monthDays: 31, 28, 31, 30, 31, 99, 31, 31, 30, 31, 30, 31,
I guess this problem is related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80462.
I have tested compiling and flashing both with Arduino studio 1.8.8 & nightly (and with vscode, although it is just reusing the toolchain from the Arduino IDE). I have tested on two different computers. And I have tested with two different arduino boards.
$HOME/.arduino15/packages/arduino/tools/avr-gcc/5.4.0-atmel3.6.1-arduino2/bin/avr-gcc --version | head -2
avr-gcc (GCC) 5.4.0
Copyright (C) 2015 Free Software Foundation, Inc.
The nightly build also uses 5.4.0-atmel3.6.1-arduino2, so no difference.