ESP32 Device Time Drift

Autodog · April 17, 2021, 4:13am

Greetings,

I’ve noticed that the local timestamps on my ESP32 fleet are drifting quite a bit (10s of minutes) vs. actual UTC.

In the past, the mDash device drivers would periodically sync to a network time source to correct for local drift. Now it appears this is no longer happening as evidenced by timestamps in mdash.log and a lack of any network timing synchronization events in these same logs (which used to appear in the past).

Is there something on the mdash backend that may have broken here?

Thanks!
-AD

lsm · April 27, 2021, 4:12pm

Actually, what happened is the following:

Every time a device connects to the mdash cloud, mdash sends a current timestamp in a handshake, which device applies
Before, due to a various reasons (unstable devices, mdash updates/restarts), devices were reconnecting often and synced their time often
Now, mdash is way more stable, and time sync does not happen often. ESP32’s own clock drifts a lot without a dedicated RTC

Bottom line - the time drift is actually a result of a stability improvement.

To fix an issue, a firmware should install a timer, like an hourly or daily, to sync with NTP. The bad, brute-force way is a nightly reboots.

Autodog · April 27, 2021, 4:18pm

OK, makes sense. Agreed that the ESP32 free-running clock definitely drifts a lot on its own.

NOTE:
There definitely is some inconsistent mDash time sync behavior when ESP32 devices reboot. If a device power-cycles then the initial NTP time sync is fine. If, however, you initiate a reboot via the mDash console then the device local time is significantly skewed on the order of 25mins or so. This is repeatable across numerous devices in our fleet.

Would you happen to have an example function call of an NTP sync? That seems like a must-have right about now.

Thanks for your help!
-AD

lsm · May 6, 2021, 7:40am

Thank you.

It should be noted here that this question is not, strictly speaking, mDash related.
mDash has nothing to do with ESP32 clock drift.
Alleviating this issue is the firmware’s task, not mDash.

When a device reconnects to mDash, mDash sends a Sys.Info request. That RPC request accepts an optional parameter: utc_time. You could send that request manually. Thus, triggering device reconnect (not necessarily a reboot) should re-sync the clock.

mDash library does not currently perform periodic NTP syncs.

lsm · May 6, 2021, 9:50am

Please try https://github.com/cesanta/mDash/releases/tag/1.2.14 which has an hourly automatic NTP sync.

Autodog · May 13, 2021, 3:33am

Thank you.

The NTP time sync does not seem to kick in after reset (there can still a large time offset) but once the hourly sync kicks in the time looks good.

-AD

Autodog · September 19, 2021, 2:36am

Greetings,

It seems the NTP sync updates implemented in mDash 1.2.14 are no longer working as they once were. We have observed that all of our devices in the field have an offset of 30-32 minutes vs. UTC which causes some headaches.

Can Cesanta look at this once again and see if something with NTP sync has broken?

Thanks!
-AD

Autodog · October 30, 2021, 11:47pm

An update on the local “time-drift” topic.

After extensive testing, we can definitely confirm the issue is related to the mDash 1.2.14 update. If you use this version then every 2 hours or so, mDash will apply a timing offset to the local clock on your device. Even if you periodically initiate an NTP time sync mDash will override this and apply a timing offset. The exact mechanism of how this happens I will need to leave to Cesanta, but the effect is very deterministic.

If you revert back to mDash 1.2.13 this will not happen. For those of you that care about the accuracy of your local clock I’d recommend reverting as we have and occasionally call an NTP sync to account for slow ESP32 timing drift (we do this every 24 hours). This method works well and we’ve successfully implemented this on our fleet.

@lsm - Recommend that you back out your changes that you applied when you moved from 1.2.13 to 1.2.14

-AD