For a long time, all machines in Apollo-NG's infrastructure use chrony as a replacement for the usual ntpd package. chrony can be compared to ntpd like nginx can be compared to Apache, newer, much more lightweight approach and some additional very nice features. While nginx has replaced most of the Apache installations these days, chrony still isn't adopted as a good alternative by most people yet.
During the NTPD reflection attack time and the mysql/kernel/ntpd leap second bug, it was nice to see how chrony really saved a lot of time and grief by not being affected. It has been working here - and in many other scenarios - absolutely flawlessly until today.
While setting up gentoo on an Odroid C1 quad-core ARMv7 as a replacement embedded system for picoprint, chronyd died as soon as it tried to sync the kernel time, like this:
2015-02-07T16:29:09Z chronyd version 1.31 starting 2015-02-07T16:31:20Z Selected source 129.70.132.35 2015-02-07T16:31:20Z Fatal error : adjtimex failed for set_frequency, freq_ppm=-1.6047e+00 required_freq=1.6047e+00 required_tick=10000
Deploying 2.0pre1 didn't fix the issue, baking new kernels with different dyn_ticks settings didn't help either but finally there was a break: https://bugzilla.redhat.com/show_bug.cgi?id=1188074#c3.
--- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -634,9 +634,9 @@ int ntp_validate_timex(struct timex *txc) return -EPERM; if (txc->modes & ADJ_FREQUENCY) { - if (LONG_MIN / PPM_SCALE > txc->freq) + if (-MAXFREQ_SCALED / PPM_SCALE > txc->freq) return -EINVAL; - if (LONG_MAX / PPM_SCALE < txc->freq) + if (MAXFREQ_SCALED / PPM_SCALE < txc->freq) return -EINVAL; }
After patching the odroid sources with the recommended one from the bugtracker chronyd is working like a charm again.
Update
After linking the issue on github, kukabu delivered the link to the upstream kernel bug:
--- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -634,9 +634,9 @@ int ntp_validate_timex(struct timex *txc) return -EPERM; if (txc->modes & ADJ_FREQUENCY) { - if (LONG_MIN / PPM_SCALE > txc->freq) + if (LLONG_MIN / PPM_SCALE > txc->freq) return -EINVAL; - if (LONG_MAX / PPM_SCALE < txc->freq) + if (LLONG_MAX / PPM_SCALE < txc->freq) return -EINVAL; }
This patch was tested and is solving chrony's sync problems as well. So if you're experiencing NTP issues with newer kernels (at least from an embedded perspective), that use dynamic or idle tick-less clocks, you might want to check/patch your kernel with this one.
The Odroid C1 also came with another not so obvious pitfall: The RTC power supply. The optional RTC battery, you can usually buy wherever you can buy the Odroid itself, is the only power supply for the RTC. Meaning: When you don't have this battery, the RTC will not get powered at all (not even when your system is powered and running) so in order to get the RTC working at all, you need to feed power to the RTC Header. A quick glance into the schematics quickly revealed the secret:
U15 (XC6215B0927R-G) is a highly precise, low noise, positive voltage LDO regulator and it supplies the power to the VDD_RTC rail. And it's only connected to the RTC Battery PIN Header. So what to do now, when no battery is available? Just a quick hack to solve it:
Since using a battery seemed kinda backwards and there was a 5.5V 1F goldcap flying around here, this hack (second from the left minus the resistor) was used to supply 3.3V from Pin 1 of the Odroid C1 40 pin header through a 0.4V drop schottky diode (to get to about 2.9V which emulates a CR2032 Battery closely enough) and then fed to the goldcap to charge it and supply the RTC power header of the Odroid C1 board while running. GND is the pin towards the UART where it says RTC on the silkscreen, connected to (-) of the goldcap and also to GND of the 40 pin header. The VDD_RTC PIN is the one closer to the IR Receiver and is directly connected to the goldcap's (+).