f145a6
From patchwork Fri May 11 02:27:50 2018
f145a6
Content-Type: text/plain; charset="utf-8"
f145a6
MIME-Version: 1.0
f145a6
Content-Transfer-Encoding: 8bit
f145a6
Subject: [1/2] arm64: arch_timer: Workaround for Allwinner A64 timer
f145a6
 instability
f145a6
From: Samuel Holland <samuel@sholland.org>
f145a6
X-Patchwork-Id: 10392891
f145a6
Message-Id: <20180511022751.9096-2-samuel@sholland.org>
f145a6
To: Maxime Ripard <maxime.ripard@bootlin.com>, Chen-Yu Tsai <wens@csie.org>, 
f145a6
 Catalin Marinas <catalin.marinas@arm.com>,
f145a6
 Will Deacon <will.deacon@arm.com>,
f145a6
 Daniel Lezcano <daniel.lezcano@linaro.org>,
f145a6
 Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <marc.zyngier@arm.com>
f145a6
Cc: linux-sunxi@googlegroups.com, linux-kernel@vger.kernel.org,
f145a6
 linux-arm-kernel@lists.infradead.org, Samuel Holland <samuel@sholland.org>
f145a6
Date: Thu, 10 May 2018 21:27:50 -0500
f145a6
f145a6
The Allwinner A64 SoC is known [1] to have an unstable architectural
f145a6
timer, which manifests itself most obviously in the time jumping forward
f145a6
a multiple of 95 years [2][3]. This coincides with 2^56 cycles at a
f145a6
timer frequency of 24 MHz, implying that the time went slightly backward
f145a6
(and this was interpreted by the kernel as it jumping forward and
f145a6
wrapping around past the epoch).
f145a6
f145a6
Further investigation revealed instability in the low bits of CNTVCT at
f145a6
the point a high bit rolls over. This leads to power-of-two cycle
f145a6
forward and backward jumps. (Testing shows that forward jumps are about
f145a6
twice as likely as backward jumps.)
f145a6
f145a6
Without trapping reads to CNTVCT, a userspace program is able to read it
f145a6
in a loop faster than it changes. A test program running on all 4 CPU
f145a6
cores that reported jumps larger than 100 ms was run for 13.6 hours and
f145a6
reported the following:
f145a6
f145a6
 Count | Event
f145a6
-------+---------------------------
f145a6
  9940 | jumped backward      699ms
f145a6
   268 | jumped backward     1398ms
f145a6
     1 | jumped backward     2097ms
f145a6
 16020 | jumped forward       175ms
f145a6
  6443 | jumped forward       699ms
f145a6
  2976 | jumped forward      1398ms
f145a6
     9 | jumped forward    356516ms
f145a6
     9 | jumped forward    357215ms
f145a6
     4 | jumped forward    714430ms
f145a6
     1 | jumped forward   3578440ms
f145a6
f145a6
This works out to a jump larger than 100 ms about every 5.5 seconds on
f145a6
each CPU core.
f145a6
f145a6
The largest jump (almost an hour!) was the following sequence of reads:
f145a6
      0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000
f145a6
f145a6
Note that the middle bits don't necessarily all read as all zeroes or
f145a6
all ones during the anomalous behavior; however the low 11 bits checked
f145a6
by the function in this patch have never been observed with any other
f145a6
value.
f145a6
f145a6
Also note that smaller jumps are much more common, with the smallest
f145a6
backward jumps of 2048 cycles observed over 400 times per second on each
f145a6
core. (Of course, this is partially due to lower bits rolling over more
f145a6
frequently.) Any one of these could have caused the 95 year time skip.
f145a6
f145a6
Similar anomalies were observed while reading CNTPCT (after patching the
f145a6
kernel to allow reads from userspace). However, the jumps are much less
f145a6
frequent, and only small jumps were observed. The same program as before
f145a6
(except now reading CNTPCT) observed after 72 hours:
f145a6
f145a6
 Count | Event
f145a6
-------+---------------------------
f145a6
    17 | jumped backward      699ms
f145a6
    52 | jumped forward       175ms
f145a6
  2831 | jumped forward       699ms
f145a6
     5 | jumped forward      1398ms
f145a6
Acked-by: Maxime Ripard <maxime.ripard@bootlin.com>
f145a6
Tested-by: Andre Przywara <andre.przywara@arm.com>
f145a6
f145a6
========================================================================
f145a6
f145a6
Because the CPU can read the CNTPCT/CNTVCT registers faster than they
f145a6
change, performing two reads of the register and comparing the high bits
f145a6
(like other workarounds) is not a workable solution. And because the
f145a6
timer can jump both forward and backward, no pair of reads can
f145a6
distinguish a good value from a bad one. The only way to guarantee a
f145a6
good value from consecutive reads would be to read _three_ times, and
f145a6
take the middle value iff the three values are 1) individually unique
f145a6
and 2) increasing. This takes at minimum 3 cycles (125 ns), or more if
f145a6
an anomaly is detected.
f145a6
f145a6
However, since there is a distinct pattern to the bad values, we can
f145a6
optimize the common case (2046/2048 of the time) to a single read by
f145a6
simply ignoring values that match the pattern. This still takes no more
f145a6
than 3 cycles in the worst case, and requires much less code.
f145a6
f145a6
[1]: https://github.com/armbian/build/commit/a08cd6fe7ae9
f145a6
[2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/
f145a6
[3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26
f145a6
f145a6
Signed-off-by: Samuel Holland <samuel@sholland.org>
f145a6
---
f145a6
 drivers/clocksource/Kconfig          | 11 ++++++++++
f145a6
 drivers/clocksource/arm_arch_timer.c | 39 ++++++++++++++++++++++++++++++++++++
f145a6
 2 files changed, 50 insertions(+)
f145a6
f145a6
diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
f145a6
index 8e8a09755d10..7a5d434dd30b 100644
f145a6
--- a/drivers/clocksource/Kconfig
f145a6
+++ b/drivers/clocksource/Kconfig
f145a6
@@ -364,6 +364,17 @@ config ARM64_ERRATUM_858921
f145a6
 	  The workaround will be dynamically enabled when an affected
f145a6
 	  core is detected.
f145a6
 
f145a6
+config SUN50I_A64_UNSTABLE_TIMER
f145a6
+	bool "Workaround for Allwinner A64 timer instability"
f145a6
+	default y
f145a6
+	depends on ARM_ARCH_TIMER && ARM64 && ARCH_SUNXI
f145a6
+	select ARM_ARCH_TIMER_OOL_WORKAROUND
f145a6
+	help
f145a6
+	  This option enables a workaround for instability in the timer on
f145a6
+	  the Allwinner A64 SoC. The workaround will only be active if the
f145a6
+	  allwinner,sun50i-a64-unstable-timer property is found in the
f145a6
+	  timer node.
f145a6
+
f145a6
 config ARM_GLOBAL_TIMER
f145a6
 	bool "Support for the ARM global timer" if COMPILE_TEST
f145a6
 	select TIMER_OF if OF
f145a6
diff --git a/drivers/clocksource/arm_arch_timer.c b/drivers/clocksource/arm_arch_timer.c
f145a6
index 57cb2f00fc07..66ce13578c52 100644
f145a6
--- a/drivers/clocksource/arm_arch_timer.c
f145a6
+++ b/drivers/clocksource/arm_arch_timer.c
f145a6
@@ -319,6 +319,36 @@ static u64 notrace arm64_858921_read_cntvct_el0(void)
f145a6
 }
f145a6
 #endif
f145a6
 
f145a6
+#ifdef CONFIG_SUN50I_A64_UNSTABLE_TIMER
f145a6
+/*
f145a6
+ * The low bits of each register can transiently read as all ones or all zeroes
f145a6
+ * when bit 11 or greater rolls over. Since the value can jump both backward
f145a6
+ * (7ff -> 000 -> 800) and forward (7ff -> fff -> 800), it is simplest to just
f145a6
+ * ignore register values with all ones or zeros in the low bits.
f145a6
+ */
f145a6
+static u64 notrace sun50i_a64_read_cntpct_el0(void)
f145a6
+{
f145a6
+	u64 val;
f145a6
+
f145a6
+	do {
f145a6
+		val = read_sysreg(cntpct_el0);
f145a6
+	} while (((val + 1) & GENMASK(10, 0)) <= 1);
f145a6
+
f145a6
+	return val;
f145a6
+}
f145a6
+
f145a6
+static u64 notrace sun50i_a64_read_cntvct_el0(void)
f145a6
+{
f145a6
+	u64 val;
f145a6
+
f145a6
+	do {
f145a6
+		val = read_sysreg(cntvct_el0);
f145a6
+	} while (((val + 1) & GENMASK(10, 0)) <= 1);
f145a6
+
f145a6
+	return val;
f145a6
+}
f145a6
+#endif
f145a6
+
f145a6
 #ifdef CONFIG_ARM_ARCH_TIMER_OOL_WORKAROUND
f145a6
 DEFINE_PER_CPU(const struct arch_timer_erratum_workaround *, timer_unstable_counter_workaround);
f145a6
 EXPORT_SYMBOL_GPL(timer_unstable_counter_workaround);
f145a6
@@ -408,6 +438,15 @@ static const struct arch_timer_erratum_workaround ool_workarounds[] = {
f145a6
 		.read_cntvct_el0 = arm64_858921_read_cntvct_el0,
f145a6
 	},
f145a6
 #endif
f145a6
+#ifdef CONFIG_SUN50I_A64_UNSTABLE_TIMER
f145a6
+	{
f145a6
+		.match_type = ate_match_dt,
f145a6
+		.id = "allwinner,sun50i-a64-unstable-timer",
f145a6
+		.desc = "Allwinner A64 timer instability",
f145a6
+		.read_cntpct_el0 = sun50i_a64_read_cntpct_el0,
f145a6
+		.read_cntvct_el0 = sun50i_a64_read_cntvct_el0,
f145a6
+	},
f145a6
+#endif
f145a6
 };
f145a6
 
f145a6
 typedef bool (*ate_match_fn_t)(const struct arch_timer_erratum_workaround *,