1f73373
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
Jeremy Cline d1b6f8c
From: Jeremy Cline <jcline@redhat.com>
Jeremy Cline d1b6f8c
Date: Tue, 23 Jul 2019 15:24:30 +0000
Jeremy Cline d1b6f8c
Subject: [PATCH] kdump: add support for crashkernel=auto
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
Rebased for v5.3-rc1 because the documentation has moved.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    Message-id: <20180604013831.574215750@redhat.com>
Jeremy Cline d1b6f8c
    Patchwork-id: 8166
Jeremy Cline d1b6f8c
    O-Subject: [kernel team] [PATCH RHEL8.0 V2 2/2] kdump: add support for crashkernel=auto
Jeremy Cline d1b6f8c
    Bugzilla: 1507353
Jeremy Cline d1b6f8c
    RH-Acked-by: Don Zickus <dzickus@redhat.com>
Jeremy Cline d1b6f8c
    RH-Acked-by: Baoquan He <bhe@redhat.com>
Jeremy Cline d1b6f8c
    RH-Acked-by: Pingfan Liu <piliu@redhat.com>
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1507353
Jeremy Cline d1b6f8c
    Build: https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=16534135
Jeremy Cline d1b6f8c
    Tested: ppc64le, x86_64 with several memory sizes.
Jeremy Cline d1b6f8c
            kdump qe tested 160M on various x86 machines in lab.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    We continue to provide crashkernel=auto like we did in RHEL6
Jeremy Cline d1b6f8c
    and RHEL7,  this will simplify the kdump deployment for common
Jeremy Cline d1b6f8c
    use cases that kdump just works with the auto reserved values.
Jeremy Cline d1b6f8c
    But this is still a best effort estimation, we can not know the
Jeremy Cline d1b6f8c
    exact memory requirement because it depends on a lot of different
Jeremy Cline d1b6f8c
    factors.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    The implementation of crashkernel=auto is simplified as a wrapper
Jeremy Cline d1b6f8c
    to use below kernel cmdline:
Jeremy Cline d1b6f8c
    x86_64: crashkernel=1G-64G:160M,64G-1T:256M,1T-:512M
Jeremy Cline d1b6f8c
    s390x:  crashkernel=4G-64G:160M,64G-1T:256M,1T-:512M
Jeremy Cline d1b6f8c
    arm64:  crashkernel=2G-:512M
Jeremy Cline d1b6f8c
    ppc64:  crashkernel=2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    The difference between this way and the old implementation in
Jeremy Cline d1b6f8c
    RHEL6/7 is we do not scale the crash reserved memory size according
Jeremy Cline d1b6f8c
    to system memory size anymore.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    Latest effort to move upstream is below thread:
Jeremy Cline d1b6f8c
    https://lkml.org/lkml/2018/5/20/262
Jeremy Cline d1b6f8c
    But unfortunately it is still unlikely to be accepted, thus we
Jeremy Cline d1b6f8c
    will still use a RHEL only patch in RHEL8.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    Copied old patch description about the history reason see below:
Jeremy Cline d1b6f8c
    '''
Jeremy Cline d1b6f8c
        Non-upstream explanations:
Jeremy Cline d1b6f8c
        Besides "crashkenrel=X@Y" format, upstream also has advanced
Jeremy Cline d1b6f8c
        "crashkernel=range1:size1[,range2:size2,...][@offset]", and
Jeremy Cline d1b6f8c
        "crashkernel=X,high{low}" formats, but they need more careful
Jeremy Cline d1b6f8c
        manual configuration, and have different values for different
Jeremy Cline d1b6f8c
        architectures.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        Most of the distributions use the standard "crashkernel=X@Y"
Jeremy Cline d1b6f8c
        upstream format, and use crashkernel range format for advanced
Jeremy Cline d1b6f8c
        scenarios, heavily relying on the user's involvement.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        While "crashkernel=auto" is redhat's special feature, it exists
Jeremy Cline d1b6f8c
        and has been used as the default boot cmdline since 2008 rhel6.
Jeremy Cline d1b6f8c
        It does not require users to figure out how many crash memory
Jeremy Cline d1b6f8c
        size for their systems, also has been proved to be able to work
Jeremy Cline d1b6f8c
        pretty well for common scenarios.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        "crashkernel=auto" was tested/based on rhel-related products, as
Jeremy Cline d1b6f8c
        we have stable kernel configurations which means more or less
Jeremy Cline d1b6f8c
        stable memory consumption. In 2014 we tried to post them again to
Jeremy Cline d1b6f8c
        upstream but NACKed by people because they think it's not general
Jeremy Cline d1b6f8c
        and unnecessary, users can specify their own values or do that by
Jeremy Cline d1b6f8c
        scripts. However our customers insist on having it added to rhel.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        Also see one previous discussion related to this backport to Pegas:
Jeremy Cline d1b6f8c
        On 10/17/2016 at 10:15 PM, Don Zickus wrote:
Jeremy Cline d1b6f8c
        > On Fri, Oct 14, 2016 at 10:57:41AM +0800, Dave Young wrote:
Jeremy Cline d1b6f8c
        >> Don, agree with you we should evaluate them instead of just inherit
Jeremy Cline d1b6f8c
        >> them blindly. Below is what I think about kdump auto memory:
Jeremy Cline d1b6f8c
        >> There are two issues for crashkernel=auto in upstream:
Jeremy Cline d1b6f8c
        >> 1) It will be seen as a policy which should not go to kernel
Jeremy Cline d1b6f8c
        >> 2) It is hard to get a good number for the crash reserved size,
Jeremy Cline d1b6f8c
        >> considering various different kernel config options one can setups.
Jeremy Cline d1b6f8c
        >> In RHEL we are easier because our supported Kconfig is limited.
Jeremy Cline d1b6f8c
        >> I digged the upstream mail archive, but I'm not sure I got all the
Jeremy Cline d1b6f8c
        >> information, at least Michael Ellerman was objecting the series for
Jeremy Cline d1b6f8c
        >> 1).
Jeremy Cline d1b6f8c
        > Yes, I know.  Vivek and I have argued about this for years.  :-)
Jeremy Cline d1b6f8c
        >
Jeremy Cline d1b6f8c
        > I had hoped all the changes internally to the makedumpfile would allow
Jeremy Cline d1b6f8c
        > the memory configuration to stabilize at a number like 192M or 128M and
Jeremy Cline d1b6f8c
        > only in the rare cases extend beyond that.
Jeremy Cline d1b6f8c
        >
Jeremy Cline d1b6f8c
        > So I always treated that as a temporary hack until things were better.
Jeremy Cline d1b6f8c
        > With the hope of every new RHEL release we get smarter and better. :-)
Jeremy Cline d1b6f8c
        > Ideally it would be great if we could get the number down to 64M for most
Jeremy Cline d1b6f8c
        > cases and just turn it on in Fedora.  Maybe someday.... ;-)
Jeremy Cline d1b6f8c
        >
Jeremy Cline d1b6f8c
        > We can have this conversation when the patch gets reposted/refreshed
Jeremy Cline d1b6f8c
        > for upstream on rhkl?
Jeremy Cline d1b6f8c
        >
Jeremy Cline d1b6f8c
        > Cheers,
Jeremy Cline d1b6f8c
        > Don
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        We had proposed to drop the historic crashkernel=auto code and move
Jeremy Cline d1b6f8c
        to use crashkernel=range:size format and pass them in anaconda.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        The initial reason is crashkernel=range:size works just fine because
Jeremy Cline d1b6f8c
        we do not need complex algorithm to scale crashkernel reserved size
Jeremy Cline d1b6f8c
        any more.  The old linear scaling is mainly for old makedumpfile
Jeremy Cline d1b6f8c
        requirements, now it is not necessary.
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        But With the new approach, backward compatibility is potentially at risk.
Jeremy Cline d1b6f8c
        For e.g. let's consider the following cases:
Jeremy Cline d1b6f8c
        1) When we upgrade from an older distribution like rhel-alt-7.4(which
Jeremy Cline d1b6f8c
        uses crashkernel=auto) to rhel-alt-7.5 (which uses the crashkernel=xY
Jeremy Cline d1b6f8c
        format)
Jeremy Cline d1b6f8c
        In this case we can use anaconda scripts for checking
Jeremy Cline d1b6f8c
        'crashkernel=auto' in kernel spec and update to the new
Jeremy Cline d1b6f8c
        'crashkernel=range:size' format.
Jeremy Cline d1b6f8c
        2) When we upgrade from rhel-alt-7.5(which uses crashkernel=xY format)
Jeremy Cline d1b6f8c
        to rhel-alt-7.6(which uses crashkernel=xY format), but the x and/or Y
Jeremy Cline d1b6f8c
        values are changed in rhel-alt-7.6.
Jeremy Cline d1b6f8c
        For example from crashkernel=2G-:160M to crashkernel=2G-:192M, then we have
Jeremy Cline d1b6f8c
        no way to determine if the X and/or Y values were distribution
Jeremy Cline d1b6f8c
        provided or user specified ones.
Jeremy Cline d1b6f8c
        Since it is recommended to give precedence to user-specified values,
Jeremy Cline d1b6f8c
        so we cannot do an upgrade in such a case."
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
        Thus turn back to resolve it in kernel, and add a simpler version
Jeremy Cline d1b6f8c
        which just hacks to use the range:size style in code, and make
Jeremy Cline d1b6f8c
        rhel-only code easily to maintain.
Jeremy Cline d1b6f8c
    '''
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
    Signed-off-by: Dave Young <dyoung@redhat.com>
Jeremy Cline d1b6f8c
    Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
Upstream Status: RHEL only
Jeremy Cline d1b6f8c
Signed-off-by: Jeremy Cline <jcline@redhat.com>
Jeremy Cline d1b6f8c
---
Jeremy Cline d1b6f8c
 Documentation/admin-guide/kdump/kdump.rst | 11 +++++++++++
Jeremy Cline d1b6f8c
 kernel/crash_core.c                       | 14 ++++++++++++++
Jeremy Cline d1b6f8c
 2 files changed, 25 insertions(+)
Jeremy Cline d1b6f8c
Jeremy Cline d1b6f8c
diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
fb49733
index 2da65fef2a1c..d53a524f80f0 100644
Jeremy Cline d1b6f8c
--- a/Documentation/admin-guide/kdump/kdump.rst
Jeremy Cline d1b6f8c
+++ b/Documentation/admin-guide/kdump/kdump.rst
Jeremy Cline d1b6f8c
@@ -285,6 +285,17 @@ This would mean:
Jeremy Cline d1b6f8c
     2) if the RAM size is between 512M and 2G (exclusive), then reserve 64M
Jeremy Cline d1b6f8c
     3) if the RAM size is larger than 2G, then reserve 128M
d176dfc
Jeremy Cline d1b6f8c
+Or you can use crashkernel=auto if you have enough memory.  The threshold
Jeremy Cline d1b6f8c
+is 2G on x86_64, arm64, ppc64 and ppc64le. The threshold is 4G for s390x.
Jeremy Cline d1b6f8c
+If your system memory is less than the threshold crashkernel=auto will not
Jeremy Cline d1b6f8c
+reserve memory.
Jeremy Cline d1b6f8c
+
Jeremy Cline d1b6f8c
+The automatically reserved memory size varies based on architecture.
Jeremy Cline d1b6f8c
+The size changes according to system memory size like below:
Jeremy Cline d1b6f8c
+    x86_64: 1G-64G:160M,64G-1T:256M,1T-:512M
Jeremy Cline d1b6f8c
+    s390x:  4G-64G:160M,64G-1T:256M,1T-:512M
Jeremy Cline d1b6f8c
+    arm64:  2G-:512M
Jeremy Cline d1b6f8c
+    ppc64:  2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G
d176dfc
d176dfc
Jeremy Cline d1b6f8c
 Boot into System Kernel
Jeremy Cline d1b6f8c
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
73c86eb
index e4dfe2a05a31..8c6f59932247 100644
Jeremy Cline d1b6f8c
--- a/kernel/crash_core.c
Jeremy Cline d1b6f8c
+++ b/kernel/crash_core.c
Jeremy Cline d1b6f8c
@@ -258,6 +258,20 @@ static int __init __parse_crashkernel(char *cmdline,
Jeremy Cline d1b6f8c
 	if (suffix)
Jeremy Cline d1b6f8c
 		return parse_crashkernel_suffix(ck_cmdline, crash_size,
Jeremy Cline d1b6f8c
 				suffix);
Jeremy Cline d1b6f8c
+
Jeremy Cline d1b6f8c
+	if (strncmp(ck_cmdline, "auto", 4) == 0) {
Jeremy Cline d1b6f8c
+#ifdef CONFIG_X86_64
Jeremy Cline d1b6f8c
+		ck_cmdline = "1G-64G:160M,64G-1T:256M,1T-:512M";
Jeremy Cline d1b6f8c
+#elif defined(CONFIG_S390)
Jeremy Cline d1b6f8c
+		ck_cmdline = "4G-64G:160M,64G-1T:256M,1T-:512M";
Jeremy Cline d1b6f8c
+#elif defined(CONFIG_ARM64)
Jeremy Cline d1b6f8c
+		ck_cmdline = "2G-:512M";
Jeremy Cline d1b6f8c
+#elif defined(CONFIG_PPC64)
Jeremy Cline d1b6f8c
+		ck_cmdline = "2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G";
Jeremy Cline d1b6f8c
+#endif
Jeremy Cline d1b6f8c
+		pr_info("Using crashkernel=auto, the size choosed is a best effort estimation.\n");
Jeremy Cline d1b6f8c
+	}
Jeremy Cline d1b6f8c
+
Jeremy Cline d1b6f8c
 	/*
Jeremy Cline d1b6f8c
 	 * if the commandline contains a ':', then that's the extended
Jeremy Cline d1b6f8c
 	 * syntax -- if not, it must be the classic syntax
Jeremy Cline d1b6f8c
-- 
73c86eb
2.28.0
Jeremy Cline d1b6f8c