539770f
From patchwork Fri Jul  8 15:37:35 2016
539770f
Content-Type: text/plain; charset="utf-8"
539770f
MIME-Version: 1.0
539770f
Content-Transfer-Encoding: 7bit
539770f
Subject: drm/amdgpu: Disable RPM helpers while reprobing connectors on resume
539770f
From: cpaul@redhat.com
539770f
X-Patchwork-Id: 97837
539770f
Message-Id: <1467992256-23832-1-git-send-email-cpaul@redhat.com>
539770f
To: amd-gfx@lists.freedesktop.org
539770f
Cc: Tom St Denis <tom.stdenis@amd.com>, Jammy Zhou <Jammy.Zhou@amd.com>,
539770f
 open list <linux-kernel@vger.kernel.org>, stable@vger.kernel.org,
539770f
 "open list:RADEON and AMDGPU DRM DRIVERS"
539770f
 <dri-devel@lists.freedesktop.org>, 
539770f
 Alex Deucher <alexander.deucher@amd.com>, Lyude <cpaul@redhat.com>,
539770f
 Flora Cui <Flora.Cui@amd.com>,
539770f
 =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
539770f
 Monk Liu <Monk.Liu@amd.com>
539770f
Date: Fri,  8 Jul 2016 11:37:35 -0400
539770f
539770f
Just about all of amdgpu's connector probing functions try to acquire
539770f
runtime PM refs. If we try to do this in the context of
539770f
amdgpu_resume_kms by calling drm_helper_hpd_irq_event(), we end up
539770f
deadlocking the system.
539770f
539770f
Since we're guaranteed to be holding the spinlock for RPM in
539770f
amdgpu_resume_kms, and we already know the GPU is in working order, we
539770f
need to prevent the RPM helpers from trying to run during the initial
539770f
connector reprobe on resume.
539770f
539770f
There's a couple of solutions I've explored for fixing this, but this
539770f
one by far seems to be the simplest and most reliable (plus I'm pretty
539770f
sure that's what disable_depth is there for anyway).
539770f
539770f
Reproduction recipe:
539770f
  - Get any laptop dual GPUs using PRIME
539770f
  - Make sure runtime PM is enabled for amdgpu
539770f
  - Boot the machine
539770f
  - If the machine managed to boot without hanging, switch out of X to
539770f
    another VT. This should definitely cause X to hang infinitely.
539770f
539770f
Cc: stable@vger.kernel.org
539770f
Signed-off-by: Lyude <cpaul@redhat.com>
539770f
---
539770f
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++++++++
539770f
 1 file changed, 12 insertions(+)
539770f
539770f
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
539770f
index 6e92008..46c1fee 100644
539770f
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
539770f
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
539770f
@@ -1841,7 +1841,19 @@ int amdgpu_resume_kms(struct drm_device *dev, bool resume, bool fbcon)
539770f
 	}
539770f
 
539770f
 	drm_kms_helper_poll_enable(dev);
539770f
+
539770f
+	/*
539770f
+	 * Most of the connector probing functions try to acquire runtime pm
539770f
+	 * refs to ensure that the GPU is powered on when connector polling is
539770f
+	 * performed. Since we're calling this from a runtime PM callback,
539770f
+	 * trying to acquire rpm refs will cause us to deadlock.
539770f
+	 *
539770f
+	 * Since we're guaranteed to be holding the rpm lock, it's safe to
539770f
+	 * temporarily disable the rpm helpers so this doesn't deadlock us.
539770f
+	 */
539770f
+	dev->dev->power.disable_depth++;
539770f
 	drm_helper_hpd_irq_event(dev);
539770f
+	dev->dev->power.disable_depth--;
539770f
 
539770f
 	if (fbcon) {
539770f
 		amdgpu_fbdev_set_suspend(adev, 0);