37cc0db
From patchwork Fri Jul  8 15:37:35 2016
37cc0db
Content-Type: text/plain; charset="utf-8"
37cc0db
MIME-Version: 1.0
37cc0db
Content-Transfer-Encoding: 7bit
37cc0db
Subject: drm/amdgpu: Disable RPM helpers while reprobing connectors on resume
37cc0db
From: cpaul@redhat.com
37cc0db
X-Patchwork-Id: 97837
37cc0db
Message-Id: <1467992256-23832-1-git-send-email-cpaul@redhat.com>
37cc0db
To: amd-gfx@lists.freedesktop.org
37cc0db
Cc: Tom St Denis <tom.stdenis@amd.com>, Jammy Zhou <Jammy.Zhou@amd.com>,
37cc0db
 open list <linux-kernel@vger.kernel.org>, stable@vger.kernel.org,
37cc0db
 "open list:RADEON and AMDGPU DRM DRIVERS"
37cc0db
 <dri-devel@lists.freedesktop.org>, 
37cc0db
 Alex Deucher <alexander.deucher@amd.com>, Lyude <cpaul@redhat.com>,
37cc0db
 Flora Cui <Flora.Cui@amd.com>,
37cc0db
 =?UTF-8?q?Christian=20K=C3=B6nig?= <christian.koenig@amd.com>,
37cc0db
 Monk Liu <Monk.Liu@amd.com>
37cc0db
Date: Fri,  8 Jul 2016 11:37:35 -0400
37cc0db
37cc0db
Just about all of amdgpu's connector probing functions try to acquire
37cc0db
runtime PM refs. If we try to do this in the context of
37cc0db
amdgpu_resume_kms by calling drm_helper_hpd_irq_event(), we end up
37cc0db
deadlocking the system.
37cc0db
37cc0db
Since we're guaranteed to be holding the spinlock for RPM in
37cc0db
amdgpu_resume_kms, and we already know the GPU is in working order, we
37cc0db
need to prevent the RPM helpers from trying to run during the initial
37cc0db
connector reprobe on resume.
37cc0db
37cc0db
There's a couple of solutions I've explored for fixing this, but this
37cc0db
one by far seems to be the simplest and most reliable (plus I'm pretty
37cc0db
sure that's what disable_depth is there for anyway).
37cc0db
37cc0db
Reproduction recipe:
37cc0db
  - Get any laptop dual GPUs using PRIME
37cc0db
  - Make sure runtime PM is enabled for amdgpu
37cc0db
  - Boot the machine
37cc0db
  - If the machine managed to boot without hanging, switch out of X to
37cc0db
    another VT. This should definitely cause X to hang infinitely.
37cc0db
37cc0db
Cc: stable@vger.kernel.org
37cc0db
Signed-off-by: Lyude <cpaul@redhat.com>
37cc0db
---
37cc0db
 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++++++++
37cc0db
 1 file changed, 12 insertions(+)
37cc0db
37cc0db
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
37cc0db
index 6e92008..46c1fee 100644
37cc0db
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
37cc0db
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
37cc0db
@@ -1841,7 +1841,19 @@ int amdgpu_resume_kms(struct drm_device *dev, bool resume, bool fbcon)
37cc0db
 	}
37cc0db
 
37cc0db
 	drm_kms_helper_poll_enable(dev);
37cc0db
+
37cc0db
+	/*
37cc0db
+	 * Most of the connector probing functions try to acquire runtime pm
37cc0db
+	 * refs to ensure that the GPU is powered on when connector polling is
37cc0db
+	 * performed. Since we're calling this from a runtime PM callback,
37cc0db
+	 * trying to acquire rpm refs will cause us to deadlock.
37cc0db
+	 *
37cc0db
+	 * Since we're guaranteed to be holding the rpm lock, it's safe to
37cc0db
+	 * temporarily disable the rpm helpers so this doesn't deadlock us.
37cc0db
+	 */
37cc0db
+	dev->dev->power.disable_depth++;
37cc0db
 	drm_helper_hpd_irq_event(dev);
37cc0db
+	dev->dev->power.disable_depth--;
37cc0db
 
37cc0db
 	if (fbcon) {
37cc0db
 		amdgpu_fbdev_set_suspend(adev, 0);