14735d3
                                                                                                                                                                                                                                                               
14735d3
Delivered-To: jwboyer@gmail.com
14735d3
Received: by 10.229.175.203 with SMTP id bb11csp66243qcb;
14735d3
        Fri, 8 Jun 2012 15:08:27 -0700 (PDT)
14735d3
Received: by 10.68.222.133 with SMTP id qm5mr23412736pbc.113.1339193307132;
14735d3
        Fri, 08 Jun 2012 15:08:27 -0700 (PDT)
14735d3
Return-Path: <stable-owner@vger.kernel.org>
14735d3
Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67])
14735d3
        by mx.google.com with ESMTP id ku9si12482578pbc.355.2012.06.08.15.08.24;
14735d3
        Fri, 08 Jun 2012 15:08:25 -0700 (PDT)
14735d3
Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67;
14735d3
Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mail=stable-owner@vger.kernel.org
14735d3
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
14735d3
	id S964992Ab2FHWIW (ORCPT <rfc822;bigsmallbd@gmail.com> + 21 others);
14735d3
	Fri, 8 Jun 2012 18:08:22 -0400
14735d3
Received: from mail-bk0-f74.google.com ([209.85.214.74]:41783 "EHLO
14735d3
	mail-bk0-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
14735d3
	with ESMTP id S964922Ab2FHWIV (ORCPT
14735d3
	<rfc822;stable@vger.kernel.org>); Fri, 8 Jun 2012 18:08:21 -0400
14735d3
Received: by bkty5 with SMTP id y5so128736bkt.1
14735d3
        for <stable@vger.kernel.org>; Fri, 08 Jun 2012 15:08:20 -0700 (PDT)
14735d3
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
14735d3
        d=google.com; s=20120113;
14735d3
        h=subject:to:cc:from:date:message-id:x-gm-message-state;
14735d3
        bh=RSdNZSZcXg/enKaYIM+JR4+Bd890ieO+blY9bsk9giI=;
14735d3
        b=NwTZEmRSdqDAiTV/EW91GXpM/yrRd7CNzfPif0JcF0iFgxGAo4lB7W1I05vmrnPcCQ
14735d3
         Va+P6xXLWle2rAVQLsPooKdtb3u2wnNRDEGvBPZl2alje+qzhKGlQcVgnI5+KCM6GaS+
14735d3
         YWoE+2gv5UFmF6JlelThyecGTyZ0D93K5aVYewSxg0H7KZ6BgvMnB/qJKFdScatv1uDH
14735d3
         g39MFwJzmD+DmNMn149jeUWYOLLTeMZJkymtJCLgxS8eJzQxXA0nes2Wz/pXCBdxXF2z
14735d3
         mft6LyzKtoEUDeTtalgm9zxkT4XJ+6bsAMEXBFgkcyNq0Ic8P79AP0ynlET2L/Ql3ARP
14735d3
         C5Sg==
14735d3
Received: by 10.14.101.2 with SMTP id a2mr2823176eeg.6.1339193299969;
14735d3
        Fri, 08 Jun 2012 15:08:19 -0700 (PDT)
14735d3
Received: from hpza10.eem.corp.google.com ([74.125.121.33])
14735d3
        by gmr-mx.google.com with ESMTPS id d52si7345113eei.1.2012.06.08.15.08.19
14735d3
        (version=TLSv1/SSLv3 cipher=AES128-SHA);
14735d3
        Fri, 08 Jun 2012 15:08:19 -0700 (PDT)
14735d3
Received: from akpm.mtv.corp.google.com (akpm.mtv.corp.google.com [172.18.96.75])
14735d3
	by hpza10.eem.corp.google.com (Postfix) with ESMTP id 9D09620004E;
14735d3
	Fri,  8 Jun 2012 15:08:19 -0700 (PDT)
14735d3
Received: from localhost.localdomain (localhost [127.0.0.1])
14735d3
	by akpm.mtv.corp.google.com (Postfix) with ESMTP id D5FACA0329;
14735d3
	Fri,  8 Jun 2012 15:08:18 -0700 (PDT)
14735d3
Subject: + thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae.patch added to -mm tree
14735d3
To:	mm-commits@vger.kernel.org
14735d3
Cc:	aarcange@redhat.com, hughd@google.com, jbeulich@suse.com,
14735d3
	jrnieder@gmail.com, kosaki.motohiro@gmail.com, lwoodman@redhat.com,
14735d3
	mgorman@suse.de, pmatouse@redhat.com, riel@redhat.com,
14735d3
	stable@vger.kernel.org, uobergfe@redhat.com
14735d3
From:	akpm@linux-foundation.org
14735d3
Date:	Fri, 08 Jun 2012 15:08:18 -0700
14735d3
Message-Id: <20120608220818.D5FACA0329@akpm.mtv.corp.google.com>
14735d3
X-Gm-Message-State: ALoCoQnqC0C+2OVVfC5Yi43jUu5vH03b/RBncPoI4SpE4HFSgaRrM+gM2J8rR6MMoba3nM/OmDAU
14735d3
Sender:	stable-owner@vger.kernel.org
14735d3
Precedence: bulk
14735d3
List-ID: <stable.vger.kernel.org>
14735d3
X-Mailing-List:	stable@vger.kernel.org
14735d3

14735d3

14735d3
The patch titled
14735d3
     Subject: thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE
14735d3
has been added to the -mm tree.  Its filename is
14735d3
     thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae.patch
14735d3

14735d3
Before you just go and hit "reply", please:
14735d3
   a) Consider who else should be cc'ed
14735d3
   b) Prefer to cc a suitable mailing list as well
14735d3
   c) Ideally: find the original patch on the mailing list and do a
14735d3
      reply-to-all to that, adding suitable additional cc's
14735d3

14735d3
*** Remember to use Documentation/SubmitChecklist when testing your code ***
14735d3

14735d3
The -mm tree is included into linux-next and is updated
14735d3
there every 3-4 working days
14735d3

14735d3
------------------------------------------------------
14735d3
From: Andrea Arcangeli <aarcange@redhat.com>
14735d3
Subject: thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE
14735d3

14735d3
In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the
14735d3
mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under
14735d3
Xen.
14735d3

14735d3
So instead of dealing only with "consistent" pmdvals in
14735d3
pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually
14735d3
simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals
14735d3
where the low 32bit and high 32bit could be inconsistent (to avoid having
14735d3
to use cmpxchg8b).
14735d3

14735d3
The only guarantee we get from pmd_read_atomic is that if the low part of
14735d3
the pmd was found null, the high part will be null too (so the pmd will be
14735d3
considered unstable).  And if the low part of the pmd is found "stable"
14735d3
later, then it means the whole pmd was read atomically (because after a
14735d3
pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore,
14735d3
and we read the high part after the low part).
14735d3

14735d3
In the 32bit PAE x86 case, it is enough to read the low part of the pmdval
14735d3
atomically to declare the pmd as "stable" and that's true for THP and no
14735d3
THP, furthermore in the THP case we also have a barrier() that will
14735d3
prevent any inconsistent pmdvals to be cached by a later re-read of the
14735d3
*pmd.
14735d3

14735d3
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
14735d3
Cc: Jonathan Nieder <jrnieder@gmail.com>
14735d3
Cc: Ulrich Obergfell <uobergfe@redhat.com>
14735d3
Cc: Mel Gorman <mgorman@suse.de>
14735d3
Cc: Hugh Dickins <hughd@google.com>
14735d3
Cc: Larry Woodman <lwoodman@redhat.com>
14735d3
Cc: Petr Matousek <pmatouse@redhat.com>
14735d3
Cc: Rik van Riel <riel@redhat.com>
14735d3
Cc: Jan Beulich <jbeulich@suse.com>
14735d3
Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
14735d3
Cc: <stable@vger.kernel.org>
14735d3
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
14735d3
---
14735d3

14735d3
 arch/x86/include/asm/pgtable-3level.h |   30 +++++++++++++-----------
14735d3
 include/asm-generic/pgtable.h         |   10 ++++++++
14735d3
 2 files changed, 27 insertions(+), 13 deletions(-)
14735d3

14735d3
diff -puN arch/x86/include/asm/pgtable-3level.h~thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae arch/x86/include/asm/pgtable-3level.h
14735d3
--- a/arch/x86/include/asm/pgtable-3level.h~thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae
14735d3
+++ a/arch/x86/include/asm/pgtable-3level.h
14735d3
@@ -47,16 +47,26 @@ static inline void native_set_pte(pte_t 
14735d3
  * they can run pmd_offset_map_lock or pmd_trans_huge or other pmd
14735d3
  * operations.
14735d3
  *
14735d3
- * Without THP if the mmap_sem is hold for reading, the
14735d3
- * pmd can only transition from null to not null while pmd_read_atomic runs.
14735d3
- * So there's no need of literally reading it atomically.
14735d3
+ * Without THP if the mmap_sem is hold for reading, the pmd can only
14735d3
+ * transition from null to not null while pmd_read_atomic runs. So
14735d3
+ * we can always return atomic pmd values with this function.
14735d3
  *
14735d3
  * With THP if the mmap_sem is hold for reading, the pmd can become
14735d3
- * THP or null or point to a pte (and in turn become "stable") at any
14735d3
- * time under pmd_read_atomic, so it's mandatory to read it atomically
14735d3
- * with cmpxchg8b.
14735d3
+ * trans_huge or none or point to a pte (and in turn become "stable")
14735d3
+ * at any time under pmd_read_atomic. We could read it really
14735d3
+ * atomically here with a atomic64_read for the THP enabled case (and
14735d3
+ * it would be a whole lot simpler), but to avoid using cmpxchg8b we
14735d3
+ * only return an atomic pmdval if the low part of the pmdval is later
14735d3
+ * found stable (i.e. pointing to a pte). And we're returning a none
14735d3
+ * pmdval if the low part of the pmd is none. In some cases the high
14735d3
+ * and low part of the pmdval returned may not be consistent if THP is
14735d3
+ * enabled (the low part may point to previously mapped hugepage,
14735d3
+ * while the high part may point to a more recently mapped hugepage),
14735d3
+ * but pmd_none_or_trans_huge_or_clear_bad() only needs the low part
14735d3
+ * of the pmd to be read atomically to decide if the pmd is unstable
14735d3
+ * or not, with the only exception of when the low part of the pmd is
14735d3
+ * zero in which case we return a none pmd.
14735d3
  */
14735d3
-#ifndef CONFIG_TRANSPARENT_HUGEPAGE
14735d3
 static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
14735d3
 {
14735d3
 	pmdval_t ret;
14735d3
@@ -74,12 +84,6 @@ static inline pmd_t pmd_read_atomic(pmd_
14735d3
 
14735d3
 	return (pmd_t) { ret };
14735d3
 }
14735d3
-#else /* CONFIG_TRANSPARENT_HUGEPAGE */
14735d3
-static inline pmd_t pmd_read_atomic(pmd_t *pmdp)
14735d3
-{
14735d3
-	return (pmd_t) { atomic64_read((atomic64_t *)pmdp) };
14735d3
-}
14735d3
-#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
14735d3
 
14735d3
 static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte)
14735d3
 {
14735d3
diff -puN include/asm-generic/pgtable.h~thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae include/asm-generic/pgtable.h
14735d3
--- a/include/asm-generic/pgtable.h~thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae
14735d3
+++ a/include/asm-generic/pgtable.h
14735d3
@@ -484,6 +484,16 @@ static inline int pmd_none_or_trans_huge
14735d3
 	/*
14735d3
 	 * The barrier will stabilize the pmdval in a register or on
14735d3
 	 * the stack so that it will stop changing under the code.
14735d3
+	 *
14735d3
+	 * When CONFIG_TRANSPARENT_HUGEPAGE=y on x86 32bit PAE,
14735d3
+	 * pmd_read_atomic is allowed to return a not atomic pmdval
14735d3
+	 * (for example pointing to an hugepage that has never been
14735d3
+	 * mapped in the pmd). The below checks will only care about
14735d3
+	 * the low part of the pmd with 32bit PAE x86 anyway, with the
14735d3
+	 * exception of pmd_none(). So the important thing is that if
14735d3
+	 * the low part of the pmd is found null, the high part will
14735d3
+	 * be also null or the pmd_none() check below would be
14735d3
+	 * confused.
14735d3
 	 */
14735d3
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
14735d3
 	barrier();
14735d3
_
14735d3
Subject: Subject: thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE
14735d3

14735d3
Patches currently in -mm which might be from aarcange@redhat.com are
14735d3

14735d3
origin.patch
14735d3
linux-next.patch
14735d3
mm-fix-slab-page-_count-corruption-when-using-slub.patch
14735d3
thp-avoid-atomic64_read-in-pmd_read_atomic-for-32bit-pae.patch
14735d3
hugetlb-rename-max_hstate-to-hugetlb_max_hstate.patch
14735d3
hugetlbfs-dont-use-err_ptr-with-vm_fault-values.patch
14735d3
hugetlbfs-add-an-inline-helper-for-finding-hstate-index.patch
14735d3
hugetlbfs-add-an-inline-helper-for-finding-hstate-index-fix.patch
14735d3
hugetlb-use-mmu_gather-instead-of-a-temporary-linked-list-for-accumulating-pages.patch
14735d3
hugetlb-use-mmu_gather-instead-of-a-temporary-linked-list-for-accumulating-pages-fix.patch
14735d3
hugetlb-use-mmu_gather-instead-of-a-temporary-linked-list-for-accumulating-pages-fix-fix.patch
14735d3
hugetlb-avoid-taking-i_mmap_mutex-in-unmap_single_vma-for-hugetlb.patch
14735d3
hugetlb-simplify-migrate_huge_page.patch
14735d3
hugetlb-simplify-migrate_huge_page-fix.patch
14735d3
memcg-add-hugetlb-extension.patch
14735d3
memcg-add-hugetlb-extension-fix.patch
14735d3
memcg-add-hugetlb-extension-fix-fix.patch
14735d3
hugetlb-add-charge-uncharge-calls-for-hugetlb-alloc-free.patch
14735d3
memcg-track-resource-index-in-cftype-private.patch
14735d3
hugetlbfs-add-memcg-control-files-for-hugetlbfs.patch
14735d3
hugetlbfs-add-memcg-control-files-for-hugetlbfs-use-scnprintf-instead-of-sprintf.patch
14735d3
hugetlbfs-add-memcg-control-files-for-hugetlbfs-use-scnprintf-instead-of-sprintf-fix.patch
14735d3
hugetlbfs-add-a-list-for-tracking-in-use-hugetlb-pages.patch
14735d3
memcg-move-hugetlb-resource-count-to-parent-cgroup-on-memcg-removal.patch
14735d3
memcg-move-hugetlb-resource-count-to-parent-cgroup-on-memcg-removal-fix.patch
14735d3
memcg-move-hugetlb-resource-count-to-parent-cgroup-on-memcg-removal-fix-fix.patch
14735d3
hugetlb-migrate-memcg-info-from-oldpage-to-new-page-during-migration.patch
14735d3
memcg-add-memory-controller-documentation-for-hugetlb-management.patch
14735d3

14735d3
--
14735d3
To unsubscribe from this list: send the line "unsubscribe stable" in
14735d3
the body of a message to majordomo@vger.kernel.org
14735d3
More majordomo info at  http://vger.kernel.org/majordomo-info.html