12be10b
From: Jan Beulich <jbeulich@suse.com>
12be10b
Subject: x86/mm: add explicit preemption checks to L3 (un)validation
12be10b
12be10b
When recursive page tables are used at the L3 level, unvalidation of a
12be10b
single L4 table may incur unvalidation of two levels of L3 tables, i.e.
12be10b
a maximum iteration count of 512^3 for unvalidating an L4 table. The
12be10b
preemption check in free_l2_table() as well as the one in
12be10b
_put_page_type() may never be reached, so explicit checking is needed in
12be10b
free_l3_table().
12be10b
12be10b
When recursive page tables are used at the L4 level, the iteration count
12be10b
at L4 alone is capped at 512^2. As soon as a present L3 entry is hit
12be10b
which itself needs unvalidation (and hence requiring another nested loop
12be10b
with 512 iterations), the preemption checks added here kick in, so no
12be10b
further preemption checking is needed at L4 (until we decide to permit
12be10b
5-level paging for PV guests).
12be10b
12be10b
The validation side additions are done just for symmetry.
12be10b
12be10b
This is part of XSA-290.
12be10b
12be10b
Signed-off-by: Jan Beulich <jbeulich@suse.com>
12be10b
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
12be10b
12be10b
--- a/xen/arch/x86/mm.c
12be10b
+++ b/xen/arch/x86/mm.c
12be10b
@@ -1612,6 +1612,13 @@ static int alloc_l3_table(struct page_in
12be10b
     for ( i = page->nr_validated_ptes; i < L3_PAGETABLE_ENTRIES;
12be10b
           i++, partial = 0 )
12be10b
     {
12be10b
+        if ( i > page->nr_validated_ptes && hypercall_preempt_check() )
12be10b
+        {
12be10b
+            page->nr_validated_ptes = i;
12be10b
+            rc = -ERESTART;
12be10b
+            break;
12be10b
+        }
12be10b
+
12be10b
         if ( is_pv_32bit_domain(d) && (i == 3) )
12be10b
         {
12be10b
             if ( !(l3e_get_flags(pl3e[i]) & _PAGE_PRESENT) ||
12be10b
@@ -1913,15 +1920,25 @@ static int free_l3_table(struct page_inf
12be10b
 
12be10b
     pl3e = map_domain_page(_mfn(pfn));
12be10b
 
12be10b
-    do {
12be10b
+    for ( ; ; )
12be10b
+    {
12be10b
         rc = put_page_from_l3e(pl3e[i], pfn, partial, 0);
12be10b
         if ( rc < 0 )
12be10b
             break;
12be10b
+
12be10b
         partial = 0;
12be10b
-        if ( rc > 0 )
12be10b
-            continue;
12be10b
-        pl3e[i] = unadjust_guest_l3e(pl3e[i], d);
12be10b
-    } while ( i-- );
12be10b
+        if ( rc == 0 )
12be10b
+            pl3e[i] = unadjust_guest_l3e(pl3e[i], d);
12be10b
+
12be10b
+        if ( !i-- )
12be10b
+            break;
12be10b
+
12be10b
+        if ( hypercall_preempt_check() )
12be10b
+        {
12be10b
+            rc = -EINTR;
12be10b
+            break;
12be10b
+        }
12be10b
+    }
12be10b
 
12be10b
     unmap_domain_page(pl3e);
12be10b