Danilo C. L. de Paula cb4ea43
From 517a99c5fba163bf684978fe3d9476b619481391 Mon Sep 17 00:00:00 2001
Danilo C. L. de Paula cb4ea43
From: Juan Quintela <quintela@redhat.com>
Danilo C. L. de Paula cb4ea43
Date: Tue, 3 Mar 2020 14:51:42 +0000
Danilo C. L. de Paula cb4ea43
Subject: [PATCH 10/18] migration/multifd: fix nullptr access in
Danilo C. L. de Paula cb4ea43
 multifd_send_terminate_threads
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
RH-Author: Juan Quintela <quintela@redhat.com>
Danilo C. L. de Paula cb4ea43
Message-id: <20200303145143.149290-10-quintela@redhat.com>
Danilo C. L. de Paula cb4ea43
Patchwork-id: 94117
Danilo C. L. de Paula cb4ea43
O-Subject: [RHEL-AV-8.2.0 qemu-kvm PATCH v2 09/10] migration/multifd: fix nullptr access in multifd_send_terminate_threads
Danilo C. L. de Paula cb4ea43
Bugzilla: 1738451
Danilo C. L. de Paula cb4ea43
RH-Acked-by: Laurent Vivier <lvivier@redhat.com>
Danilo C. L. de Paula cb4ea43
RH-Acked-by: Peter Xu <peterx@redhat.com>
Danilo C. L. de Paula cb4ea43
RH-Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
From: Zhimin Feng <fengzhimin1@huawei.com>
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
If the multifd_send_threads is not created when migration is failed,
Danilo C. L. de Paula cb4ea43
multifd_save_cleanup would be called twice. In this senario, the
Danilo C. L. de Paula cb4ea43
multifd_send_state is accessed after it has been released, the result
Danilo C. L. de Paula cb4ea43
is that the source VM is crashing down.
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
Here is the coredump stack:
Danilo C. L. de Paula cb4ea43
    Program received signal SIGSEGV, Segmentation fault.
Danilo C. L. de Paula cb4ea43
    0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012
Danilo C. L. de Paula cb4ea43
    1012            MultiFDSendParams *p = &multifd_send_state->params[i];
Danilo C. L. de Paula cb4ea43
    #0  0x00005629333a78ef in multifd_send_terminate_threads (err=err@entry=0x0) at migration/ram.c:1012
Danilo C. L. de Paula cb4ea43
    #1  0x00005629333ab8a9 in multifd_save_cleanup () at migration/ram.c:1028
Danilo C. L. de Paula cb4ea43
    #2  0x00005629333abaea in multifd_new_send_channel_async (task=0x562935450e70, opaque=<optimized out>) at migration/ram.c:1202
Danilo C. L. de Paula cb4ea43
    #3  0x000056293373a562 in qio_task_complete (task=task@entry=0x562935450e70) at io/task.c:196
Danilo C. L. de Paula cb4ea43
    #4  0x000056293373a6e0 in qio_task_thread_result (opaque=0x562935450e70) at io/task.c:111
Danilo C. L. de Paula cb4ea43
    #5  0x00007f475d4d75a7 in g_idle_dispatch () from /usr/lib64/libglib-2.0.so.0
Danilo C. L. de Paula cb4ea43
    #6  0x00007f475d4da9a9 in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
Danilo C. L. de Paula cb4ea43
    #7  0x0000562933785b33 in glib_pollfds_poll () at util/main-loop.c:219
Danilo C. L. de Paula cb4ea43
    #8  os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:242
Danilo C. L. de Paula cb4ea43
    #9  main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:518
Danilo C. L. de Paula cb4ea43
    #10 0x00005629334c5acf in main_loop () at vl.c:1810
Danilo C. L. de Paula cb4ea43
    #11 0x000056293334d7bb in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4471
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
If the multifd_send_threads is not created when migration is failed.
Danilo C. L. de Paula cb4ea43
In this senario, we don't call multifd_save_cleanup in multifd_new_send_channel_async.
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
Signed-off-by: Zhimin Feng <fengzhimin1@huawei.com>
Danilo C. L. de Paula cb4ea43
Reviewed-by: Juan Quintela <quintela@redhat.com>
Danilo C. L. de Paula cb4ea43
Signed-off-by: Juan Quintela <quintela@redhat.com>
Danilo C. L. de Paula cb4ea43
(cherry picked from commit 9c4d333c092e9c26d38f740ff3616deb42f21681)
Danilo C. L. de Paula cb4ea43
Signed-off-by: Danilo C. L. de Paula <ddepaula@redhat.com>
Danilo C. L. de Paula cb4ea43
---
Danilo C. L. de Paula cb4ea43
 migration/ram.c | 10 +++++++++-
Danilo C. L. de Paula cb4ea43
 1 file changed, 9 insertions(+), 1 deletion(-)
Danilo C. L. de Paula cb4ea43
Danilo C. L. de Paula cb4ea43
diff --git a/migration/ram.c b/migration/ram.c
Danilo C. L. de Paula cb4ea43
index 902c56c..3891eff 100644
Danilo C. L. de Paula cb4ea43
--- a/migration/ram.c
Danilo C. L. de Paula cb4ea43
+++ b/migration/ram.c
Danilo C. L. de Paula cb4ea43
@@ -1229,7 +1229,15 @@ static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
Danilo C. L. de Paula cb4ea43
     trace_multifd_new_send_channel_async(p->id);
Danilo C. L. de Paula cb4ea43
     if (qio_task_propagate_error(task, &local_err)) {
Danilo C. L. de Paula cb4ea43
         migrate_set_error(migrate_get_current(), local_err);
Danilo C. L. de Paula cb4ea43
-        multifd_save_cleanup();
Danilo C. L. de Paula cb4ea43
+        /* Error happen, we need to tell who pay attention to me */
Danilo C. L. de Paula cb4ea43
+        qemu_sem_post(&multifd_send_state->channels_ready);
Danilo C. L. de Paula cb4ea43
+        qemu_sem_post(&p->sem_sync);
Danilo C. L. de Paula cb4ea43
+        /*
Danilo C. L. de Paula cb4ea43
+         * Although multifd_send_thread is not created, but main migration
Danilo C. L. de Paula cb4ea43
+         * thread neet to judge whether it is running, so we need to mark
Danilo C. L. de Paula cb4ea43
+         * its status.
Danilo C. L. de Paula cb4ea43
+         */
Danilo C. L. de Paula cb4ea43
+        p->quit = true;
Danilo C. L. de Paula cb4ea43
     } else {
Danilo C. L. de Paula cb4ea43
         p->c = QIO_CHANNEL(sioc);
Danilo C. L. de Paula cb4ea43
         qio_channel_set_delay(p->c, false);
Danilo C. L. de Paula cb4ea43
-- 
Danilo C. L. de Paula cb4ea43
1.8.3.1
Danilo C. L. de Paula cb4ea43