Kyle McMartin 003654d
From 2876b1571839c25ce5e7485ead8417506d720c73 Mon Sep 17 00:00:00 2001
Kyle McMartin 003654d
From: Oleg Nesterov <oleg@redhat.com>
Kyle McMartin 003654d
Date: Fri, 5 Nov 2010 16:53:42 +0100
Kyle McMartin 003654d
Subject: posix-cpu-timers: workaround to suppress the problems with mt exec
Kyle McMartin 003654d
Kyle McMartin 003654d
posix-cpu-timers.c correctly assumes that the dying process does
Kyle McMartin 003654d
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
Kyle McMartin 003654d
timers from signal->cpu_timers list.
Kyle McMartin 003654d
Kyle McMartin 003654d
But, it also assumes that timer->it.cpu.task is always the group
Kyle McMartin 003654d
leader, and thus the dead ->task means the dead thread group.
Kyle McMartin 003654d
Kyle McMartin 003654d
This is obviously not true after de_thread() changes the leader.
Kyle McMartin 003654d
After that almost every posix_cpu_timer_ method has problems.
Kyle McMartin 003654d
Kyle McMartin 003654d
It is not simple to fix this bug correctly. First of all, I think
Kyle McMartin 003654d
that timer->it.cpu should use struct pid instead of task_struct.
Kyle McMartin 003654d
Also, the locking should be reworked completely. In particular,
Kyle McMartin 003654d
tasklist_lock should not be used at all. This all needs a lot of
Kyle McMartin 003654d
nontrivial and hard-to-test changes.
Kyle McMartin 003654d
Kyle McMartin 003654d
Change __exit_signal() to do posix_cpu_timers_exit_group() when
Kyle McMartin 003654d
the old leader dies during exec. This is not the fix, just the
Kyle McMartin 003654d
temporary hack to hide the problem for 2.6.37 and stable. IOW,
Kyle McMartin 003654d
this is obviously wrong but this is what we currently have anyway:
Kyle McMartin 003654d
cpu timers do not work after mt exec.
Kyle McMartin 003654d
Kyle McMartin 003654d
In theory this change adds another race. The exiting leader can
Kyle McMartin 003654d
detach the timers which were attached to the new leader. However,
Kyle McMartin 003654d
the window between de_thread() and release_task() is small, we
Kyle McMartin 003654d
can pretend that sys_timer_create() was called before de_thread().
Kyle McMartin 003654d
Kyle McMartin 003654d
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Kyle McMartin 003654d
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Kyle McMartin 003654d
---
Kyle McMartin 003654d
 kernel/exit.c |    8 ++++++++
Kyle McMartin 003654d
 1 files changed, 8 insertions(+), 0 deletions(-)
Kyle McMartin 003654d
Kyle McMartin 003654d
diff --git a/kernel/exit.c b/kernel/exit.c
Kyle McMartin 003654d
index ac90425..85daf1d 100644
Kyle McMartin 003654d
--- a/kernel/exit.c
Kyle McMartin 003654d
+++ b/kernel/exit.c
Kyle McMartin 003654d
@@ -95,6 +95,14 @@ static void __exit_signal(struct task_struct *tsk)
Kyle McMartin 003654d
 		sig->tty = NULL;
Kyle McMartin 003654d
 	} else {
Kyle McMartin 003654d
 		/*
Kyle McMartin 003654d
+		 * This can only happen if the caller is de_thread().
Kyle McMartin 003654d
+		 * FIXME: this is the temporary hack, we should teach
Kyle McMartin 003654d
+		 * posix-cpu-timers to handle this case correctly.
Kyle McMartin 003654d
+		 */
Kyle McMartin 003654d
+		if (unlikely(has_group_leader_pid(tsk)))
Kyle McMartin 003654d
+			posix_cpu_timers_exit_group(tsk);
Kyle McMartin 003654d
+
Kyle McMartin 003654d
+		/*
Kyle McMartin 003654d
 		 * If there is any task waiting for the group exit
Kyle McMartin 003654d
 		 * then notify it:
Kyle McMartin 003654d
 		 */
Kyle McMartin 003654d
-- 
Kyle McMartin 003654d
1.7.3.2
Kyle McMartin 003654d