Kyle McMartin 07d3322
utrace core
Kyle McMartin 07d3322
Kyle McMartin 07d3322
This adds the utrace facility, a new modular interface in the kernel for
Kyle McMartin 07d3322
implementing user thread tracing and debugging.  This fits on top of the
Kyle McMartin 07d3322
tracehook_* layer, so the new code is well-isolated.
Kyle McMartin 07d3322
Kyle McMartin 07d3322
The new interface is in <linux/utrace.h> and the DocBook utrace book
Kyle McMartin 07d3322
describes it.  It allows for multiple separate tracing engines to work in
Kyle McMartin 07d3322
parallel without interfering with each other.  Higher-level tracing
Kyle McMartin 07d3322
facilities can be implemented as loadable kernel modules using this layer.
Kyle McMartin 07d3322
Kyle McMartin 07d3322
The new facility is made optional under CONFIG_UTRACE.
Kyle McMartin 07d3322
When this is not enabled, no new code is added.
Kyle McMartin 07d3322
It can only be enabled on machines that have all the
Kyle McMartin 07d3322
prerequisites and select CONFIG_HAVE_ARCH_TRACEHOOK.
Kyle McMartin 07d3322
Kyle McMartin 07d3322
In this initial version, utrace and ptrace do not play together at all.
Kyle McMartin 07d3322
If ptrace is attached to a thread, the attach calls in the utrace kernel
Kyle McMartin 07d3322
API return -EBUSY.  If utrace is attached to a thread, the PTRACE_ATTACH
Kyle McMartin 07d3322
or PTRACE_TRACEME request will return EBUSY to userland.  The old ptrace
Kyle McMartin 07d3322
code is otherwise unchanged and nothing using ptrace should be affected
Kyle McMartin 07d3322
by this patch as long as utrace is not used at the same time.  In the
Kyle McMartin 07d3322
future we can clean up the ptrace implementation and rework it to use
Kyle McMartin 07d3322
the utrace API.
Kyle McMartin 07d3322
Kyle McMartin 07d3322
Signed-off-by: Roland McGrath <roland@redhat.com>
Kyle McMartin 07d3322
---
Roland McGrath edee7cd
 Documentation/DocBook/Makefile    |    2 +-
Roland McGrath edee7cd
 Documentation/DocBook/utrace.tmpl |  589 +++++++++
Roland McGrath edee7cd
 fs/proc/array.c                   |    3 +
Kyle McMartin 07d3322
 include/linux/sched.h             |    5 +
Kyle McMartin 07d3322
 include/linux/tracehook.h         |   87 ++-
Roland McGrath edee7cd
 include/linux/utrace.h            |  692 +++++++++++
Roland McGrath edee7cd
 init/Kconfig                      |    9 +
Kyle McMartin 07d3322
 kernel/Makefile                   |    1 +
Roland McGrath edee7cd
 kernel/fork.c                     |    3 +
Kyle McMartin 07d3322
 kernel/ptrace.c                   |   14 +
Kyle McMartin 07d3322
 kernel/utrace.c                   | 2434 +++++++++++++++++++++++++++++++++++++
Kyle McMartin 07d3322
 11 files changed, 3837 insertions(+), 2 deletions(-)
Roland McGrath edee7cd
Jesse Keating 7a32965
diff --git a/Documentation/DocBook/Makefile b/Documentation/DocBook/Makefile
Roland McGrath edee7cd
index 34929f2..884c36b 100644  
Jesse Keating 7a32965
--- a/Documentation/DocBook/Makefile
Jesse Keating 7a32965
+++ b/Documentation/DocBook/Makefile
Roland McGrath edee7cd
@@ -14,7 +14,7 @@ DOCBOOKS := z8530book.xml mcabook.xml de
Jesse Keating 7a32965
 	    genericirq.xml s390-drivers.xml uio-howto.xml scsi.xml \
Jesse Keating 7a32965
 	    mac80211.xml debugobjects.xml sh.xml regulator.xml \
Jesse Keating 7a32965
 	    alsa-driver-api.xml writing-an-alsa-driver.xml \
Jesse Keating 7a32965
-	    tracepoint.xml media.xml drm.xml
Jesse Keating 7a32965
+	    tracepoint.xml utrace.xml media.xml drm.xml
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 ###
Jesse Keating 7a32965
 # The build process is as follows (targets):
Jesse Keating 7a32965
diff --git a/Documentation/DocBook/utrace.tmpl b/Documentation/DocBook/utrace.tmpl
Jesse Keating 7a32965
new file mode 100644
Roland McGrath edee7cd
index ...0c40add 100644  
Jesse Keating 7a32965
--- /dev/null
Jesse Keating 7a32965
+++ b/Documentation/DocBook/utrace.tmpl
Kyle McMartin da80d72
@@ -0,0 +1,589 @@
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+"http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" []>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<book id="utrace">
Jesse Keating 7a32965
+  <bookinfo>
Jesse Keating 7a32965
+    <title>The utrace User Debugging Infrastructure</title>
Jesse Keating 7a32965
+  </bookinfo>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <toc></toc>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <chapter id="concepts"><title>utrace concepts</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect1 id="intro"><title>Introduction</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    <application>utrace</application> is infrastructure code for tracing
Jesse Keating 7a32965
+    and controlling user threads.  This is the foundation for writing
Jesse Keating 7a32965
+    tracing engines, which can be loadable kernel modules.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    The basic actors in <application>utrace</application> are the thread
Jesse Keating 7a32965
+    and the tracing engine.  A tracing engine is some body of code that
Jesse Keating 7a32965
+    calls into the <filename><linux/utrace.h></filename>
Jesse Keating 7a32965
+    interfaces, represented by a <structname>struct
Jesse Keating 7a32965
+    utrace_engine_ops</structname>.  (Usually it's a kernel module,
Jesse Keating 7a32965
+    though the legacy <function>ptrace</function> support is a tracing
Jesse Keating 7a32965
+    engine that is not in a kernel module.)  The interface operates on
Jesse Keating 7a32965
+    individual threads (<structname>struct task_struct</structname>).
Jesse Keating 7a32965
+    If an engine wants to treat several threads as a group, that is up
Jesse Keating 7a32965
+    to its higher-level code.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Tracing begins by attaching an engine to a thread, using
Jesse Keating 7a32965
+    <function>utrace_attach_task</function> or
Jesse Keating 7a32965
+    <function>utrace_attach_pid</function>.  If successful, it returns a
Jesse Keating 7a32965
+    pointer that is the handle used in all other calls.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  </sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect1 id="callbacks"><title>Events and Callbacks</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    An attached engine does nothing by default.  An engine makes something
Jesse Keating 7a32965
+    happen by requesting callbacks via <function>utrace_set_events</function>
Jesse Keating 7a32965
+    and poking the thread with <function>utrace_control</function>.
Jesse Keating 7a32965
+    The synchronization issues related to these two calls
Jesse Keating 7a32965
+    are discussed further below in <xref linkend="teardown"/>.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Events are specified using the macro
Jesse Keating 7a32965
+    <constant>UTRACE_EVENT(<replaceable>type</replaceable>)</constant>.
Jesse Keating 7a32965
+    Each event type is associated with a callback in <structname>struct
Jesse Keating 7a32965
+    utrace_engine_ops</structname>.  A tracing engine can leave unused
Jesse Keating 7a32965
+    callbacks <constant>NULL</constant>.  The only callbacks required
Jesse Keating 7a32965
+    are those used by the event flags it sets.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Many engines can be attached to each thread.  When a thread has an
Jesse Keating 7a32965
+    event, each engine gets a callback if it has set the event flag for
Jesse Keating 7a32965
+    that event type.  For most events, engines are called in the order they
Jesse Keating 7a32965
+    attached.  Engines that attach after the event has occurred do not get
Jesse Keating 7a32965
+    callbacks for that event.  This includes any new engines just attached
Jesse Keating 7a32965
+    by an existing engine's callback function.  Once the sequence of
Jesse Keating 7a32965
+    callbacks for that one event has completed, such new engines are then
Jesse Keating 7a32965
+    eligible in the next sequence that starts when there is another event.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Event reporting callbacks have details particular to the event type,
Jesse Keating 7a32965
+    but are all called in similar environments and have the same
Jesse Keating 7a32965
+    constraints.  Callbacks are made from safe points, where no locks
Jesse Keating 7a32965
+    are held, no special resources are pinned (usually), and the
Jesse Keating 7a32965
+    user-mode state of the thread is accessible.  So, callback code has
Jesse Keating 7a32965
+    a pretty free hand.  But to be a good citizen, callback code should
Jesse Keating 7a32965
+    never block for long periods.  It is fine to block in
Jesse Keating 7a32965
+    <function>kmalloc</function> and the like, but never wait for i/o or
Jesse Keating 7a32965
+    for user mode to do something.  If you need the thread to wait, use
Jesse Keating 7a32965
+    <constant>UTRACE_STOP</constant> and return from the callback
Jesse Keating 7a32965
+    quickly.  When your i/o finishes or whatever, you can use
Jesse Keating 7a32965
+    <function>utrace_control</function> to resume the thread.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    The <constant>UTRACE_EVENT(SYSCALL_ENTRY)</constant> event is a special
Jesse Keating 7a32965
+    case.  While other events happen in the kernel when it will return to
Jesse Keating 7a32965
+    user mode soon, this event happens when entering the kernel before it
Jesse Keating 7a32965
+    will proceed with the work requested from user mode.  Because of this
Jesse Keating 7a32965
+    difference, the <function>report_syscall_entry</function> callback is
Jesse Keating 7a32965
+    special in two ways.  For this event, engines are called in reverse of
Jesse Keating 7a32965
+    the normal order (this includes the <function>report_quiesce</function>
Jesse Keating 7a32965
+    call that precedes a <function>report_syscall_entry</function> call).
Jesse Keating 7a32965
+    This preserves the semantics that the last engine to attach is called
Jesse Keating 7a32965
+    "closest to user mode"--the engine that is first to see a thread's user
Jesse Keating 7a32965
+    state when it enters the kernel is also the last to see that state when
Jesse Keating 7a32965
+    the thread returns to user mode.  For the same reason, if these
Jesse Keating 7a32965
+    callbacks use <constant>UTRACE_STOP</constant> (see the next section),
Jesse Keating 7a32965
+    the thread stops immediately after callbacks rather than only when it's
Jesse Keating 7a32965
+    ready to return to user mode; when allowed to resume, it will actually
Jesse Keating 7a32965
+    attempt the system call indicated by the register values at that time.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  </sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect1 id="safely"><title>Stopping Safely</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="well-behaved"><title>Writing well-behaved callbacks</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Well-behaved callbacks are important to maintain two essential
Jesse Keating 7a32965
+    properties of the interface.  The first of these is that unrelated
Jesse Keating 7a32965
+    tracing engines should not interfere with each other.  If your engine's
Jesse Keating 7a32965
+    event callback does not return quickly, then another engine won't get
Jesse Keating 7a32965
+    the event notification in a timely manner.  The second important
Jesse Keating 7a32965
+    property is that tracing should be as noninvasive as possible to the
Jesse Keating 7a32965
+    normal operation of the system overall and of the traced thread in
Jesse Keating 7a32965
+    particular.  That is, attached tracing engines should not perturb a
Jesse Keating 7a32965
+    thread's behavior, except to the extent that changing its user-visible
Jesse Keating 7a32965
+    state is explicitly what you want to do.  (Obviously some perturbation
Jesse Keating 7a32965
+    is unavoidable, primarily timing changes, ranging from small delays due
Jesse Keating 7a32965
+    to the overhead of tracing, to arbitrary pauses in user code execution
Jesse Keating 7a32965
+    when a user stops a thread with a debugger for examination.)  Even when
Jesse Keating 7a32965
+    you explicitly want the perturbation of making the traced thread block,
Jesse Keating 7a32965
+    just blocking directly in your callback has more unwanted effects.  For
Jesse Keating 7a32965
+    example, the <constant>CLONE</constant> event callbacks are called when
Jesse Keating 7a32965
+    the new child thread has been created but not yet started running; the
Jesse Keating 7a32965
+    child can never be scheduled until the <constant>CLONE</constant>
Jesse Keating 7a32965
+    tracing callbacks return.  (This allows engines tracing the parent to
Jesse Keating 7a32965
+    attach to the child.)  If a <constant>CLONE</constant> event callback
Jesse Keating 7a32965
+    blocks the parent thread, it also prevents the child thread from
Jesse Keating 7a32965
+    running (even to process a <constant>SIGKILL</constant>).  If what you
Jesse Keating 7a32965
+    want is to make both the parent and child block, then use
Jesse Keating 7a32965
+    <function>utrace_attach_task</function> on the child and then use
Jesse Keating 7a32965
+    <constant>UTRACE_STOP</constant> on both threads.  A more crucial
Jesse Keating 7a32965
+    problem with blocking in callbacks is that it can prevent
Jesse Keating 7a32965
+    <constant>SIGKILL</constant> from working.  A thread that is blocking
Jesse Keating 7a32965
+    due to <constant>UTRACE_STOP</constant> will still wake up and die
Jesse Keating 7a32965
+    immediately when sent a <constant>SIGKILL</constant>, as all threads
Jesse Keating 7a32965
+    should.  Relying on the <application>utrace</application>
Jesse Keating 7a32965
+    infrastructure rather than on private synchronization calls in event
Jesse Keating 7a32965
+    callbacks is an important way to help keep tracing robustly
Jesse Keating 7a32965
+    noninvasive.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="UTRACE_STOP"><title>Using <constant>UTRACE_STOP</constant></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    To control another thread and access its state, it must be stopped
Jesse Keating 7a32965
+    with <constant>UTRACE_STOP</constant>.  This means that it is
Jesse Keating 7a32965
+    stopped and won't start running again while we access it.  When a
Jesse Keating 7a32965
+    thread is not already stopped, <function>utrace_control</function>
Jesse Keating 7a32965
+    returns <constant>-EINPROGRESS</constant> and an engine must wait
Jesse Keating 7a32965
+    for an event callback when the thread is ready to stop.  The thread
Jesse Keating 7a32965
+    may be running on another CPU or may be blocked.  When it is ready
Jesse Keating 7a32965
+    to be examined, it will make callbacks to engines that set the
Jesse Keating 7a32965
+    <constant>UTRACE_EVENT(QUIESCE)</constant> event bit.  To wake up an
Jesse Keating 7a32965
+    interruptible wait, use <constant>UTRACE_INTERRUPT</constant>.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    As long as some engine has used <constant>UTRACE_STOP</constant> and
Jesse Keating 7a32965
+    not called <function>utrace_control</function> to resume the thread,
Jesse Keating 7a32965
+    then the thread will remain stopped.  <constant>SIGKILL</constant>
Jesse Keating 7a32965
+    will wake it up, but it will not run user code.  When the stop is
Jesse Keating 7a32965
+    cleared with <function>utrace_control</function> or a callback
Jesse Keating 7a32965
+    return value, the thread starts running again.
Jesse Keating 7a32965
+    (See also <xref linkend="teardown"/>.)
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  </sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect1 id="teardown"><title>Tear-down Races</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="SIGKILL"><title>Primacy of <constant>SIGKILL</constant></title>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Ordinarily synchronization issues for tracing engines are kept fairly
Jesse Keating 7a32965
+    straightforward by using <constant>UTRACE_STOP</constant>.  You ask a
Jesse Keating 7a32965
+    thread to stop, and then once it makes the
Jesse Keating 7a32965
+    <function>report_quiesce</function> callback it cannot do anything else
Jesse Keating 7a32965
+    that would result in another callback, until you let it with a
Jesse Keating 7a32965
+    <function>utrace_control</function> call.  This simple arrangement
Jesse Keating 7a32965
+    avoids complex and error-prone code in each one of a tracing engine's
Jesse Keating 7a32965
+    event callbacks to keep them serialized with the engine's other
Jesse Keating 7a32965
+    operations done on that thread from another thread of control.
Jesse Keating 7a32965
+    However, giving tracing engines complete power to keep a traced thread
Jesse Keating 7a32965
+    stuck in place runs afoul of a more important kind of simplicity that
Jesse Keating 7a32965
+    the kernel overall guarantees: nothing can prevent or delay
Jesse Keating 7a32965
+    <constant>SIGKILL</constant> from making a thread die and release its
Jesse Keating 7a32965
+    resources.  To preserve this important property of
Jesse Keating 7a32965
+    <constant>SIGKILL</constant>, it as a special case can break
Jesse Keating 7a32965
+    <constant>UTRACE_STOP</constant> like nothing else normally can.  This
Jesse Keating 7a32965
+    includes both explicit <constant>SIGKILL</constant> signals and the
Jesse Keating 7a32965
+    implicit <constant>SIGKILL</constant> sent to each other thread in the
Jesse Keating 7a32965
+    same thread group by a thread doing an exec, or processing a fatal
Jesse Keating 7a32965
+    signal, or making an <function>exit_group</function> system call.  A
Jesse Keating 7a32965
+    tracing engine can prevent a thread from beginning the exit or exec or
Jesse Keating 7a32965
+    dying by signal (other than <constant>SIGKILL</constant>) if it is
Jesse Keating 7a32965
+    attached to that thread, but once the operation begins, no tracing
Jesse Keating 7a32965
+    engine can prevent or delay all other threads in the same thread group
Jesse Keating 7a32965
+    dying.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="reap"><title>Final callbacks</title>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    The <function>report_reap</function> callback is always the final event
Jesse Keating 7a32965
+    in the life cycle of a traced thread.  Tracing engines can use this as
Jesse Keating 7a32965
+    the trigger to clean up their own data structures.  The
Jesse Keating 7a32965
+    <function>report_death</function> callback is always the penultimate
Jesse Keating 7a32965
+    event a tracing engine might see; it's seen unless the thread was
Jesse Keating 7a32965
+    already in the midst of dying when the engine attached.  Many tracing
Jesse Keating 7a32965
+    engines will have no interest in when a parent reaps a dead process,
Jesse Keating 7a32965
+    and nothing they want to do with a zombie thread once it dies; for
Jesse Keating 7a32965
+    them, the <function>report_death</function> callback is the natural
Jesse Keating 7a32965
+    place to clean up data structures and detach.  To facilitate writing
Jesse Keating 7a32965
+    such engines robustly, given the asynchrony of
Jesse Keating 7a32965
+    <constant>SIGKILL</constant>, and without error-prone manual
Jesse Keating 7a32965
+    implementation of synchronization schemes, the
Jesse Keating 7a32965
+    <application>utrace</application> infrastructure provides some special
Jesse Keating 7a32965
+    guarantees about the <function>report_death</function> and
Jesse Keating 7a32965
+    <function>report_reap</function> callbacks.  It still takes some care
Jesse Keating 7a32965
+    to be sure your tracing engine is robust to tear-down races, but these
Jesse Keating 7a32965
+    rules make it reasonably straightforward and concise to handle a lot of
Jesse Keating 7a32965
+    corner cases correctly.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="refcount"><title>Engine and task pointers</title>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    The first sort of guarantee concerns the core data structures
Jesse Keating 7a32965
+    themselves.  <structname>struct utrace_engine</structname> is
Jesse Keating 7a32965
+    a reference-counted data structure.  While you hold a reference, an
Jesse Keating 7a32965
+    engine pointer will always stay valid so that you can safely pass it to
Jesse Keating 7a32965
+    any <application>utrace</application> call.  Each call to
Jesse Keating 7a32965
+    <function>utrace_attach_task</function> or
Jesse Keating 7a32965
+    <function>utrace_attach_pid</function> returns an engine pointer with a
Jesse Keating 7a32965
+    reference belonging to the caller.  You own that reference until you
Jesse Keating 7a32965
+    drop it using <function>utrace_engine_put</function>.  There is an
Jesse Keating 7a32965
+    implicit reference on the engine while it is attached.  So if you drop
Jesse Keating 7a32965
+    your only reference, and then use
Jesse Keating 7a32965
+    <function>utrace_attach_task</function> without
Jesse Keating 7a32965
+    <constant>UTRACE_ATTACH_CREATE</constant> to look up that same engine,
Jesse Keating 7a32965
+    you will get the same pointer with a new reference to replace the one
Jesse Keating 7a32965
+    you dropped, just like calling <function>utrace_engine_get</function>.
Jesse Keating 7a32965
+    When an engine has been detached, either explicitly with
Jesse Keating 7a32965
+    <constant>UTRACE_DETACH</constant> or implicitly after
Jesse Keating 7a32965
+    <function>report_reap</function>, then any references you hold are all
Jesse Keating 7a32965
+    that keep the old engine pointer alive.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    There is nothing a kernel module can do to keep a <structname>struct
Jesse Keating 7a32965
+    task_struct</structname> alive outside of
Jesse Keating 7a32965
+    <function>rcu_read_lock</function>.  When the task dies and is reaped
Jesse Keating 7a32965
+    by its parent (or itself), that structure can be freed so that any
Jesse Keating 7a32965
+    dangling pointers you have stored become invalid.
Jesse Keating 7a32965
+    <application>utrace</application> will not prevent this, but it can
Jesse Keating 7a32965
+    help you detect it safely.  By definition, a task that has been reaped
Jesse Keating 7a32965
+    has had all its engines detached.  All
Jesse Keating 7a32965
+    <application>utrace</application> calls can be safely called on a
Jesse Keating 7a32965
+    detached engine if the caller holds a reference on that engine pointer,
Jesse Keating 7a32965
+    even if the task pointer passed in the call is invalid.  All calls
Jesse Keating 7a32965
+    return <constant>-ESRCH</constant> for a detached engine, which tells
Jesse Keating 7a32965
+    you that the task pointer you passed could be invalid now.  Since
Jesse Keating 7a32965
+    <function>utrace_control</function> and
Jesse Keating 7a32965
+    <function>utrace_set_events</function> do not block, you can call those
Jesse Keating 7a32965
+    inside a <function>rcu_read_lock</function> section and be sure after
Jesse Keating 7a32965
+    they don't return <constant>-ESRCH</constant> that the task pointer is
Jesse Keating 7a32965
+    still valid until <function>rcu_read_unlock</function>.  The
Jesse Keating 7a32965
+    infrastructure never holds task references of its own.  Though neither
Jesse Keating 7a32965
+    <function>rcu_read_lock</function> nor any other lock is held while
Jesse Keating 7a32965
+    making a callback, it's always guaranteed that the <structname>struct
Jesse Keating 7a32965
+    task_struct</structname> and the <structname>struct
Jesse Keating 7a32965
+    utrace_engine</structname> passed as arguments remain valid
Jesse Keating 7a32965
+    until the callback function returns.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    The common means for safely holding task pointers that is available to
Jesse Keating 7a32965
+    kernel modules is to use <structname>struct pid</structname>, which
Jesse Keating 7a32965
+    permits <function>put_pid</function> from kernel modules.  When using
Jesse Keating 7a32965
+    that, the calls <function>utrace_attach_pid</function>,
Jesse Keating 7a32965
+    <function>utrace_control_pid</function>,
Jesse Keating 7a32965
+    <function>utrace_set_events_pid</function>, and
Jesse Keating 7a32965
+    <function>utrace_barrier_pid</function> are available.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="reap-after-death">
Jesse Keating 7a32965
+    <title>
Jesse Keating 7a32965
+      Serialization of <constant>DEATH</constant> and <constant>REAP</constant>
Jesse Keating 7a32965
+    </title>
Jesse Keating 7a32965
+    <para>
Jesse Keating 7a32965
+      The second guarantee is the serialization of
Jesse Keating 7a32965
+      <constant>DEATH</constant> and <constant>REAP</constant> event
Jesse Keating 7a32965
+      callbacks for a given thread.  The actual reaping by the parent
Jesse Keating 7a32965
+      (<function>release_task</function> call) can occur simultaneously
Jesse Keating 7a32965
+      while the thread is still doing the final steps of dying, including
Jesse Keating 7a32965
+      the <function>report_death</function> callback.  If a tracing engine
Jesse Keating 7a32965
+      has requested both <constant>DEATH</constant> and
Jesse Keating 7a32965
+      <constant>REAP</constant> event reports, it's guaranteed that the
Jesse Keating 7a32965
+      <function>report_reap</function> callback will not be made until
Jesse Keating 7a32965
+      after the <function>report_death</function> callback has returned.
Jesse Keating 7a32965
+      If the <function>report_death</function> callback itself detaches
Jesse Keating 7a32965
+      from the thread, then the <function>report_reap</function> callback
Jesse Keating 7a32965
+      will never be made.  Thus it is safe for a
Jesse Keating 7a32965
+      <function>report_death</function> callback to clean up data
Jesse Keating 7a32965
+      structures and detach.
Jesse Keating 7a32965
+    </para>
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="interlock"><title>Interlock with final callbacks</title>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    The final sort of guarantee is that a tracing engine will know for sure
Jesse Keating 7a32965
+    whether or not the <function>report_death</function> and/or
Jesse Keating 7a32965
+    <function>report_reap</function> callbacks will be made for a certain
Jesse Keating 7a32965
+    thread.  These tear-down races are disambiguated by the error return
Jesse Keating 7a32965
+    values of <function>utrace_set_events</function> and
Jesse Keating 7a32965
+    <function>utrace_control</function>.  Normally
Jesse Keating 7a32965
+    <function>utrace_control</function> called with
Jesse Keating 7a32965
+    <constant>UTRACE_DETACH</constant> returns zero, and this means that no
Jesse Keating 7a32965
+    more callbacks will be made.  If the thread is in the midst of dying,
Jesse Keating 7a32965
+    it returns <constant>-EALREADY</constant> to indicate that the
Jesse Keating 7a32965
+    <constant>report_death</constant> callback may already be in progress;
Jesse Keating 7a32965
+    when you get this error, you know that any cleanup your
Jesse Keating 7a32965
+    <function>report_death</function> callback does is about to happen or
Jesse Keating 7a32965
+    has just happened--note that if the <function>report_death</function>
Jesse Keating 7a32965
+    callback does not detach, the engine remains attached until the thread
Jesse Keating 7a32965
+    gets reaped.  If the thread is in the midst of being reaped,
Jesse Keating 7a32965
+    <function>utrace_control</function> returns <constant>-ESRCH</constant>
Jesse Keating 7a32965
+    to indicate that the <function>report_reap</function> callback may
Jesse Keating 7a32965
+    already be in progress; this means the engine is implicitly detached
Jesse Keating 7a32965
+    when the callback completes.  This makes it possible for a tracing
Jesse Keating 7a32965
+    engine that has decided asynchronously to detach from a thread to
Jesse Keating 7a32965
+    safely clean up its data structures, knowing that no
Jesse Keating 7a32965
+    <function>report_death</function> or <function>report_reap</function>
Jesse Keating 7a32965
+    callback will try to do the same.  <constant>utrace_detach</constant>
Jesse Keating 7a32965
+    returns <constant>-ESRCH</constant> when the <structname>struct
Jesse Keating 7a32965
+    utrace_engine</structname> has already been detached, but is
Jesse Keating 7a32965
+    still a valid pointer because of its reference count.  A tracing engine
Jesse Keating 7a32965
+    can use this to safely synchronize its own independent multiple threads
Jesse Keating 7a32965
+    of control with each other and with its event callbacks that detach.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    In the same vein, <function>utrace_set_events</function> normally
Jesse Keating 7a32965
+    returns zero; if the target thread was stopped before the call, then
Jesse Keating 7a32965
+    after a successful call, no event callbacks not requested in the new
Jesse Keating 7a32965
+    flags will be made.  It fails with <constant>-EALREADY</constant> if
Jesse Keating 7a32965
+    you try to clear <constant>UTRACE_EVENT(DEATH)</constant> when the
Kyle McMartin da80d72
+    <function>report_death</function> callback may already have begun, or if
Jesse Keating 7a32965
+    you try to newly set <constant>UTRACE_EVENT(DEATH)</constant> or
Jesse Keating 7a32965
+    <constant>UTRACE_EVENT(QUIESCE)</constant> when the target is already
Jesse Keating 7a32965
+    dead or dying.  Like <function>utrace_control</function>, it returns
Kyle McMartin da80d72
+    <constant>-ESRCH</constant> when the <function>report_reap</function>
Kyle McMartin da80d72
+    callback may already have begun, or the thread has already been detached
Jesse Keating 7a32965
+    (including forcible detach on reaping).  This lets the tracing engine
Jesse Keating 7a32965
+    know for sure which event callbacks it will or won't see after
Jesse Keating 7a32965
+    <function>utrace_set_events</function> has returned.  By checking for
Jesse Keating 7a32965
+    errors, it can know whether to clean up its data structures immediately
Jesse Keating 7a32965
+    or to let its callbacks do the work.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <sect2 id="barrier"><title>Using <function>utrace_barrier</function></title>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    When a thread is safely stopped, calling
Jesse Keating 7a32965
+    <function>utrace_control</function> with <constant>UTRACE_DETACH</constant>
Jesse Keating 7a32965
+    or calling <function>utrace_set_events</function> to disable some events
Jesse Keating 7a32965
+    ensures synchronously that your engine won't get any more of the callbacks
Jesse Keating 7a32965
+    that have been disabled (none at all when detaching).  But these can also
Jesse Keating 7a32965
+    be used while the thread is not stopped, when it might be simultaneously
Jesse Keating 7a32965
+    making a callback to your engine.  For this situation, these calls return
Jesse Keating 7a32965
+    <constant>-EINPROGRESS</constant> when it's possible a callback is in
Jesse Keating 7a32965
+    progress.  If you are not prepared to have your old callbacks still run,
Jesse Keating 7a32965
+    then you can synchronize to be sure all the old callbacks are finished,
Jesse Keating 7a32965
+    using <function>utrace_barrier</function>.  This is necessary if the
Jesse Keating 7a32965
+    kernel module containing your callback code is going to be unloaded.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    After using <constant>UTRACE_DETACH</constant> once, further calls to
Jesse Keating 7a32965
+    <function>utrace_control</function> with the same engine pointer will
Jesse Keating 7a32965
+    return <constant>-ESRCH</constant>.  In contrast, after getting
Jesse Keating 7a32965
+    <constant>-EINPROGRESS</constant> from
Jesse Keating 7a32965
+    <function>utrace_set_events</function>, you can call
Jesse Keating 7a32965
+    <function>utrace_set_events</function> again later and if it returns zero
Jesse Keating 7a32965
+    then know the old callbacks have finished.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    Unlike all other calls, <function>utrace_barrier</function> (and
Jesse Keating 7a32965
+    <function>utrace_barrier_pid</function>) will accept any engine pointer you
Jesse Keating 7a32965
+    hold a reference on, even if <constant>UTRACE_DETACH</constant> has already
Jesse Keating 7a32965
+    been used.  After any <function>utrace_control</function> or
Jesse Keating 7a32965
+    <function>utrace_set_events</function> call (these do not block), you can
Jesse Keating 7a32965
+    call <function>utrace_barrier</function> to block until callbacks have
Jesse Keating 7a32965
+    finished.  This returns <constant>-ESRCH</constant> only if the engine is
Jesse Keating 7a32965
+    completely detached (finished all callbacks).  Otherwise it waits
Jesse Keating 7a32965
+    until the thread is definitely not in the midst of a callback to this
Jesse Keating 7a32965
+    engine and then returns zero, but can return
Jesse Keating 7a32965
+    <constant>-ERESTARTSYS</constant> if its wait is interrupted.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+  </sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</chapter>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<chapter id="core"><title>utrace core API</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  The utrace API is declared in <filename><linux/utrace.h></filename>.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+!Iinclude/linux/utrace.h
Jesse Keating 7a32965
+!Ekernel/utrace.c
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</chapter>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<chapter id="machine"><title>Machine State</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  The <function>task_current_syscall</function> function can be used on any
Jesse Keating 7a32965
+  valid <structname>struct task_struct</structname> at any time, and does
Jesse Keating 7a32965
+  not even require that <function>utrace_attach_task</function> was used at all.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  The other ways to access the registers and other machine-dependent state of
Jesse Keating 7a32965
+  a task can only be used on a task that is at a known safe point.  The safe
Jesse Keating 7a32965
+  points are all the places where <function>utrace_set_events</function> can
Jesse Keating 7a32965
+  request callbacks (except for the <constant>DEATH</constant> and
Jesse Keating 7a32965
+  <constant>REAP</constant> events).  So at any event callback, it is safe to
Jesse Keating 7a32965
+  examine <varname>current</varname>.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  One task can examine another only after a callback in the target task that
Jesse Keating 7a32965
+  returns <constant>UTRACE_STOP</constant> so that task will not return to user
Jesse Keating 7a32965
+  mode after the safe point.  This guarantees that the task will not resume
Jesse Keating 7a32965
+  until the same engine uses <function>utrace_control</function>, unless the
Jesse Keating 7a32965
+  task dies suddenly.  To examine safely, one must use a pair of calls to
Jesse Keating 7a32965
+  <function>utrace_prepare_examine</function> and
Jesse Keating 7a32965
+  <function>utrace_finish_examine</function> surrounding the calls to
Jesse Keating 7a32965
+  <structname>struct user_regset</structname> functions or direct examination
Jesse Keating 7a32965
+  of task data structures.  <function>utrace_prepare_examine</function> returns
Jesse Keating 7a32965
+  an error if the task is not properly stopped, or is dead.  After a
Jesse Keating 7a32965
+  successful examination, the paired <function>utrace_finish_examine</function>
Jesse Keating 7a32965
+  call returns an error if the task ever woke up during the examination.  If
Jesse Keating 7a32965
+  so, any data gathered may be scrambled and should be discarded.  This means
Jesse Keating 7a32965
+  there was a spurious wake-up (which should not happen), or a sudden death.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect1 id="regset"><title><structname>struct user_regset</structname></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  The <structname>struct user_regset</structname> API
Jesse Keating 7a32965
+  is declared in <filename><linux/regset.h></filename>.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+!Finclude/linux/regset.h
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect1 id="task_current_syscall">
Jesse Keating 7a32965
+  <title><filename>System Call Information</filename></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  This function is declared in <filename><linux/ptrace.h></filename>.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+!Elib/syscall.c
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect1 id="syscall"><title><filename>System Call Tracing</filename></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  The arch API for system call information is declared in
Jesse Keating 7a32965
+  <filename><asm/syscall.h></filename>.
Jesse Keating 7a32965
+  Each of these calls can be used only at system call entry tracing,
Jesse Keating 7a32965
+  or can be used only at system call exit and the subsequent safe points
Jesse Keating 7a32965
+  before returning to user mode.
Jesse Keating 7a32965
+  At system call entry tracing means either during a
Jesse Keating 7a32965
+  <structfield>report_syscall_entry</structfield> callback,
Jesse Keating 7a32965
+  or any time after that callback has returned <constant>UTRACE_STOP</constant>.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+!Finclude/asm-generic/syscall.h
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</chapter>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<chapter id="internals"><title>Kernel Internals</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  This chapter covers the interface to the tracing infrastructure
Jesse Keating 7a32965
+  from the core of the kernel and the architecture-specific code.
Jesse Keating 7a32965
+  This is for maintainers of the kernel and arch code, and not relevant
Jesse Keating 7a32965
+  to using the tracing facilities described in preceding chapters.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect1 id="tracehook"><title>Core Calls In</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  These calls are declared in <filename><linux/tracehook.h></filename>.
Jesse Keating 7a32965
+  The core kernel calls these functions at various important places.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+!Finclude/linux/tracehook.h
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect1 id="arch"><title>Architecture Calls Out</title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  An arch that has done all these things sets
Jesse Keating 7a32965
+  <constant>CONFIG_HAVE_ARCH_TRACEHOOK</constant>.
Jesse Keating 7a32965
+  This is required to enable the <application>utrace</application> code.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect2 id="arch-ptrace"><title><filename><asm/ptrace.h></filename></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<para>
Jesse Keating 7a32965
+  An arch defines these in <filename><asm/ptrace.h></filename>
Jesse Keating 7a32965
+  if it supports hardware single-step or block-step features.
Jesse Keating 7a32965
+</para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+!Finclude/linux/ptrace.h arch_has_single_step arch_has_block_step
Jesse Keating 7a32965
+!Finclude/linux/ptrace.h user_enable_single_step user_enable_block_step
Jesse Keating 7a32965
+!Finclude/linux/ptrace.h user_disable_single_step
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect2 id="arch-syscall">
Jesse Keating 7a32965
+  <title><filename><asm/syscall.h></filename></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    An arch provides <filename><asm/syscall.h></filename> that
Jesse Keating 7a32965
+    defines these as inlines, or declares them as exported functions.
Jesse Keating 7a32965
+    These interfaces are described in <xref linkend="syscall"/>.
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+<sect2 id="arch-tracehook">
Jesse Keating 7a32965
+  <title><filename><linux/tracehook.h></filename></title>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  <para>
Jesse Keating 7a32965
+    An arch must define <constant>TIF_NOTIFY_RESUME</constant>
Jesse Keating 7a32965
+    and <constant>TIF_SYSCALL_TRACE</constant>
Jesse Keating 7a32965
+    in its <filename><asm/thread_info.h></filename>.
Jesse Keating 7a32965
+    The arch code must call the following functions, all declared
Jesse Keating 7a32965
+    in <filename><linux/tracehook.h></filename> and
Jesse Keating 7a32965
+    described in <xref linkend="tracehook"/>:
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+    <itemizedlist>
Jesse Keating 7a32965
+      <listitem>
Jesse Keating 7a32965
+	<para><function>tracehook_notify_resume</function></para>
Jesse Keating 7a32965
+      </listitem>
Jesse Keating 7a32965
+      <listitem>
Jesse Keating 7a32965
+	<para><function>tracehook_report_syscall_entry</function></para>
Jesse Keating 7a32965
+      </listitem>
Jesse Keating 7a32965
+      <listitem>
Jesse Keating 7a32965
+	<para><function>tracehook_report_syscall_exit</function></para>
Jesse Keating 7a32965
+      </listitem>
Jesse Keating 7a32965
+      <listitem>
Jesse Keating 7a32965
+	<para><function>tracehook_signal_handler</function></para>
Jesse Keating 7a32965
+      </listitem>
Jesse Keating 7a32965
+    </itemizedlist>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+  </para>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect2>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</sect1>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</chapter>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+</book>
Jesse Keating 7a32965
diff --git a/fs/proc/array.c b/fs/proc/array.c
Roland McGrath edee7cd
index fff6572..a67bd83 100644  
Jesse Keating 7a32965
--- a/fs/proc/array.c
Jesse Keating 7a32965
+++ b/fs/proc/array.c
Jesse Keating 7a32965
@@ -81,6 +81,7 @@
Jesse Keating 7a32965
 #include <linux/pid_namespace.h>
Jesse Keating 7a32965
 #include <linux/ptrace.h>
Jesse Keating 7a32965
 #include <linux/tracehook.h>
Jesse Keating 7a32965
+#include <linux/utrace.h>
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 #include <asm/pgtable.h>
Jesse Keating 7a32965
 #include <asm/processor.h>
Roland McGrath edee7cd
@@ -192,6 +193,8 @@ static inline void task_state(struct seq
Jesse Keating 7a32965
 		cred->uid, cred->euid, cred->suid, cred->fsuid,
Jesse Keating 7a32965
 		cred->gid, cred->egid, cred->sgid, cred->fsgid);
Jesse Keating 7a32965
 
Jesse Keating 7a32965
+	task_utrace_proc_status(m, p);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
 	task_lock(p);
Jesse Keating 7a32965
 	if (p->files)
Jesse Keating 7a32965
 		fdt = files_fdtable(p->files);
Jesse Keating 7a32965
diff --git a/include/linux/sched.h b/include/linux/sched.h
Kyle McMartin 07d3322
index 5e7cc95..66a1ec8 100644  
Jesse Keating 7a32965
--- a/include/linux/sched.h
Jesse Keating 7a32965
+++ b/include/linux/sched.h
Kyle McMartin da80d72
@@ -1339,6 +1339,11 @@ struct task_struct {
Jesse Keating 7a32965
 #endif
Jesse Keating 7a32965
 	seccomp_t seccomp;
Jesse Keating 7a32965
 
Jesse Keating 7a32965
+#ifdef CONFIG_UTRACE
Jesse Keating 7a32965
+	struct utrace *utrace;
Jesse Keating 7a32965
+	unsigned long utrace_flags;
Jesse Keating 7a32965
+#endif
Jesse Keating 7a32965
+
Jesse Keating 7a32965
 /* Thread group tracking */
Jesse Keating 7a32965
    	u32 parent_exec_id;
Jesse Keating 7a32965
    	u32 self_exec_id;
Jesse Keating 7a32965
diff --git a/include/linux/tracehook.h b/include/linux/tracehook.h
Kyle McMartin 07d3322
index c78b2f4..71fa250 100644  
Jesse Keating 7a32965
--- a/include/linux/tracehook.h
Jesse Keating 7a32965
+++ b/include/linux/tracehook.h
Jesse Keating 7a32965
@@ -49,6 +49,7 @@
Jesse Keating 7a32965
 #include <linux/sched.h>
Jesse Keating 7a32965
 #include <linux/ptrace.h>
Jesse Keating 7a32965
 #include <linux/security.h>
Jesse Keating 7a32965
+#include <linux/utrace.h>
Jesse Keating 7a32965
 struct linux_binprm;
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 /**
Jesse Keating 7a32965
@@ -63,6 +64,8 @@ struct linux_binprm;
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline int tracehook_expect_breakpoints(struct task_struct *task)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(task) & UTRACE_EVENT(SIGNAL_CORE)))
Jesse Keating 7a32965
+		return 1;
Jesse Keating 7a32965
 	return (task_ptrace(task) & PT_PTRACED) != 0;
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Roland McGrath edee7cd
@@ -111,6 +114,9 @@ static inline void ptrace_report_syscall
Jesse Keating 7a32965
 static inline __must_check int tracehook_report_syscall_entry(
Jesse Keating 7a32965
 	struct pt_regs *regs)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if ((task_utrace_flags(current) & UTRACE_EVENT(SYSCALL_ENTRY)) &&
Jesse Keating 7a32965
+	    utrace_report_syscall_entry(regs))
Jesse Keating 7a32965
+		return 1;
Jesse Keating 7a32965
 	ptrace_report_syscall(regs);
Jesse Keating 7a32965
 	return 0;
Jesse Keating 7a32965
 }
Kyle McMartin 07d3322
@@ -134,6 +140,9 @@ static inline __must_check int tracehook
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline void tracehook_report_syscall_exit(struct pt_regs *regs, int step)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (task_utrace_flags(current) & UTRACE_EVENT(SYSCALL_EXIT))
Jesse Keating 7a32965
+		utrace_report_syscall_exit(regs);
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
 	if (step && (task_ptrace(current) & PT_PTRACED)) {
Jesse Keating 7a32965
 		siginfo_t info;
Jesse Keating 7a32965
 		user_single_step_siginfo(current, regs, &info;;
Roland McGrath edee7cd
@@ -201,6 +210,8 @@ static inline void tracehook_report_exec
Jesse Keating 7a32965
 					 struct linux_binprm *bprm,
Jesse Keating 7a32965
 					 struct pt_regs *regs)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(current) & UTRACE_EVENT(EXEC)))
Jesse Keating 7a32965
+		utrace_report_exec(fmt, bprm, regs);
Jesse Keating 7a32965
 	if (!ptrace_event(PT_TRACE_EXEC, PTRACE_EVENT_EXEC, 0) &&
Jesse Keating 7a32965
 	    unlikely(task_ptrace(current) & PT_PTRACED))
Jesse Keating 7a32965
 		send_sig(SIGTRAP, current, 0);
Roland McGrath edee7cd
@@ -218,10 +229,37 @@ static inline void tracehook_report_exec
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline void tracehook_report_exit(long *exit_code)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(current) & UTRACE_EVENT(EXIT)))
Jesse Keating 7a32965
+		utrace_report_exit(exit_code);
Jesse Keating 7a32965
 	ptrace_event(PT_TRACE_EXIT, PTRACE_EVENT_EXIT, *exit_code);
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 /**
Jesse Keating 7a32965
+ * tracehook_init_task - task_struct has just been copied
Jesse Keating 7a32965
+ * @task:		new &struct task_struct just copied from parent
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Called from do_fork() when @task has just been duplicated.
Jesse Keating 7a32965
+ * After this, @task will be passed to tracehook_free_task()
Jesse Keating 7a32965
+ * even if the rest of its setup fails before it is fully created.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline void tracehook_init_task(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	utrace_init_task(task);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * tracehook_free_task - task_struct is being freed
Jesse Keating 7a32965
+ * @task:		dead &struct task_struct being freed
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Called from free_task() when @task is no longer in use.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline void tracehook_free_task(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	if (task_utrace_struct(task))
Jesse Keating 7a32965
+		utrace_free_task(task);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
  * tracehook_prepare_clone - prepare for new child to be cloned
Jesse Keating 7a32965
  * @clone_flags:	%CLONE_* flags from clone/fork/vfork system call
Jesse Keating 7a32965
  *
Roland McGrath edee7cd
@@ -285,6 +323,8 @@ static inline void tracehook_report_clon
Jesse Keating 7a32965
 					  unsigned long clone_flags,
Jesse Keating 7a32965
 					  pid_t pid, struct task_struct *child)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(current) & UTRACE_EVENT(CLONE)))
Jesse Keating 7a32965
+		utrace_report_clone(clone_flags, child);
Jesse Keating 7a32965
 	if (unlikely(task_ptrace(child))) {
Jesse Keating 7a32965
 		/*
Jesse Keating 7a32965
 		 * It doesn't matter who attached/attaching to this
Roland McGrath edee7cd
@@ -317,6 +357,9 @@ static inline void tracehook_report_clon
Jesse Keating 7a32965
 						   pid_t pid,
Jesse Keating 7a32965
 						   struct task_struct *child)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(current) & UTRACE_EVENT(CLONE)) &&
Jesse Keating 7a32965
+	    (clone_flags & CLONE_VFORK))
Jesse Keating 7a32965
+		utrace_finish_vfork(current);
Jesse Keating 7a32965
 	if (unlikely(trace))
Jesse Keating 7a32965
 		ptrace_event(0, trace, pid);
Jesse Keating 7a32965
 }
Roland McGrath edee7cd
@@ -351,6 +394,10 @@ static inline void tracehook_report_vfor
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline void tracehook_prepare_release_task(struct task_struct *task)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	/* see utrace_add_engine() about this barrier */
Jesse Keating 7a32965
+	smp_mb();
Jesse Keating 7a32965
+	if (task_utrace_flags(task))
Jesse Keating 7a32965
+		utrace_maybe_reap(task, task_utrace_struct(task), true);
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 /**
Roland McGrath edee7cd
@@ -365,6 +412,7 @@ static inline void tracehook_prepare_rel
Jesse Keating 7a32965
 static inline void tracehook_finish_release_task(struct task_struct *task)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
 	ptrace_release_task(task);
Jesse Keating 7a32965
+	BUG_ON(task->exit_state != EXIT_DEAD);
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 /**
Kyle McMartin 07d3322
@@ -386,6 +434,8 @@ static inline void tracehook_signal_hand
Jesse Keating 7a32965
 					    const struct k_sigaction *ka,
Jesse Keating 7a32965
 					    struct pt_regs *regs, int stepping)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (task_utrace_flags(current))
Jesse Keating 7a32965
+		utrace_signal_handler(current, stepping);
Kyle McMartin 07d3322
 	if (stepping && (task_ptrace(current) & PT_PTRACED))
Jesse Keating 7a32965
 		ptrace_notify(SIGTRAP);
Jesse Keating 7a32965
 }
Roland McGrath edee7cd
@@ -403,6 +453,8 @@ static inline void tracehook_signal_hand
Jesse Keating 7a32965
 static inline int tracehook_consider_ignored_signal(struct task_struct *task,
Jesse Keating 7a32965
 						    int sig)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(task) & UTRACE_EVENT(SIGNAL_IGN)))
Jesse Keating 7a32965
+		return 1;
Jesse Keating 7a32965
 	return (task_ptrace(task) & PT_PTRACED) != 0;
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Roland McGrath edee7cd
@@ -422,6 +474,9 @@ static inline int tracehook_consider_ign
Jesse Keating 7a32965
 static inline int tracehook_consider_fatal_signal(struct task_struct *task,
Jesse Keating 7a32965
 						  int sig)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(task) & (UTRACE_EVENT(SIGNAL_TERM) |
Jesse Keating 7a32965
+						UTRACE_EVENT(SIGNAL_CORE))))
Jesse Keating 7a32965
+		return 1;
Jesse Keating 7a32965
 	return (task_ptrace(task) & PT_PTRACED) != 0;
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Roland McGrath edee7cd
@@ -436,6 +491,8 @@ static inline int tracehook_consider_fat
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline int tracehook_force_sigpending(void)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(current)))
Jesse Keating 7a32965
+		return utrace_interrupt_pending();
Jesse Keating 7a32965
 	return 0;
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Roland McGrath edee7cd
@@ -465,6 +522,8 @@ static inline int tracehook_get_signal(s
Jesse Keating 7a32965
 				       siginfo_t *info,
Jesse Keating 7a32965
 				       struct k_sigaction *return_ka)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (unlikely(task_utrace_flags(task)))
Jesse Keating 7a32965
+		return utrace_get_signal(task, regs, info, return_ka);
Jesse Keating 7a32965
 	return 0;
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Kyle McMartin 07d3322
@@ -492,6 +551,8 @@ static inline int tracehook_get_signal(s
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline int tracehook_notify_jctl(int notify, int why)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (task_utrace_flags(current) & UTRACE_EVENT(JCTL))
Jesse Keating 7a32965
+		utrace_report_jctl(notify, why);
Kyle McMartin 07d3322
 	return notify ?: task_ptrace(current) ? why : 0;
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Roland McGrath edee7cd
@@ -502,6 +563,8 @@ static inline int tracehook_notify_jctl(
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline void tracehook_finish_jctl(void)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	if (task_utrace_flags(current))
Jesse Keating 7a32965
+		utrace_finish_stop();
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 #define DEATH_REAP			-1
Roland McGrath edee7cd
@@ -524,6 +587,8 @@ static inline void tracehook_finish_jctl
Jesse Keating 7a32965
 static inline int tracehook_notify_death(struct task_struct *task,
Jesse Keating 7a32965
 					 void **death_cookie, int group_dead)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	*death_cookie = task_utrace_struct(task);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
 	if (task_detached(task))
Jesse Keating 7a32965
 		return task->ptrace ? SIGCHLD : DEATH_REAP;
Jesse Keating 7a32965
 
Roland McGrath edee7cd
@@ -560,6 +625,15 @@ static inline void tracehook_report_deat
Jesse Keating 7a32965
 					  int signal, void *death_cookie,
Jesse Keating 7a32965
 					  int group_dead)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If utrace_set_events() was just called to enable
Jesse Keating 7a32965
+	 * UTRACE_EVENT(DEATH), then we are obliged to call
Jesse Keating 7a32965
+	 * utrace_report_death() and not miss it.  utrace_set_events()
Jesse Keating 7a32965
+	 * checks @task->exit_state under tasklist_lock to synchronize
Jesse Keating 7a32965
+	 * with exit_notify(), the caller.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (task_utrace_flags(task) & _UTRACE_DEATH_EVENTS)
Jesse Keating 7a32965
+		utrace_report_death(task, death_cookie, group_dead, signal);
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 #ifdef TIF_NOTIFY_RESUME
Roland McGrath edee7cd
@@ -589,10 +663,21 @@ static inline void set_notify_resume(str
Jesse Keating 7a32965
  * asynchronously, this will be called again before we return to
Jesse Keating 7a32965
  * user mode.
Jesse Keating 7a32965
  *
Jesse Keating 7a32965
- * Called without locks.
Jesse Keating 7a32965
+ * Called without locks.  However, on some machines this may be
Jesse Keating 7a32965
+ * called with interrupts disabled.
Jesse Keating 7a32965
  */
Jesse Keating 7a32965
 static inline void tracehook_notify_resume(struct pt_regs *regs)
Jesse Keating 7a32965
 {
Jesse Keating 7a32965
+	struct task_struct *task = current;
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * Prevent the following store/load from getting ahead of the
Jesse Keating 7a32965
+	 * caller which clears TIF_NOTIFY_RESUME. This pairs with the
Jesse Keating 7a32965
+	 * implicit mb() before setting TIF_NOTIFY_RESUME in
Jesse Keating 7a32965
+	 * set_notify_resume().
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	smp_mb();
Jesse Keating 7a32965
+	if (task_utrace_flags(task))
Jesse Keating 7a32965
+		utrace_resume(task, regs);
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 #endif	/* TIF_NOTIFY_RESUME */
Jesse Keating 7a32965
 
Jesse Keating 7a32965
diff --git a/include/linux/utrace.h b/include/linux/utrace.h
Jesse Keating 7a32965
new file mode 100644
Roland McGrath edee7cd
index ...f251efe 100644  
Jesse Keating 7a32965
--- /dev/null
Jesse Keating 7a32965
+++ b/include/linux/utrace.h
Jesse Keating 7a32965
@@ -0,0 +1,692 @@
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * utrace infrastructure interface for debugging user processes
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Copyright (C) 2006-2009 Red Hat, Inc.  All rights reserved.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This copyrighted material is made available to anyone wishing to use,
Jesse Keating 7a32965
+ * modify, copy, or redistribute it subject to the terms and conditions
Jesse Keating 7a32965
+ * of the GNU General Public License v.2.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Red Hat Author: Roland McGrath.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This interface allows for notification of interesting events in a
Jesse Keating 7a32965
+ * thread.  It also mediates access to thread state such as registers.
Jesse Keating 7a32965
+ * Multiple unrelated users can be associated with a single thread.
Jesse Keating 7a32965
+ * We call each of these a tracing engine.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * A tracing engine starts by calling utrace_attach_task() or
Jesse Keating 7a32965
+ * utrace_attach_pid() on the chosen thread, passing in a set of hooks
Jesse Keating 7a32965
+ * (&struct utrace_engine_ops), and some associated data.  This produces a
Jesse Keating 7a32965
+ * &struct utrace_engine, which is the handle used for all other
Jesse Keating 7a32965
+ * operations.  An attached engine has its ops vector, its data, and an
Jesse Keating 7a32965
+ * event mask controlled by utrace_set_events().
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * For each event bit that is set, that engine will get the
Jesse Keating 7a32965
+ * appropriate ops->report_*() callback when the event occurs.  The
Jesse Keating 7a32965
+ * &struct utrace_engine_ops need not provide callbacks for an event
Jesse Keating 7a32965
+ * unless the engine sets one of the associated event bits.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#ifndef _LINUX_UTRACE_H
Jesse Keating 7a32965
+#define _LINUX_UTRACE_H	1
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#include <linux/list.h>
Jesse Keating 7a32965
+#include <linux/kref.h>
Jesse Keating 7a32965
+#include <linux/signal.h>
Jesse Keating 7a32965
+#include <linux/sched.h>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+struct linux_binprm;
Jesse Keating 7a32965
+struct pt_regs;
Jesse Keating 7a32965
+struct utrace;
Jesse Keating 7a32965
+struct user_regset;
Jesse Keating 7a32965
+struct user_regset_view;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Event bits passed to utrace_set_events().
Jesse Keating 7a32965
+ * These appear in &struct task_struct.@utrace_flags
Jesse Keating 7a32965
+ * and &struct utrace_engine.@flags.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+enum utrace_events {
Jesse Keating 7a32965
+	_UTRACE_EVENT_QUIESCE,	/* Thread is available for examination.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_REAP,  	/* Zombie reaped, no more tracing possible.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_CLONE,	/* Successful clone/fork/vfork just done.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_EXEC,	/* Successful execve just completed.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_EXIT,	/* Thread exit in progress.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_DEATH,	/* Thread has died.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SYSCALL_ENTRY, /* User entered kernel for system call. */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SYSCALL_EXIT, /* Returning to user after system call.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SIGNAL,	/* Signal delivery will run a user handler.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SIGNAL_IGN, /* No-op signal to be delivered.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SIGNAL_STOP, /* Signal delivery will suspend.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SIGNAL_TERM, /* Signal delivery will terminate.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_SIGNAL_CORE, /* Signal delivery will dump core.  */
Jesse Keating 7a32965
+	_UTRACE_EVENT_JCTL,	/* Job control stop or continue completed.  */
Jesse Keating 7a32965
+	_UTRACE_NEVENTS
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+#define UTRACE_EVENT(type)	(1UL << _UTRACE_EVENT_##type)
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * All the kinds of signal events.
Jesse Keating 7a32965
+ * These all use the @report_signal() callback.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+#define UTRACE_EVENT_SIGNAL_ALL	(UTRACE_EVENT(SIGNAL) \
Jesse Keating 7a32965
+				 | UTRACE_EVENT(SIGNAL_IGN) \
Jesse Keating 7a32965
+				 | UTRACE_EVENT(SIGNAL_STOP) \
Jesse Keating 7a32965
+				 | UTRACE_EVENT(SIGNAL_TERM) \
Jesse Keating 7a32965
+				 | UTRACE_EVENT(SIGNAL_CORE))
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Both kinds of syscall events; these call the @report_syscall_entry()
Jesse Keating 7a32965
+ * and @report_syscall_exit() callbacks, respectively.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+#define UTRACE_EVENT_SYSCALL	\
Jesse Keating 7a32965
+	(UTRACE_EVENT(SYSCALL_ENTRY) | UTRACE_EVENT(SYSCALL_EXIT))
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * The event reports triggered synchronously by task death.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+#define _UTRACE_DEATH_EVENTS (UTRACE_EVENT(DEATH) | UTRACE_EVENT(QUIESCE))
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Hooks in <linux/tracehook.h> call these entry points to the utrace dispatch.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+void utrace_free_task(struct task_struct *);
Jesse Keating 7a32965
+bool utrace_interrupt_pending(void);
Jesse Keating 7a32965
+void utrace_resume(struct task_struct *, struct pt_regs *);
Jesse Keating 7a32965
+void utrace_finish_stop(void);
Jesse Keating 7a32965
+void utrace_maybe_reap(struct task_struct *, struct utrace *, bool);
Jesse Keating 7a32965
+int utrace_get_signal(struct task_struct *, struct pt_regs *,
Jesse Keating 7a32965
+		      siginfo_t *, struct k_sigaction *);
Jesse Keating 7a32965
+void utrace_report_clone(unsigned long, struct task_struct *);
Jesse Keating 7a32965
+void utrace_finish_vfork(struct task_struct *);
Jesse Keating 7a32965
+void utrace_report_exit(long *exit_code);
Jesse Keating 7a32965
+void utrace_report_death(struct task_struct *, struct utrace *, bool, int);
Jesse Keating 7a32965
+void utrace_report_jctl(int notify, int type);
Jesse Keating 7a32965
+void utrace_report_exec(struct linux_binfmt *, struct linux_binprm *,
Jesse Keating 7a32965
+			struct pt_regs *regs);
Jesse Keating 7a32965
+bool utrace_report_syscall_entry(struct pt_regs *);
Jesse Keating 7a32965
+void utrace_report_syscall_exit(struct pt_regs *);
Jesse Keating 7a32965
+void utrace_signal_handler(struct task_struct *, int);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#ifndef CONFIG_UTRACE
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * <linux/tracehook.h> uses these accessors to avoid #ifdef CONFIG_UTRACE.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline unsigned long task_utrace_flags(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return 0;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+static inline struct utrace *task_utrace_struct(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return NULL;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+static inline void utrace_init_task(struct task_struct *child)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static inline void task_utrace_proc_status(struct seq_file *m,
Jesse Keating 7a32965
+					   struct task_struct *p)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#else  /* CONFIG_UTRACE */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static inline unsigned long task_utrace_flags(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return task->utrace_flags;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static inline struct utrace *task_utrace_struct(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace *utrace;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * This barrier ensures that any prior load of task->utrace_flags
Jesse Keating 7a32965
+	 * is ordered before this load of task->utrace.  We use those
Jesse Keating 7a32965
+	 * utrace_flags checks in the hot path to decide to call into
Jesse Keating 7a32965
+	 * the utrace code.  The first attach installs task->utrace before
Jesse Keating 7a32965
+	 * setting task->utrace_flags nonzero with implicit barrier in
Jesse Keating 7a32965
+	 * between, see utrace_add_engine().
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	smp_rmb();
Jesse Keating 7a32965
+	utrace = task->utrace;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	smp_read_barrier_depends(); /* See utrace_task_alloc().  */
Jesse Keating 7a32965
+	return utrace;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static inline void utrace_init_task(struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	task->utrace_flags = 0;
Jesse Keating 7a32965
+	task->utrace = NULL;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+void task_utrace_proc_status(struct seq_file *m, struct task_struct *p);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Version number of the API defined in this file.  This will change
Jesse Keating 7a32965
+ * whenever a tracing engine's code would need some updates to keep
Jesse Keating 7a32965
+ * working.  We maintain this here for the benefit of tracing engine code
Jesse Keating 7a32965
+ * that is developed concurrently with utrace API improvements before they
Jesse Keating 7a32965
+ * are merged into the kernel, making LINUX_VERSION_CODE checks unwieldy.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+#define UTRACE_API_VERSION	20091216
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * enum utrace_resume_action - engine's choice of action for a traced task
Jesse Keating 7a32965
+ * @UTRACE_STOP:		Stay quiescent after callbacks.
Jesse Keating 7a32965
+ * @UTRACE_INTERRUPT:		Make @report_signal() callback soon.
Jesse Keating 7a32965
+ * @UTRACE_REPORT:		Make some callback soon.
Jesse Keating 7a32965
+ * @UTRACE_SINGLESTEP:		Resume in user mode for one instruction.
Jesse Keating 7a32965
+ * @UTRACE_BLOCKSTEP:		Resume in user mode until next branch.
Jesse Keating 7a32965
+ * @UTRACE_RESUME:		Resume normally in user mode.
Jesse Keating 7a32965
+ * @UTRACE_DETACH:		Detach my engine (implies %UTRACE_RESUME).
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * See utrace_control() for detailed descriptions of each action.  This is
Jesse Keating 7a32965
+ * encoded in the @action argument and the return value for every callback
Jesse Keating 7a32965
+ * with a &u32 return value.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The order of these is important.  When there is more than one engine,
Jesse Keating 7a32965
+ * each supplies its choice and the smallest value prevails.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+enum utrace_resume_action {
Jesse Keating 7a32965
+	UTRACE_STOP,
Jesse Keating 7a32965
+	UTRACE_INTERRUPT,
Jesse Keating 7a32965
+	UTRACE_REPORT,
Jesse Keating 7a32965
+	UTRACE_SINGLESTEP,
Jesse Keating 7a32965
+	UTRACE_BLOCKSTEP,
Jesse Keating 7a32965
+	UTRACE_RESUME,
Jesse Keating 7a32965
+	UTRACE_DETACH,
Jesse Keating 7a32965
+	UTRACE_RESUME_MAX
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+#define UTRACE_RESUME_BITS	(ilog2(UTRACE_RESUME_MAX) + 1)
Jesse Keating 7a32965
+#define UTRACE_RESUME_MASK	((1 << UTRACE_RESUME_BITS) - 1)
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_resume_action - &enum utrace_resume_action from callback action
Jesse Keating 7a32965
+ * @action:		&u32 callback @action argument or return value
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This extracts the &enum utrace_resume_action from @action,
Jesse Keating 7a32965
+ * which is the @action argument to a &struct utrace_engine_ops
Jesse Keating 7a32965
+ * callback or the return value from one.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline enum utrace_resume_action utrace_resume_action(u32 action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return action & UTRACE_RESUME_MASK;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * enum utrace_signal_action - disposition of signal
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_DELIVER:	Deliver according to sigaction.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_IGN:		Ignore the signal.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_TERM:		Terminate the process.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_CORE:		Terminate with core dump.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_STOP:		Deliver as absolute stop.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_TSTP:		Deliver as job control stop.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_REPORT:	Reporting before pending signals.
Jesse Keating 7a32965
+ * @UTRACE_SIGNAL_HANDLER:	Reporting after signal handler setup.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is encoded in the @action argument and the return value for
Jesse Keating 7a32965
+ * a @report_signal() callback.  It says what will happen to the
Jesse Keating 7a32965
+ * signal described by the &siginfo_t parameter to the callback.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The %UTRACE_SIGNAL_REPORT value is used in an @action argument when
Jesse Keating 7a32965
+ * a tracing report is being made before dequeuing any pending signal.
Jesse Keating 7a32965
+ * If this is immediately after a signal handler has been set up, then
Jesse Keating 7a32965
+ * %UTRACE_SIGNAL_HANDLER is used instead.  A @report_signal callback
Jesse Keating 7a32965
+ * that uses %UTRACE_SIGNAL_DELIVER|%UTRACE_SINGLESTEP will ensure
Jesse Keating 7a32965
+ * it sees a %UTRACE_SIGNAL_HANDLER report.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+enum utrace_signal_action {
Jesse Keating 7a32965
+	UTRACE_SIGNAL_DELIVER	= 0x00,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_IGN	= 0x10,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_TERM	= 0x20,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_CORE	= 0x30,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_STOP	= 0x40,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_TSTP	= 0x50,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_REPORT	= 0x60,
Jesse Keating 7a32965
+	UTRACE_SIGNAL_HANDLER	= 0x70
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+#define	UTRACE_SIGNAL_MASK	0xf0
Jesse Keating 7a32965
+#define UTRACE_SIGNAL_HOLD	0x100 /* Flag, push signal back on queue.  */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_signal_action - &enum utrace_signal_action from callback action
Jesse Keating 7a32965
+ * @action:		@report_signal callback @action argument or return value
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This extracts the &enum utrace_signal_action from @action, which
Jesse Keating 7a32965
+ * is the @action argument to a @report_signal callback or the
Jesse Keating 7a32965
+ * return value from one.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline enum utrace_signal_action utrace_signal_action(u32 action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return action & UTRACE_SIGNAL_MASK;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * enum utrace_syscall_action - disposition of system call attempt
Jesse Keating 7a32965
+ * @UTRACE_SYSCALL_RUN:		Run the system call.
Jesse Keating 7a32965
+ * @UTRACE_SYSCALL_ABORT:	Don't run the system call.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is encoded in the @action argument and the return value for
Jesse Keating 7a32965
+ * a @report_syscall_entry callback.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+enum utrace_syscall_action {
Jesse Keating 7a32965
+	UTRACE_SYSCALL_RUN	= 0x00,
Jesse Keating 7a32965
+	UTRACE_SYSCALL_ABORT	= 0x10
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+#define	UTRACE_SYSCALL_MASK	0xf0
Jesse Keating 7a32965
+#define	UTRACE_SYSCALL_RESUMED	0x100 /* Flag, report_syscall_entry() repeats */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_syscall_action - &enum utrace_syscall_action from callback action
Jesse Keating 7a32965
+ * @action:		@report_syscall_entry callback @action or return value
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This extracts the &enum utrace_syscall_action from @action, which
Jesse Keating 7a32965
+ * is the @action argument to a @report_syscall_entry callback or the
Jesse Keating 7a32965
+ * return value from one.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline enum utrace_syscall_action utrace_syscall_action(u32 action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return action & UTRACE_SYSCALL_MASK;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Flags for utrace_attach_task() and utrace_attach_pid().
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+#define UTRACE_ATTACH_MATCH_OPS		0x0001 /* Match engines on ops.  */
Jesse Keating 7a32965
+#define UTRACE_ATTACH_MATCH_DATA	0x0002 /* Match engines on data.  */
Jesse Keating 7a32965
+#define UTRACE_ATTACH_MATCH_MASK	0x000f
Jesse Keating 7a32965
+#define UTRACE_ATTACH_CREATE		0x0010 /* Attach a new engine.  */
Jesse Keating 7a32965
+#define UTRACE_ATTACH_EXCLUSIVE		0x0020 /* Refuse if existing match.  */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * struct utrace_engine - per-engine structure
Jesse Keating 7a32965
+ * @ops:	&struct utrace_engine_ops pointer passed to utrace_attach_task()
Jesse Keating 7a32965
+ * @data:	engine-private &void * passed to utrace_attach_task()
Jesse Keating 7a32965
+ * @flags:	event mask set by utrace_set_events() plus internal flag bits
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The task itself never has to worry about engines detaching while
Jesse Keating 7a32965
+ * it's doing event callbacks.  These structures are removed from the
Jesse Keating 7a32965
+ * task's active list only when it's stopped, or by the task itself.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * utrace_engine_get() and utrace_engine_put() maintain a reference count.
Jesse Keating 7a32965
+ * When it drops to zero, the structure is freed.  One reference is held
Jesse Keating 7a32965
+ * implicitly while the engine is attached to its task.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_engine {
Jesse Keating 7a32965
+/* private: */
Jesse Keating 7a32965
+	struct kref kref;
Jesse Keating 7a32965
+	void (*release)(void *);
Jesse Keating 7a32965
+	struct list_head entry;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/* public: */
Jesse Keating 7a32965
+	const struct utrace_engine_ops *ops;
Jesse Keating 7a32965
+	void *data;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	unsigned long flags;
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_engine_get - acquire a reference on a &struct utrace_engine
Jesse Keating 7a32965
+ * @engine:	&struct utrace_engine pointer
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * You must hold a reference on @engine, and you get another.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline void utrace_engine_get(struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	kref_get(&engine->kref);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+void __utrace_engine_release(struct kref *);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_engine_put - release a reference on a &struct utrace_engine
Jesse Keating 7a32965
+ * @engine:	&struct utrace_engine pointer
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * You must hold a reference on @engine, and you lose that reference.
Jesse Keating 7a32965
+ * If it was the last one, @engine becomes an invalid pointer.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline void utrace_engine_put(struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	kref_put(&engine->kref, __utrace_engine_release);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * struct utrace_engine_ops - tracing engine callbacks
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Each @report_*() callback corresponds to an %UTRACE_EVENT(*) bit.
Jesse Keating 7a32965
+ * utrace_set_events() calls on @engine choose which callbacks will
Jesse Keating 7a32965
+ * be made to @engine from @task.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Most callbacks take an @action argument, giving the resume action
Jesse Keating 7a32965
+ * chosen by other tracing engines.  All callbacks take an @engine
Jesse Keating 7a32965
+ * argument.  The @report_reap callback takes a @task argument that
Jesse Keating 7a32965
+ * might or might not be @current.  All other @report_* callbacks
Jesse Keating 7a32965
+ * report an event in the @current task.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * For some calls, @action also includes bits specific to that event
Jesse Keating 7a32965
+ * and utrace_resume_action() is used to extract the resume action.
Jesse Keating 7a32965
+ * This shows what would happen if @engine wasn't there, or will if
Jesse Keating 7a32965
+ * the callback's return value uses %UTRACE_RESUME.  This always
Jesse Keating 7a32965
+ * starts as %UTRACE_RESUME when no other tracing is being done on
Jesse Keating 7a32965
+ * this task.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * All return values contain &enum utrace_resume_action bits.  For
Jesse Keating 7a32965
+ * some calls, other bits specific to that kind of event are added to
Jesse Keating 7a32965
+ * the resume action bits with OR.  These are the same bits used in
Jesse Keating 7a32965
+ * the @action argument.  The resume action returned by a callback
Jesse Keating 7a32965
+ * does not override previous engines' choices, it only says what
Jesse Keating 7a32965
+ * @engine wants done.  What @current actually does is the action that's
Jesse Keating 7a32965
+ * most constrained among the choices made by all attached engines.
Jesse Keating 7a32965
+ * See utrace_control() for more information on the actions.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * When %UTRACE_STOP is used in @report_syscall_entry, then @current
Jesse Keating 7a32965
+ * stops before attempting the system call.  In this case, another
Jesse Keating 7a32965
+ * @report_syscall_entry callback will follow after @current resumes if
Jesse Keating 7a32965
+ * %UTRACE_REPORT or %UTRACE_INTERRUPT was returned by some callback
Jesse Keating 7a32965
+ * or passed to utrace_control().  In a second or later callback,
Jesse Keating 7a32965
+ * %UTRACE_SYSCALL_RESUMED is set in the @action argument to indicate
Jesse Keating 7a32965
+ * a repeat callback still waiting to attempt the same system call
Jesse Keating 7a32965
+ * invocation.  This repeat callback gives each engine an opportunity
Jesse Keating 7a32965
+ * to reexamine registers another engine might have changed while
Jesse Keating 7a32965
+ * @current was held in %UTRACE_STOP.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * In other cases, the resume action does not take effect until @current
Jesse Keating 7a32965
+ * is ready to check for signals and return to user mode.  If there
Jesse Keating 7a32965
+ * are more callbacks to be made, the last round of calls determines
Jesse Keating 7a32965
+ * the final action.  A @report_quiesce callback with @event zero, or
Jesse Keating 7a32965
+ * a @report_signal callback, will always be the last one made before
Jesse Keating 7a32965
+ * @current resumes.  Only %UTRACE_STOP is "sticky"--if @engine returned
Jesse Keating 7a32965
+ * %UTRACE_STOP then @current stays stopped unless @engine returns
Jesse Keating 7a32965
+ * different from a following callback.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The report_death() and report_reap() callbacks do not take @action
Jesse Keating 7a32965
+ * arguments, and only %UTRACE_DETACH is meaningful in the return value
Jesse Keating 7a32965
+ * from a report_death() callback.  None of the resume actions applies
Jesse Keating 7a32965
+ * to a dead thread.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * All @report_*() hooks are called with no locks held, in a generally
Jesse Keating 7a32965
+ * safe environment when we will be returning to user mode soon (or just
Jesse Keating 7a32965
+ * entered the kernel).  It is fine to block for memory allocation and
Jesse Keating 7a32965
+ * the like, but all hooks are asynchronous and must not block on
Jesse Keating 7a32965
+ * external events!  If you want the thread to block, use %UTRACE_STOP
Jesse Keating 7a32965
+ * in your hook's return value; then later wake it up with utrace_control().
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_quiesce:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%QUIESCE).
Jesse Keating 7a32965
+ *	This does not indicate any event, but just that @current is in a
Jesse Keating 7a32965
+ *	safe place for examination.  This call is made before each specific
Jesse Keating 7a32965
+ *	event callback, except for @report_reap.  The @event argument gives
Jesse Keating 7a32965
+ *	the %UTRACE_EVENT(@which) value for the event occurring.  This
Jesse Keating 7a32965
+ *	callback might be made for events @engine has not requested, if
Jesse Keating 7a32965
+ *	some other engine is tracing the event; calling utrace_set_events()
Jesse Keating 7a32965
+ *	call here can request the immediate callback for this occurrence of
Jesse Keating 7a32965
+ *	@event.  @event is zero when there is no other event, @current is
Jesse Keating 7a32965
+ *	now ready to check for signals and return to user mode, and some
Jesse Keating 7a32965
+ *	engine has used %UTRACE_REPORT or %UTRACE_INTERRUPT to request this
Jesse Keating 7a32965
+ *	callback.  For this case, if @report_signal is not %NULL, the
Jesse Keating 7a32965
+ *	@report_quiesce callback may be replaced with a @report_signal
Jesse Keating 7a32965
+ *	callback passing %UTRACE_SIGNAL_REPORT in its @action argument,
Jesse Keating 7a32965
+ *	whenever @current is entering the signal-check path anyway.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_signal:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%SIGNAL_*) or %UTRACE_EVENT(%QUIESCE).
Jesse Keating 7a32965
+ *	Use utrace_signal_action() and utrace_resume_action() on @action.
Jesse Keating 7a32965
+ *	The signal action is %UTRACE_SIGNAL_REPORT when some engine has
Jesse Keating 7a32965
+ *	used %UTRACE_REPORT or %UTRACE_INTERRUPT; the callback can choose
Jesse Keating 7a32965
+ *	to stop or to deliver an artificial signal, before pending signals.
Jesse Keating 7a32965
+ *	It's %UTRACE_SIGNAL_HANDLER instead when signal handler setup just
Jesse Keating 7a32965
+ *	finished (after a previous %UTRACE_SIGNAL_DELIVER return); this
Jesse Keating 7a32965
+ *	serves in lieu of any %UTRACE_SIGNAL_REPORT callback requested by
Jesse Keating 7a32965
+ *	%UTRACE_REPORT or %UTRACE_INTERRUPT, and is also implicitly
Jesse Keating 7a32965
+ *	requested by %UTRACE_SINGLESTEP or %UTRACE_BLOCKSTEP into the
Jesse Keating 7a32965
+ *	signal delivery.  The other signal actions indicate a signal about
Jesse Keating 7a32965
+ *	to be delivered; the previous engine's return value sets the signal
Jesse Keating 7a32965
+ *	action seen by the the following engine's callback.  The @info data
Jesse Keating 7a32965
+ *	can be changed at will, including @info->si_signo.  The settings in
Jesse Keating 7a32965
+ *	@return_ka determines what %UTRACE_SIGNAL_DELIVER does.  @orig_ka
Jesse Keating 7a32965
+ *	is what was in force before other tracing engines intervened, and
Jesse Keating 7a32965
+ *	it's %NULL when this report began as %UTRACE_SIGNAL_REPORT or
Jesse Keating 7a32965
+ *	%UTRACE_SIGNAL_HANDLER.  For a report without a new signal, @info
Jesse Keating 7a32965
+ *	is left uninitialized and must be set completely by an engine that
Jesse Keating 7a32965
+ *	chooses to deliver a signal; if there was a previous @report_signal
Jesse Keating 7a32965
+ *	callback ending in %UTRACE_STOP and it was just resumed using
Jesse Keating 7a32965
+ *	%UTRACE_REPORT or %UTRACE_INTERRUPT, then @info is left unchanged
Jesse Keating 7a32965
+ *	from the previous callback.  In this way, the original signal can
Jesse Keating 7a32965
+ *	be left in @info while returning %UTRACE_STOP|%UTRACE_SIGNAL_IGN
Jesse Keating 7a32965
+ *	and then found again when resuming with %UTRACE_INTERRUPT.
Jesse Keating 7a32965
+ *	The %UTRACE_SIGNAL_HOLD flag bit can be OR'd into the return value,
Jesse Keating 7a32965
+ *	and might be in @action if the previous engine returned it.  This
Jesse Keating 7a32965
+ *	flag asks that the signal in @info be pushed back on @current's queue
Jesse Keating 7a32965
+ *	so that it will be seen again after whatever action is taken now.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_clone:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%CLONE).
Jesse Keating 7a32965
+ *	Event reported for parent, before the new task @child might run.
Jesse Keating 7a32965
+ *	@clone_flags gives the flags used in the clone system call, or
Jesse Keating 7a32965
+ *	equivalent flags for a fork() or vfork() system call.  This
Jesse Keating 7a32965
+ *	function can use utrace_attach_task() on @child.  Then passing
Jesse Keating 7a32965
+ *	%UTRACE_STOP to utrace_control() on @child here keeps the child
Jesse Keating 7a32965
+ *	stopped before it ever runs in user mode, %UTRACE_REPORT or
Jesse Keating 7a32965
+ *	%UTRACE_INTERRUPT ensures a callback from @child before it
Jesse Keating 7a32965
+ *	starts in user mode.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_jctl:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%JCTL).
Jesse Keating 7a32965
+ *	Job control event; @type is %CLD_STOPPED or %CLD_CONTINUED,
Jesse Keating 7a32965
+ *	indicating whether we are stopping or resuming now.  If @notify
Jesse Keating 7a32965
+ *	is nonzero, @current is the last thread to stop and so will send
Jesse Keating 7a32965
+ *	%SIGCHLD to its parent after this callback; @notify reflects
Jesse Keating 7a32965
+ *	what the parent's %SIGCHLD has in @si_code, which can sometimes
Jesse Keating 7a32965
+ *	be %CLD_STOPPED even when @type is %CLD_CONTINUED.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_exec:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%EXEC).
Jesse Keating 7a32965
+ *	An execve system call has succeeded and the new program is about to
Jesse Keating 7a32965
+ *	start running.  The initial user register state is handy to be tweaked
Jesse Keating 7a32965
+ *	directly in @regs.  @fmt and @bprm gives the details of this exec.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_syscall_entry:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%SYSCALL_ENTRY).
Jesse Keating 7a32965
+ *	Thread has entered the kernel to request a system call.
Jesse Keating 7a32965
+ *	The user register state is handy to be tweaked directly in @regs.
Jesse Keating 7a32965
+ *	The @action argument contains an &enum utrace_syscall_action,
Jesse Keating 7a32965
+ *	use utrace_syscall_action() to extract it.  The return value
Jesse Keating 7a32965
+ *	overrides the last engine's action for the system call.
Jesse Keating 7a32965
+ *	If the final action is %UTRACE_SYSCALL_ABORT, no system call
Jesse Keating 7a32965
+ *	is made.  The details of the system call being attempted can
Jesse Keating 7a32965
+ *	be fetched here with syscall_get_nr() and syscall_get_arguments().
Jesse Keating 7a32965
+ *	The parameter registers can be changed with syscall_set_arguments().
Jesse Keating 7a32965
+ *	See above about the %UTRACE_SYSCALL_RESUMED flag in @action.
Jesse Keating 7a32965
+ *	Use %UTRACE_REPORT in the return value to guarantee you get
Jesse Keating 7a32965
+ *	another callback (with %UTRACE_SYSCALL_RESUMED flag) in case
Jesse Keating 7a32965
+ *	@current stops with %UTRACE_STOP before attempting the system call.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_syscall_exit:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%SYSCALL_EXIT).
Jesse Keating 7a32965
+ *	Thread is about to leave the kernel after a system call request.
Jesse Keating 7a32965
+ *	The user register state is handy to be tweaked directly in @regs.
Jesse Keating 7a32965
+ *	The results of the system call attempt can be examined here using
Jesse Keating 7a32965
+ *	syscall_get_error() and syscall_get_return_value().  It is safe
Jesse Keating 7a32965
+ *	here to call syscall_set_return_value() or syscall_rollback().
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_exit:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%EXIT).
Jesse Keating 7a32965
+ *	Thread is exiting and cannot be prevented from doing so,
Jesse Keating 7a32965
+ *	but all its state is still live.  The @code value will be
Jesse Keating 7a32965
+ *	the wait result seen by the parent, and can be changed by
Jesse Keating 7a32965
+ *	this engine or others.  The @orig_code value is the real
Jesse Keating 7a32965
+ *	status, not changed by any tracing engine.  Returning %UTRACE_STOP
Jesse Keating 7a32965
+ *	here keeps @current stopped before it cleans up its state and dies,
Jesse Keating 7a32965
+ *	so it can be examined by other processes.  When @current is allowed
Jesse Keating 7a32965
+ *	to run, it will die and get to the @report_death callback.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_death:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%DEATH).
Jesse Keating 7a32965
+ *	Thread is really dead now.  It might be reaped by its parent at
Jesse Keating 7a32965
+ *	any time, or self-reap immediately.  Though the actual reaping
Jesse Keating 7a32965
+ *	may happen in parallel, a report_reap() callback will always be
Jesse Keating 7a32965
+ *	ordered after a report_death() callback.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @report_reap:
Jesse Keating 7a32965
+ *	Requested by %UTRACE_EVENT(%REAP).
Jesse Keating 7a32965
+ *	Called when someone reaps the dead task (parent, init, or self).
Jesse Keating 7a32965
+ *	This means the parent called wait, or else this was a detached
Jesse Keating 7a32965
+ *	thread or a process whose parent ignores SIGCHLD.
Jesse Keating 7a32965
+ *	No more callbacks are made after this one.
Jesse Keating 7a32965
+ *	The engine is always detached.
Jesse Keating 7a32965
+ *	There is nothing more a tracing engine can do about this thread.
Jesse Keating 7a32965
+ *	After this callback, the @engine pointer will become invalid.
Jesse Keating 7a32965
+ *	The @task pointer may become invalid if get_task_struct() hasn't
Jesse Keating 7a32965
+ *	been used to keep it alive.
Jesse Keating 7a32965
+ *	An engine should always request this callback if it stores the
Jesse Keating 7a32965
+ *	@engine pointer or stores any pointer in @engine->data, so it
Jesse Keating 7a32965
+ *	can clean up its data structures.
Jesse Keating 7a32965
+ *	Unlike other callbacks, this can be called from the parent's context
Jesse Keating 7a32965
+ *	rather than from the traced thread itself--it must not delay the
Jesse Keating 7a32965
+ *	parent by blocking.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * @release:
Jesse Keating 7a32965
+ *	If not %NULL, this is called after the last utrace_engine_put()
Jesse Keating 7a32965
+ *	call for a &struct utrace_engine, which could be implicit after
Jesse Keating 7a32965
+ *	a %UTRACE_DETACH return from another callback.  Its argument is
Jesse Keating 7a32965
+ *	the engine's @data member.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_engine_ops {
Jesse Keating 7a32965
+	u32 (*report_quiesce)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+			      unsigned long event);
Jesse Keating 7a32965
+	u32 (*report_signal)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+			     struct pt_regs *regs,
Jesse Keating 7a32965
+			     siginfo_t *info,
Jesse Keating 7a32965
+			     const struct k_sigaction *orig_ka,
Jesse Keating 7a32965
+			     struct k_sigaction *return_ka);
Jesse Keating 7a32965
+	u32 (*report_clone)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+			    unsigned long clone_flags,
Jesse Keating 7a32965
+			    struct task_struct *child);
Jesse Keating 7a32965
+	u32 (*report_jctl)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+			   int type, int notify);
Jesse Keating 7a32965
+	u32 (*report_exec)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+			   const struct linux_binfmt *fmt,
Jesse Keating 7a32965
+			   const struct linux_binprm *bprm,
Jesse Keating 7a32965
+			   struct pt_regs *regs);
Jesse Keating 7a32965
+	u32 (*report_syscall_entry)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+				    struct pt_regs *regs);
Jesse Keating 7a32965
+	u32 (*report_syscall_exit)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+				   struct pt_regs *regs);
Jesse Keating 7a32965
+	u32 (*report_exit)(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+			   long orig_code, long *code);
Jesse Keating 7a32965
+	u32 (*report_death)(struct utrace_engine *engine,
Jesse Keating 7a32965
+			    bool group_dead, int signal);
Jesse Keating 7a32965
+	void (*report_reap)(struct utrace_engine *engine,
Jesse Keating 7a32965
+			    struct task_struct *task);
Jesse Keating 7a32965
+	void (*release)(void *data);
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * struct utrace_examiner - private state for using utrace_prepare_examine()
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The members of &struct utrace_examiner are private to the implementation.
Jesse Keating 7a32965
+ * This data type holds the state from a call to utrace_prepare_examine()
Jesse Keating 7a32965
+ * to be used by a call to utrace_finish_examine().
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_examiner {
Jesse Keating 7a32965
+/* private: */
Jesse Keating 7a32965
+	long state;
Jesse Keating 7a32965
+	unsigned long ncsw;
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * These are the exported entry points for tracing engines to use.
Jesse Keating 7a32965
+ * See kernel/utrace.c for their kerneldoc comments with interface details.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_engine *utrace_attach_task(struct task_struct *, int,
Jesse Keating 7a32965
+					 const struct utrace_engine_ops *,
Jesse Keating 7a32965
+					 void *);
Jesse Keating 7a32965
+struct utrace_engine *utrace_attach_pid(struct pid *, int,
Jesse Keating 7a32965
+					const struct utrace_engine_ops *,
Jesse Keating 7a32965
+					void *);
Jesse Keating 7a32965
+int __must_check utrace_control(struct task_struct *,
Jesse Keating 7a32965
+				struct utrace_engine *,
Jesse Keating 7a32965
+				enum utrace_resume_action);
Jesse Keating 7a32965
+int __must_check utrace_set_events(struct task_struct *,
Jesse Keating 7a32965
+				   struct utrace_engine *,
Jesse Keating 7a32965
+				   unsigned long eventmask);
Jesse Keating 7a32965
+int __must_check utrace_barrier(struct task_struct *,
Jesse Keating 7a32965
+				struct utrace_engine *);
Jesse Keating 7a32965
+int __must_check utrace_prepare_examine(struct task_struct *,
Jesse Keating 7a32965
+					struct utrace_engine *,
Jesse Keating 7a32965
+					struct utrace_examiner *);
Jesse Keating 7a32965
+int __must_check utrace_finish_examine(struct task_struct *,
Jesse Keating 7a32965
+				       struct utrace_engine *,
Jesse Keating 7a32965
+				       struct utrace_examiner *);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_control_pid - control a thread being traced by a tracing engine
Jesse Keating 7a32965
+ * @pid:		thread to affect
Jesse Keating 7a32965
+ * @engine:		attached engine to affect
Jesse Keating 7a32965
+ * @action:		&enum utrace_resume_action for thread to do
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is the same as utrace_control(), but takes a &struct pid
Jesse Keating 7a32965
+ * pointer rather than a &struct task_struct pointer.  The caller must
Jesse Keating 7a32965
+ * hold a ref on @pid, but does not need to worry about the task
Jesse Keating 7a32965
+ * staying valid.  If it's been reaped so that @pid points nowhere,
Jesse Keating 7a32965
+ * then this call returns -%ESRCH.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline __must_check int utrace_control_pid(
Jesse Keating 7a32965
+	struct pid *pid, struct utrace_engine *engine,
Jesse Keating 7a32965
+	enum utrace_resume_action action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * We don't bother with rcu_read_lock() here to protect the
Jesse Keating 7a32965
+	 * task_struct pointer, because utrace_control will return
Jesse Keating 7a32965
+	 * -ESRCH without looking at that pointer if the engine is
Jesse Keating 7a32965
+	 * already detached.  A task_struct pointer can't die before
Jesse Keating 7a32965
+	 * all the engines are detached in release_task() first.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	struct task_struct *task = pid_task(pid, PIDTYPE_PID);
Jesse Keating 7a32965
+	return unlikely(!task) ? -ESRCH : utrace_control(task, engine, action);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_set_events_pid - choose which event reports a tracing engine gets
Jesse Keating 7a32965
+ * @pid:		thread to affect
Jesse Keating 7a32965
+ * @engine:		attached engine to affect
Jesse Keating 7a32965
+ * @eventmask:		new event mask
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is the same as utrace_set_events(), but takes a &struct pid
Jesse Keating 7a32965
+ * pointer rather than a &struct task_struct pointer.  The caller must
Jesse Keating 7a32965
+ * hold a ref on @pid, but does not need to worry about the task
Jesse Keating 7a32965
+ * staying valid.  If it's been reaped so that @pid points nowhere,
Jesse Keating 7a32965
+ * then this call returns -%ESRCH.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline __must_check int utrace_set_events_pid(
Jesse Keating 7a32965
+	struct pid *pid, struct utrace_engine *engine, unsigned long eventmask)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct task_struct *task = pid_task(pid, PIDTYPE_PID);
Jesse Keating 7a32965
+	return unlikely(!task) ? -ESRCH :
Jesse Keating 7a32965
+		utrace_set_events(task, engine, eventmask);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_barrier_pid - synchronize with simultaneous tracing callbacks
Jesse Keating 7a32965
+ * @pid:		thread to affect
Jesse Keating 7a32965
+ * @engine:		engine to affect (can be detached)
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is the same as utrace_barrier(), but takes a &struct pid
Jesse Keating 7a32965
+ * pointer rather than a &struct task_struct pointer.  The caller must
Jesse Keating 7a32965
+ * hold a ref on @pid, but does not need to worry about the task
Jesse Keating 7a32965
+ * staying valid.  If it's been reaped so that @pid points nowhere,
Jesse Keating 7a32965
+ * then this call returns -%ESRCH.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline __must_check int utrace_barrier_pid(struct pid *pid,
Jesse Keating 7a32965
+						  struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct task_struct *task = pid_task(pid, PIDTYPE_PID);
Jesse Keating 7a32965
+	return unlikely(!task) ? -ESRCH : utrace_barrier(task, engine);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#endif	/* CONFIG_UTRACE */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#endif	/* linux/utrace.h */
Jesse Keating 7a32965
diff --git a/init/Kconfig b/init/Kconfig
Roland McGrath edee7cd
index 2de5b1c..a283086 100644  
Jesse Keating 7a32965
--- a/init/Kconfig
Jesse Keating 7a32965
+++ b/init/Kconfig
Kyle McMartin da80d72
@@ -332,6 +332,15 @@ config AUDIT_TREE
Jesse Keating 7a32965
 	depends on AUDITSYSCALL
Kyle McMartin da80d72
 	select FSNOTIFY
Jesse Keating 7a32965
 
Jesse Keating 7a32965
+config UTRACE
Jesse Keating 7a32965
+	bool "Infrastructure for tracing and debugging user processes"
Jesse Keating 7a32965
+	depends on EXPERIMENTAL
Jesse Keating 7a32965
+	depends on HAVE_ARCH_TRACEHOOK
Jesse Keating 7a32965
+	help
Jesse Keating 7a32965
+	  Enable the utrace process tracing interface.  This is an internal
Jesse Keating 7a32965
+	  kernel interface exported to kernel modules, to track events in
Jesse Keating 7a32965
+	  user threads, extract and change user thread state.
Jesse Keating 7a32965
+
Jesse Keating 7a32965
 menu "RCU Subsystem"
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 choice
Jesse Keating 7a32965
diff --git a/kernel/Makefile b/kernel/Makefile
Kyle McMartin 07d3322
index 0b72d1a..6004913 100644  
Jesse Keating 7a32965
--- a/kernel/Makefile
Jesse Keating 7a32965
+++ b/kernel/Makefile
Kyle McMartin 07d3322
@@ -70,6 +70,7 @@ obj-$(CONFIG_IKCONFIG) += configs.o
Jesse Keating 7a32965
 obj-$(CONFIG_RESOURCE_COUNTERS) += res_counter.o
Jesse Keating 7a32965
 obj-$(CONFIG_SMP) += stop_machine.o
Jesse Keating 7a32965
 obj-$(CONFIG_KPROBES_SANITY_TEST) += test_kprobes.o
Jesse Keating 7a32965
+obj-$(CONFIG_UTRACE) += utrace.o
Kyle McMartin da80d72
 obj-$(CONFIG_AUDIT) += audit.o auditfilter.o
Jesse Keating 7a32965
 obj-$(CONFIG_AUDITSYSCALL) += auditsc.o
Kyle McMartin da80d72
 obj-$(CONFIG_AUDIT_WATCH) += audit_watch.o
Jesse Keating 7a32965
diff --git a/kernel/fork.c b/kernel/fork.c
Roland McGrath edee7cd
index 98b4508..3ceff6f 100644  
Jesse Keating 7a32965
--- a/kernel/fork.c
Jesse Keating 7a32965
+++ b/kernel/fork.c
Jesse Keating 7a32965
@@ -161,6 +161,7 @@ void free_task(struct task_struct *tsk)
Jesse Keating 7a32965
 	free_thread_info(tsk->stack);
Jesse Keating 7a32965
 	rt_mutex_debug_task_free(tsk);
Jesse Keating 7a32965
 	ftrace_graph_exit_task(tsk);
Jesse Keating 7a32965
+	tracehook_free_task(tsk);
Jesse Keating 7a32965
 	free_task_struct(tsk);
Jesse Keating 7a32965
 }
Jesse Keating 7a32965
 EXPORT_SYMBOL(free_task);
Roland McGrath edee7cd
@@ -1008,6 +1009,8 @@ static struct task_struct *copy_process(
Jesse Keating 7a32965
 	if (!p)
Jesse Keating 7a32965
 		goto fork_out;
Jesse Keating 7a32965
 
Jesse Keating 7a32965
+	tracehook_init_task(p);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
 	ftrace_graph_init_task(p);
Jesse Keating 7a32965
 
Jesse Keating 7a32965
 	rt_mutex_init_task(p);
Kyle McMartin 07d3322
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
Kyle McMartin 07d3322
index 8049cb5..23bde94 100644  
Kyle McMartin 07d3322
--- a/kernel/ptrace.c
Kyle McMartin 07d3322
+++ b/kernel/ptrace.c
Kyle McMartin 07d3322
@@ -15,6 +15,7 @@
Kyle McMartin 07d3322
 #include <linux/highmem.h>
Kyle McMartin 07d3322
 #include <linux/pagemap.h>
Kyle McMartin 07d3322
 #include <linux/ptrace.h>
Kyle McMartin 07d3322
+#include <linux/utrace.h>
Kyle McMartin 07d3322
 #include <linux/security.h>
Kyle McMartin 07d3322
 #include <linux/signal.h>
Kyle McMartin 07d3322
 #include <linux/audit.h>
Kyle McMartin 07d3322
@@ -163,6 +164,14 @@ bool ptrace_may_access(struct task_struc
Kyle McMartin 07d3322
 	return !err;
Kyle McMartin 07d3322
 }
Kyle McMartin 07d3322
 
Kyle McMartin 07d3322
+/*
Kyle McMartin 07d3322
+ * For experimental use of utrace, exclude ptrace on the same task.
Kyle McMartin 07d3322
+ */
Kyle McMartin 07d3322
+static inline bool exclude_ptrace(struct task_struct *task)
Kyle McMartin 07d3322
+{
Kyle McMartin 07d3322
+	return unlikely(!!task_utrace_flags(task));
Kyle McMartin 07d3322
+}
Kyle McMartin 07d3322
+
Kyle McMartin 07d3322
 int ptrace_attach(struct task_struct *task)
Kyle McMartin 07d3322
 {
Kyle McMartin 07d3322
 	int retval;
Kyle McMartin 07d3322
@@ -186,6 +195,8 @@ int ptrace_attach(struct task_struct *ta
Kyle McMartin 07d3322
 
Kyle McMartin 07d3322
 	task_lock(task);
Kyle McMartin 07d3322
 	retval = __ptrace_may_access(task, PTRACE_MODE_ATTACH);
Kyle McMartin 07d3322
+	if (!retval && exclude_ptrace(task))
Kyle McMartin 07d3322
+		retval = -EBUSY;
Kyle McMartin 07d3322
 	task_unlock(task);
Kyle McMartin 07d3322
 	if (retval)
Kyle McMartin 07d3322
 		goto unlock_creds;
Kyle McMartin 07d3322
@@ -223,6 +234,9 @@ int ptrace_traceme(void)
Kyle McMartin 07d3322
 {
Kyle McMartin 07d3322
 	int ret = -EPERM;
Kyle McMartin 07d3322
 
Kyle McMartin 07d3322
+	if (exclude_ptrace(current)) /* XXX locking */
Kyle McMartin 07d3322
+		return -EBUSY;
Kyle McMartin 07d3322
+
Kyle McMartin 07d3322
 	write_lock_irq(&tasklist_lock);
Kyle McMartin 07d3322
 	/* Are we already being traced? */
Kyle McMartin 07d3322
 	if (!current->ptrace) {
Kyle McMartin 07d3322
diff --git a/kernel/utrace.c b/kernel/utrace.c
Jesse Keating 7a32965
new file mode 100644
Kyle McMartin 07d3322
index ...43f38b7 100644  
Jesse Keating 7a32965
--- /dev/null
Kyle McMartin 07d3322
+++ b/kernel/utrace.c
Kyle McMartin 07d3322
@@ -0,0 +1,2434 @@
Jesse Keating 7a32965
+/*
Kyle McMartin 07d3322
+ * utrace infrastructure interface for debugging user processes
Jesse Keating 7a32965
+ *
Kyle McMartin 07d3322
+ * Copyright (C) 2006-2010 Red Hat, Inc.  All rights reserved.
Kyle McMartin 07d3322
+ *
Kyle McMartin 07d3322
+ * This copyrighted material is made available to anyone wishing to use,
Kyle McMartin 07d3322
+ * modify, copy, or redistribute it subject to the terms and conditions
Kyle McMartin 07d3322
+ * of the GNU General Public License v.2.
Jesse Keating 7a32965
+ *
Kyle McMartin 07d3322
+ * Red Hat Author: Roland McGrath.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#include <linux/utrace.h>
Kyle McMartin 07d3322
+#include <linux/tracehook.h>
Kyle McMartin 07d3322
+#include <linux/regset.h>
Kyle McMartin 07d3322
+#include <asm/syscall.h>
Kyle McMartin 07d3322
+#include <linux/ptrace.h>
Kyle McMartin 07d3322
+#include <linux/err.h>
Kyle McMartin 07d3322
+#include <linux/sched.h>
Kyle McMartin 07d3322
+#include <linux/freezer.h>
Kyle McMartin 07d3322
+#include <linux/module.h>
Kyle McMartin 07d3322
+#include <linux/init.h>
Kyle McMartin 07d3322
+#include <linux/slab.h>
Kyle McMartin 07d3322
+#include <linux/seq_file.h>
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Kyle McMartin 07d3322
+ * Per-thread structure private to utrace implementation.
Kyle McMartin 07d3322
+ * If task_struct.utrace_flags is nonzero, task_struct.utrace
Kyle McMartin 07d3322
+ * has always been allocated first.  Once allocated, it is
Kyle McMartin 07d3322
+ * never freed until free_task().
Kyle McMartin 07d3322
+ *
Kyle McMartin 07d3322
+ * The common event reporting loops are done by the task making the
Kyle McMartin 07d3322
+ * report without ever taking any locks.  To facilitate this, the two
Kyle McMartin 07d3322
+ * lists @attached and @attaching work together for smooth asynchronous
Kyle McMartin 07d3322
+ * attaching with low overhead.  Modifying either list requires @lock.
Kyle McMartin 07d3322
+ * The @attaching list can be modified any time while holding @lock.
Kyle McMartin 07d3322
+ * New engines being attached always go on this list.
Kyle McMartin 07d3322
+ *
Kyle McMartin 07d3322
+ * The @attached list is what the task itself uses for its reporting
Kyle McMartin 07d3322
+ * loops.  When the task itself is not quiescent, it can use the
Kyle McMartin 07d3322
+ * @attached list without taking any lock.  Nobody may modify the list
Kyle McMartin 07d3322
+ * when the task is not quiescent.  When it is quiescent, that means
Kyle McMartin 07d3322
+ * that it won't run again without taking @lock itself before using
Kyle McMartin 07d3322
+ * the list.
Jesse Keating 7a32965
+ *
Kyle McMartin 07d3322
+ * At each place where we know the task is quiescent (or it's current),
Kyle McMartin 07d3322
+ * while holding @lock, we call splice_attaching(), below.  This moves
Kyle McMartin 07d3322
+ * the @attaching list members on to the end of the @attached list.
Kyle McMartin 07d3322
+ * Since this happens at the start of any reporting pass, any new
Kyle McMartin 07d3322
+ * engines attached asynchronously go on the stable @attached list
Kyle McMartin 07d3322
+ * in time to have their callbacks seen.
Jesse Keating 7a32965
+ */
Kyle McMartin 07d3322
+struct utrace {
Kyle McMartin 07d3322
+	spinlock_t lock;
Kyle McMartin 07d3322
+	struct list_head attached, attaching;
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+	struct task_struct *cloning;
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+	struct utrace_engine *reporting;
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+	enum utrace_resume_action resume:UTRACE_RESUME_BITS;
Kyle McMartin 07d3322
+	unsigned int signal_handler:1;
Kyle McMartin 07d3322
+	unsigned int vfork_stop:1; /* need utrace_stop() before vfork wait */
Kyle McMartin 07d3322
+	unsigned int death:1;	/* in utrace_report_death() now */
Kyle McMartin 07d3322
+	unsigned int reap:1;	/* release_task() has run */
Kyle McMartin 07d3322
+	unsigned int pending_attach:1; /* need splice_attaching() */
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+static struct kmem_cache *utrace_cachep;
Kyle McMartin 07d3322
+static struct kmem_cache *utrace_engine_cachep;
Kyle McMartin 07d3322
+static const struct utrace_engine_ops utrace_detached_ops; /* forward decl */
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+static int __init utrace_init(void)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	utrace_cachep = KMEM_CACHE(utrace, SLAB_PANIC);
Kyle McMartin 07d3322
+	utrace_engine_cachep = KMEM_CACHE(utrace_engine, SLAB_PANIC);
Kyle McMartin 07d3322
+	return 0;
Jesse Keating 7a32965
+}
Kyle McMartin 07d3322
+module_init(utrace_init);
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+/*
Kyle McMartin 07d3322
+ * Set up @task.utrace for the first time.  We can have races
Kyle McMartin 07d3322
+ * between two utrace_attach_task() calls here.  The task_lock()
Kyle McMartin 07d3322
+ * governs installing the new pointer.  If another one got in first,
Kyle McMartin 07d3322
+ * we just punt the new one we allocated.
Kyle McMartin 07d3322
+ *
Kyle McMartin 07d3322
+ * This returns false only in case of a memory allocation failure.
Kyle McMartin 07d3322
+ */
Kyle McMartin 07d3322
+static bool utrace_task_alloc(struct task_struct *task)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	struct utrace *utrace = kmem_cache_zalloc(utrace_cachep, GFP_KERNEL);
Kyle McMartin 07d3322
+	if (unlikely(!utrace))
Kyle McMartin 07d3322
+		return false;
Kyle McMartin 07d3322
+	spin_lock_init(&utrace->lock);
Kyle McMartin 07d3322
+	INIT_LIST_HEAD(&utrace->attached);
Kyle McMartin 07d3322
+	INIT_LIST_HEAD(&utrace->attaching);
Kyle McMartin 07d3322
+	utrace->resume = UTRACE_RESUME;
Kyle McMartin 07d3322
+	task_lock(task);
Kyle McMartin 07d3322
+	if (likely(!task->utrace)) {
Kyle McMartin 07d3322
+		/*
Kyle McMartin 07d3322
+		 * This barrier makes sure the initialization of the struct
Kyle McMartin 07d3322
+		 * precedes the installation of the pointer.  This pairs
Kyle McMartin 07d3322
+		 * with smp_read_barrier_depends() in task_utrace_struct().
Kyle McMartin 07d3322
+		 */
Kyle McMartin 07d3322
+		smp_wmb();
Kyle McMartin 07d3322
+		task->utrace = utrace;
Kyle McMartin 07d3322
+	}
Kyle McMartin 07d3322
+	task_unlock(task);
Kyle McMartin 07d3322
+
Kyle McMartin 07d3322
+	if (unlikely(task->utrace != utrace))
Kyle McMartin 07d3322
+		kmem_cache_free(utrace_cachep, utrace);
Kyle McMartin 07d3322
+	return true;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+/*
Kyle McMartin 07d3322
+ * This is called via tracehook_free_task() from free_task()
Kyle McMartin 07d3322
+ * when @task is being deallocated.
Kyle McMartin 07d3322
+ */
Kyle McMartin 07d3322
+void utrace_free_task(struct task_struct *task)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	kmem_cache_free(utrace_cachep, task->utrace);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+/*
Kyle McMartin 07d3322
+ * This is calledwhen the task is safely quiescent, i.e. it won't consult
Kyle McMartin 07d3322
+ * utrace->attached without the lock.  Move any engines attached
Kyle McMartin 07d3322
+ * asynchronously from @utrace->attaching onto the @utrace->attached list.
Kyle McMartin 07d3322
+ */
Kyle McMartin 07d3322
+static void splice_attaching(struct utrace *utrace)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	lockdep_assert_held(&utrace->lock);
Kyle McMartin 07d3322
+	list_splice_tail_init(&utrace->attaching, &utrace->attached);
Kyle McMartin 07d3322
+	utrace->pending_attach = 0;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+/*
Kyle McMartin 07d3322
+ * This is the exported function used by the utrace_engine_put() inline.
Kyle McMartin 07d3322
+ */
Kyle McMartin 07d3322
+void __utrace_engine_release(struct kref *kref)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	struct utrace_engine *engine = container_of(kref, struct utrace_engine,
Kyle McMartin 07d3322
+						    kref);
Kyle McMartin 07d3322
+	BUG_ON(!list_empty(&engine->entry));
Kyle McMartin 07d3322
+	if (engine->release)
Kyle McMartin 07d3322
+		(*engine->release)(engine->data);
Kyle McMartin 07d3322
+	kmem_cache_free(utrace_engine_cachep, engine);
Jesse Keating 7a32965
+}
Kyle McMartin 07d3322
+EXPORT_SYMBOL_GPL(__utrace_engine_release);
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+static bool engine_matches(struct utrace_engine *engine, int flags,
Kyle McMartin 07d3322
+			   const struct utrace_engine_ops *ops, void *data)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	if ((flags & UTRACE_ATTACH_MATCH_OPS) && engine->ops != ops)
Kyle McMartin 07d3322
+		return false;
Kyle McMartin 07d3322
+	if ((flags & UTRACE_ATTACH_MATCH_DATA) && engine->data != data)
Kyle McMartin 07d3322
+		return false;
Kyle McMartin 07d3322
+	return engine->ops && engine->ops != &utrace_detached_ops;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Kyle McMartin 07d3322
+static struct utrace_engine *find_matching_engine(
Kyle McMartin 07d3322
+	struct utrace *utrace, int flags,
Kyle McMartin 07d3322
+	const struct utrace_engine_ops *ops, void *data)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace_engine *engine;
Kyle McMartin 07d3322
+	list_for_each_entry(engine, &utrace->attached, entry)
Kyle McMartin 07d3322
+		if (engine_matches(engine, flags, ops, data))
Jesse Keating 7a32965
+			return engine;
Kyle McMartin 07d3322
+	list_for_each_entry(engine, &utrace->attaching, entry)
Kyle McMartin 07d3322
+		if (engine_matches(engine, flags, ops, data))
Kyle McMartin 07d3322
+			return engine;
Kyle McMartin 07d3322
+	return NULL;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Kyle McMartin 07d3322
+ * Enqueue @engine, or maybe don't if UTRACE_ATTACH_EXCLUSIVE.
Jesse Keating 7a32965
+ */
Kyle McMartin 07d3322
+static int utrace_add_engine(struct task_struct *target,
Kyle McMartin 07d3322
+			     struct utrace *utrace,
Kyle McMartin 07d3322
+			     struct utrace_engine *engine,
Kyle McMartin 07d3322
+			     int flags,
Kyle McMartin 07d3322
+			     const struct utrace_engine_ops *ops,
Kyle McMartin 07d3322
+			     void *data)
Jesse Keating 7a32965
+{
Kyle McMartin 07d3322
+	int ret;
Kyle McMartin 07d3322
+
Kyle McMartin 07d3322
+	spin_lock(&utrace->lock);
Kyle McMartin 07d3322
+
Kyle McMartin 07d3322
+	ret = -EEXIST;
Kyle McMartin 07d3322
+	if ((flags & UTRACE_ATTACH_EXCLUSIVE) &&
Kyle McMartin 07d3322
+	     unlikely(find_matching_engine(utrace, flags, ops, data)))
Kyle McMartin 07d3322
+		goto unlock;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Kyle McMartin 07d3322
+	 * In case we had no engines before, make sure that
Kyle McMartin 07d3322
+	 * utrace_flags is not zero. Since we did unlock+lock
Kyle McMartin 07d3322
+	 * at least once after utrace_task_alloc() installed
Kyle McMartin 07d3322
+	 * ->utrace, we have the necessary barrier which pairs
Kyle McMartin 07d3322
+	 * with rmb() in task_utrace_struct().
Jesse Keating 7a32965
+	 */
Kyle McMartin 07d3322
+	ret = -ESRCH;
Kyle McMartin 07d3322
+	if (!target->utrace_flags) {
Kyle McMartin 07d3322
+		target->utrace_flags = UTRACE_EVENT(REAP);
Jesse Keating 7a32965
+		/*
Kyle McMartin 07d3322
+		 * If we race with tracehook_prepare_release_task()
Kyle McMartin 07d3322
+		 * make sure that either it sees utrace_flags != 0
Kyle McMartin 07d3322
+		 * or we see exit_state == EXIT_DEAD.
Jesse Keating 7a32965
+		 */
Kyle McMartin 07d3322
+		smp_mb();
Kyle McMartin 07d3322
+		if (unlikely(target->exit_state == EXIT_DEAD)) {
Kyle McMartin 07d3322
+			target->utrace_flags = 0;
Kyle McMartin 07d3322
+			goto unlock;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * Put the new engine on the pending ->attaching list.
Jesse Keating 7a32965
+	 * Make sure it gets onto the ->attached list by the next
Jesse Keating 7a32965
+	 * time it's examined.  Setting ->pending_attach ensures
Jesse Keating 7a32965
+	 * that start_report() takes the lock and splices the lists
Jesse Keating 7a32965
+	 * before the next new reporting pass.
Jesse Keating 7a32965
+	 *
Jesse Keating 7a32965
+	 * When target == current, it would be safe just to call
Jesse Keating 7a32965
+	 * splice_attaching() right here.  But if we're inside a
Jesse Keating 7a32965
+	 * callback, that would mean the new engine also gets
Jesse Keating 7a32965
+	 * notified about the event that precipitated its own
Jesse Keating 7a32965
+	 * creation.  This is not what the user wants.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	list_add_tail(&engine->entry, &utrace->attaching);
Jesse Keating 7a32965
+	utrace->pending_attach = 1;
Kyle McMartin da80d72
+	utrace_engine_get(engine);
Jesse Keating 7a32965
+	ret = 0;
Jesse Keating 7a32965
+unlock:
Jesse Keating 7a32965
+	spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return ret;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_attach_task - attach new engine, or look up an attached engine
Jesse Keating 7a32965
+ * @target:	thread to attach to
Jesse Keating 7a32965
+ * @flags:	flag bits combined with OR, see below
Jesse Keating 7a32965
+ * @ops:	callback table for new engine
Jesse Keating 7a32965
+ * @data:	engine private data pointer
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The caller must ensure that the @target thread does not get freed,
Jesse Keating 7a32965
+ * i.e. hold a ref or be its parent.  It is always safe to call this
Jesse Keating 7a32965
+ * on @current, or on the @child pointer in a @report_clone callback.
Jesse Keating 7a32965
+ * For most other cases, it's easier to use utrace_attach_pid() instead.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_ATTACH_CREATE:
Jesse Keating 7a32965
+ * Create a new engine.  If %UTRACE_ATTACH_CREATE is not specified, you
Jesse Keating 7a32965
+ * only look up an existing engine already attached to the thread.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_ATTACH_EXCLUSIVE:
Jesse Keating 7a32965
+ * Attempting to attach a second (matching) engine fails with -%EEXIST.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_ATTACH_MATCH_OPS: Only consider engines matching @ops.
Jesse Keating 7a32965
+ * UTRACE_ATTACH_MATCH_DATA: Only consider engines matching @data.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Calls with neither %UTRACE_ATTACH_MATCH_OPS nor %UTRACE_ATTACH_MATCH_DATA
Jesse Keating 7a32965
+ * match the first among any engines attached to @target.  That means that
Jesse Keating 7a32965
+ * %UTRACE_ATTACH_EXCLUSIVE in such a call fails with -%EEXIST if there
Jesse Keating 7a32965
+ * are any engines on @target at all.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_engine *utrace_attach_task(
Jesse Keating 7a32965
+	struct task_struct *target, int flags,
Jesse Keating 7a32965
+	const struct utrace_engine_ops *ops, void *data)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace *utrace = task_utrace_struct(target);
Jesse Keating 7a32965
+	struct utrace_engine *engine;
Jesse Keating 7a32965
+	int ret;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (!(flags & UTRACE_ATTACH_CREATE)) {
Jesse Keating 7a32965
+		if (unlikely(!utrace))
Jesse Keating 7a32965
+			return ERR_PTR(-ENOENT);
Jesse Keating 7a32965
+		spin_lock(&utrace->lock);
Jesse Keating 7a32965
+		engine = find_matching_engine(utrace, flags, ops, data);
Jesse Keating 7a32965
+		if (engine)
Jesse Keating 7a32965
+			utrace_engine_get(engine);
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+		return engine ?: ERR_PTR(-ENOENT);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(!ops) || unlikely(ops == &utrace_detached_ops))
Jesse Keating 7a32965
+		return ERR_PTR(-EINVAL);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(target->flags & PF_KTHREAD))
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Silly kernel, utrace is for users!
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		return ERR_PTR(-EPERM);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (!utrace) {
Jesse Keating 7a32965
+		if (unlikely(!utrace_task_alloc(target)))
Jesse Keating 7a32965
+			return ERR_PTR(-ENOMEM);
Jesse Keating 7a32965
+		utrace = task_utrace_struct(target);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	engine = kmem_cache_alloc(utrace_engine_cachep, GFP_KERNEL);
Jesse Keating 7a32965
+	if (unlikely(!engine))
Jesse Keating 7a32965
+		return ERR_PTR(-ENOMEM);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Kyle McMartin da80d72
+	 * Initialize the new engine structure.  It starts out with one ref
Kyle McMartin da80d72
+	 * to return.  utrace_add_engine() adds another for being attached.
Jesse Keating 7a32965
+	 */
Kyle McMartin da80d72
+	kref_init(&engine->kref);
Jesse Keating 7a32965
+	engine->flags = 0;
Jesse Keating 7a32965
+	engine->ops = ops;
Jesse Keating 7a32965
+	engine->data = data;
Jesse Keating 7a32965
+	engine->release = ops->release;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	ret = utrace_add_engine(target, utrace, engine, flags, ops, data);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(ret)) {
Jesse Keating 7a32965
+		kmem_cache_free(utrace_engine_cachep, engine);
Jesse Keating 7a32965
+		engine = ERR_PTR(ret);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Kyle McMartin da80d72
+
Jesse Keating 7a32965
+	return engine;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+EXPORT_SYMBOL_GPL(utrace_attach_task);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_attach_pid - attach new engine, or look up an attached engine
Jesse Keating 7a32965
+ * @pid:	&struct pid pointer representing thread to attach to
Jesse Keating 7a32965
+ * @flags:	flag bits combined with OR, see utrace_attach_task()
Jesse Keating 7a32965
+ * @ops:	callback table for new engine
Jesse Keating 7a32965
+ * @data:	engine private data pointer
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is the same as utrace_attach_task(), but takes a &struct pid
Jesse Keating 7a32965
+ * pointer rather than a &struct task_struct pointer.  The caller must
Jesse Keating 7a32965
+ * hold a ref on @pid, but does not need to worry about the task
Jesse Keating 7a32965
+ * staying valid.  If it's been reaped so that @pid points nowhere,
Jesse Keating 7a32965
+ * then this call returns -%ESRCH.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_engine *utrace_attach_pid(
Jesse Keating 7a32965
+	struct pid *pid, int flags,
Jesse Keating 7a32965
+	const struct utrace_engine_ops *ops, void *data)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace_engine *engine = ERR_PTR(-ESRCH);
Jesse Keating 7a32965
+	struct task_struct *task = get_pid_task(pid, PIDTYPE_PID);
Jesse Keating 7a32965
+	if (task) {
Jesse Keating 7a32965
+		engine = utrace_attach_task(task, flags, ops, data);
Jesse Keating 7a32965
+		put_task_struct(task);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+	return engine;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+EXPORT_SYMBOL_GPL(utrace_attach_pid);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * When an engine is detached, the target thread may still see it and
Jesse Keating 7a32965
+ * make callbacks until it quiesces.  We install a special ops vector
Jesse Keating 7a32965
+ * with these two callbacks.  When the target thread quiesces, it can
Jesse Keating 7a32965
+ * safely free the engine itself.  For any event we will always get
Jesse Keating 7a32965
+ * the report_quiesce() callback first, so we only need this one
Jesse Keating 7a32965
+ * pointer to be set.  The only exception is report_reap(), so we
Jesse Keating 7a32965
+ * supply that callback too.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static u32 utrace_detached_quiesce(u32 action, struct utrace_engine *engine,
Jesse Keating 7a32965
+				   unsigned long event)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return UTRACE_DETACH;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static void utrace_detached_reap(struct utrace_engine *engine,
Jesse Keating 7a32965
+				 struct task_struct *task)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static const struct utrace_engine_ops utrace_detached_ops = {
Jesse Keating 7a32965
+	.report_quiesce = &utrace_detached_quiesce,
Jesse Keating 7a32965
+	.report_reap = &utrace_detached_reap
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * The caller has to hold a ref on the engine.  If the attached flag is
Jesse Keating 7a32965
+ * true (all but utrace_barrier() calls), the engine is supposed to be
Jesse Keating 7a32965
+ * attached.  If the attached flag is false (utrace_barrier() only),
Jesse Keating 7a32965
+ * then return -ERESTARTSYS for an engine marked for detach but not yet
Jesse Keating 7a32965
+ * fully detached.  The task pointer can be invalid if the engine is
Jesse Keating 7a32965
+ * detached.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Get the utrace lock for the target task.
Jesse Keating 7a32965
+ * Returns the struct if locked, or ERR_PTR(-errno).
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This has to be robust against races with:
Jesse Keating 7a32965
+ *	utrace_control(target, UTRACE_DETACH) calls
Jesse Keating 7a32965
+ *	UTRACE_DETACH after reports
Jesse Keating 7a32965
+ *	utrace_report_death
Jesse Keating 7a32965
+ *	utrace_release_task
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static struct utrace *get_utrace_lock(struct task_struct *target,
Jesse Keating 7a32965
+				      struct utrace_engine *engine,
Jesse Keating 7a32965
+				      bool attached)
Jesse Keating 7a32965
+	__acquires(utrace->lock)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace *utrace;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	rcu_read_lock();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If this engine was already detached, bail out before we look at
Jesse Keating 7a32965
+	 * the task_struct pointer at all.  If it's detached after this
Jesse Keating 7a32965
+	 * check, then RCU is still keeping this task_struct pointer valid.
Jesse Keating 7a32965
+	 *
Jesse Keating 7a32965
+	 * The ops pointer is NULL when the engine is fully detached.
Jesse Keating 7a32965
+	 * It's &utrace_detached_ops when it's marked detached but still
Jesse Keating 7a32965
+	 * on the list.  In the latter case, utrace_barrier() still works,
Jesse Keating 7a32965
+	 * since the target might be in the middle of an old callback.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (unlikely(!engine->ops)) {
Jesse Keating 7a32965
+		rcu_read_unlock();
Jesse Keating 7a32965
+		return ERR_PTR(-ESRCH);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(engine->ops == &utrace_detached_ops)) {
Jesse Keating 7a32965
+		rcu_read_unlock();
Jesse Keating 7a32965
+		return attached ? ERR_PTR(-ESRCH) : ERR_PTR(-ERESTARTSYS);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	utrace = task_utrace_struct(target);
Jesse Keating 7a32965
+	spin_lock(&utrace->lock);
Kyle McMartin da80d72
+	if (unlikely(utrace->reap) || unlikely(!engine->ops) ||
Jesse Keating 7a32965
+	    unlikely(engine->ops == &utrace_detached_ops)) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * By the time we got the utrace lock,
Jesse Keating 7a32965
+		 * it had been reaped or detached already.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+		utrace = ERR_PTR(-ESRCH);
Jesse Keating 7a32965
+		if (!attached && engine->ops == &utrace_detached_ops)
Jesse Keating 7a32965
+			utrace = ERR_PTR(-ERESTARTSYS);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+	rcu_read_unlock();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return utrace;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Now that we don't hold any locks, run through any
Jesse Keating 7a32965
+ * detached engines and free their references.  Each
Jesse Keating 7a32965
+ * engine had one implicit ref while it was attached.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static void put_detached_list(struct list_head *list)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace_engine *engine, *next;
Jesse Keating 7a32965
+	list_for_each_entry_safe(engine, next, list, entry) {
Jesse Keating 7a32965
+		list_del_init(&engine->entry);
Jesse Keating 7a32965
+		utrace_engine_put(engine);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * We use an extra bit in utrace_engine.flags past the event bits,
Jesse Keating 7a32965
+ * to record whether the engine is keeping the target thread stopped.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This bit is set in task_struct.utrace_flags whenever it is set in any
Jesse Keating 7a32965
+ * engine's flags.  Only utrace_reset() resets it in utrace_flags.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+#define ENGINE_STOP		(1UL << _UTRACE_NEVENTS)
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static void mark_engine_wants_stop(struct task_struct *task,
Jesse Keating 7a32965
+				   struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	engine->flags |= ENGINE_STOP;
Jesse Keating 7a32965
+	task->utrace_flags |= ENGINE_STOP;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static void clear_engine_wants_stop(struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	engine->flags &= ~ENGINE_STOP;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static bool engine_wants_stop(struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	return (engine->flags & ENGINE_STOP) != 0;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_set_events - choose which event reports a tracing engine gets
Jesse Keating 7a32965
+ * @target:		thread to affect
Jesse Keating 7a32965
+ * @engine:		attached engine to affect
Jesse Keating 7a32965
+ * @events:		new event mask
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This changes the set of events for which @engine wants callbacks made.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This fails with -%EALREADY and does nothing if you try to clear
Jesse Keating 7a32965
+ * %UTRACE_EVENT(%DEATH) when the @report_death callback may already have
Kyle McMartin da80d72
+ * begun, or if you try to newly set %UTRACE_EVENT(%DEATH) or
Kyle McMartin da80d72
+ * %UTRACE_EVENT(%QUIESCE) when @target is already dead or dying.
Jesse Keating 7a32965
+ *
Kyle McMartin da80d72
+ * This fails with -%ESRCH if you try to clear %UTRACE_EVENT(%REAP) when
Kyle McMartin da80d72
+ * the @report_reap callback may already have begun, or when @target has
Kyle McMartin da80d72
+ * already been detached, including forcible detach on reaping.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * If @target was stopped before the call, then after a successful call,
Jesse Keating 7a32965
+ * no event callbacks not requested in @events will be made; if
Jesse Keating 7a32965
+ * %UTRACE_EVENT(%QUIESCE) is included in @events, then a
Jesse Keating 7a32965
+ * @report_quiesce callback will be made when @target resumes.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * If @target was not stopped and @events excludes some bits that were
Jesse Keating 7a32965
+ * set before, this can return -%EINPROGRESS to indicate that @target
Jesse Keating 7a32965
+ * may have been making some callback to @engine.  When this returns
Jesse Keating 7a32965
+ * zero, you can be sure that no event callbacks you've disabled in
Jesse Keating 7a32965
+ * @events can be made.  If @events only sets new bits that were not set
Jesse Keating 7a32965
+ * before on @engine, then -%EINPROGRESS will never be returned.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * To synchronize after an -%EINPROGRESS return, see utrace_barrier().
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * When @target is @current, -%EINPROGRESS is not returned.  But note
Jesse Keating 7a32965
+ * that a newly-created engine will not receive any callbacks related to
Jesse Keating 7a32965
+ * an event notification already in progress.  This call enables @events
Jesse Keating 7a32965
+ * callbacks to be made as soon as @engine becomes eligible for any
Jesse Keating 7a32965
+ * callbacks, see utrace_attach_task().
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * These rules provide for coherent synchronization based on %UTRACE_STOP,
Jesse Keating 7a32965
+ * even when %SIGKILL is breaking its normal simple rules.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+int utrace_set_events(struct task_struct *target,
Jesse Keating 7a32965
+		      struct utrace_engine *engine,
Jesse Keating 7a32965
+		      unsigned long events)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace *utrace;
Jesse Keating 7a32965
+	unsigned long old_flags, old_utrace_flags;
Kyle McMartin da80d72
+	int ret = -EALREADY;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * We just ignore the internal bit, so callers can use
Jesse Keating 7a32965
+	 * engine->flags to seed bitwise ops for our argument.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	events &= ~ENGINE_STOP;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	utrace = get_utrace_lock(target, engine, true);
Jesse Keating 7a32965
+	if (unlikely(IS_ERR(utrace)))
Jesse Keating 7a32965
+		return PTR_ERR(utrace);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	old_utrace_flags = target->utrace_flags;
Jesse Keating 7a32965
+	old_flags = engine->flags & ~ENGINE_STOP;
Jesse Keating 7a32965
+
Kyle McMartin da80d72
+	/*
Kyle McMartin da80d72
+	 * If utrace_report_death() is already progress now,
Kyle McMartin da80d72
+	 * it's too late to clear the death event bits.
Kyle McMartin da80d72
+	 */
Kyle McMartin da80d72
+	if (((old_flags & ~events) & _UTRACE_DEATH_EVENTS) && utrace->death)
Kyle McMartin da80d72
+		goto unlock;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * When setting these flags, it's essential that we really
Jesse Keating 7a32965
+	 * synchronize with exit_notify().  They cannot be set after
Jesse Keating 7a32965
+	 * exit_notify() takes the tasklist_lock.  By holding the read
Jesse Keating 7a32965
+	 * lock here while setting the flags, we ensure that the calls
Jesse Keating 7a32965
+	 * to tracehook_notify_death() and tracehook_report_death() will
Jesse Keating 7a32965
+	 * see the new flags.  This ensures that utrace_release_task()
Jesse Keating 7a32965
+	 * knows positively that utrace_report_death() will be called or
Jesse Keating 7a32965
+	 * that it won't.
Jesse Keating 7a32965
+	 */
Kyle McMartin da80d72
+	if ((events & ~old_flags) & _UTRACE_DEATH_EVENTS) {
Jesse Keating 7a32965
+		read_lock(&tasklist_lock);
Jesse Keating 7a32965
+		if (unlikely(target->exit_state)) {
Jesse Keating 7a32965
+			read_unlock(&tasklist_lock);
Kyle McMartin da80d72
+			goto unlock;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+		target->utrace_flags |= events;
Jesse Keating 7a32965
+		read_unlock(&tasklist_lock);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	engine->flags = events | (engine->flags & ENGINE_STOP);
Jesse Keating 7a32965
+	target->utrace_flags |= events;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if ((events & UTRACE_EVENT_SYSCALL) &&
Jesse Keating 7a32965
+	    !(old_utrace_flags & UTRACE_EVENT_SYSCALL))
Jesse Keating 7a32965
+		set_tsk_thread_flag(target, TIF_SYSCALL_TRACE);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	ret = 0;
Jesse Keating 7a32965
+	if ((old_flags & ~events) && target != current &&
Jesse Keating 7a32965
+	    !task_is_stopped_or_traced(target) && !target->exit_state) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * This barrier ensures that our engine->flags changes
Jesse Keating 7a32965
+		 * have hit before we examine utrace->reporting,
Jesse Keating 7a32965
+		 * pairing with the barrier in start_callback().  If
Jesse Keating 7a32965
+		 * @target has not yet hit finish_callback() to clear
Jesse Keating 7a32965
+		 * utrace->reporting, we might be in the middle of a
Jesse Keating 7a32965
+		 * callback to @engine.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		smp_mb();
Jesse Keating 7a32965
+		if (utrace->reporting == engine)
Jesse Keating 7a32965
+			ret = -EINPROGRESS;
Jesse Keating 7a32965
+	}
Kyle McMartin da80d72
+unlock:
Jesse Keating 7a32965
+	spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return ret;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+EXPORT_SYMBOL_GPL(utrace_set_events);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Asynchronously mark an engine as being detached.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This must work while the target thread races with us doing
Jesse Keating 7a32965
+ * start_callback(), defined below.  It uses smp_rmb() between checking
Jesse Keating 7a32965
+ * @engine->flags and using @engine->ops.  Here we change @engine->ops
Jesse Keating 7a32965
+ * first, then use smp_wmb() before changing @engine->flags.  This ensures
Jesse Keating 7a32965
+ * it can check the old flags before using the old ops, or check the old
Jesse Keating 7a32965
+ * flags before using the new ops, or check the new flags before using the
Jesse Keating 7a32965
+ * new ops, but can never check the new flags before using the old ops.
Jesse Keating 7a32965
+ * Hence, utrace_detached_ops might be used with any old flags in place.
Jesse Keating 7a32965
+ * It has report_quiesce() and report_reap() callbacks to handle all cases.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static void mark_engine_detached(struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	engine->ops = &utrace_detached_ops;
Jesse Keating 7a32965
+	smp_wmb();
Jesse Keating 7a32965
+	engine->flags = UTRACE_EVENT(QUIESCE);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Get @target to stop and return true if it is already stopped now.
Jesse Keating 7a32965
+ * If we return false, it will make some event callback soonish.
Jesse Keating 7a32965
+ * Called with @utrace locked.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static bool utrace_do_stop(struct task_struct *target, struct utrace *utrace)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	if (task_is_stopped(target)) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Stopped is considered quiescent; when it wakes up, it will
Jesse Keating 7a32965
+		 * go through utrace_finish_stop() before doing anything else.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		spin_lock_irq(&target->sighand->siglock);
Jesse Keating 7a32965
+		if (likely(task_is_stopped(target)))
Jesse Keating 7a32965
+			__set_task_state(target, TASK_TRACED);
Jesse Keating 7a32965
+		spin_unlock_irq(&target->sighand->siglock);
Jesse Keating 7a32965
+	} else if (utrace->resume > UTRACE_REPORT) {
Jesse Keating 7a32965
+		utrace->resume = UTRACE_REPORT;
Jesse Keating 7a32965
+		set_notify_resume(target);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return task_is_traced(target);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * If the target is not dead it should not be in tracing
Jesse Keating 7a32965
+ * stop any more.  Wake it unless it's in job control stop.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static void utrace_wakeup(struct task_struct *target, struct utrace *utrace)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	lockdep_assert_held(&utrace->lock);
Jesse Keating 7a32965
+	spin_lock_irq(&target->sighand->siglock);
Jesse Keating 7a32965
+	if (target->signal->flags & SIGNAL_STOP_STOPPED ||
Jesse Keating 7a32965
+	    target->signal->group_stop_count)
Jesse Keating 7a32965
+		target->state = TASK_STOPPED;
Jesse Keating 7a32965
+	else
Jesse Keating 7a32965
+		wake_up_state(target, __TASK_TRACED);
Jesse Keating 7a32965
+	spin_unlock_irq(&target->sighand->siglock);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * This is called when there might be some detached engines on the list or
Jesse Keating 7a32965
+ * some stale bits in @task->utrace_flags.  Clean them up and recompute the
Jesse Keating 7a32965
+ * flags.  Returns true if we're now fully detached.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Called with @utrace->lock held, returns with it released.
Jesse Keating 7a32965
+ * After this returns, @utrace might be freed if everything detached.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static bool utrace_reset(struct task_struct *task, struct utrace *utrace)
Jesse Keating 7a32965
+	__releases(utrace->lock)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace_engine *engine, *next;
Jesse Keating 7a32965
+	unsigned long flags = 0;
Jesse Keating 7a32965
+	LIST_HEAD(detached);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	splice_attaching(utrace);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * Update the set of events of interest from the union
Jesse Keating 7a32965
+	 * of the interests of the remaining tracing engines.
Jesse Keating 7a32965
+	 * For any engine marked detached, remove it from the list.
Jesse Keating 7a32965
+	 * We'll collect them on the detached list.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	list_for_each_entry_safe(engine, next, &utrace->attached, entry) {
Jesse Keating 7a32965
+		if (engine->ops == &utrace_detached_ops) {
Jesse Keating 7a32965
+			engine->ops = NULL;
Jesse Keating 7a32965
+			list_move(&engine->entry, &detached);
Jesse Keating 7a32965
+		} else {
Jesse Keating 7a32965
+			flags |= engine->flags | UTRACE_EVENT(REAP);
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (task->exit_state) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Once it's already dead, we never install any flags
Jesse Keating 7a32965
+		 * except REAP.  When ->exit_state is set and events
Jesse Keating 7a32965
+		 * like DEATH are not set, then they never can be set.
Jesse Keating 7a32965
+		 * This ensures that utrace_release_task() knows
Jesse Keating 7a32965
+		 * positively that utrace_report_death() can never run.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		BUG_ON(utrace->death);
Jesse Keating 7a32965
+		flags &= UTRACE_EVENT(REAP);
Jesse Keating 7a32965
+	} else if (!(flags & UTRACE_EVENT_SYSCALL) &&
Jesse Keating 7a32965
+		   test_tsk_thread_flag(task, TIF_SYSCALL_TRACE)) {
Jesse Keating 7a32965
+		clear_tsk_thread_flag(task, TIF_SYSCALL_TRACE);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (!flags) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * No more engines, cleared out the utrace.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		utrace->resume = UTRACE_RESUME;
Jesse Keating 7a32965
+		utrace->signal_handler = 0;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If no more engines want it stopped, wake it up.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (task_is_traced(task) && !(flags & ENGINE_STOP))
Jesse Keating 7a32965
+		utrace_wakeup(task, utrace);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * In theory spin_lock() doesn't imply rcu_read_lock().
Jesse Keating 7a32965
+	 * Once we clear ->utrace_flags this task_struct can go away
Jesse Keating 7a32965
+	 * because tracehook_prepare_release_task() path does not take
Jesse Keating 7a32965
+	 * utrace->lock when ->utrace_flags == 0.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	rcu_read_lock();
Jesse Keating 7a32965
+	task->utrace_flags = flags;
Jesse Keating 7a32965
+	spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+	rcu_read_unlock();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	put_detached_list(&detached);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return !flags;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+void utrace_finish_stop(void)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If we were task_is_traced() and then SIGKILL'ed, make
Jesse Keating 7a32965
+	 * sure we do nothing until the tracer drops utrace->lock.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (unlikely(__fatal_signal_pending(current))) {
Jesse Keating 7a32965
+		struct utrace *utrace = task_utrace_struct(current);
Jesse Keating 7a32965
+		spin_unlock_wait(&utrace->lock);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Perform %UTRACE_STOP, i.e. block in TASK_TRACED until woken up.
Jesse Keating 7a32965
+ * @task == current, @utrace == current->utrace, which is not locked.
Jesse Keating 7a32965
+ * Return true if we were woken up by SIGKILL even though some utrace
Jesse Keating 7a32965
+ * engine may still want us to stay stopped.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static void utrace_stop(struct task_struct *task, struct utrace *utrace,
Jesse Keating 7a32965
+			enum utrace_resume_action action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+relock:
Jesse Keating 7a32965
+	spin_lock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (action < utrace->resume) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Ensure a reporting pass when we're resumed.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		utrace->resume = action;
Jesse Keating 7a32965
+		if (action == UTRACE_INTERRUPT)
Jesse Keating 7a32965
+			set_thread_flag(TIF_SIGPENDING);
Jesse Keating 7a32965
+		else
Jesse Keating 7a32965
+			set_thread_flag(TIF_NOTIFY_RESUME);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If the ENGINE_STOP bit is clear in utrace_flags, that means
Jesse Keating 7a32965
+	 * utrace_reset() ran after we processed some UTRACE_STOP return
Jesse Keating 7a32965
+	 * values from callbacks to get here.  If all engines have detached
Jesse Keating 7a32965
+	 * or resumed us, we don't stop.  This check doesn't require
Jesse Keating 7a32965
+	 * siglock, but it should follow the interrupt/report bookkeeping
Jesse Keating 7a32965
+	 * steps (this can matter for UTRACE_RESUME but not UTRACE_DETACH).
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (unlikely(!(task->utrace_flags & ENGINE_STOP))) {
Jesse Keating 7a32965
+		utrace_reset(task, utrace);
Jesse Keating 7a32965
+		if (task->utrace_flags & ENGINE_STOP)
Jesse Keating 7a32965
+			goto relock;
Jesse Keating 7a32965
+		return;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * The siglock protects us against signals.  As well as SIGKILL
Jesse Keating 7a32965
+	 * waking us up, we must synchronize with the signal bookkeeping
Jesse Keating 7a32965
+	 * for stop signals and SIGCONT.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	spin_lock_irq(&task->sighand->siglock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(__fatal_signal_pending(task))) {
Jesse Keating 7a32965
+		spin_unlock_irq(&task->sighand->siglock);
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+		return;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	__set_current_state(TASK_TRACED);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If there is a group stop in progress,
Jesse Keating 7a32965
+	 * we must participate in the bookkeeping.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (unlikely(task->signal->group_stop_count) &&
Jesse Keating 7a32965
+			!--task->signal->group_stop_count)
Jesse Keating 7a32965
+		task->signal->flags = SIGNAL_STOP_STOPPED;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	spin_unlock_irq(&task->sighand->siglock);
Jesse Keating 7a32965
+	spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	schedule();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	utrace_finish_stop();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * While in TASK_TRACED, we were considered "frozen enough".
Jesse Keating 7a32965
+	 * Now that we woke up, it's crucial if we're supposed to be
Jesse Keating 7a32965
+	 * frozen that we freeze now before running anything substantial.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	try_to_freeze();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * While we were in TASK_TRACED, complete_signal() considered
Jesse Keating 7a32965
+	 * us "uninterested" in signal wakeups.  Now make sure our
Jesse Keating 7a32965
+	 * TIF_SIGPENDING state is correct for normal running.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	spin_lock_irq(&task->sighand->siglock);
Jesse Keating 7a32965
+	recalc_sigpending();
Jesse Keating 7a32965
+	spin_unlock_irq(&task->sighand->siglock);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Called by release_task() with @reap set to true.
Jesse Keating 7a32965
+ * Called by utrace_report_death() with @reap set to false.
Jesse Keating 7a32965
+ * On reap, make report_reap callbacks and clean out @utrace
Jesse Keating 7a32965
+ * unless still making callbacks.  On death, update bookkeeping
Jesse Keating 7a32965
+ * and handle the reap work if release_task() came in first.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+void utrace_maybe_reap(struct task_struct *target, struct utrace *utrace,
Jesse Keating 7a32965
+		       bool reap)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace_engine *engine, *next;
Jesse Keating 7a32965
+	struct list_head attached;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	spin_lock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (reap) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * If the target will do some final callbacks but hasn't
Jesse Keating 7a32965
+		 * finished them yet, we know because it clears these event
Jesse Keating 7a32965
+		 * bits after it's done.  Instead of cleaning up here and
Jesse Keating 7a32965
+		 * requiring utrace_report_death() to cope with it, we
Jesse Keating 7a32965
+		 * delay the REAP report and the teardown until after the
Jesse Keating 7a32965
+		 * target finishes its death reports.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		utrace->reap = 1;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		if (target->utrace_flags & _UTRACE_DEATH_EVENTS) {
Jesse Keating 7a32965
+			spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+			return;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+	} else {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * After we unlock with this flag clear, any competing
Jesse Keating 7a32965
+		 * utrace_control/utrace_set_events calls know that we've
Jesse Keating 7a32965
+		 * finished our callbacks and any detach bookkeeping.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		utrace->death = 0;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		if (!utrace->reap) {
Jesse Keating 7a32965
+			/*
Jesse Keating 7a32965
+			 * We're just dead, not reaped yet.  This will
Jesse Keating 7a32965
+			 * reset @target->utrace_flags so the later call
Jesse Keating 7a32965
+			 * with @reap set won't hit the check above.
Jesse Keating 7a32965
+			 */
Jesse Keating 7a32965
+			utrace_reset(target, utrace);
Jesse Keating 7a32965
+			return;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * utrace_add_engine() checks ->utrace_flags != 0.  Since
Jesse Keating 7a32965
+	 * @utrace->reap is set, nobody can set or clear UTRACE_EVENT(REAP)
Jesse Keating 7a32965
+	 * in @engine->flags or change @engine->ops and nobody can change
Jesse Keating 7a32965
+	 * @utrace->attached after we drop the lock.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	target->utrace_flags = 0;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * We clear out @utrace->attached before we drop the lock so
Jesse Keating 7a32965
+	 * that find_matching_engine() can't come across any old engine
Jesse Keating 7a32965
+	 * while we are busy tearing it down.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	list_replace_init(&utrace->attached, &attached);
Jesse Keating 7a32965
+	list_splice_tail_init(&utrace->attaching, &attached);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	list_for_each_entry_safe(engine, next, &attached, entry) {
Jesse Keating 7a32965
+		if (engine->flags & UTRACE_EVENT(REAP))
Jesse Keating 7a32965
+			engine->ops->report_reap(engine, target);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		engine->ops = NULL;
Jesse Keating 7a32965
+		engine->flags = 0;
Jesse Keating 7a32965
+		list_del_init(&engine->entry);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		utrace_engine_put(engine);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * You can't do anything to a dead task but detach it.
Jesse Keating 7a32965
+ * If release_task() has been called, you can't do that.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * On the exit path, DEATH and QUIESCE event bits are set only
Jesse Keating 7a32965
+ * before utrace_report_death() has taken the lock.  At that point,
Jesse Keating 7a32965
+ * the death report will come soon, so disallow detach until it's
Jesse Keating 7a32965
+ * done.  This prevents us from racing with it detaching itself.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Called only when @target->exit_state is nonzero.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static inline int utrace_control_dead(struct task_struct *target,
Jesse Keating 7a32965
+				      struct utrace *utrace,
Jesse Keating 7a32965
+				      enum utrace_resume_action action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	lockdep_assert_held(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (action != UTRACE_DETACH || unlikely(utrace->reap))
Jesse Keating 7a32965
+		return -ESRCH;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(utrace->death))
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * We have already started the death report.  We can't
Jesse Keating 7a32965
+		 * prevent the report_death and report_reap callbacks,
Jesse Keating 7a32965
+		 * so tell the caller they will happen.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		return -EALREADY;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return 0;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_control - control a thread being traced by a tracing engine
Jesse Keating 7a32965
+ * @target:		thread to affect
Jesse Keating 7a32965
+ * @engine:		attached engine to affect
Jesse Keating 7a32965
+ * @action:		&enum utrace_resume_action for thread to do
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is how a tracing engine asks a traced thread to do something.
Jesse Keating 7a32965
+ * This call is controlled by the @action argument, which has the
Jesse Keating 7a32965
+ * same meaning as the &enum utrace_resume_action value returned by
Jesse Keating 7a32965
+ * event reporting callbacks.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * If @target is already dead (@target->exit_state nonzero),
Jesse Keating 7a32965
+ * all actions except %UTRACE_DETACH fail with -%ESRCH.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The following sections describe each option for the @action argument.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_DETACH:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * After this, the @engine data structure is no longer accessible,
Jesse Keating 7a32965
+ * and the thread might be reaped.  The thread will start running
Jesse Keating 7a32965
+ * again if it was stopped and no longer has any attached engines
Jesse Keating 7a32965
+ * that want it stopped.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * If the @report_reap callback may already have begun, this fails
Jesse Keating 7a32965
+ * with -%ESRCH.  If the @report_death callback may already have
Jesse Keating 7a32965
+ * begun, this fails with -%EALREADY.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * If @target is not already stopped, then a callback to this engine
Jesse Keating 7a32965
+ * might be in progress or about to start on another CPU.  If so,
Jesse Keating 7a32965
+ * then this returns -%EINPROGRESS; the detach happens as soon as
Jesse Keating 7a32965
+ * the pending callback is finished.  To synchronize after an
Jesse Keating 7a32965
+ * -%EINPROGRESS return, see utrace_barrier().
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * If @target is properly stopped before utrace_control() is called,
Jesse Keating 7a32965
+ * then after successful return it's guaranteed that no more callbacks
Jesse Keating 7a32965
+ * to the @engine->ops vector will be made.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * The only exception is %SIGKILL (and exec or group-exit by another
Jesse Keating 7a32965
+ * thread in the group), which can cause asynchronous @report_death
Jesse Keating 7a32965
+ * and/or @report_reap callbacks even when %UTRACE_STOP was used.
Jesse Keating 7a32965
+ * (In that event, this fails with -%ESRCH or -%EALREADY, see above.)
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_STOP:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This asks that @target stop running.  This returns 0 only if
Jesse Keating 7a32965
+ * @target is already stopped, either for tracing or for job
Jesse Keating 7a32965
+ * control.  Then @target will remain stopped until another
Jesse Keating 7a32965
+ * utrace_control() call is made on @engine; @target can be woken
Jesse Keating 7a32965
+ * only by %SIGKILL (or equivalent, such as exec or termination by
Jesse Keating 7a32965
+ * another thread in the same thread group).
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This returns -%EINPROGRESS if @target is not already stopped.
Jesse Keating 7a32965
+ * Then the effect is like %UTRACE_REPORT.  A @report_quiesce or
Jesse Keating 7a32965
+ * @report_signal callback will be made soon.  Your callback can
Jesse Keating 7a32965
+ * then return %UTRACE_STOP to keep @target stopped.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This does not interrupt system calls in progress, including ones
Jesse Keating 7a32965
+ * that sleep for a long time.  For that, use %UTRACE_INTERRUPT.
Jesse Keating 7a32965
+ * To interrupt system calls and then keep @target stopped, your
Jesse Keating 7a32965
+ * @report_signal callback can return %UTRACE_STOP.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_RESUME:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Just let @target continue running normally, reversing the effect
Jesse Keating 7a32965
+ * of a previous %UTRACE_STOP.  If another engine is keeping @target
Jesse Keating 7a32965
+ * stopped, then it remains stopped until all engines let it resume.
Jesse Keating 7a32965
+ * If @target was not stopped, this has no effect.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_REPORT:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is like %UTRACE_RESUME, but also ensures that there will be
Jesse Keating 7a32965
+ * a @report_quiesce or @report_signal callback made soon.  If
Jesse Keating 7a32965
+ * @target had been stopped, then there will be a callback before it
Jesse Keating 7a32965
+ * resumes running normally.  If another engine is keeping @target
Jesse Keating 7a32965
+ * stopped, then there might be no callbacks until all engines let
Jesse Keating 7a32965
+ * it resume.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Since this is meaningless unless @report_quiesce callbacks will
Jesse Keating 7a32965
+ * be made, it returns -%EINVAL if @engine lacks %UTRACE_EVENT(%QUIESCE).
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_INTERRUPT:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This is like %UTRACE_REPORT, but ensures that @target will make a
Jesse Keating 7a32965
+ * @report_signal callback before it resumes or delivers signals.
Jesse Keating 7a32965
+ * If @target was in a system call or about to enter one, work in
Jesse Keating 7a32965
+ * progress will be interrupted as if by %SIGSTOP.  If another
Jesse Keating 7a32965
+ * engine is keeping @target stopped, then there might be no
Jesse Keating 7a32965
+ * callbacks until all engines let it resume.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This gives @engine an opportunity to introduce a forced signal
Jesse Keating 7a32965
+ * disposition via its @report_signal callback.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_SINGLESTEP:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * It's invalid to use this unless arch_has_single_step() returned true.
Jesse Keating 7a32965
+ * This is like %UTRACE_RESUME, but resumes for one user instruction only.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Note that passing %UTRACE_SINGLESTEP or %UTRACE_BLOCKSTEP to
Jesse Keating 7a32965
+ * utrace_control() or returning it from an event callback alone does
Jesse Keating 7a32965
+ * not necessarily ensure that stepping will be enabled.  If there are
Jesse Keating 7a32965
+ * more callbacks made to any engine before returning to user mode,
Jesse Keating 7a32965
+ * then the resume action is chosen only by the last set of callbacks.
Jesse Keating 7a32965
+ * To be sure, enable %UTRACE_EVENT(%QUIESCE) and look for the
Jesse Keating 7a32965
+ * @report_quiesce callback with a zero event mask, or the
Jesse Keating 7a32965
+ * @report_signal callback with %UTRACE_SIGNAL_REPORT.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Since this is not robust unless @report_quiesce callbacks will
Jesse Keating 7a32965
+ * be made, it returns -%EINVAL if @engine lacks %UTRACE_EVENT(%QUIESCE).
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * UTRACE_BLOCKSTEP:
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * It's invalid to use this unless arch_has_block_step() returned true.
Jesse Keating 7a32965
+ * This is like %UTRACE_SINGLESTEP, but resumes for one whole basic
Jesse Keating 7a32965
+ * block of user instructions.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * Since this is not robust unless @report_quiesce callbacks will
Jesse Keating 7a32965
+ * be made, it returns -%EINVAL if @engine lacks %UTRACE_EVENT(%QUIESCE).
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * %UTRACE_BLOCKSTEP devolves to %UTRACE_SINGLESTEP when another
Jesse Keating 7a32965
+ * tracing engine is using %UTRACE_SINGLESTEP at the same time.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+int utrace_control(struct task_struct *target,
Jesse Keating 7a32965
+		   struct utrace_engine *engine,
Jesse Keating 7a32965
+		   enum utrace_resume_action action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace *utrace;
Jesse Keating 7a32965
+	bool reset;
Jesse Keating 7a32965
+	int ret;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(action >= UTRACE_RESUME_MAX)) {
Jesse Keating 7a32965
+		WARN(1, "invalid action argument to utrace_control()!");
Jesse Keating 7a32965
+		return -EINVAL;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * This is a sanity check for a programming error in the caller.
Jesse Keating 7a32965
+	 * Their request can only work properly in all cases by relying on
Jesse Keating 7a32965
+	 * a follow-up callback, but they didn't set one up!  This check
Jesse Keating 7a32965
+	 * doesn't do locking, but it shouldn't matter.  The caller has to
Jesse Keating 7a32965
+	 * be synchronously sure the callback is set up to be operating the
Jesse Keating 7a32965
+	 * interface properly.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (action >= UTRACE_REPORT && action < UTRACE_RESUME &&
Jesse Keating 7a32965
+	    unlikely(!(engine->flags & UTRACE_EVENT(QUIESCE)))) {
Jesse Keating 7a32965
+		WARN(1, "utrace_control() with no QUIESCE callback in place!");
Jesse Keating 7a32965
+		return -EINVAL;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	utrace = get_utrace_lock(target, engine, true);
Jesse Keating 7a32965
+	if (unlikely(IS_ERR(utrace)))
Jesse Keating 7a32965
+		return PTR_ERR(utrace);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	reset = task_is_traced(target);
Jesse Keating 7a32965
+	ret = 0;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * ->exit_state can change under us, this doesn't matter.
Jesse Keating 7a32965
+	 * We do not care about ->exit_state in fact, but we do
Jesse Keating 7a32965
+	 * care about ->reap and ->death. If either flag is set,
Jesse Keating 7a32965
+	 * we must also see ->exit_state != 0.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (unlikely(target->exit_state)) {
Jesse Keating 7a32965
+		ret = utrace_control_dead(target, utrace, action);
Jesse Keating 7a32965
+		if (ret) {
Jesse Keating 7a32965
+			spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+			return ret;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+		reset = true;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	switch (action) {
Jesse Keating 7a32965
+	case UTRACE_STOP:
Jesse Keating 7a32965
+		mark_engine_wants_stop(target, engine);
Jesse Keating 7a32965
+		if (!reset && !utrace_do_stop(target, utrace))
Jesse Keating 7a32965
+			ret = -EINPROGRESS;
Jesse Keating 7a32965
+		reset = false;
Jesse Keating 7a32965
+		break;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	case UTRACE_DETACH:
Jesse Keating 7a32965
+		if (engine_wants_stop(engine))
Jesse Keating 7a32965
+			target->utrace_flags &= ~ENGINE_STOP;
Jesse Keating 7a32965
+		mark_engine_detached(engine);
Jesse Keating 7a32965
+		reset = reset || utrace_do_stop(target, utrace);
Jesse Keating 7a32965
+		if (!reset) {
Jesse Keating 7a32965
+			/*
Jesse Keating 7a32965
+			 * As in utrace_set_events(), this barrier ensures
Jesse Keating 7a32965
+			 * that our engine->flags changes have hit before we
Jesse Keating 7a32965
+			 * examine utrace->reporting, pairing with the barrier
Jesse Keating 7a32965
+			 * in start_callback().  If @target has not yet hit
Jesse Keating 7a32965
+			 * finish_callback() to clear utrace->reporting, we
Jesse Keating 7a32965
+			 * might be in the middle of a callback to @engine.
Jesse Keating 7a32965
+			 */
Jesse Keating 7a32965
+			smp_mb();
Jesse Keating 7a32965
+			if (utrace->reporting == engine)
Jesse Keating 7a32965
+				ret = -EINPROGRESS;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+		break;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	case UTRACE_RESUME:
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * This and all other cases imply resuming if stopped.
Jesse Keating 7a32965
+		 * There might not be another report before it just
Jesse Keating 7a32965
+		 * resumes, so make sure single-step is not left set.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		clear_engine_wants_stop(engine);
Jesse Keating 7a32965
+		if (likely(reset))
Jesse Keating 7a32965
+			user_disable_single_step(target);
Jesse Keating 7a32965
+		break;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	case UTRACE_BLOCKSTEP:
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Resume from stopped, step one block.
Jesse Keating 7a32965
+		 * We fall through to treat it like UTRACE_SINGLESTEP.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		if (unlikely(!arch_has_block_step())) {
Jesse Keating 7a32965
+			WARN(1, "UTRACE_BLOCKSTEP when !arch_has_block_step()");
Jesse Keating 7a32965
+			action = UTRACE_SINGLESTEP;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	case UTRACE_SINGLESTEP:
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Resume from stopped, step one instruction.
Jesse Keating 7a32965
+		 * We fall through to the UTRACE_REPORT case.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		if (unlikely(!arch_has_single_step())) {
Jesse Keating 7a32965
+			WARN(1,
Jesse Keating 7a32965
+			     "UTRACE_SINGLESTEP when !arch_has_single_step()");
Jesse Keating 7a32965
+			reset = false;
Jesse Keating 7a32965
+			ret = -EOPNOTSUPP;
Jesse Keating 7a32965
+			break;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	case UTRACE_REPORT:
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Make the thread call tracehook_notify_resume() soon.
Jesse Keating 7a32965
+		 * But don't bother if it's already been interrupted.
Jesse Keating 7a32965
+		 * In that case, utrace_get_signal() will be reporting soon.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		clear_engine_wants_stop(engine);
Jesse Keating 7a32965
+		if (action < utrace->resume) {
Jesse Keating 7a32965
+			utrace->resume = action;
Jesse Keating 7a32965
+			set_notify_resume(target);
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+		break;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	case UTRACE_INTERRUPT:
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * Make the thread call tracehook_get_signal() soon.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		clear_engine_wants_stop(engine);
Jesse Keating 7a32965
+		if (utrace->resume == UTRACE_INTERRUPT)
Jesse Keating 7a32965
+			break;
Jesse Keating 7a32965
+		utrace->resume = UTRACE_INTERRUPT;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * If it's not already stopped, interrupt it now.  We need
Jesse Keating 7a32965
+		 * the siglock here in case it calls recalc_sigpending()
Jesse Keating 7a32965
+		 * and clears its own TIF_SIGPENDING.  By taking the lock,
Jesse Keating 7a32965
+		 * we've serialized any later recalc_sigpending() after our
Jesse Keating 7a32965
+		 * setting of utrace->resume to force it on.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		if (reset) {
Jesse Keating 7a32965
+			/*
Jesse Keating 7a32965
+			 * This is really just to keep the invariant that
Jesse Keating 7a32965
+			 * TIF_SIGPENDING is set with UTRACE_INTERRUPT.
Jesse Keating 7a32965
+			 * When it's stopped, we know it's always going
Jesse Keating 7a32965
+			 * through utrace_get_signal() and will recalculate.
Jesse Keating 7a32965
+			 */
Jesse Keating 7a32965
+			set_tsk_thread_flag(target, TIF_SIGPENDING);
Jesse Keating 7a32965
+		} else {
Jesse Keating 7a32965
+			struct sighand_struct *sighand;
Jesse Keating 7a32965
+			unsigned long irqflags;
Jesse Keating 7a32965
+			sighand = lock_task_sighand(target, &irqflags);
Jesse Keating 7a32965
+			if (likely(sighand)) {
Jesse Keating 7a32965
+				signal_wake_up(target, 0);
Jesse Keating 7a32965
+				unlock_task_sighand(target, &irqflags);
Jesse Keating 7a32965
+			}
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+		break;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	default:
Jesse Keating 7a32965
+		BUG();		/* We checked it on entry.  */
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * Let the thread resume running.  If it's not stopped now,
Jesse Keating 7a32965
+	 * there is nothing more we need to do.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (reset)
Jesse Keating 7a32965
+		utrace_reset(target, utrace);
Jesse Keating 7a32965
+	else
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return ret;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+EXPORT_SYMBOL_GPL(utrace_control);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/**
Jesse Keating 7a32965
+ * utrace_barrier - synchronize with simultaneous tracing callbacks
Jesse Keating 7a32965
+ * @target:		thread to affect
Jesse Keating 7a32965
+ * @engine:		engine to affect (can be detached)
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * This blocks while @target might be in the midst of making a callback to
Jesse Keating 7a32965
+ * @engine.  It can be interrupted by signals and will return -%ERESTARTSYS.
Jesse Keating 7a32965
+ * A return value of zero means no callback from @target to @engine was
Jesse Keating 7a32965
+ * in progress.  Any effect of its return value (such as %UTRACE_STOP) has
Jesse Keating 7a32965
+ * already been applied to @engine.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * It's not necessary to keep the @target pointer alive for this call.
Jesse Keating 7a32965
+ * It's only necessary to hold a ref on @engine.  This will return
Jesse Keating 7a32965
+ * safely even if @target has been reaped and has no task refs.
Jesse Keating 7a32965
+ *
Jesse Keating 7a32965
+ * A successful return from utrace_barrier() guarantees its ordering
Jesse Keating 7a32965
+ * with respect to utrace_set_events() and utrace_control() calls.  If
Jesse Keating 7a32965
+ * @target was not properly stopped, event callbacks just disabled might
Jesse Keating 7a32965
+ * still be in progress; utrace_barrier() waits until there is no chance
Jesse Keating 7a32965
+ * an unwanted callback can be in progress.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+int utrace_barrier(struct task_struct *target, struct utrace_engine *engine)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	struct utrace *utrace;
Jesse Keating 7a32965
+	int ret = -ERESTARTSYS;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (unlikely(target == current))
Jesse Keating 7a32965
+		return 0;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	do {
Jesse Keating 7a32965
+		utrace = get_utrace_lock(target, engine, false);
Jesse Keating 7a32965
+		if (unlikely(IS_ERR(utrace))) {
Jesse Keating 7a32965
+			ret = PTR_ERR(utrace);
Jesse Keating 7a32965
+			if (ret != -ERESTARTSYS)
Jesse Keating 7a32965
+				break;
Jesse Keating 7a32965
+		} else {
Jesse Keating 7a32965
+			/*
Jesse Keating 7a32965
+			 * All engine state changes are done while
Jesse Keating 7a32965
+			 * holding the lock, i.e. before we get here.
Jesse Keating 7a32965
+			 * Since we have the lock, we only need to
Jesse Keating 7a32965
+			 * worry about @target making a callback.
Jesse Keating 7a32965
+			 * When it has entered start_callback() but
Jesse Keating 7a32965
+			 * not yet gotten to finish_callback(), we
Jesse Keating 7a32965
+			 * will see utrace->reporting == @engine.
Jesse Keating 7a32965
+			 * When @target doesn't take the lock, it uses
Jesse Keating 7a32965
+			 * barriers to order setting utrace->reporting
Jesse Keating 7a32965
+			 * before it examines the engine state.
Jesse Keating 7a32965
+			 */
Jesse Keating 7a32965
+			if (utrace->reporting != engine)
Jesse Keating 7a32965
+				ret = 0;
Jesse Keating 7a32965
+			spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+			if (!ret)
Jesse Keating 7a32965
+				break;
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+		schedule_timeout_interruptible(1);
Jesse Keating 7a32965
+	} while (!signal_pending(current));
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return ret;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+EXPORT_SYMBOL_GPL(utrace_barrier);
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * This is local state used for reporting loops, perhaps optimized away.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+struct utrace_report {
Jesse Keating 7a32965
+	u32 result;
Jesse Keating 7a32965
+	enum utrace_resume_action action;
Jesse Keating 7a32965
+	enum utrace_resume_action resume_action;
Jesse Keating 7a32965
+	bool detaches;
Jesse Keating 7a32965
+	bool spurious;
Jesse Keating 7a32965
+};
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+#define INIT_REPORT(var)			\
Jesse Keating 7a32965
+	struct utrace_report var = {		\
Jesse Keating 7a32965
+		.action = UTRACE_RESUME,	\
Jesse Keating 7a32965
+		.resume_action = UTRACE_RESUME,	\
Jesse Keating 7a32965
+		.spurious = true 		\
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * We are now making the report, so clear the flag saying we need one.
Jesse Keating 7a32965
+ * When there is a new attach, ->pending_attach is set just so we will
Jesse Keating 7a32965
+ * know to do splice_attaching() here before the callback loop.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static enum utrace_resume_action start_report(struct utrace *utrace)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	enum utrace_resume_action resume = utrace->resume;
Jesse Keating 7a32965
+	if (utrace->pending_attach ||
Jesse Keating 7a32965
+	    (resume > UTRACE_INTERRUPT && resume < UTRACE_RESUME)) {
Jesse Keating 7a32965
+		spin_lock(&utrace->lock);
Jesse Keating 7a32965
+		splice_attaching(utrace);
Jesse Keating 7a32965
+		resume = utrace->resume;
Jesse Keating 7a32965
+		if (resume > UTRACE_INTERRUPT)
Jesse Keating 7a32965
+			utrace->resume = UTRACE_RESUME;
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+	return resume;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static inline void finish_report_reset(struct task_struct *task,
Jesse Keating 7a32965
+				       struct utrace *utrace,
Jesse Keating 7a32965
+				       struct utrace_report *report)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	if (unlikely(report->spurious || report->detaches)) {
Jesse Keating 7a32965
+		spin_lock(&utrace->lock);
Jesse Keating 7a32965
+		if (utrace_reset(task, utrace))
Jesse Keating 7a32965
+			report->action = UTRACE_RESUME;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Complete a normal reporting pass, pairing with a start_report() call.
Jesse Keating 7a32965
+ * This handles any UTRACE_DETACH or UTRACE_REPORT or UTRACE_INTERRUPT
Jesse Keating 7a32965
+ * returns from engine callbacks.  If @will_not_stop is true and any
Jesse Keating 7a32965
+ * engine's last callback used UTRACE_STOP, we do UTRACE_REPORT here to
Jesse Keating 7a32965
+ * ensure we stop before user mode.  If there were no callbacks made, it
Jesse Keating 7a32965
+ * will recompute @task->utrace_flags to avoid another false-positive.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static void finish_report(struct task_struct *task, struct utrace *utrace,
Jesse Keating 7a32965
+			  struct utrace_report *report, bool will_not_stop)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	enum utrace_resume_action resume = report->action;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (resume == UTRACE_STOP)
Jesse Keating 7a32965
+		resume = will_not_stop ? UTRACE_REPORT : UTRACE_RESUME;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (resume < utrace->resume) {
Jesse Keating 7a32965
+		spin_lock(&utrace->lock);
Jesse Keating 7a32965
+		utrace->resume = resume;
Jesse Keating 7a32965
+		if (resume == UTRACE_INTERRUPT)
Jesse Keating 7a32965
+			set_tsk_thread_flag(task, TIF_SIGPENDING);
Jesse Keating 7a32965
+		else
Jesse Keating 7a32965
+			set_tsk_thread_flag(task, TIF_NOTIFY_RESUME);
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	finish_report_reset(task, utrace, report);
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+static void finish_callback_report(struct task_struct *task,
Jesse Keating 7a32965
+				   struct utrace *utrace,
Jesse Keating 7a32965
+				   struct utrace_report *report,
Jesse Keating 7a32965
+				   struct utrace_engine *engine,
Jesse Keating 7a32965
+				   enum utrace_resume_action action)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	if (action == UTRACE_DETACH) {
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * By holding the lock here, we make sure that
Jesse Keating 7a32965
+		 * utrace_barrier() (really get_utrace_lock()) sees the
Jesse Keating 7a32965
+		 * effect of this detach.  Otherwise utrace_barrier() could
Jesse Keating 7a32965
+		 * return 0 after this callback had returned UTRACE_DETACH.
Jesse Keating 7a32965
+		 * This way, a 0 return is an unambiguous indicator that any
Jesse Keating 7a32965
+		 * callback returning UTRACE_DETACH has indeed caused detach.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		spin_lock(&utrace->lock);
Jesse Keating 7a32965
+		engine->ops = &utrace_detached_ops;
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * If utrace_control() was used, treat that like UTRACE_DETACH here.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	if (engine->ops == &utrace_detached_ops) {
Jesse Keating 7a32965
+		report->detaches = true;
Jesse Keating 7a32965
+		return;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (action < report->action)
Jesse Keating 7a32965
+		report->action = action;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (action != UTRACE_STOP) {
Jesse Keating 7a32965
+		if (action < report->resume_action)
Jesse Keating 7a32965
+			report->resume_action = action;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		if (engine_wants_stop(engine)) {
Jesse Keating 7a32965
+			spin_lock(&utrace->lock);
Jesse Keating 7a32965
+			clear_engine_wants_stop(engine);
Jesse Keating 7a32965
+			spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+		}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+		return;
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	if (!engine_wants_stop(engine)) {
Jesse Keating 7a32965
+		spin_lock(&utrace->lock);
Jesse Keating 7a32965
+		/*
Jesse Keating 7a32965
+		 * If utrace_control() came in and detached us
Jesse Keating 7a32965
+		 * before we got the lock, we must not stop now.
Jesse Keating 7a32965
+		 */
Jesse Keating 7a32965
+		if (unlikely(engine->ops == &utrace_detached_ops))
Jesse Keating 7a32965
+			report->detaches = true;
Jesse Keating 7a32965
+		else
Jesse Keating 7a32965
+			mark_engine_wants_stop(task, engine);
Jesse Keating 7a32965
+		spin_unlock(&utrace->lock);
Jesse Keating 7a32965
+	}
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Apply the return value of one engine callback to @report.
Jesse Keating 7a32965
+ * Returns true if @engine detached and should not get any more callbacks.
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static bool finish_callback(struct task_struct *task, struct utrace *utrace,
Jesse Keating 7a32965
+			    struct utrace_report *report,
Jesse Keating 7a32965
+			    struct utrace_engine *engine,
Jesse Keating 7a32965
+			    u32 ret)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	report->result = ret & ~UTRACE_RESUME_MASK;
Jesse Keating 7a32965
+	finish_callback_report(task, utrace, report, engine,
Jesse Keating 7a32965
+			       utrace_resume_action(ret));
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * Now that we have applied the effect of the return value,
Jesse Keating 7a32965
+	 * clear this so that utrace_barrier() can stop waiting.
Jesse Keating 7a32965
+	 * A subsequent utrace_control() can stop or resume @engine
Jesse Keating 7a32965
+	 * and know this was ordered after its callback's action.
Jesse Keating 7a32965
+	 *
Jesse Keating 7a32965
+	 * We don't need any barriers here because utrace_barrier()
Jesse Keating 7a32965
+	 * takes utrace->lock.  If we touched engine->flags above,
Jesse Keating 7a32965
+	 * the lock guaranteed this change was before utrace_barrier()
Jesse Keating 7a32965
+	 * examined utrace->reporting.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	utrace->reporting = NULL;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * We've just done an engine callback.  These are allowed to sleep,
Jesse Keating 7a32965
+	 * though all well-behaved ones restrict that to blocking kalloc()
Jesse Keating 7a32965
+	 * or quickly-acquired mutex_lock() and the like.  This is a good
Jesse Keating 7a32965
+	 * place to make sure tracing engines don't introduce too much
Jesse Keating 7a32965
+	 * latency under voluntary preemption.
Jesse Keating 7a32965
+	 */
Jesse Keating 7a32965
+	might_sleep();
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	return engine->ops == &utrace_detached_ops;
Jesse Keating 7a32965
+}
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+/*
Jesse Keating 7a32965
+ * Start the callbacks for @engine to consider @event (a bit mask).
Jesse Keating 7a32965
+ * This makes the report_quiesce() callback first.  If @engine wants
Jesse Keating 7a32965
+ * a specific callback for @event, we return the ops vector to use.
Jesse Keating 7a32965
+ * If not, we return NULL.  The return value from the ops->callback
Jesse Keating 7a32965
+ * function called should be passed to finish_callback().
Jesse Keating 7a32965
+ */
Jesse Keating 7a32965
+static const struct utrace_engine_ops *start_callback(
Jesse Keating 7a32965
+	struct utrace *utrace, struct utrace_report *report,
Jesse Keating 7a32965
+	struct utrace_engine *engine, struct task_struct *task,
Jesse Keating 7a32965
+	unsigned long event)
Jesse Keating 7a32965
+{
Jesse Keating 7a32965
+	const struct utrace_engine_ops *ops;
Jesse Keating 7a32965
+	unsigned long want;
Jesse Keating 7a32965
+
Jesse Keating 7a32965
+	/*
Jesse Keating 7a32965
+	 * This barrier ensures that we've set utrace->reporting before
Jesse Keating 7a32965
+	 * we examine engine->flags or engine->ops.  utrace_barrier()
Jesse Keating