keiths / rpms / gdb

Forked from rpms/gdb 5 months ago
Clone
d7bd836
http://sourceware.org/ml/gdb-patches/2009-12/msg00364.html
d7bd836
Subject: [patch] related_breakpoint stale ref crash fix
d7bd836
d7bd836
Hi,
d7bd836
d7bd836
getting occasional random:
d7bd836
 PASS: gdb.threads/local-watch-wrong-thread.exp: local watchpoint still triggers
d7bd836
 PASS: gdb.threads/local-watch-wrong-thread.exp: let thread_function0 return
d7bd836
 PASS: gdb.threads/local-watch-wrong-thread.exp: breakpoint on thread_function0's caller
d7bd836
-PASS: gdb.threads/local-watch-wrong-thread.exp: local watchpoint automatically deleted
d7bd836
+ERROR: Process no longer exists
d7bd836
+UNRESOLVED: gdb.threads/local-watch-wrong-thread.exp: local watchpoint automatically deleted
d7bd836
d7bd836
It is even reproducible on HEAD using "input" file below and:
d7bd836
	valgrind ../gdb -nx 
d7bd836
d7bd836
(gdb) (gdb) Breakpoint 6 at 0x400685: file ./gdb.threads/local-watch-wrong-thread.c, line 47.
d7bd836
(gdb) ==31759== Invalid write of size 4
d7bd836
==31759==    at 0x601A11: bpstat_check_breakpoint_conditions (breakpoint.c:3482)
d7bd836
==31759==    by 0x601C70: bpstat_stop_status (breakpoint.c:3596)
d7bd836
==31759==    by 0x65D228: handle_inferior_event (infrun.c:3589)
d7bd836
==31759==    by 0x65A563: wait_for_inferior (infrun.c:2281)
d7bd836
==31759==    by 0x659AA0: proceed (infrun.c:1883)
d7bd836
==31759==    by 0x65300B: continue_1 (infcmd.c:668)
d7bd836
==31759==    by 0x653282: continue_command (infcmd.c:760)
d7bd836
==31759==    by 0x5C51D8: do_cfunc (cli-decode.c:67)
d7bd836
==31759==    by 0x5C824F: cmd_func (cli-decode.c:1738)
d7bd836
==31759==    by 0x48335A: execute_command (top.c:450)
d7bd836
==31759==    by 0x67273E: command_handler (event-top.c:511)
d7bd836
==31759==    by 0x672E53: command_line_handler (event-top.c:736)
d7bd836
==31759==  Address 0xbbdc950 is 240 bytes inside a block of size 336 free'd
d7bd836
==31759==    at 0x4A04D72: free (vg_replace_malloc.c:325)
d7bd836
==31759==    by 0x486E4B: xfree (utils.c:1286)
d7bd836
==31759==    by 0x60BC35: delete_breakpoint (breakpoint.c:8708)
d7bd836
==31759==    by 0x60BDAF: delete_command (breakpoint.c:8765)
d7bd836
==31759==    by 0x5C51D8: do_cfunc (cli-decode.c:67)
d7bd836
==31759==    by 0x5C824F: cmd_func (cli-decode.c:1738)
d7bd836
==31759==    by 0x48335A: execute_command (top.c:450)
d7bd836
==31759==    by 0x67273E: command_handler (event-top.c:511)
d7bd836
==31759==    by 0x672E53: command_line_handler (event-top.c:736)
d7bd836
==31759==    by 0x672FCF: gdb_readline2 (event-top.c:817)
d7bd836
==31759==    by 0x6725F7: stdin_event_handler (event-top.c:433)
d7bd836
==31759==    by 0x670CDE: handle_file_event (event-loop.c:812)
d7bd836
==31759== 
d7bd836
d7bd836
Watchpoint 4 deleted because the program has left the block in
d7bd836
which its expression is valid.
d7bd836
d7bd836
d7bd836
There is already automatic deletion of RELATED_BREAKPOINT in
d7bd836
map_breakpoint_numbers but "delete breakpoints" (for all the breakpoints)
d7bd836
calls delete_breakpoint from delete_command directly without calling
d7bd836
map_breakpoint_numbers and it does not delete the associated
d7bd836
bp_watchpoint_scope.
d7bd836
d7bd836
I find the attached patch is right for delete_breakpoint itself as such
d7bd836
function should not leave stale references in the leftover data structures.
d7bd836
How well could be other code cleaned up with this patch in place I have not
d7bd836
targeted by this patch.
d7bd836
d7bd836
d7bd836
The existing code expects accessibility of freed memory and discusses the
d7bd836
current stale references problem:
d7bd836
d7bd836
void
d7bd836
delete_breakpoint (struct breakpoint *bpt)
d7bd836
[...]
d7bd836
  /* Has this bp already been deleted?  This can happen because multiple
d7bd836
     lists can hold pointers to bp's.  bpstat lists are especial culprits.
d7bd836
d7bd836
     One example of this happening is a watchpoint's scope bp.  When the
d7bd836
     scope bp triggers, we notice that the watchpoint is out of scope, and
d7bd836
     delete it.  We also delete its scope bp.  But the scope bp is marked
d7bd836
     "auto-deleting", and is already on a bpstat.  That bpstat is then
d7bd836
     checked for auto-deleting bp's, which are deleted.
d7bd836
d7bd836
     A real solution to this problem might involve reference counts in bp's,
d7bd836
     and/or giving them pointers back to their referencing bpstat's, and
d7bd836
     teaching delete_breakpoint to only free a bp's storage when no more
d7bd836
     references were extent.  A cheaper bandaid was chosen.  */
d7bd836
  if (bpt->type == bp_none)
d7bd836
    return;
d7bd836
[...]
d7bd836
  bpt->type = bp_none;
d7bd836
d7bd836
  xfree (bpt);
d7bd836
}
d7bd836
d7bd836
d7bd836
While fixing this part may be difficult I find the attached patch easy enough
d7bd836
fixing the IMO currently most common crash due to it.
d7bd836
d7bd836
No regressions on {x86_64,x86_64-m32,i686}-fedora12-linux-gnu.
d7bd836
d7bd836
d7bd836
Thanks,
d7bd836
Jan
d7bd836
d7bd836
d7bd836
"input":
d7bd836
set height 0
d7bd836
set width 0
d7bd836
set confirm no
d7bd836
file ../testsuite/gdb.threads/local-watch-wrong-thread
d7bd836
set can-use-hw-watchpoints 1
d7bd836
break main
d7bd836
run
d7bd836
break local-watch-wrong-thread.c:36
d7bd836
continue
d7bd836
delete breakpoints
d7bd836
watch *myp
d7bd836
continue
d7bd836
delete breakpoints
d7bd836
echo MAKE watch\n
d7bd836
watch *myp if trigger != 0
d7bd836
echo MAKE break\n
d7bd836
break local-watch-wrong-thread.c:60
d7bd836
info break
d7bd836
continue
d7bd836
echo DELETE five\n
d7bd836
delete 5
d7bd836
set trigger=1
d7bd836
continue
d7bd836
set *myp=0
d7bd836
break local-watch-wrong-thread.c:47
d7bd836
continue
d7bd836
d7bd836
d7bd836
d7bd836
2009-12-23  Jan Kratochvil  <jan.kratochvil@redhat.com>
d7bd836
d7bd836
	* breakpoint.c (delete_breakpoint <bpt->related_breakpoint != NULL>):
d7bd836
	New.
d7bd836
d7bd836
--- a/gdb/breakpoint.c
d7bd836
+++ b/gdb/breakpoint.c
d7bd836
@@ -8649,6 +8649,16 @@ delete_breakpoint (struct breakpoint *bpt)
d7bd836
   if (bpt->type == bp_none)
d7bd836
     return;
d7bd836
 
d7bd836
+  /* At least avoid this stale reference until the reference counting of
d7bd836
+     breakpoints gets resolved.  */
d7bd836
+  if (bpt->related_breakpoint != NULL)
d7bd836
+    {
d7bd836
+      gdb_assert (bpt->related_breakpoint->related_breakpoint == bpt);
d7bd836
+      bpt->related_breakpoint->disposition = disp_del_at_next_stop;
d7bd836
+      bpt->related_breakpoint->related_breakpoint = NULL;
d7bd836
+      bpt->related_breakpoint = NULL;
d7bd836
+    }
d7bd836
+
d7bd836
   observer_notify_breakpoint_deleted (bpt->number);
d7bd836
 
d7bd836
   if (breakpoint_chain == bpt)
d7bd836