pdb has been, is and probably always will be the bread and butter of Python programmers when they need to find the root cause of a problem in their applications — it’s a built-in and easy to use debugger. But there are cases when pdb can’t help you, e.g. if your app is stuck somewhere, and you want to attach to a running process to find out why, without restarting it. This is where gdb shines.
Why gdb? Link to heading
gdb is a general purpose debugger that is mostly used for debugging C and C++ applications (although it actually supports Ada, Objective-C, Pascal and many other languages).
There are different reasons why a Python programmer would be interested in gdb for debugging:
gdb allows attaching to a running process without starting the app in the debug mode or modifying the source code (e.g. adding something like
import rpdb; rpdb.set_trace()
)gdb allows taking a core dump of a process and analyzing it later. This is useful when you don’t want to stop the process while you are introspecting its state, as well as when you perform post-mortem debugging of a process that has already failed (e.g. crashed with a segmentation fault)
Most debuggers available for Python (notable exceptions are winpdb and pydevd) do not support switching between threads of the application being debugged. gdb allows that, as well as debugging threads created by non-Python code (e.g. in a native library)
Debugging of interpreted languages Link to heading
So what makes Python special when using gdb?
Unlike with programming languages like C or C++, Python code is not compiled into a native binary for a target platform. Instead, there is an interpreter (e.g. CPython, the reference implementation of Python) which executes compiled byte-code.
This effectively means that when you attach to a Python process with gdb, you will be debugging the interpreter instance and introspecting the process state at the interpreter rather than the application level, i.e. you will see functions and variables of the interpreter, not of your app.
To give you an example, let’s take a look at the gdb backtrace of a CPython process:
#0 0x00007fcce9b2faf3 in __epoll_wait_nocancel () at ../sysdeps/unix/syscall-template.S:81
#1 0x0000000000435ef8 in pyepoll_poll (self=0x7fccdf54f240, args=<optimized out>, kwds=<optimized out>) at ../Modules/selectmodule.c:1034
#2 0x000000000049968d in call_function (oparg=<optimized out>, pp_stack=0x7ffc20d7bfb0) at ../Python/ceval.c:4020
#3 PyEval_EvalFrameEx () at ../Python/ceval.c:2666
#4 0x0000000000499ef2 in fast_function () at ../Python/ceval.c:4106
#5 call_function () at ../Python/ceval.c:4041
#6 PyEval_EvalFrameEx () at ../Python/ceval.c:2666
and the one obtained by calling traceback.extract_stack()
in Python:
/usr/local/lib/python2.7/dist-packages/eventlet/greenpool.py:82 in _spawn_n_impl
`func(*args, **kwargs)`
/opt/stack/neutron/neutron/agent/l3/agent.py:461 in _process_router_update
`for rp, update in self._queue.each_update_to_next_router():`
/opt/stack/neutron/neutron/agent/l3/router_processing_queue.py:154 in each_update_to_next_router
`next_update = self._queue.get()`
/usr/local/lib/python2.7/dist-packages/eventlet/queue.py:313 in get
`return waiter.wait()`
/usr/local/lib/python2.7/dist-packages/eventlet/queue.py:141 in wait
`return get_hub().switch()`
/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py:294 in switch
`return self.greenlet.switch()`
As is, the former is of little help when you are trying to find a problem in your Python code, and all you see is the current state of the interpreter itself.
However, the PyEval_EvalFrameEx function looks interesting: in CPython it executes the bytecode of Python application level functions and, thus, has access to their state — the very state we are interested in.
gdb and Python Link to heading
Search results for "gdb debug python"
can be confusing. Starting from gdb version 7,
it’s been possible to extend it with Python code, e.g. to provide visualisations for C++
STL types — it is much easier to implement that in Python rather than in the built-in
macro language.
To be able to debug Python programs and introspect the application level state, CPython developers decided to extend gdb and wrote a script for that in… Python, of course!
So those are two different but related things:
- gdb versions 7+ are extendable with Python
- There’s a Python gdb extension for debugging CPython processes
Debugging Python with gdb 101 Link to heading
First of all, you need to install gdb:
sudo apt-get install gdb
or
sudo yum install gdb
depending on the Linux distro you are using.
The next step is to install debugging symbols for the CPython build you have:
sudo apt-get install python-dbg
or
sudo yum install python-debuginfo
Some Linux distros like CentOS or RHEL ship debugging symbols separately from all other packages and recommend to install those like:
sudo debuginfo-install python
The installed debugging symbols will be used by CPython’s script for gdb
to analyze PyEval_EvalFrameEx
frames (a frame is a function call and its
associated state in the form of local variable and CPU register values)
and map those to application level functions in your code.
Without debugging symbols it is much harder to do — gdb allows you to manipulate the process memory in any way you want, but you can’t easily understand what data structures reside in what memory areas.
After all the preparatory steps have been completed, you can finally give gdb a try. To attach to a running CPython process, do:
gdb /usr/bin/python -p $PID
At this point, you can get an application level backtrace for the current
thread (note, that some frames are “missing” — this is expected, as gdb
counts all the interpreter level frames and only some of those will correspond
to calls in the application code — the PyEval_EvalFrameEx
calls):
(gdb) py-bt
#4 Frame 0x1b7da60, for file /usr/lib/python2.7/sched.py, line 111, in run (self=<scheduler(timefunc=<built-in function time>, delayfunc=<built-in function sleep>, _queue=[<Event at remote 0x7fe1f8c74a10>]) at remote 0x7fe1fa086758>, q=[...], delayfunc=<built-in function sleep>, timefunc=<built-in function time>, pop=<built-in function heappop>, time=<float at remote 0x1a0a400>, priority=1, action=<function at remote 0x7fe1fa083aa0>, argument=(171657,), checked_event=<...>, now=<float at remote 0x1b8ec58>)
delayfunc(time - now)
#7 Frame 0x1b87e90, for file /usr/bin/dstat, line 2416, in main (interval=1, user='ubuntu', hostname='rpodolyaka-devstack', key='unit_hi', linewidth=150, plugin='page', mods=('page', 'page24'), mod='page', pluginfile='dstat_page', scheduler=<scheduler(timefunc=<built-in function time>, delayfunc=<built-in function sleep>, _queue=[<Event at remote 0x7fe1f8c74a10>]) at remote 0x7fe1fa086758>)
scheduler.run()
#11 Frame 0x7fe1fa0bc5c0, for file /usr/bin/dstat, line 2554, in <module> ()
main()
or find out what exact line of the application code is currently being executed:
(gdb) py-list
106 pop = heapq.heappop
107 while q:
108 time, priority, action, argument = checked_event = q[0]
109 now = timefunc()
110 if now < time:
>111 delayfunc(time - now)
112 else:
113 event = pop(q)
114 # Verify that the event was not removed or altered
115 # by another thread after we last looked at q[0].
116 if event is checked_event:
or look at the values of local variables:
(gdb) py-locals
self = <scheduler(timefunc=<built-in function time>, delayfunc=<built-in function sleep>, _queue=[<Event at remote 0x7fe1f8c74a10>]) at remote 0x7fe1fa086758>
q = [<Event at remote 0x7fe1f8c74a10>]
delayfunc = <built-in function sleep>
timefunc = <built-in function time>
pop = <built-in function heappop>
time = <float at remote 0x1a0a400>
priority = 1
action = <function at remote 0x7fe1fa083aa0>
argument = (171657,)
checked_event = <Event at remote 0x7fe1f8c74a10>
now = <float at remote 0x1b8ec58>
There are more py-
commands provided by the CPython script for gdb.
Check out the debugging guide for details.
Gotchas Link to heading
Although the described technique should work out-of-the-box, there are a few known gotchas.
python-dbg Link to heading
The python-dbg
package in Debian and Ubuntu will not only install the
debugging symbols for the python
binary (which are stripped at the package
build time to save disk space), but also provide an additional CPython binary called
python-dbg
.
The latter essentially is a separate build of CPython (with the --with-pydebug
flag
passed to ./configure
) that has additional run-time checks. Generally, you don’t want
to use python-dbg
in production, as it can be (much) slower than python
. For instance:
$ time python -c "print(sum(range(1, 1000000)))"
499999500000
real 0m0.096s
user 0m0.057s
sys 0m0.030s
$ time python-dbg -c "print(sum(range(1, 1000000)))"
499999500000
[18318 refs]
real 0m0.237s
user 0m0.197s
sys 0m0.016s
The good news is that you don’t need to: it’s still possible to debug the (normal)
python
executable by the means of gdb, as long as the corresponding debugging
symbols are installed. python-dbg
just adds a bit more confusion to the
CPython/gdb story, and you can safely ignore its existence.
Build flags Link to heading
Some Linux distros build CPython with either the -g0
, or the -g1
option passed
to gcc: the former produces a binary without debugging information at all, and the latter
does not allow gdb to get the information about local variables at runtime.
Both of those options break the described workflow of debugging CPython processes
by the means of gdb. The solution is to rebuild CPython with -g
(which is the same
as -g2
).
Fortunately, all current versions of the major Linux distros (Ubuntu Trusty/Xenial, Debian Jessie, CentOS/RHEL 7) ship the “correctly” built CPython.
Optimized out frames Link to heading
For introspection to work properly, it’s crucial that the information about
PyEval_EvalFrameEx
arguments is preserved for each call. Depending on the
optimization level used in gcc when building CPython, or the specific
compiler version used, it’s possible that this information will be lost at
runtime (especially with aggressive optimizations enabled by -O3
). In that
case, gdb will show you something like:
(gdb) bt
#0 0x00007fdf3ca31be3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1 0x00000000005d1da4 in pysleep (secs=<optimized out>) at ../Modules/timemodule.c:1408
#2 time_sleep () at ../Modules/timemodule.c:231
#3 0x00000000004f5465 in call_function (oparg=<optimized out>, pp_stack=0x7fff62b184c0) at ../Python/ceval.c:4637
#4 PyEval_EvalFrameEx () at ../Python/ceval.c:3185
#5 0x00000000004f5194 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7fff62b185c0,
func=<optimized out>) at ../Python/ceval.c:4750
#6 call_function (oparg=<optimized out>, pp_stack=0x7fff62b185c0) at ../Python/ceval.c:4677
#7 PyEval_EvalFrameEx () at ../Python/ceval.c:3185
#8 0x00000000004f5194 in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7fff62b186c0,
func=<optimized out>) at ../Python/ceval.c:4750
#9 call_function (oparg=<optimized out>, pp_stack=0x7fff62b186c0) at ../Python/ceval.c:4677
#10 PyEval_EvalFrameEx () at ../Python/ceval.c:3185
#11 0x00000000005c5da8 in _PyEval_EvalCodeWithName.lto_priv.1326 () at ../Python/ceval.c:3965
#12 0x00000000005e9d7f in PyEval_EvalCodeEx () at ../Python/ceval.c:3986
#13 PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:777
#14 0x00000000005fe3d2 in run_mod () at ../Python/pythonrun.c:970
#15 0x000000000060057a in PyRun_FileExFlags () at ../Python/pythonrun.c:923
#16 0x000000000060075c in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:396
#17 0x000000000062b870 in run_file (p_cf=0x7fff62b18920, filename=0x1733260 L"test2.py", fp=0x1790190) at ../Modules/main.c:318
#18 Py_Main () at ../Modules/main.c:768
#19 0x00000000004cb8ef in main () at ../Programs/python.c:69
#20 0x00007fdf3c970610 in __libc_start_main (main=0x4cb810 <main>, argc=2, argv=0x7fff62b18b38, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7fff62b18b28) at libc-start.c:291
#21 0x00000000005c9df9 in _start ()
(gdb) py-bt
Traceback (most recent call first):
File "test2.py", line 9, in g
time.sleep(1000)
File "test2.py", line 5, in f
g()
(frame information optimized out)
i.e. some application level frames will be available, and some will not. There is little you can do at this point, except for rebuilding CPython with a lower optimization level, but that often is not an option for production (not to mention the fact you’ll be using a custom CPython build).
Update: actually, there is something you could do! This frame information optimized out
message simply tells you that gdb wasn’t able to figure out the location of the
PyFrameObject
data structure in a given stack frame (DWARF debugging symbols
allow gdb to calculate the addresses of local variables and function arguments). But
it has to be somewhere in memory — otherwise CPython would not be able to execute
your Python code.
On x86-64 machines the obvious place to check is the CPU registers: there are 16 general purpose CPU registers that compilers can use for storing the values of function call arguments and local variables.
The following command prints the values of all CPU registers in the selected stack frame:
(gdb) info registers
rax 0xfffffffffffffdfe -514
rbx 0x7ffff7fd7c20 140737353972768
rcx 0x7ffff7afaff7 140737348874231
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x7ffff7fd7d98 0x7ffff7fd7d98
rsp 0x7fffffffe3c0 0x7fffffffe3c0
r8 0x7fffffffe050 140737488347216
r9 0x0 0
r10 0x0 0
r11 0x246 582
r12 0x0 0
r13 0x7ffff7fae050 140737353801808
r14 0x7ffff7fae050 140737353801808
r15 0x0 0
rip 0x5555556468ca 0x5555556468ca <PyEval_EvalCodeEx+1754>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
But those are just integers. We need to tell gdb how to interpret them.
Note, that some of the numbers above clearly look like memory addresses. We can ask
gdb to interpret the value of a CPU register as a pointer to some data type. We know
that most of CPython runtime data structures are PyObject
’s which store information
about the actual type internally (e.g. the ->ob_type->tp_name
field contains the type
name encoded as a C-string).
So what we will do is try casting the value of each CPU register to PyObject*
and
see if we can find anything useful:
(gdb) p ((PyObject*) $rax)->ob_type->tp_name
Cannot access memory at address 0xfffffffffffffe06
If we give gdb a memory address that does not actually point to a PyObject
instance,
we will get an error on pointer dereference.
There are only so many CPU registers to check. And you can easily automate this search by the means of a helper gdb command similar to:
class LocatePyFrameObject(gdb.Command):
'Locate the CPU register that contains the value of PyFrameObject* in the selected stack frame'
REGISTERS = (
# x86-64 registers, that can be used for storing of local variables and function arguments
'rax', 'rbx', 'rcx', 'rdx',
'rsi', 'rdi',
'rbp', 'rsp',
'r8', 'r9', 'r10', 'r11', 'r12', 'r13', 'r14', 'r15',
)
def __init__(self):
super(LocatePyFrameObject, self).__init__(
'py-locate-frame',
gdb.COMMAND_DATA,
gdb.COMPLETE_NONE
)
def invoke(self, args, from_tty):
gdb_type = PyObjectPtr.get_gdb_type()
frame = gdb.selected_frame()
for register in self.REGISTERS:
try:
value = frame.read_register(register).cast(gdb_type)
if value['ob_type']['tp_name'].string() == 'frame':
print(register)
return
except gdb.MemoryError:
# if either cast or pointer dereference fails, then it's not a valid PyFrameObjectPtr*
continue
LocatePyFrameObject()
E.g., my CPython build puts the pointer to PyFrameObject
to the CPU register RBX:
(gdb) py-locate-frame
rbx
(gdb) p ((PyObject*) $rbx)->ob_type->tp_name
$28 = 0x5555557472ef "frame"
(gdb) p (PyFrameObject*) $rbx
$29 = Frame 0x7ffff7fd7c20, for file test2.py, line 12, in <module> ()
(gdb) p (PyObject*) $rbx
$30 = Frame 0x7ffff7fd7c20, for file test2.py, line 12, in <module> ()
Note, that libpython-gdb.py
enables pretty-printing of the PyFrameObject
structure, as well it is able to figure out a specific type of a given PyObject
automatically. So even if high-level commands like py-bt
do not work on such stack
frames, you will be able to get the very same information by pointing gdb to the
location of PyFrameObject
manually.
Of course, manually poking CPU registers and memory addresses is not pretty, but it can be the only way of debugging “optimized out” frames.
Virtual environments and custom CPython builds Link to heading
When a virtual environment is used, it may appear that the extension does not work:
(gdb) bt
#0 0x00007ff2df3d0be3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1 0x0000000000588c4a in ?? ()
#2 0x00000000004bad9a in PyEval_EvalFrameEx ()
#3 0x00000000004bfd1f in PyEval_EvalFrameEx ()
#4 0x00000000004bfd1f in PyEval_EvalFrameEx ()
#5 0x00000000004b8556 in PyEval_EvalCodeEx ()
#6 0x00000000004e91ef in ?? ()
#7 0x00000000004e3d92 in PyRun_FileExFlags ()
#8 0x00000000004e2646 in PyRun_SimpleFileExFlags ()
#9 0x0000000000491c23 in Py_Main ()
#10 0x00007ff2df30f610 in __libc_start_main (main=0x491670 <main>, argc=2, argv=0x7ffc36f11cf8, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7ffc36f11ce8) at libc-start.c:291
#11 0x000000000049159b in _start ()
(gdb) py-bt
Undefined command: "py-bt". Try "help".
gdb can still follow CPython frames, but the information about PyEval_EvalCodeEx
calls is not available.
If you scroll up the gdb output a bit, you’ll see that gdb failed to find
debugging symbols for the python
executable:
$ gdb -p 2975
GNU gdb (Debian 7.10-1+b1) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 2975
Reading symbols from /home/rpodolyaka/workspace/venvs/default/bin/python2...(no debugging symbols found)...done.
How is a virtual environment any different? Why did not gdb find the debugging symbols?
First and foremost, the path to the python
executable is different. Note, that I only
specified the id of the process to attach to. In this case, gdb will take the executable
file of the process (i.e. /proc/$PID/exe
on Linux).
One of the ways to separate debugging symbols is to put those into a well-known
directory (default is /usr/lib/debug/
, although it’s configured using the
debug-file-directory
option in gdb). In our case, gdb tried to load
debugging symbols from /usr/lib/debug/home/rpodolyaka/workspace/venvs/default/bin/python2
and,
obviously, did not find anything there.
The solution is simple – explicitly pass the path to the executable when running gdb:
$ gdb /usr/bin/python2.7 -p $PID
Thus, gdb will look for debugging symbols in the “right” place -
/usr/lib/debug/usr/bin/python2.7
.
It’s also worth mentioning that it’s possible that debugging symbols for a
particular executable are identified by a unique build-id
value stored
in ELF executable headers. E.g. CPython on my Debian machine:
$ objdump -s -j .note.gnu.build-id /usr/bin/python2.7
/usr/bin/python2.7: file format elf64-x86-64
Contents of section .note.gnu.build-id:
400274 04000000 14000000 03000000 474e5500 ............GNU.
400284 8d04a3ae 38521cb7 c7928e4a 7c8b1ed3 ....8R.....J|...
400294 85e763e4
In that case, gdb will look for debugging symbols using the build-id
value:
$ gdb /usr/bin/python2.7
GNU gdb (Debian 7.10-1+b1) 7.10
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/python2.7...Reading symbols from /usr/lib/debug/.build-id/8d/04a3ae38521cb7c7928e4a7c8b1ed385e763e4.debug...done.
done.
This means that it no longer matters how the executable is called:
virtualenv
just creates a copy of the specified interpreter executable, thus,
both executables - the one in /usr/bin/
and the one in your virtual environment
will use the very same debugging symbols:
$ gdb -p 11150
GNU gdb (ebian 7.10-1+b1) 7.10
Copyright () 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "how copying"
and "how warranty" for details.
This GDB was configured as "86_64-linux-gnu".
Type "how configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "elp".
Type "propos word" to search for commands related to "ord".
Attaching to process 11150
Reading symbols from /home/rpodolyaka/sandbox/testvenv/bin/python2.7...Reading symbols from
/usr/lib/debug/.build-id/8d/04a3ae38521cb7c7928e4a7c8b1ed385e763e4.debug...done.
$ ls -la /proc/11150/exe
lrwxrwxrwx 1 rpodolyaka rpodolyaka 0 Apr 10 15:18 /proc/11150/exe -> /home/rpodolyaka/sandbox/testvenv/bin/python2.7
The first problem is solved, bt
output now looks much nicer, but py-bt
command is still
undefined:
(gdb) bt
#0 0x00007f3e95083be3 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:84
#1 0x0000000000594a59 in floatsleep (secs=<optimized out>) at ../Modules/timemodule.c:948
#2 time_sleep.lto_priv () at ../Modules/timemodule.c:206
#3 0x00000000004c524a in call_function (oparg=<optimized out>, pp_stack=0x7ffefb5045b0) at ../Python/ceval.c:4350
#4 PyEval_EvalFrameEx () at ../Python/ceval.c:2987
#5 0x00000000004ca95f in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7ffefb504700,
func=0x7f3e95f78c80) at ../Python/ceval.c:4435
#6 call_function (oparg=<optimized out>, pp_stack=0x7ffefb504700) at ../Python/ceval.c:4370
#7 PyEval_EvalFrameEx () at ../Python/ceval.c:2987
#8 0x00000000004ca95f in fast_function (nk=<optimized out>, na=<optimized out>, n=<optimized out>, pp_stack=0x7ffefb504850,
func=0x7f3e95f78c08) at ../Python/ceval.c:4435
#9 call_function (oparg=<optimized out>, pp_stack=0x7ffefb504850) at ../Python/ceval.c:4370
#10 PyEval_EvalFrameEx () at ../Python/ceval.c:2987
#11 0x00000000004c32e5 in PyEval_EvalCodeEx () at ../Python/ceval.c:3582
#12 0x00000000004c3089 in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:669
#13 0x00000000004f263f in run_mod.lto_priv () at ../Python/pythonrun.c:1376
#14 0x00000000004ecf52 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362
#15 0x00000000004eb6d1 in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:948
#16 0x000000000049e2d8 in Py_Main () at ../Modules/main.c:640
#17 0x00007f3e94fc2610 in __libc_start_main (main=0x49dc00 <main>, argc=2, argv=0x7ffefb504c98, init=<optimized out>, fini=<optimized out>,
rtld_fini=<optimized out>, stack_end=0x7ffefb504c88) at libc-start.c:291
#18 0x000000000049db29 in _start ()
(gdb) py-bt
Undefined command: "py-bt". Try "help".
Once again, this is caused by the fact that the python
binary in a virtual
environment has a different path. By default, gdb will try to auto-load
Python extensions for a particular object file under debug, if they exist.
Specifically, gdb will look for ${objfile}-gdb.py
and try to source
it on
start:
(gdb) info auto-load
gdb-scripts: No auto-load scripts.
libthread-db: No auto-loaded libthread-db.
local-gdbinit: Local .gdbinit file was not found.
python-scripts:
Loaded Script
Yes /usr/share/gdb/auto-load/usr/bin/python2.7-gdb.py
If this has not been done, you can always do it manually:
(gdb) source /usr/share/gdb/auto-load/usr/bin/python2.7-gdb.py
e.g. if you want to test a new version of the gdb extension shipped with CPython.
PyPy, Jython, etc Link to heading
The described debugging technique is only feasible for the CPython interpreter,
as the gdb extension is specifically written to introspect the state
of CPython internals (e.g. PyEval_EvalFrameEx
calls).
For PyPy there is an open issue on Bitbucket, where it was proposed to provide integration with gdb, but looks like the attached patches have not been merged yet and the person who wrote those lost interest in this.
For Jython you could probably use standard tools for debugging of JVM applications, e.g. VisualVM.
Conclusion Link to heading
gdb is a powerful tool that allows one to debug complex problems with crashing or hanging CPython processes, as well as Python code that does calls to native libraries. On modern Linux distros debugging CPython processes with gdb must be as simple as installing of debugging symbols for the concrete interpreter build, although there are a few known gotchas, especially when virtual environments are used.