Discussion:
[issue18748] libgcc_s.so.1 must be installed for pthread_cancel to work
Maries Ionel Cristian
2013-08-15 13:10:42 UTC
Permalink
New submission from Maries Ionel Cristian:

Running the file couple of times will make the interpreter fail with: libgcc_s.so.1 must be installed for pthread_cancel to work
From what I've seen it is triggered from PyThread_delete_key (tries to load libgcc_s.so at that time).
How does it happen?

The main thread will close fd 4 (because fh object is getting dereferenced to 0) exactly at the same time libpthread will try to open and read libgcc_s.so with the same descriptor (4)

It's fairly obvious that the file handling in bug.py is a bug, but the interpreter should not crash like that !

This doesn't happen on python2.7. Also, python2.7 appears to be linked with libgcc_s.so.1 directly while the 3.x does not (I've tried 3.2 from ubuntu repos, and built 3.3 and 3.4 myself on ubuntu 12.04.2) - at least that's what ldd indicates.

----------
components: Build, Extension Modules
files: bug.py
messages: 195253
nosy: ionel.mc
priority: normal
severity: normal
status: open
title: libgcc_s.so.1 must be installed for pthread_cancel to work
type: crash
versions: Python 3.2, Python 3.3, Python 3.4
Added file: http://bugs.python.org/file31301/bug.py

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Charles-François Natali
2013-08-16 12:32:07 UTC
Permalink
Charles-Fran?ois Natali added the comment:

Unfortunately, there's not much we can do about it: if dlsym() fails - which is the case here either because read() fails with EBADF, or because the file descriptor now points to another stream (i.e. not libgcc), the libc aborts (here upon pthread_exit(), not PyThread_delete_key()):
"""
libgcc_s.so.1 must be installed for pthread_cancel to work

Program received signal SIGABRT, Aborted.
[Switching to Thread 0xb7b0eb70 (LWP 17152)]
0xb7fe1424 in __kernel_vsyscall ()
(gdb) bt
#0 0xb7fe1424 in __kernel_vsyscall ()
#1 0xb7e4e941 in *__GI_raise (sig=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64
#2 0xb7e51d72 in *__GI_abort () at abort.c:92
#3 0xb7e8ae15 in __libc_message (do_abort=1, fmt=0xb7f606f5 "%s") at ../sysdeps/unix/sysv/linux/libc_fatal.c:189
#4 0xb7e8af44 in *__GI___libc_fatal (message=0xb7fc75ec "libgcc_s.so.1 must be installed for pthread_cancel to work\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:200
#5 0xb7fc4ffa in pthread_cancel_init () at ../nptl/sysdeps/pthread/unwind-forcedunwind.c:65
#6 0xb7fc509d in _Unwind_ForcedUnwind (exc=0xb7b0edc0, stop=0xb7fc2bf0 <unwind_stop>, stop_argument=0xb7b0e454) at ../nptl/sysdeps/pthread/unwind-forcedunwind.c:126
#7 0xb7fc2b98 in *__GI___pthread_unwind (buf=<optimized out>) at unwind.c:130
#8 0xb7fbcce0 in __do_cancel () at pthreadP.h:265
#9 __pthread_exit (value=0x0) at pthread_exit.c:30
#10 0x08132ced in PyThread_exit_thread () at Python/thread_pthread.h:266
#11 0x08137c37 in t_bootstrap (boot_raw=0x8318aa8) at ./Modules/_threadmodule.c:1023
#12 0xb7fbbc39 in start_thread (arg=0xb7b0eb70) at pthread_create.c:304
#13 0xb7ef978e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
"""

So if you're unlucky and end up closing the FD referring to libgcc used by dlopen/dlsym, you're pretty much screwed, and there's no way around this.

Note that you specific problem (upon PyThread_delete_key()) doesn't occur for python 2.7 because it uses and ad-hoc TLS implementation, whereas Python 3 uses the platform native TLS.
But as noted above, you can very well get an abort on pthread_exit(), and in likely many other places (pretty much everywhere libgcc can be dlopen'ed).

Unfortunately, we can't do much against this, so I'm tempted to close this as "wont fix".

----------
nosy: +neologix
stage: -> committed/rejected
status: open -> pending

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Antoine Pitrou
2013-08-16 12:53:23 UTC
Permalink
Antoine Pitrou added the comment:

Perhaps the only thing we could do would be try to "preload" libgcc by calling one of those APIs at startup? But I'm not sure it's a good idea to add such a platform-specific hack (for what is arguably an obscure and rare issue).

----------
nosy: +pitrou
status: pending -> open

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
STINNER Victor
2013-08-16 13:00:43 UTC
Permalink
Changes by STINNER Victor <victor.stinner at gmail.com>:


----------
nosy: +haypo

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
STINNER Victor
2013-08-16 13:11:18 UTC
Permalink
STINNER Victor added the comment:

Maries Ionel Cristian> ubuntu 12.04.2

What is the version of your libc library? Try something like "dpkg -l libc6".

--

Could this issue be related to this glibc issue?
http://sourceware.org/bugzilla/show_bug.cgi?id=2644

I ran your script on Python 3.4 in a shell loop: I'm unable to reproduce the issue. I'm running Fedora 18 with the glibc 2.16.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
STINNER Victor
2013-08-16 13:14:45 UTC
Permalink
STINNER Victor added the comment:

Oh ok, I'm able to reproduce the issue with the system Python 3.3:

$ while true; do echo "loop"; python3.3 bug.py || break; done
loop
...
loop
libgcc_s.so.1 must be installed for pthread_cancel to work
Abandon (core dumped)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Charles-François Natali
2013-08-16 15:47:44 UTC
Permalink
Post by Antoine Pitrou
Perhaps the only thing we could do would be try to "preload" libgcc by
calling one of those APIs at startup?
Yeah, I was thinking about doing this in PyThread_init_thread() but...
Post by Antoine Pitrou
But I'm not sure it's a good idea to
add such a platform-specific hack (for what is arguably an obscure and rare
issue).
then thought the exact same thing.

Also, this exact problem can possibly show up anywhere dlopen() is
used (a grep through glibc code shows at least an occurrence in
nsswitch-related code).

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Maries Ionel Cristian
2013-08-17 18:57:52 UTC
Permalink
Post by STINNER Victor
What is the version of your libc library? Try something like "dpkg -l
libc6".
2.15-0ubuntu10.4

I don't think it's that obscure ... uwsgi has this issue
https://www.google.com/search?q=libgcc_s.so.1+must+be+installed+for+pthread_cancel+to+work+uwsgi+site:lists.unbit.it-
they cause it probably different but the point is that it's not
obscure.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Maries Ionel Cristian
2013-08-17 18:59:05 UTC
Permalink
Maries Ionel Cristian added the comment:

Correct link https://www.google.com/search?q=libgcc_s.so.1+must+be+installed+for+pthread_cancel+to+work+uwsgi+site:lists.unbit.it

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
STINNER Victor
2013-08-17 21:59:09 UTC
Permalink
Post by Maries Ionel Cristian
I don't think it's that obscure ... uwsgi has this issue
https://www.google.com/search?q=libgcc_s.so.1+must+be+installed+for+pthread_cancel+to+work+uwsgi+site:lists.unbit.it-
they cause it probably different but the point is that it's not
obscure.
The error message is maybe the same, the reason is completly different:
http://lists.unbit.it/pipermail/uwsgi/2013-July/006213.html
"IT WAS A limit-as issue"

The root cause is an arbitrary limit on the size of the virtual memory.

Here the bug is that a thread B closes a file descriptor opened in a
thread B by the glibc.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Maries Ionel Cristian
2013-08-17 22:22:21 UTC
Permalink
Maries Ionel Cristian added the comment:

Well anyway, is there any way to preload libgcc ? Because in python2.x it wasn't loaded at runtime.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
STINNER Victor
2013-08-17 22:44:04 UTC
Permalink
Post by Maries Ionel Cristian
Well anyway, is there any way to preload libgcc ? Because in python2.x it wasn't loaded at runtime.
On Linux, you can try to set the LD_PRELOAD environment variable as a
workaround.

LD_PRELOAD=libgcc_s.so.1 python bug.py

You may need to specify the full path.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Maries Ionel Cristian
2013-08-17 22:49:35 UTC
Permalink
Maries Ionel Cristian added the comment:

Alright ... would it be a very big hack to preload libgcc in the thread module (at import time) ? There is platform specific code there anyway, it wouldn't be such a big deal would it?

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Charles-François Natali
2013-08-18 15:00:26 UTC
Permalink
Post by STINNER Victor
On Linux, you can try to set the LD_PRELOAD environment variable as a
workaround.
LD_PRELOAD=libgcc_s.so.1 python bug.py
You may need to specify the full path.
I don't think that'll work.
Despite its name, using LD_PRELOAD won't "preload" the library. It
will only be loaded upon dlopen(). It just makes sure that symbols
will be looked for in this library first, even before the libc.
Post by STINNER Victor
Because in python2.x it wasn't loaded at runtime.
Yes it was. As explained above, you can get the very same crash upon
pthread_exit().
Post by STINNER Victor
Alright ... would it be a very big hack to preload libgcc in the thread module (at import
time) ?
IMO yes, but if someone writes a patch, I won't oppose to it :-)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________
Nikolay Bryskin
2014-02-26 21:49:25 UTC
Permalink
Changes by Nikolay Bryskin <devel.niks at gmail.com>:


----------
nosy: +nikicat

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue18748>
_______________________________________

Loading...