Discussion:
[issue25339] sys.stdout.errors is set to "surrogateescape"
Serhiy Storchaka
2015-10-08 06:51:28 UTC
Permalink
New submission from Serhiy Storchaka:

The error handler of sys.stdout and sys.stdin is set to "surrogateescape" even for non-ASCII encoding.

$ LANG= PYTHONIOENCODING=UTF-8 ./python -c 'import sys; print(sys.stdout.encoding, sys.stdout.errors)'
UTF-8 surrogateescape

----------
components: IO
messages: 252515
nosy: haypo, ncoghlan, serhiy.storchaka
priority: normal
severity: normal
status: open
title: sys.stdout.errors is set to "surrogateescape"
type: behavior
versions: Python 3.5, Python 3.6

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25339>
_______________________________________
STINNER Victor
2015-10-14 16:35:15 UTC
Permalink
STINNER Victor added the comment:

Sorry, I don't understand the issue. Do you consider that using surrogateescape is a bug?

Which behaviour do you expect?

Python 3.5 now uses surrogateescape by default for stdout and stderr when the locale is POSIX. I guess that you got the POSIX locale using "LANG=".

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25339>
_______________________________________
Serhiy Storchaka
2015-10-14 17:02:57 UTC
Permalink
Serhiy Storchaka added the comment:

I'm not sure this is a bug, but it looks at least unexpected, that surrogateescape is used with non-ASCII encoding. For example my last test for issue19058 fails on POSIX locale in 3.5+, and it is not so easy to make it working.

May be change error handler to surrogateescape only if PYTHONIOENCODING is not specified?

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25339>
_______________________________________
STINNER Victor
2015-10-14 20:12:38 UTC
Permalink
STINNER Victor added the comment:

"it looks at least unexpected, that surrogateescape is used with non-ASCII encoding"

What do you mean by non-ASCII encoding? surrogateescape is used by all encodings for all OS operations on Python 3, like os.listdir(), even for UTF-8.

----------

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25339>
_______________________________________
Serhiy Storchaka
2015-11-10 20:50:50 UTC
Permalink
Serhiy Storchaka added the comment:

The default encoding of sys.stdio and sys.stdout is determined by (in order of increasing precedence):

1. locale
2. PYTHONIOENCODING
3. Py_SetStandardStreamEncoding()

The default error handler before 3.5 was determined by:

1. 'strict'
2. PYTHONIOENCODING
3. Py_SetStandardStreamEncoding()

The default error handler since 3.5 (issue19977) is determined by:

1. PYTHONIOENCODING
2. locale
3. Py_SetStandardStreamEncoding()

Even if you explicitly specified the error handler by PYTHONIOENCODING, it doesn't have effect in POSIX locale. This doesn't look right to me. I think the order should be the same as for encoding.

Proposed patch makes PYTHONIOENCODING to override locale default for error handler.

----------
keywords: +patch
stage: -> patch review
Added file: http://bugs.python.org/file41004/default_io_error_handle.patch

_______________________________________
Python tracker <***@bugs.python.org>
<http://bugs.python.org/issue25339>
_______________________________________

Loading...