Python 2.x: Encodings for sys.stdout/stderr() and os module

os module

With Python 2.x on Windows the functions of the os module seem to require the encoding specified in os.getfilesystemencoding() to output/use German umlauts (äÄüÜöÖ) and “ess-zett” (ß) properly:

Probably/typically wrong:


import os
os.system("echo ä")

Output:


0

Probably/typically correct:


import os, sys
encoded_in_cp850 = "echo ä"
encoded_in_unicode = encoded_in_cp850.decode("cp850")
encoded_in_fs_encoding = encoded_in_unicode.encode(sys.getfilesystemencoding()
os.system(encoded_in_fs_encoding)

(“cp850″ is the codepage on my German Windows 7.)

(In short: os.system("echo ä".decode("cp850").encode(sys.getfilesystemencoding())))

Output:

ä
0

The above probably applies to os.getcwd() and others as well.

sys.stdout/stderr

And with Python 2.x on Windows the encoding used by sys.stdout and sys.stderr are available in the “encoding” property of sys.stdout and sys.stderr:

Possibly wrong when the source of “s” uses a different encoding than sys.stdout:


sys.stdout.write(s)

Probably right when the source of “s” uses a different encoding than sys.stdout:


s_unicode = s.decode("")
s_stdout_encoding = s_unicode.encode(sys.stdout.encoding)
sys.stdout.write(s_stdout_encoding)

( must be the encoding of “s”.)

(In short: sys.stdout.write(s.decode("").encode(sys.stdout.encoding)))

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.