Executive summary: This package consists of two module, wstring and intl. The module wstring is similar to the module string, only that the strings consist of wide characters rather than bytes. The module intl is a wrapper around the Posix/ANSI C internationalization and localization libraries. The package is now of historic interest only, as most of its functionality was integrated into Python 2. If you have questions or comments, please e-mail me at loewis@informatik.hu-berlin.de Module description: wstring is built on top of a C module wstrop. A Python-only implementation is planned, but not yet finished. The C module uses the type wchar_t of the C library if available, or unsigned int otherwise. It supports encoding and decoding of UCS-2, UCS-4, UTF-7, UTF-8 and UTF-16, as well as ISO_8859-1. Other encodings to various character sets are installable, there is a module iso8859, which covers the ISO_8859-x for x in range(2,11). #ifdef 0 Where available, there are also functions that support locale-sensitive operations on wide strings (currently only strcoll). #endif For details on individual functions and methods, check their __doc__ strings. intl is a wrapper around the functions provided by locale.h and libintl.h. Both modules have been tested on Linux only, although the functionality is available on other systems, as well. The locale support consists of intl.setlocale, which makes other Python functions locale-aware, and intl.localeconv, which returns a dictionary of locale information. The libintl support has the *gettext family of function, which allows to write multilingual programs. The following examples could give an idea of possible applications: import wstring,iso8859,koi8 tmp = wstring.decode("KOI-8R",in) out = tmp.encode("CYRILLIC") This converts between two character sets used for storing Russian text, KOI-8R and ISO_8859-5:1988 (aka CYRILLIC, aka ISO-IR-144). The translation mechanism supports alias character set names. Character set names are all upper case and US-ASCII. The mapping of 8-byte character sets to Unicode is done using dictionaries. For multibyte encodings, translation functions can be installed. I'd happily include support for the JIS character sets, if somebody can point me to libraries that convert from and to Unicode. result=[] for l in s.readlines(): uni=wstring.from_utf7(l,wstring.SKIP_INVALID)) result.append(uni.encode("L1",wstring.SKIP_INVALID)) This converts UTF-7 (the Unicode encoding used for RFC 822 messages) to Latin-1, dealing with conversion errors both in the UTF-7 and in the 8859-1 gracefully, rather than throwing exceptions. The following example demonstrates the internationalization: import intl,sys _=intl.gettext intl.textdomain("test1") intl.bindtextdomain("test1",".") capitals=( _("Warsaw"), _("Moscow"), _("some more") ) i=1 print _("The capital is %s.") % capitals[i] If the environment variable LANG is set to "de", this program will print Die Hauptstadt ist Moskau.