Base class that converts to and from Unicode.
More...
#include <rw/i18n/RWUConverterBase.h>
◆ RWUConverterBase() [1/2]
RWUConverterBase::RWUConverterBase |
( |
const char * | encoding | ) |
|
|
protected |
Constructs a converter for the character encoding scheme given by encoding, the US-ASCII name or alias of a character encoding scheme. See RWUAvailableEncodingList and RWUEncodingAliasList for lists the encodings and aliases recognized by the Internationalization Module.
- Exceptions
-
RWUException | Thrown to indicate that the converter could not be constructed. The exception carries one of the following status codes: |
◆ RWUConverterBase() [2/2]
Constructs a converter that is a deep copy of another converter. The new converter uses the same character encoding scheme as the original converter, and possesses the same internal state as the original converter.
Exercise care when copying converters, especially those used for stateful or multibyte encodings. The resulting converter may be in a state that causes the converter to produce errors if used to convert a new chunk of text. Consider using RWUToUnicodeConverter::reset() or RWUFromUnicodeConverter::reset() to restore the converter to a known state before use.
- Exceptions
-
RWUException | Thrown to indicate that the construction could not be completed because memory could not be allocated for the underlying implementation object. |
◆ ~RWUConverterBase()
RWUConverterBase::~RWUConverterBase |
( |
| ) |
|
|
inline |
◆ getCanonicalName()
RWCString RWUConverterBase::getCanonicalName |
( |
| ) |
const |
Returns the canonical name of the external character encoding scheme associated with this converter.
◆ getCurrentLocaleEncodingName()
static RWCString RWUConverterBase::getCurrentLocaleEncodingName |
( |
| ) |
|
|
static |
Returns the name of the encoding associated with the current native locale. The name returned by this method may be used to define a conversion that translates narrow characters and strings that have been encoded for the native locale into Unicode, and vice versa.
Some locales do not specify an encoding, so the execution environment must use a default encoding. For most platforms, this default is an US-ASCII encoding. If the active locale does not name an encoding, this method returns a value of ANSI_X3.4-1986
, the standard encoding name for the US-ASCII character set.
On UNIX and Linux, this method returns an RWCString containing the charmap name associated with the current ISO/POSIX LC_CTYPE
locale.
On Windows, this method returns an RWCString containing the name of the ANSI code page associated with the current thread locale. The name consists of the letters cp
followed by the decimal number of the code page (for example, cp1252
).
- Note
- When used under UNIX or Linux, this method determines the current charmap by using popen() to launch a new shell in which to execute the command
locale -k charmap
. The command is executed with the same LC_CTYPE
locale setting as the calling process. This action is much more expensive than a simple function call, so you may want to cache the value. If the shell cannot be created, or if the command fails with an error, the default value ANSI_X3.4-1986
is returned.
◆ getDefaultEncodingName()
RWCString RWUConverterBase::getDefaultEncodingName |
( |
| ) |
|
|
inlinestatic |
Returns a string containing the name of the current default encoding. This string may be empty if the framework was unable to determine the default encoding from the process environment.
If a default encoding name has not been set using setDefaultEncodingName(), the internationalization framework first attempts to retrieve the default encoding name from the execution environment, and if unable to do so, uses the value returned from getCurrentLocaleEncodingName() as the default.
◆ getLocalizedName()
void RWUConverterBase::getLocalizedName |
( |
const RWULocale & | locale, |
|
|
RWUString & | result ) const |
Sets the given RWUString result with the localized name of the converter, if one exists for the given locale, or the internal US-ASCII name of the converter if no localized name exists.
◆ getMaxBytesPerChar()
size_t RWUConverterBase::getMaxBytesPerChar |
( |
| ) |
const |
|
inline |
Returns an integer value from 1
to 4
representing the maximum number of bytes required to describe a code point in the external character encoding scheme.
◆ getMinBytesPerChar()
size_t RWUConverterBase::getMinBytesPerChar |
( |
| ) |
const |
|
inline |
Returns an integer value of 1
or 2
representing the minimum number of bytes required to describe a code point in the external character encoding scheme.
◆ operator!=()
Returns true
if self provides a different conversion than rhs; otherwise, false
.
Two converters are considered not equal if their names are aliases for two different conversions.
◆ operator=()
Assignment operator. Makes self a deep copy of rhs. Self uses the same character encoding scheme as rhs, and possesses the same internal state as rhs.
Exercise care when copying converters, especially those used for stateful or multibyte encodings. The resulting converter may be in a state that causes the converter to produce errors if used to convert a new chunk of text. Consider using RWUToUnicodeConverter::reset() or RWUFromUnicodeConverter::reset() to restore the converter to a known state before use.
- Exceptions
-
RWUException | Thrown to indicate that the assignment could not be completed because memory could not be allocated for the underlying implementation object. |
◆ operator==()
Returns true
if self provides the same conversion as rhs; otherwise, false
.
The name of self does not need to match the name of the other converter; two converters are considered equal if their names are aliases for the same conversion.
◆ setDefaultEncodingName()
static void RWUConverterBase::setDefaultEncodingName |
( |
const char * | encoding | ) |
|
|
static |
Sets the current default encoding name. The name must identify a valid encoding, but no attempt is made to validate the name.
- Note
- This method is not thread-safe. When the macro
U_CHARSET_IS_UTF8
is equal to 1
, this method is unable to change the default encoding of "UTF-8".