SourcePro® API Reference Guide

 
Loading...
Searching...
No Matches
RWUTF8Helper Class Reference

Provides common functionality used to encode and decode UTF-8 sequences. More...

#include <rw/stream/RWUTF8Helper.h>

Public Types

enum  EncodingCategory {
  oneByte , twoBytes , threeBytes , fourBytes ,
  highSurrogate , missingLowSurrogate , lowSurrogateWithoutHighSurrogate , invalidUTF8Encoding
}
 

Static Public Member Functions

static EncodingCategory decodeFirstByte (RWByte b)
 
static EncodingCategory decodeFourBytesEncoding (RWByte firstByte, RWByte secondByte, RWByte thirdByte, RWByte fourthByte, RWUChar &highSurrogateValue, RWUChar &lowSurrogateValue)
 
static EncodingCategory decodeThreeBytesEncoding (RWByte firstByte, RWByte secondByte, RWByte thirdByte, RWUChar &res)
 
static EncodingCategory decodeTwoBytesEncoding (RWByte firstByte, RWByte secondByte, RWUChar &res)
 
static EncodingCategory encodeOneUChar (RWUChar uc, RWByte *res, RWUChar highSurrogateValue=0)
 

Detailed Description

The class RWUTF8Helper provides common functionality used to encode and decode UTF-8 sequences.

Member Enumeration Documentation

◆ EncodingCategory

 

Enumerator
oneByte 

One byte encoding form of UTF-8

twoBytes 

Two bytes encoding form of UTF-8

threeBytes 

Three bytes encoding form of UTF-8

fourBytes 

Four bytes encoding from of UTF-8

highSurrogate 

The character to be encoded is a high surrogate

missingLowSurrogate 

No low surrogate after a high surrogate

lowSurrogateWithoutHighSurrogate 

A low surrogate was not preceded by a high surrogate

invalidUTF8Encoding 

The encoding is not recognized as UTF-8

Member Function Documentation

◆ decodeFirstByte()

static EncodingCategory RWUTF8Helper::decodeFirstByte ( RWByte b)
static

Takes the first byte of a UTF-8 byte sequence encoding a single UTF-16 character, and returns the encoding category to which it belongs. Throws no exceptions.

Parameters
bThe first byte of a UTF-8 byte sequence encoding a single UTF-16 character

◆ decodeFourBytesEncoding()

static EncodingCategory RWUTF8Helper::decodeFourBytesEncoding ( RWByte firstByte,
RWByte secondByte,
RWByte thirdByte,
RWByte fourthByte,
RWUChar & highSurrogateValue,
RWUChar & lowSurrogateValue )
static

Decodes a four-byte UTF-8 sequence. The function returns invalidUTF8Encoding in case the four-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.

Parameters
firstByteThe first byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
secondByteThe second byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
thirdByteThe third byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
fourthByteThe fourth byte of a UTF-8 four-byte sequence encoding a single UTF-16 character.
highSurrogateValueThe UTF-16 high surrogate resulting from the decoding of the four-byte UTF-8 sequence.
lowSurrogateValueThe UTF-16 low surrogate resulting from the decoding of the four-byte UTF-8 sequence.

◆ decodeThreeBytesEncoding()

static EncodingCategory RWUTF8Helper::decodeThreeBytesEncoding ( RWByte firstByte,
RWByte secondByte,
RWByte thirdByte,
RWUChar & res )
static

Decodes a three-byte encoding UTF-8 sequence. The function returns invalidUTF8Encoding if the three-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.

Parameters
firstByteThe first byte of a UTF-8 three-byte sequence encoding a single UTF-16 character.
secondByteThe second byte of a UTF-8 three-byte sequence encoding a single UTF-16 character.
thirdByteThe third byte of a UTF-8 three-byte sequence encoding a single UTF-16 character.
resThe UTF-16 character resulting from the decoding of the three-byte UTF-8 sequence

◆ decodeTwoBytesEncoding()

static EncodingCategory RWUTF8Helper::decodeTwoBytesEncoding ( RWByte firstByte,
RWByte secondByte,
RWUChar & res )
static

Decodes a two-byte encoding UTF-8 sequence. The function returns invalidUTF8Encoding in case the two-byte sequence doesn't represent a valid UTF-8 encoding sequence. Throws no exceptions.

Parameters
firstByteThe first byte of a UTF-8 two-byte sequence encoding a single UTF-16 character.
secondByteThe second byte of a UTF-8 two-byte sequence encoding a single UTF-16 character.
resThe UTF-16 character resulting from the decoding of the two-byte UTF-8 sequence

◆ encodeOneUChar()

static EncodingCategory RWUTF8Helper::encodeOneUChar ( RWUChar uc,
RWByte * res,
RWUChar highSurrogateValue = 0 )
static

Encodes the UTF-16 character uc according to UTF-8. The function returns the UTF-8 encoding category that was used to convert the UTF-16 character, or an error if the UTF-16 character could not be transformed. Throws no exceptions.

Parameters
ucThe UTF-16 character to be transformed.
resA pointer to a byte array containing at least four bytes. The byte array is used to store the transformation result.
highSurrogateValueThis parameter is only used when a high surrogate was previously encountered.

Copyright © 2024 Rogue Wave Software, Inc., a Perforce company. All Rights Reserved.