HydraExpress™ C++ 2019 |
HydraExpress™ C++ API Reference Guide |
Product Documentation: HydraExpress C++ Documentation Home |
A simple XML pull-parser that implements reference semantics. More...
#include <rwsf/core/XmlReader.h>
Public Types | |
enum | NodeType { StartTag, EndTag, EmptyTag, Data, Unknown } |
Public Member Functions | |
XmlReader () | |
XmlReader (const char *buf, size_t length) | |
XmlReader (const unsigned char *buf, size_t length) | |
XmlReader (const std::string &document) | |
void | addNamespace (const rwsf::XmlNamespace &ns) |
bool | eof () |
rwsf::XmlReader | getElementReader (const rwsf::XmlReaderName &name=rwsf::XmlReaderName::Empty) |
std::string | getEncoding () const |
bool | getExpandAttributeReference () const |
bool | getExpandCommentReference () const |
bool | getExpandContentReference () const |
rwsf::XmlAttributeSet | getLastAttributes () const |
std::string | getLastContent () const |
rwsf::XmlName | getLastName () const |
NodeType | getLastNodeType () const |
std::string | getPrefixForURI (const std::string &uri) const |
std::string | getStandalone () const |
std::string | getURIForPrefix (const std::string &prefix) const |
std::string | getVersion () const |
bool | hasEncoding () const |
bool | hasStandalone () const |
bool | isElementNext (const rwsf::XmlName &name) |
bool | isElementNext (const std::string &name) |
std::string | readElement (const rwsf::XmlName &name=NullName) |
std::string | readElement (const std::string &name) |
void | readElementEnd (const rwsf::XmlName &name) |
void | readElementEnd () |
rwsf::XmlAttributeSet | readElementStart (const rwsf::XmlName &name) |
void | readElementStart () |
std::string | readElementValue () |
void | readNextNode () |
std::string | readWellFormedElement (const rwsf::XmlName &name=NullName) |
void | setExpandAttributeReference (bool expandReference) |
void | setExpandCommentReference (bool expandComment) |
void | setExpandContentReference (bool expandReference) |
Public Member Functions inherited from rwsf::HandleBase | |
bool | isValid (void) const |
bool | operator!= (const HandleBase &second) const |
bool | operator== (const HandleBase &second) const |
Static Public Attributes | |
static rwsf::XmlName | NullName |
Additional Inherited Members | |
Protected Member Functions inherited from rwsf::HandleBase | |
HandleBase (void) | |
HandleBase (StaticCtor) | |
HandleBase (BodyBase *body) | |
HandleBase (const HandleBase &second) | |
virtual | ~HandleBase (void) |
BodyBase & | body (void) const |
HandleBase & | operator= (const HandleBase &second) |
Class rwsf::XmlReader is a simple XML pull-parser. The XML document is typically parsed element by element using readElement(), or by iteratively calling readElementStart(), readElementValue(), and readElementEnd(). On each read, an XmlReader instance sets its internal state with information about the content it just read. Member functions getLastNodeType(), getLastName(), and getLastContent() can then be used to retrieve portions of the rwsf::XmlReader's state.
rwsf::XmlReader throws an exception of type rwsf::XmlParseException when it encounters XML that is not well-formed. The rwsf::XmlParseException exception contains a description of the error and the line and column number of the source document where the error occurred.
rwsf::XmlReader can parse documents in the encodings UTF-8, UTF-16(BE), UTF-16LE, US-ASCII, and ISO-8859-1. In addition, if the rwsf_icu
library is present, rwsf::XmlReader also converts from any character encodings supported by the ICU.
Please see the XML Binding Development Guide for further information on conversions and custom converters.
Currently, rwsf::XmlReader provides support only for reading elements and their content. No support for reading processing instructions, DOCTYPE declarations, or entity declarations is provided.
Enumeration of different node types in XML.
rwsf::XmlReader::XmlReader | ( | ) |
Default constructor. Constructs an invalid reader.
rwsf::XmlReader::XmlReader | ( | const char * | buf, |
size_t | length | ||
) |
Constructs a reader from the document pointed to by buf, which is length bytes long. Parses the prolog of the document if found, and determines the document encoding both from the encoding=
specifier in the optional XML declaration, and from a guess based on the first few bytes of the document. Upon construction, the reader is placed before the first tag in the document.
rwsf::XmlReader::XmlReader | ( | const unsigned char * | buf, |
size_t | length | ||
) |
Constructs a reader from the document pointed to by buf, which is length bytes long. Parses the prolog of the document if found, and determines the document encoding both from the encoding=
specifier in the optional XML declaration, and from a guess based on the first few bytes of the document. Upon construction, the reader is placed before the first tag in the document.
rwsf::XmlReader::XmlReader | ( | const std::string & | document | ) |
Convenience constructor for converting from an std::string. Constructs a reader from the XML document in document. Parses the prolog of the document if found, and determines the encoding used by document, both from the encoding=
specifier in the optional XML declaration, and from a guess based on the first few bytes of the document. Upon construction, the reader is placed before the first tag in the document.
void rwsf::XmlReader::addNamespace | ( | const rwsf::XmlNamespace & | ns | ) |
Adds ns to the list of namespaces known by the reader. This method is useful when parsing document fragments where namespaces are declared outside the scope of the fragment.
bool rwsf::XmlReader::eof | ( | ) |
Returns true
if at the end of the current document; false
otherwise.
rwsf::XmlReader rwsf::XmlReader::getElementReader | ( | const rwsf::XmlReaderName & | name = rwsf::XmlReaderName::Empty | ) |
Returns a new rwsf::XmlReader instance for the current element, as if the current element in its entirety were this new document's root. This new reader copies the state of the parent reader, but its internal cursor is set to the beginning of the element, so that functions like readElementStart(), readElementValue(), etc. return the current element's information. The parent reader will have its cursor advanced past the element, so any of the parent reader's read() functions return the next element's information instead.
rwsf::XmlParseException | The current element's name is not the provided name. |
std::string rwsf::XmlReader::getEncoding | ( | ) | const |
Returns the name of the encoding of the original source document, either from the XML declaration's "encoding="
declaration, or as automatically sensed from the first few bytes of the XML document.
bool rwsf::XmlReader::getExpandAttributeReference | ( | ) | const |
Returns true
if the reader expands entity references in attributes, false
otherwise. See setExpandAttributeReference() for an example of usage.
bool rwsf::XmlReader::getExpandCommentReference | ( | ) | const |
Returns true
if the reader expands comments found in XML content, false
otherwise. See setExpandCommentReference() for an example of usage.
bool rwsf::XmlReader::getExpandContentReference | ( | ) | const |
Returns true
if the reader expands XML references in content, false
otherwise. See setExpandContentReference() for an example of usage.
rwsf::XmlAttributeSet rwsf::XmlReader::getLastAttributes | ( | ) | const |
Returns the set of attributes associated with the last node read of type rwsf::XmlReader::StartTag.
std::string rwsf::XmlReader::getLastContent | ( | ) | const |
Returns the last content read, for nodes of type rwsf::XmlReader::Data. This value is undefined if the last node read was not of type rwsf::XmlReader::Data.
rwsf::XmlName rwsf::XmlReader::getLastName | ( | ) | const |
Returns the name of the last node read. This value is undefined if the last node read was of type rwsf::XmlReader::Data.
NodeType rwsf::XmlReader::getLastNodeType | ( | ) | const |
Returns the type of the last node read. See NodeType for more information on the NodeType enumeration.
std::string rwsf::XmlReader::getPrefixForURI | ( | const std::string & | uri | ) | const |
Looks up the provided uri in the current list of namespaces and returns the corresponding prefix. If the current list of namespaces does not contain the uri, returns the empty string.
std::string rwsf::XmlReader::getStandalone | ( | ) | const |
Returns the value of the source document's "standalone="
declaration if it exists, the empty string otherwise.
std::string rwsf::XmlReader::getURIForPrefix | ( | const std::string & | prefix | ) | const |
Looks up the provided prefix in the current list of namespaces, returns the corresponding URI. If the current list of namespaces does not contain the prefix, returns the empty string.
std::string rwsf::XmlReader::getVersion | ( | ) | const |
Returns the value of the source document's "version="
declaration if it exists, the empty string otherwise.
bool rwsf::XmlReader::hasEncoding | ( | ) | const |
Returns true
if the source XML document explicitly specifies an encoding. Returns false
if the document does not specify an encoding, i.e. the encoding was automatically determined from the first few bytes of the XML document.
bool rwsf::XmlReader::hasStandalone | ( | ) | const |
Returns true
if a "standalone="
declaration exists in the source document's XML declaration.
bool rwsf::XmlReader::isElementNext | ( | const rwsf::XmlName & | name | ) |
Returns true
if name is the next element.
bool rwsf::XmlReader::isElementNext | ( | const std::string & | name | ) |
Returns true
if name is the next element.
std::string rwsf::XmlReader::readElement | ( | const rwsf::XmlName & | name = NullName | ) |
Reads in the next element from the current document and returns the entire element. A name can be provided, in which case the element's name must match, or an exception is thrown.
This method returns the entire XML for the element, rooted at the element (in other words, the element's start and end tag will be a part of the resulting string). Also returned is all content and child tags with their content. In effect, the method grabs the element wholesale and gives it to you in string form.
If a qualified name is required for name, name must be an instance of XmlName. Any element or type name used in an std::string is considered an unqualified local name, even if it contains a namespace prefix and/or URI.
rwsf::XmlParseException | The current element's name is not the provided name. |
rwsf::XmlParseException | The element's XML is invalid or malformed. |
std::string rwsf::XmlReader::readElement | ( | const std::string & | name | ) |
Reads in the next element from the current document and returns the entire element. A name can be provided, in which case the element's name must match, or an exception is thrown.
This method returns the entire XML for the element, rooted at the element (in other words, the element's start and end tag will be a part of the resulting string). Also returned is all content and child tags with their content. In effect, the method grabs the element wholesale and gives it to you in string form.
If a qualified name is required for name, name must be an instance of XmlName. Any element or type name used in an std::string is considered an unqualified local name, even if it contains a namespace prefix and/or URI.
rwsf::XmlParseException | The current element's name is not the provided name. |
rwsf::XmlParseException | The element's XML is invalid or malformed. |
void rwsf::XmlReader::readElementEnd | ( | const rwsf::XmlName & | name | ) |
Reads the next node in the document. If the node is not an end tag matching name, throws an exception of type rwsf::XmlParseException.
rwsf::XmlParseException | The current element's name is not the provided name. |
rwsf::XmlParseException | The next tag is not an end tag. |
void rwsf::XmlReader::readElementEnd | ( | ) |
Reads the next node in the document. If the node is not an end tag, throws an exception.
rwsf::XmlParseException | The next tag is not an end tag. |
rwsf::XmlAttributeSet rwsf::XmlReader::readElementStart | ( | const rwsf::XmlName & | name | ) |
Reads the next node in the document. If the node is not a start tag, or the node's name does not match name, throws an exception of type rwsf::XmlParseException. Returns any attributes found inside the tag.
rwsf::XmlParseException | The current element's name is not the provided name. |
rwsf::XmlParseException | The next tag is not a start tag. |
void rwsf::XmlReader::readElementStart | ( | ) |
Reads the next node in the document. If the node is not a start tag, throws an exception.
rwsf::XmlParseException | The next tag is not a start tag. |
std::string rwsf::XmlReader::readElementValue | ( | ) |
Reads and returns the next element content from the document. The element's start or end tags are not included in the returned string. If getExpandCommentReference() returns false, comments will not be included in the output. If getExpandContentReference() returns false, the output will contain entity references (<
;, >
;, etc.). Otherwise, comments are printed and entity references unescaped, respectively.
void rwsf::XmlReader::readNextNode | ( | ) |
Reads the next start tag, empty tag, end tag, or content from the document. Use getLastNodeType(), getLastName(), and getLastContent() to retrieve information on what was read. If a well-formedness error is encountered while reading the document, an exception of type rwsf::XmlParseException is thrown.
std::string rwsf::XmlReader::readWellFormedElement | ( | const rwsf::XmlName & | name = NullName | ) |
This method functions exactly like readElement(), except that it adds namespace declarations to the element's start tag to allow the element to be well formed. This includes namespaces declared on parent elements that are in use by this element or one of its children. You can expect that the element alone will be able to resolve its namespaces internally, even if they were declared external to this element.
rwsf::XmlParseException | The current element's name is not the provided name. |
rwsf::XmlParseException | The element's XML is invalid or malformed. |
void rwsf::XmlReader::setExpandAttributeReference | ( | bool | expandReference | ) |
Sets whether the reader expands entity references in attributes. For example, when expandReference is true
(the default), the reader converts the attribute value, like so:
to:
void rwsf::XmlReader::setExpandCommentReference | ( | bool | expandComment | ) |
Sets whether the reader expands comments found in XML content. The default is expandComment = false
.
When expandComment is true
, the reader keeps the comment in the element value returned from readElement():
to:
If expandComment is false
(the default), the above example is converted to:
void rwsf::XmlReader::setExpandContentReference | ( | bool | expandReference | ) |
Sets whether the reader expands entity references in content. For example, when expandReference is true
(the default), the reader converts the element value returned from readElement(), like so:
to:
|
static |
Static constant rwsf::XmlName that contains an empty prefix and an empty namespace URI.
Copyright © 2019 Rogue Wave Software, Inc. All Rights Reserved. |