HTTP and International Documents
The HTTP package allows documents in different locales and character sets to be downloaded and processed in a C++ application. When combined with the Internationalization Module of SourcePro Core, they provide a complete solution for working with documents in various character sets.
Example 21 illustrates using HTTP with the Internationalization Module.
NOTE: The following example uses classes from the Internationalization Module of SourcePro Core. For more information on the Internationalization Module, refer to the
Internationalization Module User’s Guide and
SourcePro API Reference Guide.
Example 21 – Using HTTP with the Internationalization Module
// Create a URL for the target web page.
RWURL url("http://www.amazon.co.jp/");
// Create a string to hold the charset, default to US-ASCII.
RWCString charset = "US-ASCII";
// Connect to the web server and retrieve the page specified.
RWHttpAgent agent;
RWHttpReply reply = agent.executeGet(url);
// Check and see if a Content-Type header is present.
RWHttpHeaderList headers = reply.getHeaders();
size_t index = headers.index("Content-Type");
if (index != RW_NPOS)
{
// A Content-Type header is present, extract it.
RWHttpContentTypeHeader ctHeader(headers[index]);
// Check and see if a charset is present.
RWCString tmp = ctHeader.getParameterValue("charset");
if (!tmp.isNull())
{
// We found an alternate charset.
charset = tmp;
}
}
// Create converters from the original charset of the message
// to UTF-8.
RWUToUnicodeConverter fromMsgCharset(charset);
RWUFromUnicodeConverter toUtf8("UTF-8");
// Create a RWUString from the body of the message.
RWUString body(reply.getBody(), fromMsgCharset);
// Output the body of the message as UTF-8.
cout << body.toBytes(toUtf8) << endl;
return 0;