Oracle® XML Developer's Kit Programmer's Guide 10g Release 1 (10.1) Part Number B10794-01 |
|
|
View PDF |
This chapter contains these topics:
Oracle XDK C/C++ components are built on W3C recommendations. The list of supported standards for release 10.1 are:
XML 1.0 (Second Edition)
DOM Level 2.0 Specifications
DOM Level 2.0 Core
DOM Level 2.0 Traversal and Range
SAX 2.0 and SAX Extensions
XSLT/XPath Specifications
XSL Transformations (XSLT) 1.0
XML Path Language (XPath) 1.0
XML Schema Specifications
XML Schema Part 0: Primer
XML Schema Part 1: Structures
XML Schema Part 2: Datatypes
XDK C components are the basic building blocks for reading, manipulating, transforming, and validating XML documents. Oracle XDK C components consist of the following:
XML Parser for C: checks if an XML document is well-formed, and optionally validates it against a DTD. The parser constructs an object tree which can be accessed via a DOM interface or operates serially via a SAX interface.
XSLT Processor for C: provides the ability to format an XML document according to a stylesheet bundled with the parser.
XVM: high performance XSLT transformation engine.
XML Schema Processor for C: supports parsing and validating XML files against an XML Schema definition file.
If you have installed the Oracle database or iAS (Application Server), you already have the XDK C components installed. You can also download the latest versions of XDK C components from OTN by following these steps:
Go to the URL:
http://otn.oracle.com/tech/xml/content.html
Click the Software link in the right-hand bar.
Logon with your OTN username and password (registration is free if you don't already have an account).
Select the Windows or UNIX version to download.
Accept all conditions in the licensing agreement.
Click the appropriate *.tar.gz
or *.zip
file.
Extract the files in the distribution:
Choose a directory under which you would like the xdk directory and subdirectories to go.
Change to that directory; then extract the XDK download archive file using:
UNIX: tar xvfz xdk_xxx.tar.gz Windows: use WinZip visual archive extraction tool
After installing the UNIX version of XDK, the directory structure is:
-$XDK_HOME | - bin: executable files | - lib: library files | - nls/data: Globalization Support data files(*.nlb) | - xdk | - demo/c: demonstration code | - doc/c: documentation | - public: header files | - mesg: message files (*.msb)
Here are all the libraries that come with the UNIX version of XDK C components:
Table 13-1 XDK C Components Libraries
Component | Library | Notes |
---|---|---|
XML Parser
XSLT Processor XML Schema Processor |
libxml10.a | XML Parser for C, which includes DOM, SAX, and XSLT APIs
XML Schema Processor for C |
The XDK C components (UNIX) depend on the Oracle CORE and Globalization Support libraries in the following table:
The parser may be called as an executable by invoking bin/xml, which has the following options:
Table 13-3 Parser Command Line Options
Option | Meaning |
---|---|
-c |
Conformance check only, no validation |
-e encoding |
Specify default input file encoding ("incoding") |
-E encoding |
Specify DOM/SAX encoding ("outcoding") |
-f file |
File - Interpret as filespec, not URI |
-h |
Help - show usage help and full list of flags |
-i n |
Number of times to iterate the XSLT processing |
-l language |
Language for error reporting |
-n |
Traverse DOM and report number of elements |
-o XSLoutfile |
Specify output file of XSLT processor |
-p |
Print document after parsing |
-r |
Do not ignore <xsl:output> instruction in XSLT processing |
-s stylesheet |
Style sheet - specifies the XSL style sheet |
-v |
Version - display parser version and then exit |
-V var value |
To test top level variables in CXSLT |
-w |
Whitespace - preserve all whitespace |
-W |
Warning - stop parsing after a warning |
-x |
SAX - exercise SAX interface and print document |
Check if the environment variable ORA_NLS10
is set to point to the location of the Globalization Support data files. If you install the Oracle database, you can set it to be:
setenv ORA_NLS10 ${ORACLE_HOME}/nls/data
If no Oracle database is installed, you can use the Globalization Support data files that come with the XDK release by setting:
setenv ORA_NLS10 ${XDK_HOME}/nls/data
Error message files are provided in the mesg
subdirectory. Files ending in .msb
are machine-readable and needed at runtime; files ending in .msg
are human-readable and contain cause and action descriptions for each error. The messages files also exist in the $ORACLE_HOME/xdk/mesg
directory.
If you do not have an ORACLE_HOME, check if the environment variable ORA_XML_MESG
is set to point to the absolute path of the mesg directory. If the Oracle database is installed, you can set ORA_XML_MESG
, although this is not required:
setenv ORA_XML_MESG ${ORACLE_HOME}/xdk/mesg
If no Oracle database is installed, you must set the environment variable ORA_XML_MESG
to point to the absolute path of the mesg
subdirectory:
setenv ORA_XML_MESG ${XDK_HOME}/xdk/mesg
The parser may also be invoked by writing code to use the supplied APIs. The code must be compiled using the headers in the include
subdirectory and linked against the libraries in the lib
subdirectory. See Makefile
in the demo
subdirectory for full details of how to build your program.
To get the XDK version you are using on UNIX:
strings libxml10.a | grep -i Version
These are the Windows libraries that come with the XDK C components:
Table 13-4 XDK C Components Libraries on Windows
Component | Library | Notes |
---|---|---|
XML Parser
XSL Processor XML Schema Processor |
oraxml10.lib
oraxml10.dll |
XML Parser for C, which includes DOM, SAX, and XSLT APIs
XML Schema Processor for C |
The XDK C components (Windows) depend on the Oracle CORE and Globalization Support libraries in the following table:
Table 13-5 Dependent Libraries of XDK C Components on Windows
Component | Library | Notes |
---|---|---|
CORE Library | oracore10.dll | Oracle CORE library |
Globalization Support Library | oranls10.dll | Oracle Globalization Support common library |
Globalization Support Library | oraunls10.dll | Oracle Globalization Support library for Unicode support |
For the parser and schema validator options, see Table 13-3, "Parser Command Line Options ".
Check that the environment variable ORA_NLS10
is set to point to the location of the Globalization Support encoding definition files. You can set it this way:
setenv ORA_NLS10 %ORACLE_HOME%\nls\data
If no Oracle database is installed, you can use the Globalization Support encoding definition files that come with the XDK release (a subset of which are in the Oracle database):
set ORA_NLS10 =%XDK_HOME%\nls\data
Error message files are provided in the mesg
subdirectory. Files ending in .msb
are machine-readable and needed at runtime; files ending in .msg
are human-readable and include cause and action descriptions for each error. The messages files also exist in the $ORACLE_HOME/xdk/mesg
directory.
If there is an Oracle database installed, you can set ORA_XML_MESG
, although this is not required:
set ORA_XML_MESG =%ORACLE_HOME%\xdk\mesg
If no Oracle database is installed, you must set the environment variable ORA_XML_MESG
to point to the absolute path of the mesg
subdirectory:
set ORA_XML_MESG =%XDK_HOME%\xdk\mesg
In order to compile the sample code, you set the path for the cl
compiler.
Go to the Start Menu and select Settings > Control Panel. In the pop-up window of Control Panel, select System icon and double click. A window named System Properties pops up. Select Environment Tab and input the path of cl.exe
to the PATH variable shown in Figure 13-1, "Setting the Path for the cl Compiler in Windows".
You need to update the Make.bat
by adding the path of the libraries and the header files to the compile and link commands as shown in the following example of a Make.bat
file:
:COMPILE set filename=%1 cl -c -Fo%filename%.obj %opt_flg% /DCRTAPI1=_cdecl /DCRTAPI2=_cdecl /nologo /Zl /Gy /DWIN32 /D_WIN32 /DWIN_NT /DWIN32COMMON /D_DLL /D_MT /D_X86_=1 /Doratext=OraText -I. -I..\..\..\include - ID:\Progra~1\Micros~1\VC98\Include %filename%.c goto :EOF :LINK set filename=%1 link %link_dbg% /out:..\..\..\..\bin\%filename%.exe /libpath:%ORACLE_HOME%\lib /libpath:D:\Progra~1\Micros~1\VC98\lib /libpath:..\..\..\..\lib %filename%.obj oraxml10.lib oracore10.lib oranls10.lib oraunls10.lib user32.lib kernel32.lib msvcrt.lib ADVAPI32.lib oldnames.lib winmm.lib :EOF
where:
D:\Progra~1\Micros~1\VC98\Include:
is the path for header files and D:\Progra~1\Micros~1\VC98\lib:
is the path for library files.
If you are using Microsoft Visual C++ compiler:
Check that the environment variable ORA_NLS10
is set to point to the location of the Globalization Support data files.
In order to use Visual C++, you need to employ the system setup for Windows to define the environment variable.
Go to Start Menu and select Settings > Control Panel. In the pop up window of Control Panel, select System icon and double click. A window named System Properties pops up. Select Environment Tab and input ORA_NLS10
, and its value d:\xdk\nls\data
, as shown in Figure 13-2:
Figure 13-2 Setting Up the ORA_NLS10 Environment Variable
Check that the environment variable ORA_XML_MESG
is set to point to the absolute path of the mesg
directory.
In order for Visual C++ to use the environment variable, you need to employ the system setup for Windows to define the environment variable.
Go to the Start Menu and select Settings > Control Panel. In the pop-up window of Control Panel, select System icon and double click. A window named System Properties pops up. Select Environment Tab and input ORA_XML_MESG
, as in Figure 13-3, (the illustrations show screens for a previous release).
Figure 13-4 shows the setup of the PATH for DLLs:
After you open a workspace in Visual C++ and include the *.c
files for your project, you must set the path for the project. Go to the Tools menu and select Options. A window will pop up. Select the Directory tab and set your include path as shown in Figure 13-5:
Then set your library path as shown in Figure 13-6:
After setting the paths for the static libraries in %XDK_HOME%\lib
, you also need to set the library name in the compiling environment of Visual C++.
Go to the Project menu in the menu bar and select Settings. A window pops up. Please select the Link tab in the Object/Library Modules field enter the name of XDK C components libraries, as shown in Figure 13-7:
Figure 13-7 Setting Up the Static Libraries in Visual C++ Project
Optionally, compile and run the demo programs. Then you can start using C XDK components.
The parser supports over 300 IANA character sets. These character sets include the following:
UTF-8, UTF-16, UTF16-BE, UTF16-LE, US-ASCII, ISO-10646-UCS-2, ISO-8859-{1-9, 13-15}, EUC-JP, SHIFT_JIS, BIG5, GB2312, GB_2312-80, HZ-GB-2312, KOI8-R, KSC5601, EUC-KR, ISO-2022-CN, ISO-2022-JP, ISO-2022-KR, WINDOWS-{1250-1258}, EBCDIC-CP-{US,CA,NL,WT,DK,NO,FI,SE,IT,ES,GB,FR,HE,BE,CH,ROECE,YU,IS,AR1}, IBM{037,273,277,278,280,284,285,297,420,424,437,500,775,850,852,855,857,00858, 860,861,863,865,866,869,870,871,1026,01140,01141,01142,01143,01144,01145,01146, 01147,01148}
Any alias of the above character sets that is found here may also be used. In addition, any character set specified in Appendix A, Character Sets, of the Oracle Database Globalization Support Guide can be used with the exception of IW7IS960.
However, it is recommended that you use IANA character set names for interoperability with other XML parsers. Also note that XML parsers are only required to support UTF-8 and UTF-16 so those character sets should be preferred.
In order to be able to use these encodings, you should have the ORACLE_HOME environment variable set and pointing to the location of your Oracle installation. This enables the use of the globalization support data files which contain data for all supported encodings. On UNIX systems, they are usually in $ORACLE_HOME/nls/data
. On Windows, they are usually in %ORACLE_HOME%\nls\data
. C and C++ XDK releases that are downloaded from OTN contain an nls/data
subdirectory. You must set the environment variable ORA_NLS10 to the absolute path of the nls/data
subdirectory if you do not have an Oracle installation.
The default input encoding ("incoding") is UTF-8. If an input document's encoding is not self-evident (by HTTP character set, Byte Order Mark, XMLDecl, and so on), then the default input encoding is assumed. It is recommended that you set the default encoding explicitly if using only single byte character sets (such as US-ASCII or any of the ISO-8859 character sets) since single-byte performance is by far the fastest. The flag XML_FLAG_FORCE_INCODING
says that the default input encoding should always be applied to input documents, ignoring any BOM or XMLDecl. However, a protocol declaration (such as HTTP character set) is always honored.
The data encoding for DOM and SAX ("outcoding") should be chosen carefully. Single-byte encodings are the fastest, but can represent only a very limited set of characters. Next fastest is Unicode (UTF-16), and slowest are the multibyte encodings such as UTF-8. If input data cannot be converted to the outcoding without loss, an error occurs. So for maximum utility, a Unicode-based outcoding should be used, since Unicode can represent any character. If outcoding is not specified, it defaults to the incoding of the first document parsed.