Oracle® XML DB Developer's Guide 10g Release 1 (10.1) Part Number B10790-01 |
|
|
View PDF |
This chapter describes how to access Oracle XML DB repository data using FTP, HTTP/WebDAV protocols.
This chapter contains these topics:
As described in Chapter 2, " Getting Started with Oracle XML DB" and Chapter 18, " Accessing Oracle XML DB Repository Data", Oracle XML DB repository provides a hierarchical data repository in the database modeled on XML. Oracle XML DB repository maps path names (or URLs) onto database objects of XMLType
and provides management facilities for these objects.
Oracle XML DB also provides the Oracle XML DB protocol server. This supports standard Internet protocols, FTP, WebDAV, and HTTP, for accessing its hierarchical repository or file system. Because XML documents reference each other using URLs, typically HTTP URLs, Oracle XML DB repository and its protocol support are important Oracle XML DB components. These protocols can provide direct access to Oracle XML DB to many users without having to install additional software. The user names and passwords to be used with the protocols are the same as those for SQL*Plus. Enterprise users are also supported.
Note: Oracle XML DB protocols are not supported on EBCDIC platforms. |
Oracle XML DB protocol server maintains a shared pool of sessions. Each protocol connection is associated with one session from this pool. After a connection is closed the session is put back into the shared pool and can be used to serve later connections.
Session Pooling improves performance of HTTP by avoiding the cost of re-creating session states, especially when using HTTP 1.0, which creates new connections for each request. For example, a couple of small files can be retrieved by an existing HTTP/1.1 connection in the time necessary to create a database session. You can tune the number of sessions in the pool by setting session-pool-size in Oracle XML DB xdbconfig.xml
file, or disable it by setting pool size to zero.
Session pooling can affect users writing Java servlets, because other users can come along and see session state initialized by another request for a different user. Hence, servlet writers should only use session memory, such as Java static variables, to hold data for the entire application rather than for a particular user. State for each user must be stored in the database or in a look-up table, rather than assuming that a session will only exist for a single user.
Figure 24-1 illustrates the Oracle XML DB protocol server components and how they are used to access files in Oracle XML DB XML repository and other data. Only the relevant components of the repository are shown
Figure 24-1 Oracle XML DB Architecture: Protocol Server
Oracle XML DB protocol server uses configuration parameters stored in /xdbconfig.xml
to initialize its startup state and manage session level configuration. The following section describes the protocol-specific configuration parameters that you can configure in the Oracle XML DB configuration file. The session pool size and timeout parameters cannot be changed dynamically, that is, you will need to restart the database in order for these changes to take effect.
Figure 24-1 shows the parameters common to all protocols. All parameter names in this table, except those starting with /xdbconfig
, are relative to the following XPath in the Oracle XML DB configuration schema:
/xdbconfig/sysconfig/protocolconfig/common
FTP-specific parameters. Table 24-2 shows the FTP-specific parameters. These are relative to the following XPath in the Oracle XML DB configuration schema:
/xdbconfig/sysconfig/protocolconfig/ftpconfig
HTTP/WebDAV specific parameters except servlet-related parameters. Table 24-3 shows the HTTP/WebDAV-specific parameters. These parameters are relative to the following XPath in the Oracle XML DB configuration schema:
/xdbconfig/sysconfig/protocolconfig/httpconfig
See Also:
|
For examples of the usage of these parameters, see the configuration file /xdbconfig.xml
, listed in and .
Table 24-1 Common Protocol Configuration Parameters
Parameter | Description |
---|---|
extension-mappings/mime-mappings |
Specifies the mapping of file extensions to mime types. When a resource is stored in the Oracle XML DB repository, and its mime type is not specified, this list of mappings is used to set its mime type. |
extension-mappings/lang-mappings |
Specifies the mapping of file extensions to languages. When a resource is stored in the Oracle XML DB repository, and its language is not specified, this list of mappings is used to set its language. |
extension-mappings/encoding-mappings |
Specifies the mapping of file extensions to encodings. When a resource is stored in the Oracle XML DB repository, and its encoding is not specified, this list of mappings is used to set its encoding. |
xml-extensions |
Specifies the list of filename extensions that are treated as XML content by Oracle XML DB. |
session-pool-size |
Maximum number of sessions that are kept in the protocol server session pool |
/xdbconfig/sysconfig/call-timeout |
If a connection is idle for this time (in hundredths of a second), then the shared server serving the connection is freed up to serve other connections. |
session-timeout |
Time (in hundredths of a second) after which a session (and consequently the corresponding connection) will be terminated by the protocol server if the connection has been idle for that time. This parameter is used only if the specific protocol session timeout is not present in the configuration |
schemaLocation-mappings |
Specifies the default schema location for a given namespace. This is used if the instance XML document does not contain an explicit xsi:schemaLocation attribute. |
/xdbconfig/sysconfig/default-lock-timeout |
Time after which a WebDAV lock on a resource becomes invalid. This could be overridden by a Timeout specified by the client that locks the resource. |
Table 24-2 Configuration Parameters Specific to FTP
Parameter | Description |
---|---|
ftp-port |
Port on which FTP server listens. By default this is 2100 |
ftp-protocol |
Protocol over which the FTP server runs. By default this is tcp |
session-timeout |
Time (in hundredths of a second) after which an FTP session (and consequently the corresponding connection) will be terminated by the protocol server if the connection has been idle for that time. |
Table 24-3 Configuration Parameters Specific to HTTP/WebDAV (Except Servlet Parameters)
Parameter | Description |
---|---|
http-port |
Port on which HTTP/WebDAV server listens |
http-protocol |
Protocol over which the HTTP/WebDAV server runs. By default this is tcp |
session-timeout |
Time (in hundredths of a second) after which an HTTP session (and consequently the corresponding connection) will be terminated by the protocol server if the connection has been idle for that time. |
max-header-size |
Maximum size (in bytes) of an HTTP header |
max-request-body |
Maximum size (in bytes) of an HTTP request body |
webappconfig/welcome-file-list |
List of filenames that are considered welcome files. When an HTTP GET request for a container is received, the server first checks if there is a resource in the container with any of these names. If so, then the contents of that file are sent, instead of a list of resources in the container. |
default-url-charset |
The character set in which an HTTP protocol server assumes incoming URL is encoded when it is not encoded in UTF-8 or the request's Content-Type field Charset parameter. |
The protocol specifications, RFC 959 (FTP), RFC 2616 (HTTP), and RFC 2518 (WebDAV) implicitly assume an abstract, hierarchical file system on the server side. This is mapped to the Oracle XML DB hierarchical repository. Oracle XML DB repository provides features such as:
Name resolution
Access control list (ACL)-based security. ACL is a list of access control entries that determine which principals have access to a given resource or resources. See also Chapter 23, " Oracle XML DB Resource Security".
The ability to store and retrieve any content. Oracle XML DB repository can store both binary data input through FTP and XML schema-based documents.
Oracle XML DB protocol server enhances the protocols by always checking if XML documents being inserted are based on XML schemas registered in the repository.
If the incoming XML document specifies an XML schema, then the Oracle XML DB storage to use is decided by that XML schema. This functionality comes in handy when you must store XML documents object-relationally in the database, using simple protocols like FTP or WebDAV instead of having to write SQL statements.
If the incoming XML document is not XML schema-based, then it is stored as a binary document.
In certain cases, it may be useful to log the requests received and responses sent by a protocol server. This can be achieved by setting event number 31098
to level 2
. To set this event, add the following line to your init.ora
file and restart the database:
event="31098 trace name context forever, level 2"
The following sections describe FTP features supported by Oracle XML DB.
File Transfer Protocol (FTP) is one of the oldest and most popular protocols on the net. FTP is specified in RFC959 and provides access to heterogeneous file systems in a uniform manner. FTP works by providing well defined commands for communication between the client and the server. The transfer of commands and the return status happens on a single connection. However, a new connection is opened between the client and the server for data transfer. In HTTP, the transfer of commands and data happens on a single connection.
FTP is implemented by both dedicated clients at the operating system level, file system explorer clients, and browsers. FTP is typically session-oriented, in that a user session is created through an explicit logon, a number of files or directories are downloaded and browsed, and then the connection is closed.
Oracle XML DB implements FTP, as defined by RFC 959, with the exception of the following optional features:
Record-oriented files, for example, only the FILE structure of the STRU command is supported. This is the most widely used structure for transfer of files. It is also the default specified by the specification. Structure mount is not supported.
Append.
Allocate. This pre-allocates space before file transfer.
Account. This uses the insecure Telnet protocol.
Abort.
It can be configured through the Oracle XML DB configuration file /xdbconfig.xml
, to listen on an arbitrary port. FTP ships listening on a non-standard, non-protected port. To use FTP on the standard port (21), your DBA has to chown
the TNS listener to setuid ROOT
rather than setuid ORACLE
.
Protocol server also provides session management for this protocol. After a short wait for a new command, FTP returns to the protocol layer and the shared server is freed up to serve other connections. The duration of this short wait is configurable by changing the call-timeOut
parameter in the Oracle XML DB configuration file. For high traffic sites, the call-timeout
should be shorter so that more connections can be served. When new data arrives on the connection, the FTP Server is re-invoked with fresh data. So, the long running nature of FTP does not affect the number of connections which can be made to the protocol server.
Oracle Database supports two FTP quote commands to control character sets for different purposes: set_nls_locale
and set_charset
.
set_nls_locale
quote set_nls_locale {<charset_name> | NULL}
This command is used to control the encoding of the file and directory names specified by the users in the FTP commands. It also controls the encoding of the file and directory names in the response returned to the users. Only IANA character set names can be specified for this parameter. If nls_locale
is set to NULL
or not set then it is defaulted to the database character set.
set_charset
quote set_charset {<charset_name> | NULL}
This command is used to specify the character set of the data to be sent to the server. This parameter, if defined, overrides the Byte Order Mark (BOM) and the encoding declaration inside the document. The keyword NULL
is used to unset the charset
parameter. The BOM is a signature to indicate the order of the following stream of bytes defined in the Unicode Standard.
The algorithm used to determine the character encoding of incoming data is as follows:
The charset
parameter value, if it is not NULL
, determines the character set.
If the charset
parameter value is NULL
, the MIME type of the data is evaluated.
If the MIME type is */xml
then the character set is determined by the presence of the BOM and the encoding declaration inside the XML document. Otherwise, the database character set is used.
Text documents are assumed to be in the database character set if the set_charset
command is not set. This parameter does not apply to binary files that are not text.
If you are frequently disconnected from the server and have to reconnect and traverse the entire directory before doing the next operation, you may need to modify the default timeout value for FTP sessions. If the session is idle for more than this period, it gets disconnected. You can increase the timeout value (default = 6000 centiseconds) by modifying the configuration document as follows and then restart the database:
Oracle XML DB implements HyperText Transfer Protocol (HTTP), HTTP 1.1 as defined in RFC2616 specification.
The Oracle XML DB HTTP component in the Oracle XML DB protocol server implements the RFC2616 specification with the exception of the following optional features:
gzip and compress transfer encodings
byte-range headers
The TRACE method (used for proxy error debugging)
Cache-Control directives (requires you to specify expiration dates for content, and are not generally used)
TE, Trailer, Vary & Warning headers
Weak entity tags
Web common log format
Multi-homed Web server
Digest Authentication (RFC 2617) is not supported. In this release, Oracle XML DB supports Basic Authentication, where a client sends the user name and password in clear text in the Authorization header.
HTTP ships listening on a non-standard, non-protected port (8080). To use HTTP on the standard port (80), your DBA must chown
the TNS listener to setuid ROOT
rather than setuid ORACLE
, and configure the port number in the Oracle XML DB configuration file /xdbconfig.xml
.
Oracle XML DB supports Java servlets. To use a servlet, it must be registered with a unique name in the Oracle XML DB configuration file, along with parameters to customize its action. It should be compiled, and loaded into the database. Finally, the servlet name must be associated with a pattern, which can be an extension such as *.jsp
or a path name such as /a/b/c
or /sys/*
, as described in Java servlet application program interface (API) version 2.2.
While processing an HTTP request, the path name for the request is matched with the registered patterns. If there is a match, then the protocol server invokes the corresponding servlet with the appropriate initialization parameters. For Java servlets, the existing Java Virtual Machine (JVM) infrastructure is used. This starts the JVM if need be, which in turn runs a Java method to initialize the servlet, create response, and request objects, pass these on to the servlet, and run it.
When a client sends multibyte data in a URL, RFC 2718 specifies that the client should send the URL using the %HH format where HH is the hexadecimal notation of the byte value in UTF-8 encoding. The following are URL examples that can be sent to XML DB in an HTTP or WebDAV context:
http://urltest/xyz%E3%81%82%E3%82%A2 http://%E3%81%82%E3%82%A2 http://%E3%81%82%E3%82%A2/abc%E3%81%86%E3%83%8F.xml
XML DB processes the requested URL, any URLs within an IF header, any URLs within the DESTINATION header, and any URLs in the REFERRED header that contains multibyte data.
The default-url-charset
configuration parameter can be used to accept requests from some clients that use other non-conformant forms of URL of non-ASCII characters. If a request with non-ASCII characters fails, try setting this value to the native character set of the client environment. The character set used in such URL fields must be specified with an IANA charset name.
default-url-charset
controls the encoding for non-conforming URLs. It is not required to be set unless a non-conforming client that does not send the Content-Type
charset is used.
Non-ascii characters appearing in URLs passed to an HTTP Server should be converted to UTF-8
and escaped in the %HH format, where HH is the hexadecimal notation of the byte value. For flexibility, XML DB protocol server interprets the incoming URLs by testing whether it is encoded in one of the following character sets in the order presented here:
UTF-8
Charset parameter of the Content-Type field of the request if specified
Character set if specified in the default-url-charset configuration parameter
Character set of the database server
The following sections describe how character sets are controlled for data transferred using HTTP.
The character set of the HTTP request body is determined with the following algorithm:
The Content-Type header is evaluated. If the Content-Type header specifies a charset value, the specified charset is used.
The MIME type of the document is evaluated as follows:
If the MIME type is "*/xml
", the character set is determined as follows:
- If a BOM is present UTF-16
is used.
- If an encoding declaration is present, the specified encoding is used.
- If neither a BOM or encoding declaration is present, UTF-8
is used.
If the MIME type is text, ISO8859-1
is used.
If the MIME type is neither "*/xml
" or text
, the database character set is used.
Note that there is a difference between HTTP and SQL or FTP. For text documents the default is ISO8859-1 as specified by the IETF.org RFC 2616: HTTP 1.1 Protocol Specification.
The response generated by XML DB HTTP/WebDAV Server is in the character set specified in the Accept-Charset
field of the request. Accept-Charset can have a list of character sets. Based on the q-value Oracle XML DB chooses one that does not require conversion. This might not necessarily be the charset with the highest q-value. If Oracle XML DB cannot find one, then the conversion is based on the highest q-value.
Web Distributed Authoring and Versioning (WebDAV) is a standard protocol used to provide users with a file system interface to Oracle XML repository over the Internet. The most popular way of accessing a WebDAV server folder is through WebFolders on Microsoft Windows 2000 or Microsoft NT.
WebDAV is an extension to HTTP 1.1 protocol. It allows clients to perform remote web content authoring through a coherent set of methods, headers, request body formats and response body formats. WebDAV provides operations to store and retrieve resources, create and list contents of resource collections, lock resources for concurrent access in a coordinated manner, and to set and retrieve resource properties.
Oracle XML DB supports the following WebDAV features:
Foldering, specified by RFC2518
Access Control
WebDAV is a set of extensions to the HTTP protocol that allow you to edit or manage your files on remote Web servers. WebDAV can also be used, for example, to:
Share documents over the Internet
Edit content over the Internet
Oracle XML DB supports the contents of RFC2518, with the following exceptions:
Lock-NULL resources create actual zero-length resources in the file system, and cannot be converted to folders.
The COPY
, MOVE
and DELETE
methods comply with section 2 of the Internet Draft titled 'Binding Extensions to WebDAV'.
Depth-infinity locks
Only Basic Authentication is supported.
To create a WebFolder in Windows 2000, follow these steps:
From your desktop, select My Network Places.
Double click Add Network Place.
Type the location of the folder, for example:
http://Oracle_server_name:HTTP_port_number
See Figure 24-2.
Click Next.
Enter any name to identify this WebFolder
Click Finish.
You can now access Oracle XML DB repository just like you access any Windows folder.