Oracle® Database Advanced Replication 10g Release 1 (10.1) Part Number B10732-01 |
|
|
View PDF |
This chapter explains the basic concepts and terminology related to Advanced Replication.
This chapter contains these topics:
Replication is the process of copying and maintaining database objects, such as tables, in multiple databases that make up a distributed database system. Changes applied at one site are captured and stored locally before being forwarded and applied at each of the remote locations. Advanced Replication is a fully integrated feature of the Oracle server; it is not a separate server.
Replication uses distributed database technology to share data between multiple sites, but a replicated database and a distributed database are not the same. In a distributed database, data is available at many locations, but a particular table resides at only one location. For example, the employees
table resides at only the ny.world
database in a distributed database system that also includes the hk.world
and la.world
databases. Replication means that the same data is available at multiple locations. For example, the employees
table is available at ny.world
, hk.world
, and la.world
.
Some of the most common reasons for using replication are described as follows:
Replication provides fast, local access to shared data because it balances activity over multiple sites. Some users can access one server while other users access different servers, thereby reducing the load at all servers. Also, users can access data from the replication site that has the lowest access cost, which is typically the site that is geographically closest to them.
Replication provides fast, local access to shared data because it balances activity over multiple sites. Some users can access one server while other users access different servers, thereby reducing the load at all servers. Also, users can access data from the replication site that has the lowest access cost, which is typically the site that is geographically closest to them.
A materialized view is a complete or partial copy (replica) of a target table from a single point in time. Materialized views enable users to work on a subset of a database while disconnected from the central database server. Later, when a connection is established, users can synchronize (refresh) materialized views on demand. When users refresh materialized views, they update the central database with all of their changes, and they receive any changes that may have happened while they were disconnected.
Replication can be used to distribute data over multiple regional locations. Then, applications can access various regional servers instead of accessing one central server. This configuration can reduce network load dramatically.
Replication can be used to distribute data over multiple regional locations. Then, applications can access various regional servers instead of accessing one central server. This configuration can reduce network load dramatically.
You can find more detailed descriptions of the uses of replication in later chapters.
Note: The Advanced Replication feature is automatically installed and upgraded in every Oracle Database 10g database. |
See Also:
Oracle Database Administrator's Guide for more information about distributed databases |
Replication supports a variety of applications that often have different requirements. Some applications allow for relatively autonomous individual materialized view sites. For example, sales force automation, field service, retail, and other mass deployment applications typically require data to be periodically synchronized between central database systems and a large number of small, remote sites, which are often disconnected from the central database. Members of a sales force must be able to complete transactions, regardless of whether they are connected to the central database. In this case, remote sites must be autonomous.
On the other hand, applications such as call centers and Internet systems require data on multiple servers to be synchronized in a continuous, nearly instantaneous manner to ensure that the service provided is available and equivalent at all times. For example, a retail Web site on the Internet must ensure that customers see the same information in the online catalog at each site. Here, data consistency is more important than site autonomy.
Advanced Replication can be used for each of the types of applications described in the previous paragraphs, and for systems that combine aspects of both types of applications. In fact, Advanced Replication can support both mass deployment and server-to-server replication, enabling integration into a single coherent environment. In such an environment, for example, sales force automation and customer service call centers can share data.
Advanced Replication can replicate data in environments that use different releases of Oracle and in environments that run Oracle on different operating systems. Therefore, applications that use data in such an environment can use Advanced Replication.
The following sections explain the basic components of a replication system, including replication objects, replication groups, and replication sites.
A replication object is a database object existing on multiple servers in a distributed database system. In a replication environment, any updates made to a replication object at one site are applied to the copies at all other sites. Advanced Replication enables you to replicate the following types of objects:
Regarding tables, replication supports advanced features such as partitioned tables, index-organized tables, tables containing columns that are based on user-defined types, and object tables.
In a replication environment, Oracle manages replication objects using replication groups. A replication group is a collection of replication objects that are logically related.
By organizing related database objects within a replication group, it is easier to administer many objects together. Typically, you create and use a replication group to organize the schema objects necessary to support a particular database application. However, replication groups and schemas do not need to correspond with one another. A replication group can contain objects from multiple schemas, and a single schema can have objects in multiple replication groups. However, each replication object can be a member of only one replication group.
A replication group can exist at multiple replication sites. Replication environments support two basic types of sites: master sites and materialized view sites. One site can be both a master site for one replication group and a materialized view site for a different replication group. However, one site cannot be both the master site and the materialized view site for the same replication group.
The differences between master sites and materialized view sites are the following:
hr_repg
master group contains the tables employees
and departments
, then all of the master sites participating in a master group must maintain a complete copy of employees
and departments
. However, one materialized view site might contain only a materialized view of the employees
table, while another materialized view site might contain materialized views of both the employees
and departments
tables.Advanced Replication supports the following types of replication environments:
Multimaster replication (also called peer-to-peer or n-way replication) enables multiple sites, acting as equal peers, to manage groups of replicated database objects. Each site in a multimaster replication environment is a master site, and each site communicates with the other master sites.
Applications can update any replicated table at any site in a multimaster configuration. Oracle database servers operating as master sites in a multimaster environment automatically work to converge the data of all table replicas and to ensure global transaction consistency and data integrity.
Asynchronous replication is the most common way to implement multimaster replication. Other ways include synchronous replication and procedural replication, which are discussed later in this chapter. When you use asynchronous replication, information about a data manipulation language (DML) change on a table is stored in the deferred transactions queue at the master site where the change occurred. These changes are called deferred transactions. The deferred transactions are pushed (or propagated) to the other participating master sites at regular intervals. You can control the amount of time in an interval.
Using asynchronous replication means that data conflicts are possible because the same row value might be updated at two different master sites at nearly the same time. However, you can use techniques to avoid conflicts and, if conflicts occur, Oracle provides prebuilt mechanisms that can be implemented to resolve them. Information about unresolved conflicts is stored in an error log.
Text description of the illustration repln00a.gif
At times, you must stop all replication activity for a master group so that you can perform certain administrative tasks on the master group. For example, you must stop all replication activity for a master group to add a new master group object. Stopping all replication activity for a master group is called quiescing the group. When a master group is quiesced, users cannot issue DML statements on any of the objects in the master group. Also, all deferred transactions must be propagated before you can quiesce a master group. Users can continue to query the tables in a quiesced master group.
A materialized view contains a complete or partial copy of a target master from a single point in time. The target master can be either a master table at a master site or a master materialized view at a materialized view site. A master materialized view is a materialized view that functions as a master for another materialized view. A multitier materialized view is one that is based on another materialized view, instead of on a master table.
Materialized views provide the following benefits:
A materialized view may be read-only, updatable, or writeable, and these types of materialized views provide benefits in addition to those listed previously.
In a basic configuration, materialized views can provide read-only access to the table data that originates from a master site or master materialized view site. Applications can query data from read-only materialized views to avoid network access to the master site, regardless of network availability. However, applications throughout the system must access data at the master site to perform data manipulation language changes (DML). Figure 1-2 illustrates basic, read-only replication. The master tables and master materialized views of read-only materialized views do not need to belong to a replication group.
Read-only materialized views provide the following benefits:
CONNECT
BY
clause.
See Also:
"Available Materialized Views" for more information about complex materialized views |
Text description of the illustration repln053.gif
In a more advanced configuration, you can create an updatable materialized view that allows users to insert, update, and delete rows of the target master table or master materialized view by performing these operations on the materialized view. An updatable materialized view may also contain a subset of the data in the target master. Figure 1-3 illustrates a replication environment using updatable materialized views.
Updatable materialized views are based on tables or other materialized views that have been set up to support replication. In fact, updatable materialized views must be part of a materialized view group that is based on another replication group. For changes made to an updatable materialized view to be pushed back to the master during refresh, the updatable materialized view must belong to a materialized view group.
Text description of the illustration repln083.gif
Updatable materialized views have the following properties.
Updatable materialized views provide the following benefits:
You can create a materialized view using the FOR
UPDATE
clause during creation but then never add the materialized view to a materialized view group. In this case, users can perform data manipulation language (DML) changes on the materialized view, but these changes cannot be pushed back to the master and are lost if the materialized view refreshes. Such materialized views are called writeable materialized views.
Both row and column subsetting enable you to create materialized views that contain a partial copy of the data at a master table or master materialized view. Such materialized views can be helpful for regional offices or sales forces that do not require the complete data set.
Row subsetting enables you to include only the rows that are needed from the masters in the materialized views by using a WHERE
clause. Column subsetting enables you to include only the columns that are needed from the masters in the materialized views. You do this by specifying particular columns in the SELECT
statement during materialized view creation.
See Also:
|
To ensure that a materialized view is consistent with its master table or master materialized view, you need to refresh the materialized view periodically. Oracle provides the following three methods to refresh materialized views:
When it is important for materialized views to be transactionally consistent with each other, you can organize them into refresh groups. By refreshing the refresh group, you can ensure that the data in all of the materialized views in the refresh group correspond to the same transactionally consistent point in time. Both read-only and updatable materialized views can be included in a refresh group. A materialized view in a refresh group still can be refreshed individually, but doing so nullifies the benefits of the refresh group because refreshing the materialized view individually does not refresh the other materialized views in the refresh group.
A materialized view log is a table at the materialized view's master site or master materialized view site that records all of the DML changes to the master table or master materialized view. A materialized view log is associated with a single master table or master materialized view, and each of those has only one materialized view log, regardless of how many materialized views refresh from the master. A fast refresh of a materialized view is possible only if the materialized view's master has a materialized view log. When a materialized view is fast refreshed, entries in the materialized view's associated materialized view log that have appeared since the materialized view was last refreshed are applied to the materialized view.
Deployment templates simplify the task of deploying and maintaining many remote materialized view sites. Using deployment templates, you can define a collection of materialized view definitions at a master site, and you can use parameters in the definitions so that the materialized views can be customized for individual users or types of users.
For example, you might create one template for the sales force and another template for field service representatives. In this case, a parameter value might be the sales territory or the customer support level. When a remote user connects to a master site, the user can query a list of available templates. When the user instantiates a template, the materialized views are created and populated at the remote site. The parameter values can either be supplied by the remote user or taken from a table maintained at the master site.
When a user instantiates a template at a materialized view site, the object DDL (for example, CREATE
MATERIALIZED
VIEW
or CREATE
TABLE
statements) is executed to create the schema objects at the materialized view site, and the objects are populated with the appropriate data. Users can instantiate templates while connected to the master site over a network (online instantiation), or while disconnected from the master site (offline instantiation).
Offline instantiation is often used to decrease server loads during peak usage periods and to reduce remote connection times. To instantiate a template offline, you package the template and required data on some type of storage media, such as tape, CD-ROM, and so on. Then, instead of pulling the data from the master site, users pull the data from the storage media containing the template and data.
Multimaster replication and materialized views can be combined in hybrid or "mixed" configurations to meet different application requirements. Hybrid configurations can have any number of master sites and multiple materialized view sites for each master.
For example, as shown in Figure 1-4, multimaster (or n-way) replication between two masters can support full-table replication between the databases that support two geographic regions. Materialized views can be defined on the masters to replicate full tables or table subsets to sites within each region.
Text description of the illustration repln002.gif
Key differences between materialized views and replicated master tables include the following:
Both master sites and materialized view sites use scheduled links to propagate data changes to other sites. A scheduled link is a database link with a user-defined schedule to push deferred transactions. A scheduled link determines how a master site propagates its deferred transaction queue to another master site, or how a materialized view site propagates its deferred transaction queue to its master site. When you create a scheduled link, Oracle creates a job in the local job queue to push the deferred transaction queue to another site in the system.
Several tools are available for configuring, administering, and monitoring your replication environment. The Replication Management tool in the Oracle Enterprise Manager Console provides a powerful GUI interface to help you manage your environment, while the replication management application programming interface (API) provides you with a familiar API to build customized scripts for replication administration. Additionally, the replication catalog keeps you informed about your replication environment.
To help configure and administer replication environments, Oracle provides a sophisticated Replication Management tool in the Oracle Enterprise Manager Console. Other sections in this book include information and examples for using this tool, but the Replication Management tool online help is the primary documentation source for this tool.
Text description of the illustration repover3.gif
See Also:
Chapter 7, "Introduction to the Replication Management Tool" for an introduction to the Replication Management tool, and the Replication Management tool online help for complete instructions on using the tool. |
The replication management API is a set of PL/SQL packages that encapsulate procedures and functions that you can use to configure an Advanced Replication environment. The replication management API is a command-line alternative to the Replication Management tool. In fact, the Replication Management tool uses the procedures and functions of the replication management API to perform its work. For example, when you use the Replication Management tool to create a new master group, the tool completes the task by making a call to the CREATE_MASTER_REPGROUP
procedure in the DBMS_REPCAT
package. The replication management API makes it easy for you to create custom scripts to manage your replication environment.
See Also:
Oracle Database Advanced Replication Management API Reference for more information about using the replication management API |
Every master site and materialized view site in a replication environment has a replication catalog. A replication catalog for a site is a distinct set of data dictionary tables and views that maintain administrative information about replication objects and replication groups at the site. Every server participating in a replication environment can automate the replication of objects in replication groups using the information in its replication catalog.
See Also:
Oracle Database Advanced Replication Management API Reference for more information about the replication catalog |
In a replication environment, all DDL statements must be issued using either the Replication Management tool in the Oracle Enterprise Manager Console or the DBMS_REPCAT
package in the replication management API. Specifically, if you use the DBMS_REPCAT
package, then use the CREATE_MASTER_REPOBJECT
procedure to add objects to a master group, and use ALTER_MASTER_REPOBJECT
to modify replicated objects. You can also use the EXECUTE_DDL
procedure.
When you use either the Replication Management tool or the DBMS_REPCAT
package, all DDL statements are replicated to all of the sites participating in the replication environment. In some cases, you can also use export/import to create replicated objects.
Asynchronous multimaster and updatable materialized view replication environments must address the possibility of replication conflicts that may occur when, for example, two transactions originating from different sites update the same row at nearly the same time. When data conflicts occur, you need a mechanism to ensure that the conflict is resolved in accordance with your business rules and to ensure that the data converges correctly at all sites.
In addition to logging any conflicts that may occur in your replication environment, Advanced Replication offers a variety of prebuilt conflict resolution methods that enable you to define a conflict resolution system for your database that resolves conflicts in accordance with your business rules. If you have a unique situation that Oracle's prebuilt conflict resolution methods cannot resolve, then you have the option of building and using your own conflict resolution methods.
See Also:
|
Asynchronous replication is the most common way to implement multimaster replication. However, you have two other options: synchronous replication and procedural replication.
A multimaster replication environment can use either asynchronous or synchronous replication to copy data. With asynchronous replication, changes made at one master site occur at a later time at all other participating master sites. With synchronous replication, changes made at one master site occur immediately at all other participating master sites.
When you use synchronous replication, an update of a table results in the immediate replication of the update at all participating master sites. In fact, each transaction includes all master sites. Therefore, if one master site cannot process a transaction for any reason, then the transaction is rolled back at all master sites.
Although you avoid the possibility of conflicts when you use synchronous replication, it requires a very stable environment to operate smoothly. If communication to one master site is not possible because of a network problem, for example, then users can still query replicated tables, but no transactions can be completed until communication is reestablished. Also, it is possible to configure asynchronous replication so that it simulates synchronous replication.
See Also:
"Scheduling Continuous Pushes" for information about simulating synchronous replication in an asynchronous replication environment |
Batch processing applications can change large amounts of data within a single transaction. In such cases, typical row-level replication might load a network with many data changes. To avoid such problems, a batch processing application operating in a replication environment can use Oracle's procedural replication to replicate simple stored procedure calls to converge data replicas. Procedural replication replicates only the call to a stored procedure that an application uses to update a table. It does not replicate the data modifications themselves.
To use procedural replication, you must replicate the packages that modify data in the system to all sites. After replicating a package, you must generate a wrapper for the package at each site. When an application calls a packaged procedure at the local site to modify data, the wrapper ensures that the call is ultimately made to the same packaged procedure at all other sites in the replication environment. Procedural replication can occur asynchronously or synchronously.
When a replicating data using procedural replication, the procedures that replicate data are responsible for ensuring the integrity of the replicated data. That is, you must design such procedures to either avoid or detect replication conflicts and to resolve them appropriately. Consequently, procedural replication is most typically used when databases are modified only with large batch operations. In such situations, replication conflicts are unlikely because numerous transactions are not contending for the same data.
See Also:
Oracle Database Advanced Replication Management API Reference for more information about procedural replication |