1
Introduction to Advanced Replication

This chapter explains the basic concepts and terminology related to Advanced Replication.

This chapter contains these topics:

Overview of Replication
Applications That Use Replication
Replication Objects, Groups, and Sites
Types of Replication Environments
Administration Tools for a Replication Environment
Replication Conflicts
Other Options for Multimaster Replication

Note:
If you are using Trusted Oracle, then see your documentation for Oracle security-related products for information about using replication in that environment.

Overview of Replication

Replication is the process of copying and maintaining database objects, such as tables, in multiple databases that make up a distributed database system. Changes applied at one site are captured and stored locally before being forwarded and applied at each of the remote locations. Advanced Replication is a fully integrated feature of the Oracle server; it is not a separate server.

Replication uses distributed database technology to share data between multiple sites, but a replicated database and a distributed database are not the same. In a distributed database, data is available at many locations, but a particular table resides at only one location. For example, the employees table resides at only the ny.world database in a distributed database system that also includes the hk.world and la.world databases. Replication means that the same data is available at multiple locations. For example, the employees table is available at ny.world, hk.world, and la.world.

Some of the most common reasons for using replication are described as follows:

Availability

Replication provides fast, local access to shared data because it balances activity over multiple sites. Some users can access one server while other users access different servers, thereby reducing the load at all servers. Also, users can access data from the replication site that has the lowest access cost, which is typically the site that is geographically closest to them.

Performance

Disconnected computing

A materialized view is a complete or partial copy (replica) of a target table from a single point in time. Materialized views enable users to work on a subset of a database while disconnected from the central database server. Later, when a connection is established, users can synchronize (refresh) materialized views on demand. When users refresh materialized views, they update the central database with all of their changes, and they receive any changes that may have happened while they were disconnected.

Network load reduction

Replication can be used to distribute data over multiple regional locations. Then, applications can access various regional servers instead of accessing one central server. This configuration can reduce network load dramatically.

Mass deployment

You can find more detailed descriptions of the uses of replication in later chapters.

Note:

The Advanced Replication feature is automatically installed and upgraded in every Oracle Database 10g database.

See Also:

Oracle Database Administrator's Guide for more information about distributed databases

Applications That Use Replication

Replication supports a variety of applications that often have different requirements. Some applications allow for relatively autonomous individual materialized view sites. For example, sales force automation, field service, retail, and other mass deployment applications typically require data to be periodically synchronized between central database systems and a large number of small, remote sites, which are often disconnected from the central database. Members of a sales force must be able to complete transactions, regardless of whether they are connected to the central database. In this case, remote sites must be autonomous.

On the other hand, applications such as call centers and Internet systems require data on multiple servers to be synchronized in a continuous, nearly instantaneous manner to ensure that the service provided is available and equivalent at all times. For example, a retail Web site on the Internet must ensure that customers see the same information in the online catalog at each site. Here, data consistency is more important than site autonomy.

Advanced Replication can be used for each of the types of applications described in the previous paragraphs, and for systems that combine aspects of both types of applications. In fact, Advanced Replication can support both mass deployment and server-to-server replication, enabling integration into a single coherent environment. In such an environment, for example, sales force automation and customer service call centers can share data.

Advanced Replication can replicate data in environments that use different releases of Oracle and in environments that run Oracle on different operating systems. Therefore, applications that use data in such an environment can use Advanced Replication.

Replication Objects, Groups, and Sites

The following sections explain the basic components of a replication system, including replication objects, replication groups, and replication sites.

Replication Objects

A replication object is a database object existing on multiple servers in a distributed database system. In a replication environment, any updates made to a replication object at one site are applied to the copies at all other sites. Advanced Replication enables you to replicate the following types of objects:

Tables
Indexes
Views and Object Views
Packages and Package Bodies
Procedures and Functions
User-Defined Types and Type Bodies
Triggers
Synonyms
Indextypes
User-Defined Operators

Regarding tables, replication supports advanced features such as partitioned tables, index-organized tables, tables containing columns that are based on user-defined types, and object tables.

Replication Groups

In a replication environment, Oracle manages replication objects using replication groups. A replication group is a collection of replication objects that are logically related.

By organizing related database objects within a replication group, it is easier to administer many objects together. Typically, you create and use a replication group to organize the schema objects necessary to support a particular database application. However, replication groups and schemas do not need to correspond with one another. A replication group can contain objects from multiple schemas, and a single schema can have objects in multiple replication groups. However, each replication object can be a member of only one replication group.

Replication Sites

A replication group can exist at multiple replication sites. Replication environments support two basic types of sites: master sites and materialized view sites. One site can be both a master site for one replication group and a materialized view site for a different replication group. However, one site cannot be both the master site and the materialized view site for the same replication group.

The differences between master sites and materialized view sites are the following:

A replication group at a master site is more specifically referred to as a master group. A replication group at a materialized view site is based on a master group and is more specifically referred to as a materialized view group. Additionally, every master group has exactly one master definition site. A replication group's master definition site is a master site serving as the control center for managing the replication group and the objects in the group.
A master site maintains a complete copy of all objects in a replication group, while materialized views at a materialized view site can contain all or a subset of the table data within a master group. For example, if the hr_repg master group contains the tables employees and departments, then all of the master sites participating in a master group must maintain a complete copy of employees and departments. However, one materialized view site might contain only a materialized view of the employees table, while another materialized view site might contain materialized views of both the employees and departments tables.
All master sites in a multimaster replication environment communicate directly with one another to continually propagate data changes in the replication group. Materialized view sites contain an image, or materialized view, of the table data from a certain point in time. Typically, a materialized view is refreshed periodically to synchronize it with its master site. You can organize materialized views into refresh groups. Materialized views in a refresh group can belong to one or more materialized view groups, and they are refreshed at the same time to ensure that the data in all materialized views in the refresh group correspond to the same transactionally consistent point in time.

Types of Replication Environments

Advanced Replication supports the following types of replication environments:

Multimaster Replication

Multimaster replication (also called peer-to-peer or n-way replication) enables multiple sites, acting as equal peers, to manage groups of replicated database objects. Each site in a multimaster replication environment is a master site, and each site communicates with the other master sites.

Applications can update any replicated table at any site in a multimaster configuration. Oracle database servers operating as master sites in a multimaster environment automatically work to converge the data of all table replicas and to ensure global transaction consistency and data integrity.

Asynchronous replication is the most common way to implement multimaster replication. Other ways include synchronous replication and procedural replication, which are discussed later in this chapter. When you use asynchronous replication, information about a data manipulation language (DML) change on a table is stored in the deferred transactions queue at the master site where the change occurred. These changes are called deferred transactions. The deferred transactions are pushed (or propagated) to the other participating master sites at regular intervals. You can control the amount of time in an interval.

Using asynchronous replication means that data conflicts are possible because the same row value might be updated at two different master sites at nearly the same time. However, you can use techniques to avoid conflicts and, if conflicts occur, Oracle provides prebuilt mechanisms that can be implemented to resolve them. Information about unresolved conflicts is stored in an error log.

Figure 1-1 Multimaster Replication

Text description of repln00a.gif follows

Text description of the illustration repln00a.gif

Master Group Quiesce

At times, you must stop all replication activity for a master group so that you can perform certain administrative tasks on the master group. For example, you must stop all replication activity for a master group to add a new master group object. Stopping all replication activity for a master group is called quiescing the group. When a master group is quiesced, users cannot issue DML statements on any of the objects in the master group. Also, all deferred transactions must be propagated before you can quiesce a master group. Users can continue to query the tables in a quiesced master group.

Materialized View Replication

A materialized view contains a complete or partial copy of a target master from a single point in time. The target master can be either a master table at a master site or a master materialized view at a materialized view site. A master materialized view is a materialized view that functions as a master for another materialized view. A multitier materialized view is one that is based on another materialized view, instead of on a master table.

Materialized views provide the following benefits:

Enable local access, which improves response times and availability
Offload queries from the master site or master materialized view site, because users can query the local materialized view instead
Increase data security by allowing you to replicate only a selected subset of the target master's data set

A materialized view may be read-only, updatable, or writeable, and these types of materialized views provide benefits in addition to those listed previously.

Read-Only Materialized Views

In a basic configuration, materialized views can provide read-only access to the table data that originates from a master site or master materialized view site. Applications can query data from read-only materialized views to avoid network access to the master site, regardless of network availability. However, applications throughout the system must access data at the master site to perform data manipulation language changes (DML). Figure 1-2 illustrates basic, read-only replication. The master tables and master materialized views of read-only materialized views do not need to belong to a replication group.

Read-only materialized views provide the following benefits:

Eliminate the possibility of conflicts because they cannot be updated.
Support complex materialized views. Examples of complex materialized views are materialized views that contain set operations or a CONNECT BY clause.

See Also:
"Available Materialized Views" for more information about complex materialized views

Figure 1-2 Read-Only Materialized View Replication

Text description of repln053.gif follows

Text description of the illustration repln053.gif

Updatable Materialized Views

In a more advanced configuration, you can create an updatable materialized view that allows users to insert, update, and delete rows of the target master table or master materialized view by performing these operations on the materialized view. An updatable materialized view may also contain a subset of the data in the target master. Figure 1-3 illustrates a replication environment using updatable materialized views.

Updatable materialized views are based on tables or other materialized views that have been set up to support replication. In fact, updatable materialized views must be part of a materialized view group that is based on another replication group. For changes made to an updatable materialized view to be pushed back to the master during refresh, the updatable materialized view must belong to a materialized view group.

Figure 1-3 Updatable Materialized View Replication

Text description of repln083.gif follows

Text description of the illustration repln083.gif

Updatable materialized views have the following properties.

They are always based on a single table, although multiple tables can be referenced in a subquery.
They can be incrementally (or fast) refreshed.
Oracle propagates the changes made to an updatable materialized view to the materialized view's remote master table or master materialized view. The updates to the master then cascade to all other replication sites.

Updatable materialized views provide the following benefits:

Allow users to query and update a local replicated data set even when disconnected from the master site or master materialized view site.
Require fewer resources than multimaster replication, while still supporting data updates. Materialized views can reduce the amount of stress placed on network resources because materialized views can be refreshed on demand, while multimaster replication propagates changes at regular intervals. In addition, because materialized views can reside in a database that contains far less data, the disk space and memory requirements for materialized view clients can be less than the requirements for an Oracle server containing master sites.

Writeable Materialized Views

You can create a materialized view using the FOR UPDATE clause during creation but then never add the materialized view to a materialized view group. In this case, users can perform data manipulation language (DML) changes on the materialized view, but these changes cannot be pushed back to the master and are lost if the materialized view refreshes. Such materialized views are called writeable materialized views.

Row and Column Subsetting with Materialized Views

Both row and column subsetting enable you to create materialized views that contain a partial copy of the data at a master table or master materialized view. Such materialized views can be helpful for regional offices or sales forces that do not require the complete data set.

Row subsetting enables you to include only the rows that are needed from the masters in the materialized views by using a WHERE clause. Column subsetting enables you to include only the columns that are needed from the masters in the materialized views. You do this by specifying particular columns in the SELECT statement during materialized view creation.

Note:

Column subsetting of updatable materialized views is supported only through the use of deployment templates and the Replication Management tool. This restriction does not apply to column subsetting of read-only materialized views.

See Also:

"Data Subsetting with Materialized Views"
"Column Subsetting with Deployment Templates" for more information about column subsetting with deployment templates

Materialized View Refresh

To ensure that a materialized view is consistent with its master table or master materialized view, you need to refresh the materialized view periodically. Oracle provides the following three methods to refresh materialized views:

Fast refresh uses materialized view logs to update only the rows that have changed since the last refresh.
Complete refresh updates the entire materialized view.
Force refresh performs a fast refresh when possible. When a fast refresh is not possible, force refresh performs a complete refresh.

Refresh Groups

When it is important for materialized views to be transactionally consistent with each other, you can organize them into refresh groups. By refreshing the refresh group, you can ensure that the data in all of the materialized views in the refresh group correspond to the same transactionally consistent point in time. Both read-only and updatable materialized views can be included in a refresh group. A materialized view in a refresh group still can be refreshed individually, but doing so nullifies the benefits of the refresh group because refreshing the materialized view individually does not refresh the other materialized views in the refresh group.

Materialized View Log

A materialized view log is a table at the materialized view's master site or master materialized view site that records all of the DML changes to the master table or master materialized view. A materialized view log is associated with a single master table or master materialized view, and each of those has only one materialized view log, regardless of how many materialized views refresh from the master. A fast refresh of a materialized view is possible only if the materialized view's master has a materialized view log. When a materialized view is fast refreshed, entries in the materialized view's associated materialized view log that have appeared since the materialized view was last refreshed are applied to the materialized view.

Deployment Templates

Deployment templates simplify the task of deploying and maintaining many remote materialized view sites. Using deployment templates, you can define a collection of materialized view definitions at a master site, and you can use parameters in the definitions so that the materialized views can be customized for individual users or types of users.

For example, you might create one template for the sales force and another template for field service representatives. In this case, a parameter value might be the sales territory or the customer support level. When a remote user connects to a master site, the user can query a list of available templates. When the user instantiates a template, the materialized views are created and populated at the remote site. The parameter values can either be supplied by the remote user or taken from a table maintained at the master site.

Online and Offline Instantiation

When a user instantiates a template at a materialized view site, the object DDL (for example, CREATE MATERIALIZED VIEW or CREATE TABLE statements) is executed to create the schema objects at the materialized view site, and the objects are populated with the appropriate data. Users can instantiate templates while connected to the master site over a network (online instantiation), or while disconnected from the master site (offline instantiation).

Offline instantiation is often used to decrease server loads during peak usage periods and to reduce remote connection times. To instantiate a template offline, you package the template and required data on some type of storage media, such as tape, CD-ROM, and so on. Then, instead of pulling the data from the master site, users pull the data from the storage media containing the template and data.

Multimaster and Materialized View Hybrid Configurations

Multimaster replication and materialized views can be combined in hybrid or "mixed" configurations to meet different application requirements. Hybrid configurations can have any number of master sites and multiple materialized view sites for each master.

For example, as shown in Figure 1-4, multimaster (or n-way) replication between two masters can support full-table replication between the databases that support two geographic regions. Materialized views can be defined on the masters to replicate full tables or table subsets to sites within each region.

Figure 1-4 Hybrid Configuration

Text description of repln002.gif follows

Text description of the illustration repln002.gif

Key differences between materialized views and replicated master tables include the following:

Replicated master tables must contain data for the full table being replicated, whereas materialized views can replicate subsets of master table data.
Multimaster replication enables you to replicate changes for each transaction as the changes occur. Materialized view refreshes are set oriented, propagating changes from multiple transactions in a more efficient, batch-oriented operation, but at less frequent intervals.
If conflicts occur because of changes made to multiple copies of the same data, then detection and resolution of conflicts always occurs at a master site or a master materialized view site.

Scheduled Links

Both master sites and materialized view sites use scheduled links to propagate data changes to other sites. A scheduled link is a database link with a user-defined schedule to push deferred transactions. A scheduled link determines how a master site propagates its deferred transaction queue to another master site, or how a materialized view site propagates its deferred transaction queue to its master site. When you create a scheduled link, Oracle creates a job in the local job queue to push the deferred transaction queue to another site in the system.

Administration Tools for a Replication Environment

Several tools are available for configuring, administering, and monitoring your replication environment. The Replication Management tool in the Oracle Enterprise Manager Console provides a powerful GUI interface to help you manage your environment, while the replication management application programming interface (API) provides you with a familiar API to build customized scripts for replication administration. Additionally, the replication catalog keeps you informed about your replication environment.

Replication Management Tool in the Oracle Enterprise Manager Console

To help configure and administer replication environments, Oracle provides a sophisticated Replication Management tool in the Oracle Enterprise Manager Console. Other sections in this book include information and examples for using this tool, but the Replication Management tool online help is the primary documentation source for this tool.

Figure 1-5 The Replication Management Tool in Oracle Enterprise Manager Console

Text description of repover3.gif follows.

Text description of the illustration repover3.gif

See Also:

Chapter 7, "Introduction to the Replication Management Tool" for an introduction to the Replication Management tool, and the Replication Management tool online help for complete instructions on using the tool.

Replication Management API

The replication management API is a set of PL/SQL packages that encapsulate procedures and functions that you can use to configure an Advanced Replication environment. The replication management API is a command-line alternative to the Replication Management tool. In fact, the Replication Management tool uses the procedures and functions of the replication management API to perform its work. For example, when you use the Replication Management tool to create a new master group, the tool completes the task by making a call to the CREATE_MASTER_REPGROUP procedure in the DBMS_REPCAT package. The replication management API makes it easy for you to create custom scripts to manage your replication environment.

See Also:

Oracle Database Advanced Replication Management API Reference for more information about using the replication management API

Replication Catalog

Every master site and materialized view site in a replication environment has a replication catalog. A replication catalog for a site is a distinct set of data dictionary tables and views that maintain administrative information about replication objects and replication groups at the site. Every server participating in a replication environment can automate the replication of objects in replication groups using the information in its replication catalog.

See Also:

Oracle Database Advanced Replication Management API Reference for more information about the replication catalog

Distributed Schema Management

In a replication environment, all DDL statements must be issued using either the Replication Management tool in the Oracle Enterprise Manager Console or the DBMS_REPCAT package in the replication management API. Specifically, if you use the DBMS_REPCAT package, then use the CREATE_MASTER_REPOBJECT procedure to add objects to a master group, and use ALTER_MASTER_REPOBJECT to modify replicated objects. You can also use the EXECUTE_DDL procedure.

When you use either the Replication Management tool or the DBMS_REPCAT package, all DDL statements are replicated to all of the sites participating in the replication environment. In some cases, you can also use export/import to create replicated objects.

Note:

Any DDL statements issued directly using a tool such as SQL*Plus are not replicated to other sites.
You can use either Data Pump export/import or original export/import to perform exports and imports in an Advanced Replication environment. General references to export/import in this document refer to both Data Pump and original export/import.

Replication Conflicts

Asynchronous multimaster and updatable materialized view replication environments must address the possibility of replication conflicts that may occur when, for example, two transactions originating from different sites update the same row at nearly the same time. When data conflicts occur, you need a mechanism to ensure that the conflict is resolved in accordance with your business rules and to ensure that the data converges correctly at all sites.

In addition to logging any conflicts that may occur in your replication environment, Advanced Replication offers a variety of prebuilt conflict resolution methods that enable you to define a conflict resolution system for your database that resolves conflicts in accordance with your business rules. If you have a unique situation that Oracle's prebuilt conflict resolution methods cannot resolve, then you have the option of building and using your own conflict resolution methods.

See Also:

Chapter 5, "Conflict Resolution Concepts and Architecture" for information about how to design your database to avoid data conflicts and how to build conflict resolution methods that resolve such conflicts when they occur
The Replication Management tool online help for instructions on using the tool to configure conflict resolution methods
Oracle Database Advanced Replication Management API Reference for a description of how to build conflict resolution methods using the replication management API

Other Options for Multimaster Replication

Asynchronous replication is the most common way to implement multimaster replication. However, you have two other options: synchronous replication and procedural replication.

Synchronous Replication

A multimaster replication environment can use either asynchronous or synchronous replication to copy data. With asynchronous replication, changes made at one master site occur at a later time at all other participating master sites. With synchronous replication, changes made at one master site occur immediately at all other participating master sites.

When you use synchronous replication, an update of a table results in the immediate replication of the update at all participating master sites. In fact, each transaction includes all master sites. Therefore, if one master site cannot process a transaction for any reason, then the transaction is rolled back at all master sites.

Although you avoid the possibility of conflicts when you use synchronous replication, it requires a very stable environment to operate smoothly. If communication to one master site is not possible because of a network problem, for example, then users can still query replicated tables, but no transactions can be completed until communication is reestablished. Also, it is possible to configure asynchronous replication so that it simulates synchronous replication.

See Also:

"Scheduling Continuous Pushes" for information about simulating synchronous replication in an asynchronous replication environment

Procedural Replication

Batch processing applications can change large amounts of data within a single transaction. In such cases, typical row-level replication might load a network with many data changes. To avoid such problems, a batch processing application operating in a replication environment can use Oracle's procedural replication to replicate simple stored procedure calls to converge data replicas. Procedural replication replicates only the call to a stored procedure that an application uses to update a table. It does not replicate the data modifications themselves.

To use procedural replication, you must replicate the packages that modify data in the system to all sites. After replicating a package, you must generate a wrapper for the package at each site. When an application calls a packaged procedure at the local site to modify data, the wrapper ensures that the call is ultimately made to the same packaged procedure at all other sites in the replication environment. Procedural replication can occur asynchronously or synchronously.

Conflict Detection and Procedural Replication

When a replicating data using procedural replication, the procedures that replicate data are responsible for ensuring the integrity of the replicated data. That is, you must design such procedures to either avoid or detect replication conflicts and to resolve them appropriately. Consequently, procedural replication is most typically used when databases are modified only with large batch operations. In such situations, replication conflicts are unlikely because numerous transactions are not contending for the same data.

See Also:

Oracle Database Advanced Replication Management API Reference for more information about procedural replication

1 Introduction to Advanced Replication

Overview of Replication

Availability

Performance

Disconnected computing

Network load reduction

Mass deployment

Applications That Use Replication

Replication Objects, Groups, and Sites

Replication Objects

Replication Groups

Replication Sites

Types of Replication Environments

Multimaster Replication

Figure 1-1 Multimaster Replication

Master Group Quiesce

Materialized View Replication

Read-Only Materialized Views

Figure 1-2 Read-Only Materialized View Replication

Updatable Materialized Views

Figure 1-3 Updatable Materialized View Replication

Writeable Materialized Views

Row and Column Subsetting with Materialized Views

Materialized View Refresh

Refresh Groups

Materialized View Log

Deployment Templates

Online and Offline Instantiation

Multimaster and Materialized View Hybrid Configurations

Figure 1-4 Hybrid Configuration

Scheduled Links

Administration Tools for a Replication Environment

Replication Management Tool in the Oracle Enterprise Manager Console

Figure 1-5 The Replication Management Tool in Oracle Enterprise Manager Console

Replication Management API

Replication Catalog

Distributed Schema Management

Replication Conflicts

Other Options for Multimaster Replication

Synchronous Replication

Procedural Replication

Conflict Detection and Procedural Replication

1
Introduction to Advanced Replication