Oracle® Warehouse Builder User's Guide 11g Release 1 (11.1) Part Number B31278-01 |
|
|
View PDF |
Scenario
Your company records all its transactions as they occur, resulting in inserts, updates, and deletes, in a flat file called record.csv.
These transactions must be processed in the exact order they were stored. For example, if an order was first placed, then updated, then canceled and re-entered, this transaction must be processed exactly in the same order.
An example data set of the source file record.csv
is defined as:
Action,DateTime,Key,Name,Desc I,71520031200,ABC,ProdABC,Product ABC I,71520031201,CDE,ProdCDE,Product CDE I,71520031202,XYZ,ProdXYZ,Product XYZ U,71620031200,ABC,ProdABC,Product ABC with option D,71620032300,ABC,ProdABC,Product ABC with option I,71720031200,ABC,ProdABC,Former ProdABC reintroduced U,71720031201,XYZ,ProdXYZ,Rename XYZ
You want to load the data into a target table such as the following:
SRC_TIMESTA KEY NAME DESCRIPTION ----------- --- ------- --------------------------- 71520031201 CDE ProdCDE Product CDE 71720031201 XYZ ProdXYZ Rename XYZ 71720031200 ABC ProdABC Former ProdABC reintroduced
Solution
Warehouse Builder enables you to design ETL logic and load the data in the exact temporal order in which the transactions were stored at source. To achieve this result, you design a mapping that orders and conditionally splits the data before loading it into the target. Then, you configure the mapping to generate code in row-based operating mode. In row-based operating mode, Warehouse Builder generates code to process the data row by row using if-then-else constructions, as shown in the following example.
CURSOR SELECT "DATETIME$1" FROM "JOURNAL_EXT" ORDER BY "JOURNAL_EXT"."DATETIME" ASC LOOP IF "ACTION" = 'I' THEN INSERT this row ELSE IF "ACTION" = 'U' THEN UPDATE this row ELSE DELETE FROM "TARGET_FOR_JOURNAL_EXT" END LOOP;
This ensures that all consecutive actions are implemented in sequential order and the data is loaded in the order in which the transaction was recorded.
Case Study
This case study shows you how to create ETL logic to load transaction data in a particular order using Warehouse Builder.
Step 1: Import and Sample the Source Flat File, record.csv
In this example, the flat file record.csv
stores all transaction records and a timestamp. Import this flat file from your source system using the Warehouse Builder Import Metadata Wizard. Proceed to define the metadata for the flat file in Warehouse Builder using the Flat File Sample Wizard.
Note:
You can replace this flat file with a regular table if your system is sourced from a table. In this case, skip to Step 3.Step 2: Create an External Table
To simplify the use of sampled flat file object in a mapping, create an external table (JOURNAL_EXT
) using the Create External Table Wizard, based on the flat file imported and sampled in Step 1.
The advantage of using an external table instead of a flat file is that it provides you direct SQL access to the data in your flat file. Hence, there is no need to stage the data.
Step 3: Design the Mapping
In this mapping, you move the transaction data from an external source, through an operator that orders the data, followed by an operator that conditionally splits the data before loading it into the target table. Figure 14-1 shows you how the source is ordered and split.
The Sorter operator enables you to order the data and process the transactions in the exact order in which they were recorded at source. The Splitter operator enables you to conditionally handle all the inserts, updates, and deletes recorded in the source data by defining a split condition that acts as the if-then-else constraint in the generated code. The data is conditionally split and loaded into the target table. In this mapping, the same target table is used three times to demonstrate this conditional loading. The mapping tables TARGET 1, TARGET 2, and TARGET 3 are all bound to the same workspace table TARGET
. All the data goes into a single target table.
The following steps show you how to build this mapping.
Step 4: Create the Mapping
Create a mapping called LOAD_JOURNAL_EXT
using the Create Mapping dialog box. Warehouse Builder then opens the Mapping Editor where you can build your mapping.
Step 5: Add an External Table Operator
Drag and drop a mapping external table operator onto the mapping editor and bind it to the external table JOURNAL_EXT
.
Step 6: Order the Data
Add the Sorter operator to define an order-by clause that specifies the order in which the transaction data must be loaded into the target. Figure 14–2 shows you how to order the table based on the timestamp of the transaction data in ascending order.
Step 7: Define a Split Condition
Add the Splitter operator to conditionally split the inserts, updates, and deletes stored in the transaction data. This split condition acts as the if-then-else constraint in the generated code. Figure 14-3 shows how to join the SOURCE operator with the ORDERBY operator which is linked to the SPLITTER operator.
Define the split condition for each type of transaction. For outgroup INSERT_DATA, define the split condition as INGRP1.ACTION = 'I'
. For UPDATE_DATA, define the split condition as INGRP1.ACTION = 'U'
. In Warehouse Builder, the Splitter operator contains a default group called REMAINING_ROWS that automatically handles all Delete ('D') records.
Step 8: Define the Target Tables
Use the same workspace target table three times for each type of transaction: one for INSERT_DATA, one for UPDATE_DATA, and one for REMAINING_ROWS.
Step 9: Configure the Mapping LOAD_JOURNAL_EXT
After you define the mapping, you must configure the mapping to generate code. Because the objective of this example is to process the data strictly in the order it was stored, you must select row-based as the default operating mode. In this mode, the data is processed row by row and the insert, update, and delete actions on the target tables take place in the exact order in which the transaction was recorded at source.
Do not select set-based mode as Warehouse Builder then generates code that creates one statement for all insert transactions, one statement for all update transactions, and a third one for all delete operations. The code then calls these procedures one after the other, completing one action completely before following up with the next action. For example, it first handles all inserts, then all updates, and then all deletes.
To configure a mapping for loading transaction data:
From the Project Explorer, right-click the LOAD_JOURNAL_EXT
mapping and select Configure.
Expand the Runtime parameters node and set the Default Operating Mode parameter to Row based.
In this example, accept the default value for all other parameters. Validate the mapping before generating the code.
Step 10: Generate Code
After you generate a mapping, Warehouse Builder displays the results in the Generation Results window.
When you inspect the code, you will see that Warehouse Builder implements all consecutive actions in row-based mode. This means that the data is processed row by row and Warehouse Builder evaluates all conditions in sequential order using if-then-else constructions, as shown in Figure 14-1. The resulting target table thus maintains the sequential integrity of the transactions recorded at source.