Oracle9i OLAP Developer's Guide to the OLAP DML Release 2 (9.2) Part Number A95298-01 |
|
Reading Data from Files, 6 of 9
The records in a data file often contain dimension values, which are used to identify the cell in which the data values should be stored. When all of the dimension values in the file already exist in your analytic workspace, you can use the default attribute MATCH
in the dimension field description. MATCH
accepts only dimension values that already are in the analytic workspace.
When FILEREAD
finds an unrecognized value, the command signals an error that warns you about the bad data. Your data-reading program can handle the error by skipping the data and continuing processing, or by halting the processing and letting you check the validity of the data file.
The following example shows a data file that contains 6-character values for the dimension productid
, names for each product, and the number of units sold.
1234AA00CHOCOLATE CHIP COOKIES 123 1099BB00OATMEAL COOKIES 145 2344CC00SUGAR COOKIES 223 3222DD00BROWNIES 432 5553EE00GINGER SNAP COOKIES 233
The following workspace objects are used by the example program.
DEFINE productid DIMENSION ID DEFINE productname VARIABLE TEXT <productid> DEFINE units.sold VARIABLE INTEGER <month productid>
The dr.prog
program reads the file. The values of productid
with the associated product name are already part of the analytic workspace, so the program uses the productid
values only to set status and assign the units data to the right cells of the units.sold
variable.
The MATCH
attribute is left out of the field description because it is the default. When the program finds a value for productid
that is not in the analytic workspace, it branches to the trap label. If the user interrupts the program (that is, the error name is attn
) or the data file cannot be opened, then the program ends. Otherwise, the program resets the error trap and branches back to FILEREAD
to continue with the next record.
The example program, named dr.prog
, has the following definition.
DEFINE dr.prog PROGRAM LD Reads a file with existing dimension values PROGRAM VARIABLE funit INTEGER TRAP ON error PUSHLEVEL 'save' PUSH month productid LIMIT month TO FIRST 1 funit = FILEOPEN('olapfiles/dr.dat' READ) next: FILEREAD funit - COLUMN 1 WIDTH 6 productid - COLUMN 39 WIDTH 3 units.sold FILECLOSE funit POPLEVEL 'save' RETURN error: "Skip current record and continue processing IF funit NE na and ERRORNAME NE 'ATTN' THEN DO TRAP ON error GOTO next DOEND "Close the file IF funit NE na THEN FILECLOSE funit POPLEVEL 'save' END
When your data file contains a mixture of existing and new dimension values, you can add the new values and all the associated data to the analytic workspace by using the APPEND
attribute in the field description.
The first FILEREAD
command in the dr.prog2
program uses APPEND
to add any new productid
values to the analytic workspace. The second FILEREAD
command includes a field to read the product name so the new data will be complete.
The dimension maintenance performed by APPEND
might be done in the same FILEREAD
command that reads the data, but that would cause inefficient handling of the data. The data is handled more efficiently when the dimension maintenance and data reading are performed in two separate passes over the file.
The error processing in this version is shorter because there is no need to skip nonexistent product values and branch back. If there is an error, then the program closes the file, restores any pushed values, and terminates.
The program, named dr.prog2
, has the following definition.
DEFINE dr.prog2 PROGRAM LD Reads a file with new dimension values PROGRAM VARIABLE funit INTEGER TRAP ON error PUSHLEVEL 'save' PUSH month productid LIMIT month TO FIRST 1 funit = FILEOPEN('olapfiles/dr.dat' READ) FILEREAD funit COLUMN 1 APPEND WIDTH 6 productid FILECLOSE funit funit = FILEOPEN('olapfiles/dr.dat' READ) FILEREAD funit - COLUMN 1 WIDTH 6 productid - COLUMN 9 WIDTH 30 productname - COLUMN 39 WIDTH 3 units.sold FILECLOSE funit POPLEVEL 'save' RETURN error: IF funit NE na THEN FILECLOSE funit POPLEVEL 'save' END
If the target dimension has a data type of TEXT
, NTEXT
, or ID
and the input field in the file contains dimension position numbers (rather than dimension values), then you must specify a conversion type of INTEGER
in the field description. The conversion type specifies how input data should be converted to values of the target dimension.
Suppose the target dimension is month
, then you can use the following command to read input values that represent positions within the default status of month
.
FILEREAD unit COLUMN 1 WIDTH 8 INTEGER month
When the input field contains position numbers, you cannot use the APPEND
keyword to add new values to a target dimension.
Composites are automatically maintained. The way in which you define and use composites can dramatically improve or hinder performance. The more you know about analytic workspace design, especially in regard to the applications that will be used with an analytic workspace, the more effective your use of composites will be.
When you have conjoint dimensions in your analytic workspace, you can set the status of those dimensions while reading a file with the FILEREAD
command. Typically, the records in the data file will have a separate field for each base dimension of your conjoint dimension. For example, a file might have a market name in the first field, a product name in the second, and then one or more fields containing sales data.
To read the sales data into a variable dimensioned by a conjoint dimension, for example markprod
, you can use a FILEREAD
command as follows.
FILEREAD funit markprod - = <W 8 market W 8 product> W 10 sales
This command will read a value of the market
dimension from the first 8-character field of the record and a value of the product
dimension from the next 8-character field.
The command will then use the results to set the status of markprod
, which is a conjoint dimension defined as follows.
DEFINE markprod DIMENSION <market product>
The command then reads the last field and assigns the value to the variable sales
, which is dimensioned by markprod
.
By including the APPEND
keyword in the field description, you can add new values to market
, product
, and markprod
, when the FILEREAD
command encounters values in the file that do not match existing dimension values.
FILEREAD funit APPEND markprod - = <W 8 APPEND market W 8 APPEND product> W 10 sales
The fields containing dimension information in your data file might have values that are not identical to the dimension values in your analytic workspace. The file values might be abbreviated or otherwise encoded. The way you translate a coded dimension value varies depending on whether the code is merely an abbreviation (for example, "P" for product
) or if the code is more complicated.
When the file contains an abbreviated code, you can sometimes complete the value by using the RSET
or LSET
attribute to add text to the right or left of the value in the file.
For example, products in the file might be identified by all-numeric product numbers, while in your analytic workspace, the values of the product
dimension might be these same product numbers preceded by the letter P
. In this case, you can use the LSET
attribute to add the letter P
to the values in the file.
FILEREAD funit COLUMN 1 WIDTH 6 LSET 'P' product
The letter P
is added when the value is read from the file; it is not added when the modified value is matched with or assigned to the product
dimension.
To correctly read values that have less straightforward codes, you can set up another dimension containing the coded values found in the data file, along with a relation to the real dimension. FILEREAD
can then use the relation to determine the actual dimension value. Or you can use any OLAP DML function to alter or manipulate the coded value to make it match a value in your analytic workspace.
When reading coded data that must be manipulated in some way before being stored in the target, use an assignment statement (shown below) in the field description.
target = expression
The expression
argument specifies the processing or calculation to be performed. If you want to include the value just read from the file as part of the expression
, then use the VALUE
keyword.
Both of the following field descriptions function identically.
COLUMN n WIDTH n target target = COLUMN n WIDTH n VALUE
This example illustrates the use of an expression for translating codes into dimension values.
The following example shows the data file, which has 3-character codes for months, and 2-character codes for districts and products.
SEP BO CH 113945 115 OCT BO CH 118934 115 SEP BO CO 92013 119 OCT BO CO 95820 119 SEP BO WI 83201 110 OCT BO WI 82986 110 SEP DA CH 111792 115 OCT DA CH 136031 114 SEP DA CO 91641 121 OCT DA CO 96347 120 SEP DA WI 89734 109 OCT DA WI 88264 109
The following OLAP DML objects are used by the example program.
DEFINE distcode DIMENSION ID DEFINE district.dcode RELATION district <distcode> DEFINE prodcode DIMENSION ID DEFINE Product.pcode RELATION product <prodcode>
The example program, named dr.prog3
, has the following definition.
DEFINE dr.prog3 PROGRAM LD Translates coded values into valid dimension values PROGRAM VARIABLE funit INT funit = FILEOPEN('olapfiles/dr3.dat' READ) FILEREAD funit - COLUMN 1 WIDTH 3 APPEND RSET '96' month FILECLOSE funit funit = FILEOPEN('olapfiles/dr3.dat' READ) FILEREAD funit - COLUMN 1 WIDTH 3 RSET '96' month - COLUMN 5 WIDTH 2 district = district.dcode - (distcode VALUE) - COLUMN 8 WIDTH 2 product = product.pcode - (prodcode VALUE) - COLUMN 11 WIDTH 6 STRIP units - COLUMN 18 WIDTH 3 SCALE 2 price FILECLOSE funit END
The program translates the 2-character codes for districts and products into values of a district
dimension and a product
dimension. The program also appends a 2-digit year to the months.
In the first FILEREAD
command, the APPEND
keyword is used so that new months are added to the MONTH dimension.
FILEREAD fileunit COLUMN 1 WIDTH 3 APPEND RSET '96' month
For the district and product fields, the program reads the value from the data file and finds the corresponding dimension value using the relations district.dcode
and product.pcode
.
COLUMN 5 WIDTH 2 district = district.dcode distcode VALUE) COLUMN 8 WIDTH 2 product = product.pcode (prodcode VALUE)
The program uses a QDR with the keyword VALUE
representing the code read from the data file. For the districts, the distcode VALUE
QDR modifies the relation district.dcode
, which holds district names. It specifies the district that corresponds to the value of distcode
just read from the data file. The QDR for product
works the same way.
The program assumes the product
and district
dimension values are already in the analytic workspace, along with the distcode
and prodcode
dimensions and the relations connecting them to district
and product
. Once the coded values have been processed, the resulting values of district
and product
are used to limit the dimension status so that the data is put in the right cells of the units
and price
variables.
Finally, you can see in the data file that the price data, which starts in column 18, does not have a decimal point. The SCALE
attribute on the last line of the FILEREAD
command puts two decimal places in each price value.
|
Copyright © 2001, 2002 Oracle Corporation. All Rights Reserved. |
|