Oracle® OLAP DML Reference 10g Release 1 (10.1) Part Number B10339-02 |
|
|
View PDF |
The AGGREGATE command calculates summary data from detail data. Use the AGGREGATE command to pre-calculate data and store it in an Oracle OLAP analytic workspace. Use the AGGREGATE function to calculate data at runtime. In either case, the aggregation is limited to the base values that are currently in status.
Syntax
AGGREGATE|AGGR var... [USING aggmap] [FROM fromspec|FROMVAR textvar]
[FUNCDATA] [COUNTVAR intvar...]
Arguments
One or more variables whose data values are to be calculated. Every variable in a single AGGREGATE command must have exactly the same dimensions in exactly the same order. The variables are often numeric, but can also be TEXT
, DATETIME
, or DATE
when the aggregation operation is FIRST, LAST, MIN, or MAX, as specified in the aggmap.
This keyword indicates that the aggregation is performed using the specified aggmap. When you do not include this phrase, the command uses the default aggmap for the variable as previously specified using the AGGMAP command or the $AGGMAP property.
The name of a previously-defined aggmap that specifies how the data will be aggregated. For information about aggmaps, see the DEFINE AGGMAP command.
This keyword indicates that the detail data is obtained from a different object.
A FROM clause is only one way in which you can specify the variable from which detail data should be obtained when performing aggregation. See "Ways of Specifying Where to Obtain Detail Data for Aggregation ".
An arbitrarily dimensioned variable, formula, or relation from which the detail data for the aggregation is obtained.
This keyword indicates that the detail data is obtained from different objects to perform a capstone aggregation. (For an example of using the FROMVAR clause, see Example 7-7, "Capstone Aggregation".)
A FROMVAR clause is only one way in which you can specify the variable from which detail data should be obtained when performing aggregation. See "Ways of Specifying Where to Obtain Detail Data for Aggregation ".
An arbitrarily dimensioned variable used to resolve any leaf nodes. Specify NA
to indicate that a node does not need detail data to calculate the value.
Compiles the aggregation specification for future use by the AGGREGATE function. When you use FUNCDATA, you do not have to recompile the aggmap before using the AGGREGATE function, unless afterward you make changes to the aggmap, the relation hierarchies, or a composite.
When the variables have composite dimensions, the indexes (composite tuples) are created and saved for use by the AGGREGATE function. Otherwise, the indexes are re-created each time the AGGREGATE function is called. Refer to AGGINDEX for more information about composite indexes.
Indicates that the number of leaf nodes that contributed to an aggregate value are counted. Leaf nodes that are NA
are not included in the tally. You must include a COUNTVAR phrase when the aggmap contains a RELATION (for aggregation) statement that uses the AVERAGE operator.
A variable that you have defined with an INTEGER data type. The definition of intvar must have exactly the same dimensions in exactly the same order as the dimensions in var. When you aggregate several variables together, you must define an INTEGER variable for each one to record the results.
Notes
A dimension hierarchy is a tree structure in which the dimension values are the nodes. At the lowest level of the hierarchy are leaves or leaf nodes, and at the highest level is the root or root node. Nodes in a hierarchy have parent-child relationships. Leaf nodes have parents but no children; root nodes have children but no parents.
The current status only affects dimension values at the lowest level of the hierarchy, that is, the leaf nodes. Only leaf-node dimension values that are currently in status are aggregated. The parent values of leaf nodes in status are calculated, whether the parent values are in status or not (unless you exclude the dimension values in those levels with a RELATION PRECOMPUTE statement in the aggmap). Thus, when you want to aggregate all of the data specified in the aggmap, then be sure to set the status of the dimensions to ALL before performing the aggregation. AGGREGATE uses the parent relation to distinguish among dimension values at different levels of the hierarchy. Alternatively, you can perform a partial aggregation of the data by limiting status. However, this must be done carefully when some of the data will be aggregated at runtime by the AGGREGATE function. See the AGGREGATE function notes for more information.
For example, suppose you use the area
dimension and the area.area
child-parent relation that supports one hierarchy for a geography dimension as illustrated in Table 7-2, "Geography Hierarchy".
Table 7-2 Geography Hierarchy
Level | area Dimension | area.area Parent Relation |
---|---|---|
1 | TotalUS |
NA |
2 | East |
TotalUS |
2 | South |
TotalUS |
3 | Boston |
East |
3 | New York |
East |
3 | Atlanta |
South |
Now suppose you change the data value for New
York
. When you then use AGGREGATE with only New
York
, the calculation occurs without including the child value for South
(Atlanta
), but still includes level 2
as it goes from level 3
to level 1
(TotalUS
). When you want all the child values included in rolling up to TotalUS
, use a LIMIT TO ALL
statement before you execute the AGGREGATE command.
When the data has changed for some, but not all, of the child values in a hierarchy, you can set the status to calculate just the values that have changed. For example, when your embedded-total dimension is called d2
, and its parent relation is called reld2
, first limit d2
to the values that have changed.
To calculate the data for every hierarchy in a dimension, limit the dimension's hierarchy dimension to ALL
before you execute the AGGREGATE command.
You can control how much of the variable data is calculated by using the PRECOMPUTE keyword with the RELATION statement in the aggmap. Use the limit-clause (after the PRECOMPUTE keyword) to set the status of the dimension.
When users are able to change the data in a variable, then you should calculate aggregates on the fly using the AGGREGATE function, so that their changes are reflected in the aggregate data. See the AGGREGATE function for more information about runtime changes to the data.
AGGREGATE automatically distinguishes between generations in the parent relation, even to the extent of allowing generation-skipping hierarchies. For example, you can have a four-level hierarchy (for example, neighborhoods
, cities
, states
, and totalUS
) that has a three-level branch (for example, Boston
, Massachusetts
, and totalUS
).
AGGREGATE does not work on variables that have cell-by-cell permissions; it will immediately return an error. It also ignores the PERMITERROR option. However, AGGREGATE will operate on variables with object level or dimension level permission. See the PERMIT and PERMITERROR entries.
You can specify where to obtain detail data when aggregating data in the following ways:
Assign either a $AGGREGATE_FROM property or a $AGGREGATE_FROMVAR property to a variable.
Note: You can only assign one of these properties to a variable. A variable cannot have both the $AGGREGATE_FROM and $AGGREGATE_FROMVAR properties assigned to it. |
Include either a FROM or FROMVAR clause in the AGGREGATE command or AGGREGATE function that aggregates the data.
When performing an aggregation, Oracle OLAP determines where to obtain the detail data as follows:
When a location has been specified using a FROM or FROMVAR clause, Oracle OLAP uses the detail data at that location.
When a location has not been specified using a FROM or FROMVAR clause, Oracle OLAP checks to see if a location has been specified using aa $AGGREGATE_FROM property or a $AGGREGATE_FROMVAR property. When a location has been specified using one of these properties, Oracle OLAP uses the detail data at that location.
When a location has not been specified using either FROM or FROMVAR clause or a $AGGREGATE_FROM property or a $AGGREGATE_FROMVAR property, Oracle OLAP performs the aggregation using the detail data in the variable itself.
Examples
This section contains several examples of using the AGGREGATE command. For additional aggregation examples, see the examples in AGGMAP.
Example 7-2 Precalculating Data in a Batch Job
Frequently, you generate precalculated aggregates in a batch window as part of maintaining the data in your database. When you wish, you can use Job Manager to schedule batch jobs in Oracle Enterprise Manager, as described in the Oracle OLAP Application Developer's Guide.
To generate precalculated aggregates, you use the AGGREGATE command. The AGGREGATE command aggregates the data for one or more variables according to the specifications provided in the aggmap.
Your batch job should include statements like the following.
POUTFILEUNIT=FILEOPEN('userfiles/progress.txt' WRITE) AGGREGATE sales units USING gpct.aggmap UPDATE COMMIT FILECLOSE POUTFILEUNIT
Example 7-3 Aggregating One Variable
Suppose your analytic workspace contains a variable named actuals
, which has the following definition.
DEFINE actuals DECIMAL <time, SPARSE <product, customer, channel>>
The next step is to define an aggmap object, whose definition has the same dimensions in the same dimension order. Suppose you define an aggmap object named act.agg
using the DEFINE AGGMAP command.
DEFINE act.agg AGGMAP <time, SPARSE <product, customer, channel>>
Suppose that the name of the hierarchy for the time
dimension is time.r
, the name of the product
dimension is product.r
, and so on Next, you use the AGGMAP command to add the following text in the act.agg
aggmap.
AGGMAP RELATION time.r RELATION product.r RELATION customer.r RELATION channel.r END
The preceding text specifies the name of each dimension's hierarchy for which data should be rolled up. Assuming that the current status of every dimension is ALL
, data will be calculated for every dimension value of every dimension in the definition of actuals
. No data will be calculated on the fly.
Use the following statements to calculate the actuals
variable. (It is not necessary to compile the aggmap, because the compilation is included as part of the AGGREGATE command.)
AGGREGATE actuals USING act.agg
Example 7-4 Aggregating Multiple Variables
Suppose your analytic workspace contains a variable named actuals
and a variable named forecast
. As shown in the following variable definitions, these variables have the same dimensions in the same dimension order.
DEFINE actuals DECIMAL <time, SPARSE <product, customer, channel>> DEFINE forecast DECIMAL <time, SPARSE <product, customer, channel>>
The next step is to define an aggmap object, whose definition has the same dimensions in the same dimension order. Suppose you define the same aggmap object named act.agg
, as described in "Aggregating One Variable". As long as you want the data for each variable to be rolled up in exactly the same way, you can use the same aggmap to calculate both variables in a single command.
Use the following statements to calculate the actuals
and the forecast
variables.
AGGREGATE actuals forecast USING act.agg
Because the aggmap specifies that all data for every dimension value in each dimension should be rolled up, this command rolls up all of the data in actuals
and all of the data in forecast
.
Example 7-5 Using COUNTVAR with Multiple Variables
Suppose you plan to use one AGGREGATE command to aggregate the data for three variables: sales
, units
, and projected_sales
. Each variable has the following dimensionality.
<month product geography>
To tally the results with COUNTVAR, you must define three INTEGER variables that have the same dimensionality as sales
, units
, and projected_sales
.
DEFINE intsales INTEGER <month product geography> DEFINE intunits INTEGER <month product geography> DEFINE intprojsales INTEGER <month product geography>
You can then specify the INTEGER variables in the following command:
AGGREGATE sales units projected_sales USING sales.agg - COUNTVAR intsales intunits inprojsales
Example 7-6 Performing a Partial Aggregation
This example limits the time
dimension to the last two time periods, so that only newly loaded data is aggregated.
The tp2.agg
aggmap specifies preaggregation for all detail data currently in status.
DEFINE TP2.AGG AGGMAP LD Full preaggregation AGGMAP RELATION time.parentrel PRECOMPUTE (ALL) RELATION product.parentrel PRECOMPUTE (ALL) END
For the aggregation, time
is limited to the last two time periods and all product
values are in status.
LIMIT time TO LAST 2 STATUS time product The current status of TIME is: Apr02, May02 LIMIT product TO ALL
The AGGREGATE command calculates units
using the tp2.agg
aggmap.
AGGREGATE units USING tp2.agg
The results of this aggregation show that parent values are calculated, regardless of their own status, when their children are in status.
LIMIT time TO '2002' 'Q1.02' 'Q2.02' 'Jan02' to 'May02' REPORT DOWN time units
-----------------------------------------UNITS----------------------------------------- ----------------------------------------PRODUCT---------------------------------------- TIME FOOD SNACKS DRINKS POPCORN COOKIES CAKES SODA JUICE ------- -------- -------- -------- -------- -------- -------- -------- -------- 2002 38 24 14 6 9 9 9 5 Q1.02 NA NA NA NA NA NA NA NA Q2.02 38 24 14 6 9 9 9 5 Jan02 NA NA NA 8 2 4 5 8 Feb02 NA NA NA 5 3 2 2 5 Mar02 NA NA NA 3 4 4 2 4 Apr02 21 13 8 2 7 4 6 2 May02 17 11 6 4 2 5 3 3
Example 7-7 Capstone Aggregation
Assume that your analytic workspace has the two hierarchical TEXT
dimensions named geog.d
and time.d
with the following values.
GEOG.D -------------- Boston Medford San Diego Sunnydale Massachusetts California United States TIME.D -------------- Jan76 Feb76 Mar76 76Q1
Assume, also, that there are four variables with the following definitions
DEFINE sales_jan76 VARIABLE INTEGER <geog.d> DEFINE sales_feb76 VARIABLE INTEGER <geog.d> DEFINE sales_mar76 VARIABLE INTEGER <geog.d> DEFINE sales_capstone76 VARIABLE INTEGER <geog.d time.d>
Assume that you issue the following REPORT statements for the variables. The output of the reports show the detail data in the variables.
REPORT sales_jan76 sales_feb76 sales_mar76 REPORT DOWN geog.d sales_capstone76 GEOG.D SALES_JAN76 SALES_FEB76 SALES_MAR76 -------------- ------------ ------------ ------------ Boston 1,000 2,000 3,000 Medford 2,000 4,000 6,000 San Diego 3,000 6,000 9,000 Sunnydale 4,000 8,000 12,000 Massachusetts NA NA NA California NA NA NA United States NA NA NA -----------------SALES_CAPSTONE76------------------ ----------------------TIME.D----------------------- GEOG.D Jan76 Feb76 Mar76 76Q1 -------------- ------------ ------------ ------------ ------------ Boston NA NA NA NA Medford NA NA NA NA San Diego NA NA NA NA Sunnydale NA NA NA NA Massachusetts NA NA NA NA California NA NA NA NA United States NA NA NA NA
Define two aggmap objects with the following definitions.
DEFINE leaf_aggmap AGGMAP AGGMAP RELATION geog.parentrel OPERATOR SUM END DEFINE capstone_aggmap AGGMAP AGGMAP RELATION time.parentrel OPERATOR SUM END
Define a variable named capstone_source
with the following definition to use to aggregate the data.
DEFINE capstone_source VARIABLE TEXT <time.d>
As the following output of a REPORT statement illustrates, for each value of time.d
, you populate capstone_source
with the name of the variable that contains the corresponding sales data.
TIME.D CAPSTONE_SOURCE -------------- ---------------------- Jan76 sales_jan76 Feb76 sales_feb76 Mar76 sales_mar76 76Q1 NA
Issue the following statements to aggregate the variables.
AGGREGATE sales_jan76 sales_feb76 sales_mar76 USING leaf_aggmap AGGREGATE sales_capstone76 USING capstone_aggmap FROMVAR capstone_source
After aggregating the variables, when you issue the REPORT statements, the variables are populated with the calculated data.
REPORT sales_jan76 sales_feb76 sales_mar76 REPORT DOWN geog.d sales_capstone76 GEOG.D SALES_JAN76 SALES_FEB76 SALES_MAR76 -------------- ------------ ------------ ------------ Boston 1,000 2,000 3,000 Medford 2,000 4,000 6,000 San Diego 3,000 6,000 9,000 Sunnydale 4,000 8,000 12,000 Massachusetts 3,000 6,000 9,000 California 7,000 14,000 21,000 United States 10,000 20,000 30,000 -----------------SALES_CAPSTONE76------------------ ----------------------TIME.D----------------------- GEOG.D Jan76 Feb76 Mar76 76Q1 -------------- ------------ ------------ ------------ ------------ Boston 1,000 2,000 3,000 6,000 Medford 2,000 4,000 6,000 12,000 San Diego 3,000 6,000 9,000 18,000 Sunnydale 4,000 8,000 12,000 24,000 Massachusetts 3,000 6,000 9,000 18,000 California 7,000 14,000 21,000 42,000 United States 10,000 20,000 30,000 60,000
Example 7-8 Aggregating a Variable with External Partitions
Assume that you have the following objects defined in your analytic workspace.
DEFINE YEAR_2003 DIMENSION TEXT DEFINE YEAR_2002 DIMENSION TEXT DEFINE PRODUCT DIMENSION TEXT DEFINE SALES_2003 VARIABLE DECIMAL <YEAR_2003 PRODUCT> DEFINE SALES_2002 VARIABLE DECIMAL <YEAR_2002 PRODUCT> DEFINE TIME DIMENSION CONCAT (YEAR_2003 YEAR_2002) UNIQUE DEFINE TIME_PARENTREL RELATION TIME <TIME> DEFINE PART_TEMP_SALES_BY_YEAR PARTITION TEMPLATE <TIME PRODUCT> - PARTITION BY CONCAT (TIME) - (PARTITION PARTITION_2002 <YEAR_2002 PRODUCT> - PARTITION PARTITION_2003 <YEAR_2003 PRODUCT>) DEFINE SALES VARIABLE DECIMAL <PART_TEMP_SALES_BY_YEAR <TIME PRODUCT>> - (PARTITION PARTITION_2002 EXTERNAL SALES_2002 - PARTITION PARTITION_2003 EXTERNAL SALES_2003) DEFINE AGG_SALES AGGMAP AGGMAP RELATION time_parentrel OPERATOR SUM END
To aggregate sales
, you issue the following statement.
AGGREGATE sales USING agg_sales
When you issue REPORT statements on sales, you can see the aggregated values in sales.
--------SALES-------- -------PRODUCT------- TIME 00001 00002 ---------- ---------- ---------- 01Jan2003 10.00 15.21 31Jan2003 10.88 13.37 01Dec2003 NA NA 31Dec2003 NA NA Jan2003 20.88 28.58 Dec2003 NA NA 2003 20.88 28.58 01Jan2002 14.44 11.03 31Jan2002 15.55 12.20 01Dec2002 11.39 12.80 31Dec2002 10.53 13.77 Jan2002 29.98 23.23 Dec2002 21.92 26.57 2002 51.91 49.80
Since sales_2002
and sales_2003
are external partitions of sales
, aggregating sales
effectively means that you aggregated sales_2002
and sales_2003
. When you issue REPORT statements onsales_2002
and sales_2003
, you can see the aggregated values in those variables.
-----SALES_2002------ -------PRODUCT------- YEAR_2002 00001 00002 ---------- ---------- ---------- 01Jan2002 14.44 11.03 31Jan2002 15.55 12.20 01Dec2002 11.39 12.80 31Dec2002 10.53 13.77 Jan2002 29.98 23.23 Dec2002 21.92 26.57 2002 51.91 49.80 -----SALES_2003------ -------PRODUCT------- YEAR_2003 00001 00002 ---------- ---------- ---------- 01Jan2003 10.00 15.21 31Jan2003 10.88 13.37 01Dec2003 NA NA 31Dec2003 NA NA Jan2003 20.88 28.58 Dec2003 NA NA 2003 20.88 28.58