Oracle9i OLAP User's Guide Release 2 (9.2.0.2) Part Number A95295-02 |
|
|
View PDF |
The basic data model in a relational database is a table composed of one or more columns of data. All of the data is stored in columns. In contrast, the basic data model for multidimensional analysis is a cube, which is composed of Measures, Dimensions, and Attributes.
Within the OLAP Catalog, you identify whether the data will function as a measure, a dimension, or an attribute. Once these decisions are stored in the OLAP Catalog metadata, the OLAP API can access warehouse data without regard to its underlying storage format. Whether the data is stored in relational tables, analytic workspaces, or some combination of relational and multidimensional schemas, the OLAP Catalog presents the same logical model to applications that use the OLAP API.
The OLAP Catalog metadata informs applications about the data that is available within the database. The application can then define multidimensional objects to represent that data. When the application runs, it instantiates these objects and populates them with data.
Before you can create metadata, you must know what data users want to view and at what levels they want to view it. If you have already created a data warehouse, then you have already done most of this research. You only need to verify that the requirements haven't changed for the analytical applications that will be run using Oracle OLAP.
The objects comprising a data warehouse and Oracle OLAP metadata use different data structures. The data objects in your data warehouse are represented to the OLAP metadata catalog in the following relational objects, regardless of whether the data is actually stored in relational tables or workspace variables:
Oracle OLAP metadata catalog maps the data warehouse schema to these multidimensional data objects:
Measures are the same as facts. The term "fact" is typically used in relational databases, and the term "measure" is typically used in multidimensional applications.
Measures are thus located in fact tables. A fact table has columns that store measures (or facts) and foreign key columns that create the association with dimension tables.
Measures contain the data that you wish to analyze, such as Sales or Cost. OLAP Catalog metadata requires that a column have a numerical or a date data type to be identified as a measure. Most frequently, a measure is numerical and additive.
Dimensions identify and categorize your data. Dimension members are stored in a dimension table. Each column represents a particular level in a hierarchy. In a star schema, the columns are all in the same table; in a snowflake schema, the columns are in separate tables for each level.
Because measures are typically multidimensional, a single value in a measure must be qualified by a member of each dimension to be meaningful. For example, the unit_cost
measure has two dimensions: products_dim
and times_dim
. A value of unit_cost
(21.60
) is only meaningful when it is qualified by a specific product code (1575
) and a time period (28-jan-1998
).
If you use Oracle Enterprise Manager to create OLAP metadata, then defining a dimension in your data warehouse creates a database dimension object, in addition to creating metadata. A dimension object contains the details of the parent-child relationship between columns in a dimension table; it does not contain data.
Note: A dimension object is not created when you use the |
The database dimension object is used by the Summary Advisor and query rewrite to optimize your data warehouse.
OLAP metadata considers time dimensions to be distinct from other dimensions. When you specify a dimension in the OLAP metadata, you must identify whether it is a time dimension. A time dimension has special attributes that support both regular and irregular time periods.
Regular time periods, such as weeks, months, and years, are evident on standard calendars. Typically, they neither overlap nor have gaps between them.
Irregular time periods, such as promotional schedules and seasonal time periods, are not evident on standard calendars. They often overlap (even to the extent that one time period is a subset of another time period) or have gaps between them.
The time dimension table should contain the following columns to provide full time support:
WEEK_ENDDATE
, QUARTER_ENDDATE
, and YEAR_ENDDATE
. These columns must have a DATE
data type. Their values identify the last day in the time period.WEEK_TIMESPAN
, QUARTER_TIMESPAN
, and YEAR_TIMESPAN
. These columns must have a NUMBER
data type. Their values identify the number of days in the period.
The following table describes a dimension table in a star schema.
The following tables describe dimension tables in a snowflake schema. The first table defines weeks, which is the lowest level of time data.
A second table defines quarters.
A third table defines years.
A hierarchy is a way to organize data according to levels. Dimensions are structured hierarchically so that data at different levels of aggregation can be manipulated together efficiently for analysis and display. Dimension hierarchies enable users to recognize trends at one level of aggregation, drill down to lower levels to identify reasons for these trends, and roll up to higher levels to see what affect these trends have on a larger sector of the business.
Each level represents a position in the hierarchy. Levels group the data for aggregation and are used internally for computation. Each level above the base (or lowest) level represents the aggregate total of the levels below it. For example, a time
dimension might have day
, week
, quarter
, and year
for the levels of a hierarchy. If data for the sales
measure is stored in days, then the higher levels of the time
dimension allow the sales
data to be aggregated correctly into weeks, quarters, and years. Days roll up into weeks, weeks into quarters, and quarters into years.
The members of a hierarchy at different levels have a one-to-many parent-child relationship. For example, qtr1
and qtr2
are the children of yr2001
, thus yr2001
is the parent of qtr1
and qtr2
.
Attributes provide descriptive information about the data and are typically used for display.
Level attributes provide supplementary information about the dimension members at a particular level of a dimension hierarchy. The dimension members themselves may be meaningless, such as a value of "1296" for a time period. These cryptic values for dimension members are used internally for selecting and sorting quickly, but are meaningless to users.
For example, you might have columns for employee number (ENUM
), last name (LAST_NAME
), first name (FIRST_NAME
), and telephone extension (TELNO
). ENUM
is the best choice for a level column, since it is a key column and its values uniquely identify the employees. ENUM
also has a NUMBER
data type, which makes it more efficient than a text column for the creation of indexes. LAST_NAME
, FIRST_NAME
, and TELNO
are attributes. Even though they are dimensioned by ENUM
, they do not make suitable measures because they are descriptive text rather than business measurements.
Dimension attributes specify groupings of level attributes for a specific dimension. Whereas level attributes map to specific data values, dimension attributes are purely logical metadata objects.
An example of a dimension attribute is end date
, which is required for time dimensions. If a time dimension has month, quarter, and year levels, end date
identifies the last date of each month, each quarter, and each year. Within a relational schema, the three level attributes that make up the end date
dimension attribute would be stored in columns with names like month_end_date
, quarter_end_date
, and year_end_date
.
Cubes are the metadata objects that associate measures with their dimensions. All the measures associated with a cube have the exact same dimensionality.
The edges of a cube are defined by its dimensions. Although there is no limit to the number of edges on a cube, data is often organized for display purposes along three edges, which are referred to as the row edge, column edge, and page edge. A single dimension or multiple dimensions can be placed on an edge. For example, sales data might be displayed with Product and Channel on the row edge, Time on the column edge, and Customer on the page edge.
Measures can be organized within measure folders, which facilitate the browsing of data by business area. Measure folders are also known as catalogs.
Whereas dimensions and measures are associated with the schemas that contain their source data, measure folders are schema independent. Each OLAP client can view all measure folders defined within the Oracle instance.