Oracle® Database 2 Day + Data Warehousing Guide 11g Release 1 (11.1) Part Number B28314-01 |
|
|
View PDF |
This section discusses how to identify and reduce performance issues, and includes the following sections:
An important aspect of ensuring that your system performs well is to eliminate performance problems. This section describes some methods of finding and eliminating these bottlenecks, and contains the following topics:
Optimizer statistics are a collection of data that describes more details about the database and the objects in the database. These statistics are stored in the data dictionary, and are used by the query optimizer to choose the best execution plan for each SQL statement. Optimizer statistics include the following:
Table statistics (number of rows, blocks, and the average row length)
Column statistics (number of distinct values in a column, number of null values in a column, and data distribution)
Index statistics (number of leaf blocks, levels, and clustering factor)
System statistics (CPU and I/O performance and utilization)
The optimizer statistics are stored in the data dictionary. They can be viewed using data dictionary views similar to the following:
SELECT * FROM DBA_SCHEDULER_JOBS WHERE JOB_NAME = 'GATHER_STATS_JOB';
Because the objects in a database can constantly change, statistics must be regularly updated so that they accurately describe these database objects. Statistics are maintained automatically by Oracle Database or you can maintain the optimizer statistics manually using the DBMS_STATS
package.
To execute a SQL statement, Oracle Database might need to perform many steps. Each of these steps either retrieves rows of data physically from the database or prepares them in some way for the user issuing the statement. The combination of the steps Oracle Database uses to execute a statement is called an execution plan. An execution plan includes an access path for each table that the statement accesses and an ordering of the tables (the join order) with the appropriate join method.
You can examine the execution plan chosen by the optimizer for a SQL statement by using the EXPLAIN
PLAN
statement. When the statement is issued, the optimizer chooses an execution plan and then inserts data describing the plan into a database table. Simply issue the EXPLAIN
PLAN
statement and then query the output table.
General guidelines for using the EXPLAIN
PLAN
statement are:
To use the SQL script UTLXPLAN.SQL
to create a sample output table called PLAN_TABLE
in your schema.
To include the EXPLAIN
PLAN
FOR
clause prior to the SQL statement.
After issuing the EXPLAIN
PLAN
statement, to use one of the scripts or packages provided by Oracle Database to display the most recent plan table output.
The execution order in EXPLAIN
PLAN
output begins with the line that is indented farthest to the right. If two lines are indented equally, then the top line is normally executed first.
The following statement illustrates the output of two EXPLAIN
PLAN
statements, one with dynamic pruning and one with static pruning.
To analyze EXPLAIN PLAN output:
EXPLAIN PLAN FOR SELECT p.prod_name , c.channel_desc , SUM(s.amount_sold) revenue FROM products p , channels c , sales s WHERE s.prod_id = p.prod_id AND s.channel_id = c.channel_id AND s.time_id BETWEEN '01-12-2001' AND '31-12-2001' GROUP BY p.prod_name , c.channel_desc; SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);
WITHOUT TO_DATE --------------------------------------------------------------------------------------------------- | Id| Operation | Name |Rows|Bytes|Cost | Time |Pstart|Pstop| (%CPU) --------------------------------------------------------------------------------------------------- | 0|SELECT STATEMENT | | 252|15876|305(1)|00:00:06| | | | 1| HASH GROUP BY | | 252|15876|305(1)|00:00:06| | | | *2| FILTER | | | | | | | | | *3| HASH JOIN | |2255| 138K|304(1)|00:00:06| | | | 4| TABLE ACCESS FULL | PRODUCTS | 72| 2160| 2(0)|00:00:01| | | | 5| MERGE JOIN | |2286|75438|302(1)|00:00:06| | | | 6| TABLE ACCESS BY INDEX ROWID | CHANNELS | 5| 65| 2(0)|00:00:01| | | | 7| INDEX FULL SCAN | CHANNELS_PK | 5| | 1(0)|00:00:01| | | | *8| SORT JOIN | |2286|45720|299(1)|00:00:06| | | | 9| PARTITION RANGE ITERATOR | |2286|45720|298(0)|00:00:06| KEY| KEY| | 10| TABLE ACCESS BY LOCAL INDEX ROWID| SALES |2286|45720|298(0)|00:00:06| KEY| KEY| | 11| BITMAP CONVERSION TO ROWIDS | | | | | | | | |*12| BITMAP INDEX RANGE SCAN |SALES_TIME_BIX| | | | | KEY| KEY| --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(TO_DATE('01-12-2001')<=TO_DATE('31-12-2001')) 3 - access("S"."PROD_ID"="P"."PROD_ID") 8 - access("S"."CHANNEL_ID"="C"."CHANNEL_ID") filter("S"."CHANNEL_ID"="C"."CHANNEL_ID") 12 - access("S"."TIME_ID">='01-12-2001' AND "S"."TIME_ID"<='31-12-2001') Note the values of KEY KEY for Pstart and Pstop. WITH TO_DATE -------------------------------------------------------------------------------------------------- |Id| Operation | Name | Rows | Bytes |Cost(%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------- | 0| SELECT STATEMENT | | 252 | 15876 | 31 (20)| 00:00:01 | | | | 1| HASH GROUP BY | | 252 | 15876 | 31 (20)| 00:00:01 | | | |*2| HASH JOIN | | 21717 | 1336K| 28 (11)| 00:00:01 | | | | 3| TABLE ACCESS FULL |PRODUCTS| 72 | 2160 | 2 (0)| 00:00:01 | | | |*4| HASH JOIN | | 21717 | 699K| 26 (12)| 00:00:01 | | | | 5| TABLE ACCESS FULL |CHANNELS| 5 | 65 | 3 (0)| 00:00:01 | | | | 6| PARTITION RANGE SINGLE| | 21717 | 424K| 22 (10)| 00:00:01 | 20 | 20 | |*7| TABLE ACCESS FULL |SALES | 21717 | 424K| 22 (10)| 00:00:01 | 20 | 20 | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("S"."PROD_ID"="P"."PROD_ID") 4 - access("S"."CHANNEL_ID"="C"."CHANNEL_ID") 7 - filter("S"."TIME_ID">=TO_DATE('2001-12-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') AND "S"."TIME_ID"<=TO_DATE('2001-12-31 00:00:00', 'yyyy-mm-dd hh24:mi:ss')) Note the values of 20 20 for Pstart and Pstop.
The first execution plan shows dynamic pruning, using the KEY
KEY
values for Pstart and Pstop respectively. Dynamic pruning means that the database will have to figure out at execution time which partition or partitions to access. In the case of static pruning, the database knows at parse time which partition or partitions to access, which leads to more efficient execution.
You can frequently improve the execution plan by using explicit date conversions. Using explicit date conversions is a best practice for optimal partition pruning and index usage.
Hints let you make decisions usually made by the optimizer. As an application developer, you might know information about your data that the optimizer does not know. Hints provide a mechanism to instruct the optimizer to choose a certain query execution plan based on the specific criteria.
For example, you might know that a certain index is more selective for certain queries. Based on this information, you might be able to choose a more efficient execution plan than the optimizer. In such a case, use hints to instruct the optimizer to use the optimal execution plan.
By default, Oracle Warehouse Builder includes hints to optimize a typical data load. See Oracle Warehouse Builder User's Guide for more information.
Suppose you want to very quickly run a summary across the sales table for last year while the system is otherwise idle. In this case, you could issue the following statement.
To use a hint to improve data warehouse performance:
SELECT /*+ PARALLEL(s,16) */ SUM(amount_sold) FROM sales s WHERE s.time_id BETWEEN TO_DATE('01-JAN-2005','DD-MON-YYYY') AND TO_DATE('31-DEC-2005','DD-MON-YYYY');
Another common use for hints in data warehouses is to ensure that records are efficiently loaded using compression. For this, you use the APPEND
hint, as shown in the following SQL:
... INSERT /* +APPEND */ INTO my_materialized_view ...
Using the SQL Tuning Advisor and SQL Access Advisor, you can invoke the query optimizer in advisory mode to examine a given SQL statement, or set of SQL statements, and provide recommendations to improve their efficiency. The SQL Tuning Advisor and SQL Access Advisor can make various types of recommendations, such as creating SQL profiles, restructuring SQL statements, creating additional indexes or materialized views, and refreshing optimizer statistics. Additionally, Oracle Enterprise Manager enables you to accept and implement many of these recommendations in very few steps.
The SQL Access Advisor is primarily responsible for making schema modification recommendations, such as adding or dropping indexes and materialized views. It also recommends a partitioning strategy. The SQL Tuning Advisor makes other types of recommendations, such as creating SQL profiles and restructuring SQL statements. In some cases where significant performance improvements can be gained by creating a new index, the SQL Tuning Advisor may recommend doing so. However, such recommendations should be verified by running the SQL Access Advisor with a SQL workload that contains a set of representative SQL statements.
Example: Using the SQL Tuning Advisor to Verify SQL Performance
You can use the SQL Tuning Advisor to tune a single or multiple SQL statements. When tuning multiple SQL statements, keep in the mind that the SQL Tuning Advisor does not recognize interdependencies between the SQL statements. Instead, it is just meant to be a convenient way for you to run the SQL Tuning Advisor for a large number of SQL statements.
To run the SQL Tuning Advisor to verify SQL performance:
Go to the Advisor Central page, then click SQL Advisors.
The SQL Advisors page is displayed.
Click Schedule SQL Tuning Advisor.
The Schedule SQL Tuning Advisor page is displayed. A suggested name will be in the Name field, which you can modify. Then select Comprehensive to have a comprehensive analysis performed. Select Immediately for the Schedule. Choose a proper SQL Tuning Set, and then click OK.
The Processing page is displayed. Then the Recommendations page shows the recommendations for improving performance. Click View Recommendations.
The Recommendations page is displayed.
The recommendation is to create an index, which you can implement by clicking Implement. Alternatively, you may want to run the SQL Access Advisor as well.
You can minimize resource consumption, and thus improve your data warehouse's performance through the use of the following capabilities:
Data warehouses often contain large tables and require techniques both for managing these large tables and for providing good query performance across these large tables. This section discusses partitioning, a key method for addressing these requirements. Two capabilities relevant for query performance in a data warehouse are partition pruning and partitionwise joins.
Partition pruning is an essential performance feature for data warehouses. In partition pruning, the optimizer analyzes FROM
and WHERE
clauses in SQL statements to eliminate unneeded partitions when building the partition access list. This enables Oracle Database to perform operations only on those partitions that are relevant to the SQL statement. Oracle Database prunes partitions when you use range, LIKE
, equality, and IN
-list predicates on the range or list partitioning columns, and when you use equality and IN
-list predicates on the hash partitioning columns.
Partition pruning dramatically reduces the amount of data retrieved from disk and shortens the use of processing time, thus improving query performance and resource utilization. If you partition the index and table on different columns (with a global partitioned index), partition pruning also eliminates index partitions even when the partitions of the underlying table cannot be eliminated.
Depending upon the actual SQL statement, Oracle Database may use static or dynamic pruning. Static pruning occurs at compiletime, with the information about the partitions accessed beforehand while dynamic pruning occurs at runtime, meaning that the exact partitions to be accessed by a statement are not known beforehand. A sample scenario for static pruning would be a SQL statement containing a WHERE
condition with a constant literal on the partition key column. An example of dynamic pruning is the use of operators or functions in the WHERE
condition.
Partition pruning affects the statistics of the objects where pruning will occur and will therefore also affect the execution plan of a statement.
Partitionwise joins reduce query response time by minimizing the amount of data exchanged among parallel execution servers when joins execute in parallel. This significantly reduces response time and improves the use of both CPU and memory resources. In Oracle Real Application Clusters environments, partitionwise joins also avoid or at least limit the data traffic over the interconnection, which is the key to achieving good scalability for massive join operations.
Partitionwise joins can be full or partial. Oracle Database decides which type of join to use.
You should always consider partitioning in data warehousing environments.
In the Advisor Central page, click SQL Advisors.
The SQL Advisors page is displayed.
Click SQL Access Advisor.
The SQL Access Advisor page is displayed.
From the Initial Options, select Use Default Options and click Continue.
From the Workload Sources, select Current and Recent SQL Activity and click Next.
The Recommendation Options page is displayed.
Select Partitioning and then Comprehensive Mode, then click Next.
The Schedule page is displayed.
Enter SQLACCESStest1
into the Task Name field and click Next
The Review page is displayed. Click Submit.
Click Submit.
The Confirmation page is displayed.
Select your task and click View Result. The Results for Task page is displayed, illustrating possible improvements as a result of partitioning.
In data warehouses, you can use materialized views to compute and store aggregated data such as the sum of sales. You can also use them to compute joins with or without aggregations, and they are very useful for frequently executed expensive joins between large tables as well as expensive calculations. A materialized view eliminates the overhead associated with expensive joins and aggregations for a large or important class of queries because it computes and stores summarized data before processing large joins or queries. Materialized views in these environments are often referred to as summaries.
One of the major benefits of creating and maintaining materialized views is the ability to take advantage of the query rewrite feature, which transforms a SQL statement expressed in terms of tables or views into a statement accessing one or more materialized views that are defined on the detail tables. The transformation is transparent to the user or application, requiring no intervention and no reference to the materialized view in the SQL statement. Because the query rewrite feature is transparent, materialized views can be added or dropped just like indexes without invalidating the SQL in the application code.
When underlying tables contain large amount of data, it is an expensive and time-consuming process to compute the required aggregates or to compute joins between these tables. In such cases, queries can take minutes or even hours to return the answer. Because materialized views contain already computed aggregates and joins, Oracle Database uses the powerful query rewrite process to quickly answer the query using materialized views.
Bitmap indexes are widely used in data warehousing environments. The environments typically have large amounts of data and ad hoc queries, but a low level of concurrent DML transactions. Fully indexing a large table with a traditional B-tree index can be prohibitively expensive in terms of disk space because the indexes can be several times larger than the data in the table. Bitmap indexes are typically only a fraction of the size of the indexed data in the table. For such applications, bitmap indexing provides the following:
Reduced response time for large classes of ad hoc queries
Reduced storage requirements compared to other indexing techniques
Dramatic performance gains even on hardware with a relatively small number of CPUs or a small amount of memory
Efficient maintenance during parallel DML and loads
You can maximize how resources are used in your system by ensuring that operations run in parallel whenever possible. Any database operation would run faster if it were not constrained by a resource at any point in time. The operation may be constrained by CPU resources, I/O capacity, memory, or interconnection traffic (in a cluster). To improve the performance of database operations, you focus on the performance problem and try to eliminate it (so that the problem might shift to another resource). Oracle Database provides functions to optimize the use of available resources, but also to avoid using unnecessary resources.
Parallel execution dramatically reduces response time for data-intensive operations on large databases typically associated with a decision support system (DSS) and data warehouses. You can also implement parallel execution on certain types of online transaction processing (OLTP) and hybrid systems. Parallel execution is sometimes called parallelism. Simply expressed, parallelism is the idea of breaking down a task so that, instead of one process doing all the work in a query, many processes do part of the work at the same time. An example of this is when four processes handle four different quarters in a year instead of one process handling all four quarters by itself. The improvement in performance can be quite high. Parallel execution improves processing for the following:
Queries requiring large table scans, joins, or partitioned index scans
Creation of large indexes
Creation of large tables (including materialized views)
Bulk insert, update, merge, and delete operations
You can also use parallel execution to access object types within an Oracle database. For example, you can use parallel execution to access large objects (LOBs).
Parallel execution benefits systems with all of the following characteristics:
Symmetric multiprocessors (SMPs), clusters, or massively parallel systems
Sufficient I/O bandwidth
Underutilized or intermittently used CPUs (for example, systems where CPU usage is typically less than 30 percent)
Sufficient memory to support additional memory-intensive processes, such as sorts, hashing, and I/O buffers
If your system lacks any of these characteristics, parallel execution might not significantly improve performance. In fact, parallel execution might reduce system performance on overutilized systems or systems with small I/O bandwidth.
Parallel execution divides the task of executing a SQL statement into multiple small units, each of which is executed by a separate process. Also, the incoming data (tables, indexes, partitions) can be divided into parts called granules. The user shadow process that wants to execute a query in parallel takes on the role as parallel execution coordinator or query coordinator. The query coordinator does the following:
Parses the query and determines the degree of parallelism
Allocates one or two sets of slaves (threads or processes)
Controls the query and sends instructions to the parallel query slaves
Determines which tables or indexes need to be scanned by the parallel query slaves
Produces the final output to the user
The parallel execution coordinator may enlist two or more of the instance's parallel execution servers to process a SQL statement. The number of parallel execution servers associated with a single operation is known as the degree of parallelism.
A single operation is a part of a SQL statement, such as an ORDER
BY
operation or a full table scan to perform a join on a nonindexed column table.
The degree of parallelism (DOP) is specified in the following ways:
At the statement level with PARALLEL
hints
At the session level by issuing the ALTER
SESSION
FORCE
PARALLEL
statement
At the table level in the table's definition
At the index level in the index's definition
Wait events are statistics that are incremented by a server process to indicate that the server process had to wait for an event to complete before being able to continue processing. A session could wait for a variety of reasons, including waiting for more input, waiting for the operating system to complete a service such as a disk write operation, or it could wait for a lock or latch.
When a session is waiting for resources, it is not doing any useful work. A large number of wait events is a source of concern. Wait event data reveals various symptoms of problems that might be affecting performance, such as latch contention, buffer contention, and I/O contention.