Oracle® Database Performance Tuning Guide 10g Release 1 (10.1) Part Number B10752-01 |
|
|
View PDF |
After the initial configuration of a database, tuning an instance is important to eliminate any performance bottlenecks. This chapter discusses the tuning process based on the Oracle performance views.
This chapter contains the following sections:
These are the main steps in the Oracle performance method for instance tuning:
Get candid feedback from users about the scope of the performance problem.
Propose changes to be made and the expected result of implementing the changes. Then, implement the changes and measure application performance.
See Also:
"The Oracle Performance Improvement Method" for a theoretical description of this performance method and a list of common errors |
The remainder of this chapter discusses instance tuning using the Oracle dynamic performance views. However, Oracle recommends using the Automatic Workload Repository and Automatic Database Diagnostic Monitor for statistics gathering, monitoring, and tuning due to the extended feature list. See "Automatic Workload Repository" and "Automatic Database Diagnostic Monitor".
Note: If your site does not have the Automatic Workload Repository and Automatic Database Diagnostic Monitor features, then Statspack can be used to gather Oracle instance statistics. |
It is vital to develop a good understanding of the purpose of the tuning exercise and the nature of the problem before attempting to implement a solution. Without this understanding, it is virtually impossible to implement effective changes. The data gathered during this stage helps determine the next step to take and what evidence to examine.
Gather the following data:
What is the measure of acceptable performance? How many transactions an hour, or seconds, response time will meet the required performance level?
What is affected by the slowdown? For example, is the whole instance slow? Is it a particular application, program, specific operation, or a single user?
Is the problem only evident during peak hours? Does performance deteriorate over the course of the day? Was the slowdown gradual (over the space of months or weeks) or sudden?
This helps identify the extent of the problem and also acts as a measure for comparison when deciding whether changes implemented to fix the problem have actually made an improvement. Find a consistently reproducible measure of the response time or job run time. How much worse are the timings than when the program was running well?
Identify what has changed since performance was acceptable. This may narrow the potential cause quickly. For example, has the operating system software, hardware, application software, or Oracle release been upgraded? Has more data been loaded into the system, or has the data volume or user population grown?
At the end of this phase, you should have a good understanding of the symptoms. If the symptoms can be identified as local to a program or set of programs, then the problem is handled in a different manner than instance-wide performance issues.
See Also:
Chapter 12, "SQL Tuning Overview" for information on solving performance problems specific to an application or user |
Look at the load on the database server, as well as the database instance. Consider the operating system, the I/O subsystem, and network statistics, because examining these areas helps determine what might be worth further investigation. In multitier systems, also examine the application server middle-tier hosts.
Examining the host hardware often gives a strong indication of the bottleneck in the system. This determines which Oracle performance data could be useful for cross-reference and further diagnosis.
Data to examine includes the following:
If there is a significant amount of idle CPU, then there could be an I/O, application, or database bottleneck. Note that wait I/O should be considered as idle CPU.
If there is high CPU usage, then determine whether the CPU is being used effectively. Is the majority of CPU usage attributable to a small number of high-CPU using programs, or is the CPU consumed by an evenly distributed workload?
If the CPU is used by a small number of high-usage programs, then look at the programs to determine the cause. Check whether some processes alone consume the full power of one CPU. Depending on the process, this could be an indication of a CPU or process bound workload which can be tackled by dividing or parallelizing the process activity.
If the programs are not Oracle programs, then identify whether they are legitimately requiring that amount of CPU. If so, determine whether their execution be delayed to off-peak hours. Identifying these CPU intensive processes can also help narrowing what specific activity, such as I/O, network, and paging, is consuming resources and how can it be related to the Oracle workload.
If a small number of Oracle processes consumes most of the CPU resources, then use SQL_TRACE and TKPROF to identify the SQL or PL/SQL statements to see if a particular query or PL/SQL program unit can be tuned. For example, a SELECT statement could be CPU-intensive if its execution involves many reads of data in cache (logical reads) that could be avoided with better SQL optimization.
Oracle CPU statistics are available in several V$
views:
V$SYSSTAT
shows Oracle CPU usage for all sessions. The CPU
used
by
this
session
statistic shows the aggregate CPU used by all sessions. The parse
time
cpu
statistic shows the total CPU time used for parsing.V$SESSTAT
shows Oracle CPU usage for each session. Use this view to determine which particular session is using the most CPU.V$RSRC_CONSUMER_GROUP
shows CPU utilization statistics for each consumer group when the Oracle Database Resource Manager is running.It is important to recognize that CPU time and real time are distinct. With eight CPUs, for any given minute in real time, there are eight minutes of CPU time available. On Windows and UNIX, this can be either user time or system time (privileged mode on Windows). Thus, average CPU time utilized by all processes (threads) on the system could be greater than one minute for every one minute real time interval.
At any given moment, you know how much time Oracle has used on the system. So, if eight minutes are available and Oracle uses four minutes of that time, then you know that 50% of all CPU time is used by Oracle. If your process is not consuming that time, then some other process is. Identify the processes that are using CPU time, figure out why, and then attempt to tune them. See Chapter 20, "Using Application Tracing Tools".
If the CPU usage is evenly distributed over many Oracle server processes, examine the V$SYS_TIME_MODEL
view to help get a precise understanding of where most time is spent. See Table 10-1, " Wait Events and Potential Causes".
An overly active I/O system can be evidenced by disk queue lengths greater than two, or disk service times that are over 20-30ms. If the I/O system is overly active, then check for potential hot spots that could benefit from distributing the I/O across more disks. Also identify whether the load can be reduced by lowering the resource requirements of the programs using those resources.
Use operating system monitoring tools to determine what processes are running on the system as a whole and to monitor disk access to all files. Remember that disks holding datafiles and redo log files can also hold files that are not related to Oracle. Reduce any heavy access to disks that contain database files. Access to non-Oracle files can be monitored only through operating system facilities, rather than through the V$
views.
Utilities, such as sar
-d
(or iostat
) on many UNIX systems and the administrative performance monitoring tool on Windows systems, examine I/O statistics for the entire system.
Check the Oracle wait event data in V$SYSTEM_EVENT
to see whether the top wait events are I/O related. I/O related events include db
file
sequential
read
, db
file
scattered
read
, db
file
single
write
, and db
file
parallel
write
, and log
file
parallel
write
. These are all events corresponding to I/Os performed against datafiles and log files. If any of these wait events correspond to high average time, then investigate the I/O contention.
Cross reference the host I/O system data with the I/O sections in the Automatic Repository report to identify hot datafiles and tablespaces. Also compare the I/O times reported by the operating system with the times reported by Oracle to see if they are consistent.
An I/O problem can also manifest itself with non-I/O related wait events. For example, the difficulty in finding a free buffer in the buffer cache or high wait times for log to be flushed to disk can also be symptoms of an I/O problem. Before investigating whether the I/O system should be reconfigured, determine if the load on the I/O system can be reduced. To reduce Oracle I/O load, look at SQL statements that perform many physical reads by querying the V$SQLAREA
view or by reviewing the 'SQL ordered by Reads' section of the Automatic Workload Repository report. Examine these statements to see how they can be tuned to reduce the number of I/Os.
If there are Oracle-related I/O problems caused by SQL statements, then tune them. If the Oracle server is not consuming the available I/O resources, then identify the process that is using up the I/O. Determine why the process is using up the I/O, and then tune this process.
See Also:
|
Using operating system utilities, look at the network round-trip ping time and the number of collisions. If the network is causing large delays in response time, then investigate possible causes.
Network load can be reduced by scheduling large data transfers to off-peak times, or by coding applications to batch requests to remote hosts, rather than accessing remote hosts once (or more) for one request.
Oracle statistics should be examined and cross-referenced with operating system statistics to ensure a consistent diagnosis of the problem. operating-system statistics can indicate a good place to begin tuning. However, if the goal is to tune the Oracle instance, then look at the Oracle statistics to identify the resource bottleneck from an Oracle perspective before implementing corrective action. See "Interpreting Oracle Statistics".
The following sections discuss the common Oracle data sources used while tuning.
Oracle provides the initialization parameter STATISTICS_LEVEL
, which controls all major statistics collections or advisories in the database. This parameter sets the statistics collection level for the database.
Depending on the setting of STATISTICS_LEVEL
, certain advisories or statistics are collected, as follows:
BASIC
: No advisories or statistics are collected. Monitoring and many automatic features are disabled. Oracle does not recommend this setting because it disables important Oracle features.TYPICAL
: This is the default value and ensures collection for all major statistics while providing best overall database performance. This setting should be adequate for most environments.ALL
: All of the advisories or statistics that are collected with the TYPICAL
setting are included, plus timed operating system statistics and row source execution statistics.
See Also:
|
This view lists the status of the statistics or advisories controlled by STATISTICS_LEVEL
.
See Also:
Oracle Database Reference for information about the dynamic performance |
Wait events are statistics that are incremented by a server process or thread to indicate that it had to wait for an event to complete before being able to continue processing. Wait event data reveals various symptoms of problems that might be impacting performance, such as latch contention, buffer contention, and I/O contention. Remember that these are only symptoms of problems, not the actual causes.
Wait events are grouped into classes. The wait event classes include: Administrative, Application, Cluster, Commit, Concurrency, Configuration, Idle, Network, Other, Scheduler, System I/O, and User I/O.
A server process can wait for the following:
See Also:
Oracle Database Reference for more information about Oracle wait events |
Wait event statistics include the number of times an event was waited for and the time waited for the event to complete. If the initialization parameter TIMED_STATISTICS
is set to true
, then you can also see how long each resource was waited for.
To minimize user response time, reduce the time spent by server processes waiting for event completion. Not all wait events have the same wait time. Therefore, it is more important to examine events with the most total time waited rather than wait events with a high number of occurrences. Usually, it is best to set the dynamic parameter TIMED_STATISTICS
to true
at least while monitoring performance. See "Setting the Level of Statistics Collection" for information about STATISTICS_LEVEL
settings.
These dynamic performance views can be queried for wait event statistics:
V$ACTIVE_SESSION_HISTORY
The V$ACTIVE_SESSION_HISTORY
view displays active database session activity, sampled once every second. See "Active Session History (ASH)".
V$SESS_TIME_MODEL
and V$SYS_TIME_MODEL
The V$SESS_TIME_MODEL
and V$SYS_TIME_MODEL
views contain time model statistics, including DB
time
which is the total time spent in database calls
V$SESSION_WAIT
The V$SESSION_WAIT
view displays the resources or events for which active sessions are waiting.
V$SESSION
The V$SESSION
view contains the same wait statistics that are contained in the V$SESSION_WAIT
view. If applicable, this view also contains detailed information on the object that the session is currently waiting for (object number, block number, file number, and row number), plus the blocking session responsible for the current wait.
V$SESSION_EVENT
The V$SESSION_EVENT
view provides summary of all the events the session has waited for since it started.
V$SESSION_WAIT_CLASS
The V$SESSION_WAIT_CLASS
view provides the number of waits and the time spent in each class of wait events for each session.
V$SESSION_WAIT_HISTORY
The V$SESSION_WAIT_HISTORY
view provides the last ten wait events for each active session.
V$SYSTEM_EVENT
The V$SYSTEM_EVENT
view provides a summary of all the event waits on the instance since it started.
V$EVENT_HISTOGRAM
The V$EVENT_HISTOGRAM
view displays a histogram of the number of waits, the maximum wait, and total wait time on a per-child cursor basis.
V$FILE_HISTOGRAM
The V$FILE_HISTOGRAM
view displays a histogram of times waited during single block reads for each file.
V$SYSTEM_WAIT_CLASS
The V$SYSTEM_WAIT_CLASS
view provides the instance wide time totals for the number of waits and the time spent in each class of wait events. This view also shows the object number for which the session is waiting.
V$TEMP_HISTOGRAM
The V$TEMP_HISTOGRAM
view displays a histogram of times waited during single block reads for each temporary file.
See Also:
Oracle Database Reference for information about the dynamic performance views |
Investigate wait events and related timing data when performing reactive performance tuning. The events with the most time listed against them are often strong indications of the performance bottleneck. For example, by looking at V$SYSTEM_EVENT
, you might notice lots of buffer
busy
waits
. It might be that many processes are inserting into the same block and must wait for each other before they can insert. The solution could be to use automatic segment space management or partitioning for the object in question. See "Wait Events Statistics" for a description of the differences between the views V$SESSION_WAIT
, V$SESSION_EVENT
, and V$SYSTEM_EVENT
.
System statistics are typically used in conjunction with wait event data to find further evidence of the cause of a performance problem.
For example, if V$SYSTEM_EVENT
indicates that the largest wait event (in terms of wait time) is the event buffer
busy
waits
, then look at the specific buffer wait statistics available in the view V$WAITSTAT
to see which block type has the highest wait count and the highest wait time.
After the block type has been identified, also look at V$SESSION
real-time while the problem is occurring or V$ACTIVE_SESSION_HISTORY
and DBA_HIST_ACTIVE_SESS_HISTORY
views after the problem has been experienced to identify the contended-for objects using the object number indicated. The combination of this data indicates the appropriate corrective action.
Statistics are available in many V$
views. Some common views include the following:
This view displays active database session activity, sampled once every second. See "Active Session History (ASH)".
This contains overall statistics for many different parts of Oracle, including rollback, logical and physical I/O, and parse data. Data from V$SYSSTAT
is used to compute ratios, such as the buffer cache hit ratio.
This contains detailed file I/O statistics for each file, including the number of I/Os for each file and the average read time.
This contains detailed rollback and undo segment statistics for each segment.
This contains detailed enqueue statistics for each enqueue, including the number of times an enqueue was requested and the number of times an enqueue was waited for, and the wait time.
This contains detailed latch usage statistics for each latch, including the number of times each latch was requested and the number of times the latch was waited for.
See Also:
Oracle Database Reference for information about dynamic performance views |
You can gather segment-level statistics to help you spot performance problems associated with individual segments. Collecting and viewing segment-level statistics is a good way to effectively identify hot tables or indexes in an instance.
After viewing wait events and system statistics to identify the performance problem, you can use segment-level statistics to find specific tables or indexes that are causing the problem. Consider, for example, that V$SYSTEM_EVENT
indicates that buffer busy waits cause a fair amount of wait time. You can select from V$SEGMENT_STATISTICS
the top segments that cause the buffer busy waits. Then you can focus your effort on eliminating the problem in those segments.
You can query segment-level statistics through the following dynamic performance views:
V$SEGSTAT_NAME
This view lists the segment statistics being collected, as well as the properties of each statistic (for instance, if it is a sampled statistic).V$SEGSTAT
This is a highly efficient, real-time monitoring view that shows the statistic value, statistic name, and other basic information.V$SEGMENT_STATISTICS
This is a user-friendly view of statistic values. In addition to all the columns of V$SEGSTAT
, it has information about such things as the segment owner and table space name. It makes the statistics easy to understand, but it is more costly.
See Also:
Oracle Database Reference for information about dynamic performance views |
Often at the end of a tuning exercise, it is possible to identify two or three changes that could potentially alleviate the problem. To identify which change provides the most benefit, it is recommended that only one change be implemented at a time. The effect of the change should be measured against the baseline data measurements found in the problem definition phase.
Typically, most sites with dire performance problems implement a number of overlapping changes at once, and thus cannot identify which changes provided any benefit. Although this is not immediately an issue, this becomes a significant hindrance if similar problems subsequently appear, because it is not possible to know which of the changes provided the most benefit and which efforts to prioritize.
If it is not possible to implement changes separately, then try to measure the effects of dissimilar changes. For example, measure the effect of making an initialization change to optimize redo generation separately from the effect of creating a new index to improve the performance of a modified query. It is impossible to measure the benefit of performing an operating system upgrade if SQL is tuned, the operating system disk layout is changed, and the initialization parameters are also changed at the same time.
Performance tuning is an iterative process. It is unlikely to find a 'silver bullet' that solves an instance-wide performance problem. In most cases, excellent performance requires iteration through the performance tuning phases, because solving one bottleneck often uncovers another (sometimes worse) problem.
Knowing when to stop tuning is also important. The best measure of performance is user perception, rather than how close the statistic is to an ideal value.
Gather statistics that cover the time when the instance had the performance problem. If you previously captured baseline data for comparison, then you can compare the current data to the data from the baseline that most represents the problem workload.
When comparing two reports, ensure that the two reports are from times where the system was running comparable workloads.
Usually, wait events are the first data examined. However, if you have a baseline report, then check to see if the load has changed. Regardless of whether you have a baseline, it is useful to see whether the resource usage rates are high.
Load-related statistics to examine include redo
size
, session
logical
reads
, db
block
changes
, physical
reads
, physical
writes
, parse
count
(total
), parse
count
(hard
), and user
calls
. This data is queried from V$SYSSTAT
. It is best to normalize this data over seconds and over transactions.
In the Automatic Workload Repository report, look at the Load Profile section. The data has been normalized over transactions and over seconds.
The load profile statistics over seconds show the changes in throughput (that is, whether the instance is performing more work each second). The statistics over transactions identify changes in the application characteristics by comparing these to the corresponding statistics from the baseline report.
Examine the statistics normalized over seconds to identify whether the rates of activity are very high. It is difficult to make blanket recommendations on high values, because the thresholds are different on each site and are contingent on the application characteristics, the number and speed of CPUs, the operating system, the I/O system, and the Oracle release.
The following are some generalized examples (acceptable values vary at each site):
DB
time
found in V$SYSSTAT
. If so, examine the SQL
ordered
by
Parse
Calls
section of the Automatic Workload Repository report.Whenever an Oracle process waits for something, it records the wait using one of a set of predefined wait events. These wait events are grouped in wait classes. The Idle wait class groups all events that a process waits for when it does not have work to do and is waiting for more work to perform. Non-idle events indicate nonproductive time spent waiting for a resource or action to complete.
Note: Not all symptoms can be evidenced by wait events. See "Additional Statistics" for the statistics that can be checked. |
The most effective way to use wait event data is to order the events by the wait time. This is only possible if TIMED_STATISTICS
is set to true
. Otherwise, the wait events can only be ranked by the number of times waited, which is often not the ordering that best represents the problem.
See Also:
|
To get an indication of where time is spent, follow these steps:
V$SYSTEM_EVENT
. The events of interest should be ranked by wait time.
Identify the wait events that have the most significant percentage of wait time. To determine the percentage of wait time, add the total wait time for all wait events, excluding idle events, such as Null
event
, SQL*Net
message
from
client
, SQL*Net
message
to
client
, and SQL*Net
more
data
to
client
. Calculate the relative percentage of the five most prominent events by dividing each event's wait time by the total time waited for all events.
See Also:
|
Alternatively, look at the Top 5 Timed Events section at the beginning of the Automatic Workload Repository report. This section automatically orders the wait events (omitting idle events), and calculates the relative percentage:
Top 5 Timed Events ~~~~~~~~~~~~~~~~~~ % Total Event Waits Time (s) Call Time -------------------------------------- ------------ ----------- --------- CPU time 559 88.80 log file parallel write 2,181 28 4.42 SQL*Net more data from client 516,611 27 4.24 db file parallel write 13,383 13 2.04 db file sequential read 563 2 .27
In some situations, there might be a few events with similar percentages. This can provide extra evidence if all the events are related to the same type of resource request (for example, all I/O related events).
Avg Total Wait wait Waits Event Waits Timeouts Time (s) (ms) /txn --------------------------- --------- --------- ---------- ------ --------- log file parallel write 2,181 0 28 13 41.2 SQL*Net more data from clie 516,611 0 27 0 9,747.4 db file parallel write 13,383 0 13 1 252.5
V$SYSSTAT
, operating system statistics, and so on. Perform cross-checks with other data to confirm or refute the developing theory.Table 10-1 links wait events to possible causes and gives an overview of the Oracle data that could be most useful to review next.
You may also want to review the following Oracle Metalink notices on buffer
busy
waits
(34405.1) and free
buffer
waits
(62172.1):
You can also access these notices and related notices by searching for "busy buffer waits" and "free buffer waits" at:
http://metalink.oracle.com
See Also:
|
There are a number of statistics that can indicate performance problems that do not have corresponding wait events.
The V$SYSSTAT
statistic redo
log
space
requests
indicates how many times a server process had to wait for space in the online redo log, not for space in the redo log buffer. A significant value for this statistic and the wait events should be used as an indication that checkpoints, DBWR, or archiver activity should be tuned, not LGWR. Increasing the size of log buffer does not help.
Your system might spend excessive time rolling back changes to blocks in order to maintain a consistent view. Consider the following scenarios:
V$SYSSTAT
statistics to determine whether this is happening:
consistent
changes
statistic indicates the number of times a database block has rollback entries applied to perform a consistent read on the block. Workloads that produce a great deal of consistent
changes
can consume a great deal of resources.consistent
gets
statistic counts the number of logical reads in consistent mode.The ratio of the following V$SYSSTAT
statistics should be close to 1:
ratio = transaction tables consistent reads - undo records applied / transaction tables consistent read rollbacks
The recommended solution is to use automatic undo management.
WAITS
to the number of GETS
in V$ROLLSTAT
; the proportion of WAITS
to GETS
should be small.V$WAITSTAT
to see whether there are many WAITS
for buffers of CLASS
'undo
header
'.The recommended solution is to use automatic undo management.
You can detect migrated or chained rows by checking the number of table
fetch
continued
row
statistic in V$SYSSTAT
. A small number of chained rows (less than 1%) is unlikely to impact system performance. However, a large percentage of chained rows can affect performance.
Chaining on rows larger than the block size is inevitable. You might want to consider using tablespace with larger block size for such data.
However, for smaller rows, you can avoid chaining by using sensible space parameters and good application design. For example, do not insert a row with key values filled in and nulls in most other columns, then update that row with the real data, causing the row to grow in size. Rather, insert rows filled with data from the start.
If an UPDATE
statement increases the amount of data in a row so that the row no longer fits in its data block, then Oracle tries to find another block with enough free space to hold the entire row. If such a block is available, then Oracle moves the entire row to the new block. This is called migrating a row. If the row is too large to fit into any available block, then Oracle splits the row into multiple pieces and stores each piece in a separate block. This is called chaining a row. Rows can also be chained when they are inserted.
Migration and chaining are especially detrimental to performance with the following:
UPDATE
statements that cause migration and chaining to perform poorlyThe definition of a sample output table named CHAINED_ROWS
appears in a SQL script available on your distribution medium. The common name of this script is UTLCHN1
.SQL
, although its exact name and location varies depending on your platform. Your output table must have the same column names, datatypes, and sizes as the CHAINED_ROWS
table.
Increasing PCTFREE
can help to avoid migrated rows. If you leave more free space available in the block, then the row has room to grow. You can also reorganize or re-create tables and indexes that have high deletion rates. If tables frequently have rows deleted, then data blocks can have partially free space in them. If rows are inserted and later expanded, then the inserted rows might land in blocks with deleted rows but still not have enough room to expand. Reorganizing the table ensures that the main free space is totally empty blocks.
See Also:
|
The more your application parses, the more potential for contention exists, and the more time your system spends waiting. If parse
time
CPU
represents a large percentage of the CPU time, then time is being spent parsing instead of executing statements. If this is the case, then it is likely that the application is using literal SQL and so SQL cannot be shared, or the shared pool is poorly configured.
There are a number of statistics available to identify the extent of time spent parsing by Oracle. Query the parse related statistics from V$SYSSTAT
. For example:
SELECT NAME, VALUE FROM V$SYSSTAT WHERE NAME IN ( 'parse time cpu', 'parse time elapsed', 'parse count (hard)', 'CPU used by this session' );
There are various ratios that can be computed to assist in determining whether parsing may be a problem:
This ratio indicates how much of the time spent parsing was due to the parse operation itself, rather than waiting for resources, such as latches. A ratio of one is good, indicating that the elapsed time was not spent waiting for highly contended resources.
This ratio indicates how much of the total CPU used by Oracle server processes was spent on parse-related operations. A ratio closer to zero is good, indicating that the majority of CPU is not spent on parsing.
The V$SESSION
, V$SESSION_WAIT
, V$SESSION_EVENT
, and V$SYSTEM_EVENT
views provide information on what resources were waited for, and, if the configuration parameter TIMED_STATISTICS
is set to true
, how long each resource was waited for.
See Also:
|
Investigate wait events and related timing data when performing reactive performance tuning. The events with the most time listed against them are often strong indications of the performance bottleneck.
The following views contain related, but different, views of the same data:
V$SESSION
lists session information for each current session. It lists either the event currently being waited for or the event last waited for on each session. This view also contains information on blocking sessions.V$SESSION_WAIT
is a current state view. It lists either the event currently being waited for or the event last waited for on each sessionV$SESSION_EVENT
lists the cumulative history of events waited for on each session. After a session exits, the wait event statistics for that session are removed from this view.V$SYSTEM_EVENT
lists the events and times waited for by the whole instance (that is, all session wait events data rolled up) since instance startup.Because V$SESSION_WAIT
is a current state view, it also contains a finer-granularity of information than V$SESSION_EVENT
or V$SYSTEM_EVENT
. It includes additional identifying data for the current event in three parameter columns: P1
, P2
, and P3
.
For example, V$SESSION_EVENT
can show that session 124 (SID=124) had many waits on the db
file
scattered
read
, but it does not show which file and block number. However, V$SESSION_WAIT
shows the file number in P1
, the block number read in P2
, and the number of blocks read in P3
(P1
and P2
let you determine for which segments the wait event is occurring).
This chapter concentrates on examples using V$SESSION_WAIT
. However, Oracle recommends capturing performance data over an interval and keeping this data for performance and capacity analysis. This form of rollup data is queried from the V$SYSTEM_EVENT
view by Automatic Workload Repository. See "Automatic Workload Repository".
Most commonly encountered events are described in this chapter, listed in case-sensitive alphabetical order. Other event-related data to examine is also included. The case used for each event name is that which appears in the V$SYSTEM_EVENT
view.
See Also:
Oracle Database Reference for a description of the |
The following events signify that the database process is waiting for acknowledgment from a database link or a client process:
If these waits constitute a significant portion of the wait time on the system or for a user experiencing response time issues, then the network or the middle-tier could be a bottleneck.
Events that are client-related should be diagnosed as described for the event SQL*Net
message
from
client
. Events that are dblink-related should be diagnosed as described for the event SQL*Net
message
from
dblink
.
Although this is an idle event, it is important to explain when this event can be used to diagnose what is not the problem. This event indicates that a server process is waiting for work from the client process. However, there are several situations where this event could accrue most of the wait time for a user experiencing poor response time. The cause could be either a network bottleneck or a resource bottleneck on the client process.
A network bottleneck can occur if the application causes a lot of traffic between server and client and the network latency (time for a round-trip) is high. Symptoms include the following:
To alleviate network bottlenecks, try the following:
VSAT
links).If the client process is using most of the resources, then there is nothing that can be done in the database. Symptoms include the following:
In some cases, you can see the wait time for a waiting user tracking closely with the amount of CPU used by the client process. The term client here refers to any process other than the database process (middle-tier, desktop client) in the n-tier architecture.
This event signifies that the session has sent a message to the remote node and is waiting for a response from the database link. This time could go up because of the following:
For information, see "SQL*Net message from client".
It is useful to see the SQL being run on the remote node. Login to the remote database, find the session created by the database link, and examine the SQL statement being run by it.
Each message between the session and the remote node adds latency time and processing overhead. To reduce the number of messages exchanged, use array fetches and array inserts.
The server process is sending more data or messages to the client. The previous operation to the client was also a send.
See Also:
Oracle Net Services Administrator's Guide for a detailed discussion on network optimization |
This wait indicates that there are some buffers in the buffer cache that multiple processes are attempting to access concurrently. Query V$WAITSTAT
for the wait statistics for each class of buffer. Common buffer classes that have buffer busy waits include data
block
, segment
header
, undo
header
, and undo
block
.
Check the following V$SESSION_WAIT
parameter columns:
To determine the possible causes, first query V$SESSION
to identify the value of ROW_WAIT_OBJ#
when the session waits for buffer
busy
waits
. For example:
SELECT row_wait_obj# FROM V$SESSION WHERE EVENT = 'buffer busy waits';
To identify the object and object type contended for, query DBA_OBJECTS
using the value for ROW_WAIT_OBJ#
that is returned from V$SESSION
. For example:
SELECT owner, object_name, subobject_name, object_type FROM DBA_OBJECTS WHERE data_object_id = &row_wait_obj;
The action required depends on the class of block contended for and the actual segment.
If the contention is on the segment header, then this is most likely free list contention.
Automatic segment-space management in locally managed tablespaces eliminates the need to specify the PCTUSED
, FREELISTS
, and FREELIST
GROUPS
parameters. If possible, switch from manual space management to automatic segment-space management (ASSM).
The following information is relevant if you are unable to use automatic segment-space management (for example, because the tablespace uses dictionary space management).
A free list is a list of free data blocks that usually includes blocks existing in a number of different extents within the segment. Free lists are composed of blocks in which free space has not yet reached PCTFREE or used space has shrunk below PCTUSED. Specify the number of process free lists with the FREELISTS
parameter. The default value of FREELISTS
is one. The maximum value depends on the data block size.
To find the current setting for free lists for that segment, run the following:
SELECT SEGMENT_NAME, FREELISTS FROM DBA_SEGMENTS WHERE SEGMENT_NAME = segment name AND SEGMENT_TYPE = segment type;
Set free lists, or increase the number of free lists. If adding more free lists does not alleviate the problem, then use free list groups (even in single instance this can make a difference). If using Oracle Real Application Clusters, then ensure that each instance has its own free list group(s).
See Also:
Oracle Database Concepts for information on automatic segment-space management, free lists, |
If the contention is on tables or indexes (not the segment header):
For contention on rollback segment header:
For contention on rollback segment block:
This event signifies that the user process is reading buffers into the SGA buffer cache and is waiting for a physical I/O call to return. A db
file
scattered
read
issues a scattered read to read the data into multiple discontinuous memory locations. A scattered read is usually a multiblock read. It can occur for a fast full scan (of an index) in addition to a full table scan.
The db
file
scattered
read
wait event identifies that a full scan is occurring. When performing a full scan into the buffer cache, the blocks read are read into memory locations that are not physically adjacent to each other. Such reads are called scattered read calls, because the blocks are scattered throughout memory. This is why the corresponding wait event is called 'db file scattered read'. Multiblock (up to DB_FILE_MULTIBLOCK_READ_COUNT
blocks) reads due to full scans into the buffer cache show up as waits for 'db file scattered read'.
Check the following V$SESSION_WAIT
parameter columns:
P1
- The absolute file numberP2
- The block being readP3
- The number of blocks (should be greater than 1)On a healthy system, physical read waits should be the biggest waits after the idle waits. However, also consider whether there are direct read waits (signifying full table scans with parallel query) or db
file
scattered
read
waits on an operational (OLTP) system that should be doing small indexed accesses.
Other things that could indicate excessive I/O load on the system include the following:
There are several ways to handle excessive I/O waits. In the order of effectiveness, these are as follows:
DBMS_STATS
package, allowing the query optimizer to accurately cost possible access paths that use full scansThe first course of action should be to find opportunities to reduce I/O. Examine the SQL statements being run by sessions waiting for these events, as well as statements causing high physical I/Os from V$SQLAREA
. Factors that can adversely affect the execution plans causing excessive I/O include the following:
DB_FILE_MULTIBLOCK_READ_COUNT
initialization parameter too high which favors full scansBesides reducing I/O, also examine the I/O distribution of files across the disks. Is I/O distributed uniformly across the disks, or are there hot spots on some disks? Are the number of disks sufficient to meet the I/O needs of the database?
See the total I/O operations (reads and writes) by the database, and compare those with the number of disks used. Remember to include the I/O activity of LGWR and ARCH processes.
Use the following query to determine, at a point in time, which sessions are waiting for I/O:
SELECT SQL_ADDRESS, SQL_HASH_VALUE FROM V$SESSION WHERE EVENT LIKE 'db file%read';
To determine the possible causes, first query V$SESSION
to identify the value of ROW_WAIT_OBJ#
when the session waits for db
file
scattered
read
. For example:
SELECT row_wait_obj# FROM V$SESSION WHERE EVENT = 'db file scattered read';
To identify the object and object type contended for, query DBA_OBJECTS
using the value for ROW_WAIT_OBJ#
that is returned from V$SESSION
. For example:
SELECT owner, object_name, subobject_name, object_type FROM DBA_OBJECTS WHERE data_object_id = &row_wait_obj;
This event signifies that the user process is reading a buffer into the SGA buffer cache and is waiting for a physical I/O call to return. A sequential read is a single-block read.
Single block I/Os are usually the result of using indexes. Rarely, full table scan calls could get truncated to a single block call due to extent boundaries, or buffers already present in the buffer cache. These waits would also show up as 'db file sequential read'.
Check the following V$SESSION_WAIT
parameter columns:
P1
- The absolute file numberP2
- The block being readP3
- The number of blocks (should be 1)
See Also:
"db file scattered read" for information on managing excessive I/O, inadequate I/O distribution, and finding the SQL causing the I/O and the segment the I/O is performed on |
On a healthy system, physical read waits should be the biggest waits after the idle waits. However, also consider whether there are db
file
sequential
reads
on a large data warehouse that should be seeing mostly full table scans with parallel query.
Figure 10-1 depicts the differences between the following wait events:
db
file
sequential
read
(single block read into one SGA buffer)db
file
scattered
read
(multiblock read into many discontinuous SGA buffers)direct
read
(single or multiblock read into the PGA, bypassing the SGA)Text description of the illustration pfgrf210.gif
When a session is reading buffers from disk directly into the PGA (opposed to the buffer cache in SGA), it waits on this event. If the I/O subsystem does not support asynchronous I/Os, then each wait corresponds to a physical read request.
If the I/O subsystem supports asynchronous I/O, then the process is able to overlap issuing read requests with processing the blocks already existing in the PGA. When the process attempts to access a block in the PGA that has not yet been read from disk, it then issues a wait call and updates the statistics for this event. Hence, the number of waits is not necessarily the same as the number of read requests (unlike db
file
scattered
read
and db
file
sequential
read
).
Check the following V$SESSION_WAIT
parameter columns:
P1
- File_id for the read callP2
- Start block_id for the read callP3
- Number of blocks in the read callThis happens in the following situations:
The file_id
shows if the reads are for an object in TEMP
tablespace (sorts to disk) or full table scans by parallel slaves. This is the biggest wait for large data warehouse sites. However, if the workload is not a DSS workload, then examine why this is happening.
Examine the SQL statement currently being run by the session experiencing waits to see what is causing the sorts. Query V$TEMPSEG_USAGE
to find the SQL statement that is generating the sort. Also query the statistics from V$SESSTAT
for the session to determine the size of the sort. See if it is possible to reduce the sorting by tuning the SQL statement. If WORKAREA_SIZE_POLICY
is MANUAL
, then consider increasing the SORT_AREA_SIZE
for the system (if the sorts are not too big) or for individual processes. If WORKAREA_SIZE_POLICY
is AUTO
, then investigate whether to increase PGA_AGGREGATE_TARGET
. See "PGA Memory Management".
If tables are defined with a high degree of parallelism, then this could skew the optimizer to use full table scans with parallel slaves. Check the object being read into using the direct path reads. If the full table scans are a valid part of the workload, then ensure that the I/O subsystem is configured adequately for the degree of parallelism. Consider using disk striping if you are not already using it or Automatic Storage Management (ASM).
For query plans that call for a hash join, excessive I/O could result from having HASH_AREA_SIZE
too small. If WORKAREA_SIZE_POLICY
is MANUAL
, then consider increasing the HASH_AREA_SIZE
for the system or for individual processes. If WORKAREA_SIZE_POLICY
is AUTO
, then investigate whether to increase PGA_AGGREGATE_TARGET
.
When a process is writing buffers directly from PGA (as opposed to the DBWR writing them from the buffer cache), the process waits on this event for the write call to complete. Operations that could perform direct path writes include when a sort goes to disk, during parallel DML operations, direct-path INSERT
s, parallel create table as select, and some LOB operations.
Like direct path reads, the number of waits is not the same as number of write calls issued if the I/O subsystem supports asynchronous writes. The session waits if it has processed all buffers in the PGA and is unable to continue work until an I/O request completes.
See Also:
Oracle Database Administrator's Guide for information on direct-path inserts |
Check the following V$SESSION_WAIT
parameter columns:
P1
- File_id for the write callP2
- Start block_id for the write callP3
- Number of blocks in the write callThis happens in the following situations:
For large sorts see "Sorts to Disk".
For parallel DML, check the I/O distribution across disks and make sure that the I/O subsystem is adequately configured for the degree of parallelism.
Enqueues are locks that coordinate access to database resources. This event indicates that the session is waiting for a lock that is held by another session.
The name of the enqueue is included as part of the wait event name, in the form enq:
enqueue
_type
-
related
_details
. In some cases, the same enqueue type can be held for different purposes, such as the following related TX
types:
enq:
TX
-
allocate
ITL
entry
enq:
TX
-
contention
enq:
TX
-
index
contention
enq:
TX
-
row
lock
contention
The V$EVENT_NAME
view provides a complete list of all the enq:
wait events.
You can check the following V$SESSION_WAIT
parameter columns for additional information:
P1
- Lock TYPE
(or name) and MODE
P2
- Resource identifier ID1 for the lockP3
- Resource identifier ID2 for the lock
See Also:
Oracle Database Reference for information about Oracle enqueues |
Query V$LOCK
to find the sessions holding the lock. For every session waiting for the event enqueue, there is a row in V$LOCK
with REQUEST
<> 0
. Use one of the following two queries to find the sessions holding the locks and waiting for the locks.
If there are enqueue waits, you can see these using the following statement:
SELECT * FROM V$LOCK WHERE request > 0;
To show only holders and waiters for locks being waited on, use the following:
SELECT DECODE(request,0,'Holder: ','Waiter: ') || sid sess, id1, id2, lmode, request, type FROM V$LOCK WHERE (id1, id2, type) IN (SELECT id1, id2, type FROM V$LOCK WHERE request > 0) ORDER BY id1, request;
The appropriate action depends on the type of enqueue.
If the contended-for enqueue is the ST enqueue, then the problem is most likely to be dynamic space allocation. Oracle dynamically allocates an extent to a segment when there is no more free space available in the segment. This enqueue is only used for dictionary managed tablespaces.
To solve contention on this resource:
TEMPFILES
. If not, then switch to using TEMPFILES
.See Also:
Oracle Database Concepts for detailed information on |
EXTENTS
column of the DBA_SEGMENTS
view for all SEGMENT_NAMEs
. See Oracle Database Administrator's Guide for information about displaying information about space usage.ALTER
TABLE
ALLOCATE
EXTENT
SQL statement.The HW enqueue is used to serialize the allocation of space beyond the high water mark of a segment.
V$SESSION_WAIT.P2
/ V$LOCK.ID1
is the tablespace number.V$SESSION_WAIT.P3
/ V$LOCK.ID2
is the relative dba of segment header of the object for which space is being allocated.If this is a point of contention for an object, then manual allocation of extents solves the problem.
The most common reason for waits on TM locks tend to involve foreign key constraints where the constrained columns are not indexed. Index the foreign key columns to avoid this problem.
These are acquired exclusive when a transaction initiates its first change and held until the transaction does a COMMIT
or ROLLBACK
.
enq:
TX
-
row
lock
contention
.
The solution is to have the first session already holding the lock perform a COMMIT
or ROLLBACK
.
enq:
TX
-
allocate
ITL
entry
.
The solution is to increase the number of ITLs available, either by changing the INITRANS
or MAXTRANS
for the table (either by using an ALTER
statement, or by re-creating the table with the higher values).
UNIQUE
index. If two sessions try to insert the same key value the second session has to wait to see if an ORA-0001
should be raised or not. This type of TX enqueue wait corresponds to the wait event enq:
TX
-
row
lock
contention
.
The solution is to have the first session already holding the lock perform a COMMIT
or ROLLBACK
.
COMMIT
or ROLLBACK
by waiting for the TX lock in mode 4. This type of TX enqueue wait corresponds to the wait event enq:
TX
-
row
lock
contention
.PREPARED
transaction.enq:
TX
-
index
contention
.
See Also:
Oracle Database Application Developer's Guide - Fundamentals for more information about referential integrity and locking data explicitly |
This wait event indicates that a server process was unable to find a free buffer and has posted the database writer to make free buffers by writing out dirty buffers. A dirty buffer is a buffer whose contents have been modified. Dirty buffers are freed for reuse when DBWR has written the blocks to disk.
DBWR may not be keeping up with writing dirty buffers in the following situations:
If this event occurs frequently, then examine the session waits for DBWR to see whether there is anything delaying DBWR.
If it is waiting for writes, then determine what is delaying the writes and fix it. Check the following:
V$FILESTAT
to see where most of the writes are happening.If I/O is slow:
It is possible DBWR is very active because the cache is too small. Investigate whether this is a probable cause by looking to see if the buffer cache hit ratio is low. Also use the V$DB_CACHE_ADVICE
view to determine whether a larger cache size would be advantageous. See "Sizing the Buffer Cache".
If the cache size is adequate and the I/O is already evenly spread, then you can potentially modify the behavior of DBWR by using asynchronous I/O or by using multiple database writers.
Configuring multiple database writer processes, or using I/O slaves, is useful when the transaction rates are high or when the buffer cache size is so large that a single DBWn process cannot keep up with the load.
The DB_WRITER_PROCESSES
initialization parameter lets you configure multiple database writer processes (from DBW0 to DBW9 and from DBWa to DBWj). Configuring multiple DBWR processes distributes the work required to identify buffers to be written, and it also distributes the I/O load over these processes. Multiple db writer processes are highly recommended for systems with multiple CPUs (at least one db writer for every 8 CPUs) or multiple processor groups (at least as many db writers as processor groups).
Based upon the number of CPUs and the number of processor groups, Oracle either selects an appropriate default setting for DB_WRITER_PROCESSES
or adjusts a user-specified setting.
If it is not practical to use multiple DBWR processes, then Oracle provides a facility whereby the I/O load can be distributed over multiple slave processes. The DBWR process is the only process that scans the buffer cache LRU list for blocks to be written out. However, the I/O for those blocks is performed by the I/O slaves. The number of I/O slaves is determined by the parameter DBWR_IO_SLAVES
.
DBWR_IO_SLAVES
is intended for scenarios where you cannot use multiple DB_WRITER_PROCESSES
(for example, where you have a single CPU). I/O slaves are also useful when asynchronous I/O is not available, because the multiple I/O slaves simulate nonblocking, asynchronous requests by freeing DBWR to continue identifying blocks in the cache to be written. Asynchronous I/O at the operating system level, if you have it, is generally preferred.
DBWR I/O slaves are allocated immediately following database open when the first I/O request is made. The DBWR continues to perform all of the DBWR-related work, apart from performing I/O. I/O slaves simply perform the I/O on behalf of DBWR. The writing of the batch is parallelized between the I/O slaves.
Configuring multiple DBWR processes benefits performance when a single DBWR process is unable to keep up with the required workload. However, before configuring multiple DBWR processes, check whether asynchronous I/O is available and configured on the system. If the system supports asynchronous I/O but it is not currently used, then enable asynchronous I/O to see if this alleviates the problem. If the system does not support asynchronous I/O, or if asynchronous I/O is already configured and there is still a DBWR bottleneck, then configure multiple DBWR processes.
Note: If asynchronous I/O is not available on your platform, then asynchronous I/O can be disabled by setting the |
Using multiple DBWRs parallelizes the gathering and writing of buffers. Therefore, multiple DBWn processes should deliver more throughput than one DBWR process with the same number of I/O slaves. For this reason, the use of I/O slaves has been deprecated in favor of multiple DBWR processes. I/O slaves should only be used if multiple DBWR processes cannot be configured.
A latch is a low-level internal lock used by Oracle to protect memory structures. The latch free event is updated when a server process attempts to get a latch, and the latch is unavailable on the first attempt.
There is a dedicated latch-related wait event for the more popular latches that often generate significant contention. For those events, the name of the latch appears in the name of the wait event, such as latch:
library
cache
or latch:
cache
buffers
chains
. This enables you to quickly figure out if a particular type of latch is responsible for most of the latch-related contention. Waits for all other latches are grouped in the generic latch
free
wait event.
See Also:
Oracle Database Concepts for more information on latches and internal locks |
This event should only be a concern if latch waits are a significant portion of the wait time on the system as a whole, or for individual users experiencing problems.
Check the following V$SESSION_WAIT
parameter columns:
P1
- Address of the latchP2
- Latch numberP3
- Number of times process has already slept, waiting for the latchSELECT EVENT, SUM(P3) SLEEPS, SUM(SECONDS_IN_WAIT) SECONDS_IN_WAIT FROM V$SESSION_WAIT WHERE EVENT LIKE 'latch%' GROUP BY EVENT;
A problem with the previous query is that it tells more about session tuning or instant instance tuning than instance or long-duration instance tuning.
The following query provides more information about long duration instance tuning, showing whether the latch waits are significant in the overall database time.
SELECT EVENT, TIME_WAITED_MICRO, ROUND(TIME_WAITED_MICRO*100/S.DBTIME,1) PCT_DB_TIME FROM V$SYSTEM_EVENT, (SELECT VALUE DBTIME FROM V$SYS_TIME_MODEL WHERE STAT_NAME = 'DB time') S WHERE EVENT LIKE 'latch%' ORDER BY PCT_DB_TIME ASC;
A more general query that is not specific to latch waits is the following:
SELECT EVENT, WAIT_CLASS, TIME_WAITED_MICRO,ROUND(TIME_WAITED_MICRO*100/S.DBTIME,1) PCT_DB_TIME FROM V$SYSTEM_EVENT E, V$EVENT_NAME N, (SELECT VALUE DBTIME FROM V$SYS_TIME_MODEL WHERE STAT_NAME = 'DB time') S WHERE E.EVENT_ID = N.EVENT_ID AND N.WAIT_CLASS NOT IN ('Idle', 'System I/O') ORDER BY PCT_DB_TIME ASC;
A main cause of shared pool or library cache latch contention is parsing. There are a number of techniques that can be used to identify unnecessary parsing and a number of types of unnecessary parsing:
This method identifies similar SQL statements that could be shared if literals were replaced with bind variables. The idea is to either:
SELECT SQL_TEXT FROM V$SQLAREA WHERE EXECUTIONS < 4 ORDER BY SQL_TEXT;
SELECT SUBSTR(SQL_TEXT,1, 60), COUNT(*) FROM V$SQLAREA WHERE EXECUTIONS < 4 GROUP BY SUBSTR(SQL_TEXT, 1, 60) HAVING COUNT(*) > 1;
SELECT SQL_TEXT FROM V$SQL WHERE PLAN_HASH_VALUE IN (SELECT PLAN_HASH_VALUE FROM V$SQL GROUP BY PLAN_HASH_VALUE HAVING COUNT(*) > 4) ORDER BY PLAN_HASH_VALUE;
check the V$SQLAREA
view. Enter the following query:
SELECT SQL_TEXT, PARSE_CALLS, EXECUTIONS FROM V$SQLAREA ORDER BY PARSE_CALLS;
When the PARSE_CALLS
value is close to the EXECUTIONS
value for a given statement, you might be continually reparsing that statement. Tune the statements with the higher numbers of parse calls.
Identify unnecessary parse calls by identifying the session in which they occur. It might be that particular batch programs or certain types of applications do most of the reparsing. To do this, run the following query:
SELECT pa.SID, pa.VALUE "Hard Parses", ex.VALUE "Execute Count" FROM V$SESSTAT pa, V$SESSTAT ex WHERE pa.SID = ex.SID AND pa.STATISTIC#=(SELECT STATISTIC# FROM V$STATNAME WHERE NAME = 'parse count (hard)') AND ex.STATISTIC#=(SELECT STATISTIC# FROM V$STATNAME WHERE NAME = 'execute count') AND pa.VALUE > 0;
The result is a list of all sessions and the amount of reparsing they do. For each session identifier (SID), go to V$SESSION
to find the name of the program that causes the reparsing.
The output is similar to the following:
SID Hard Parses Execute Count ------ ----------- ------------- 7 1 20 8 3 12690 6 26 325 11 84 1619
The cache
buffers
lru
chain
latches protect the lists of buffers in the cache. When adding, moving, or removing a buffer from a list, a latch must be obtained.
For symmetric multiprocessor (SMP) systems, Oracle automatically sets the number of LRU latches to a value equal to one half the number of CPUs on the system. For non-SMP systems, one LRU latch is sufficient.
Contention for the LRU latch can impede performance on SMP machines with a large number of CPUs. LRU latch contention is detected by querying V$LATCH
, V$SESSION_EVENT
, and V$SYSTEM_EVENT
. To avoid contention, consider tuning the application, bypassing the buffer cache for DSS jobs, or redesigning the application.
The cache
buffers
chains
latches are used to protect a buffer list in the buffer cache. These latches are used when searching for, adding, or removing a buffer from the buffer cache. Contention on this latch usually means that there is a block that is greatly contended for (known as a hot block).
To identify the heavily accessed buffer chain, and hence the contended for block, look at latch statistics for the cache
buffers
chains
latches using the view V$LATCH_CHILDREN
. If there is a specific cache
buffers
chains
child latch that has many more GETS
, MISSES
, and SLEEPS
when compared with the other child latches, then this is the contended for child latch.
This latch has a memory address, identified by the ADDR
column. Use the value in the ADDR
column joined with the X$BH
table to identify the blocks protected by this latch. For example, given the address (V$LATCH_CHILDREN
.ADDR
) of a heavily contended latch, this queries the file and block numbers:
SELECT OBJ data_object_id, FILE#, DBABLK,CLASS, STATE, TCH FROM X$BH WHERE HLADDR = 'address of latch' ORDER BY TCH;
X$BH.TCH
is a touch count for the buffer. A high value for X$BH.TCH indicates a hot block.
Many blocks are protected by each latch. One of these buffers will probably be the hot block. Any block with a high TCH
value is a potential hot block. Perform this query a number of times, and identify the block that consistently appears in the output. After you have identified the hot block, query DBA_EXTENTS
using the file number and block number, to identify the segment.
After you have identified the hot block, you can identify the segment it belongs to with the following query:
SELECT OBJECT_NAME, SUBOBJECT_NAME FROM DBA_OBJECTS WHERE DATA_OBJECT_ID = &obj;
In the query, &obj
is the value of the OBJ
column in the previous query on X$BH
.
The row
cache
objects
latches protect the data dictionary.
This event involves writing redo records to the redo log files from the log buffer.
This event manages library cache concurrency. Pinning an object causes the heaps to be loaded into memory. If a client wants to modify or examine the object, the client must acquire a pin after the lock.
This event controls the concurrency between clients of the library cache. It acquires a lock on the object handle so that either:
This lock is also obtained to locate an object in the library cache.
This event occurs when server processes are waiting for free space in the log buffer, because all the redo is generated faster than LGWR can write it out.
Modify the redo log buffer size. If the size of the log buffer is already reasonable, then ensure that the disks on which the online redo logs reside do not suffer from I/O contention. The log
buffer
space
wait event could be indicative of either disk I/O contention on the disks where the redo logs reside, or of a too-small log buffer. Check the I/O profile of the disks containing the redo logs to investigate whether the I/O system is the bottleneck. If the I/O system is not a problem, then the redo log buffer could be too small. Increase the size of the redo log buffer until this event is no longer significant.
There are two wait events commonly encountered:
In both of the events, the LGWR is unable to switch into the next online redo log, and all the commit requests wait for this event.
For the log
file
switch
(archiving
needed
) event, examine why the archiver is unable to archive the logs in a timely fashion. It could be due to the following:
Depending on the nature of bottleneck, you might need to redistribute I/O or add more space to the archive destination to alleviate the problem. For the log
file
switch
(checkpoint
incomplete
) event:
When a user session commits (or rolls back), the session's redo information must be flushed to the redo logfile by LGWR. The server process performing the COMMIT
or ROLLBACK
waits under this event for the write to the redo log to complete.
If this event's waits constitute a significant wait on the system or a significant amount of time waited by a user experiencing response time issues or on a system, then examine the average time waited.
If the average time waited is low, but the number of waits are high, then the application might be committing after every INSERT
, rather than batching COMMIT
s. Applications can reduce the wait by committing after 50 rows, rather than every row.
If the average time waited is high, then examine the session waits for the log writer and see what it is spending most of its time doing and waiting for. If the waits are because of slow I/O, then try the following:
COMMIT
s by committing every N rows, rather than every row, so that fewer log file syncs are needed.This event is used to wait for a reply from one of the background processes.
These events belong to Idle wait class and indicate that the server process is waiting because it has no work. This usually implies that if there is a bottleneck, then the bottleneck is not for database resources. The majority of the idle events should be ignored when tuning, because they do not indicate the nature of the performance bottleneck. Some idle events can be useful in indicating what the bottleneck is not. An example of this type of event is the most commonly encountered idle wait-event SQL Net message from client
. This and other idle events (and their categories) are listed in Table 10-3.
See Also:
Oracle Database Reference for explanations of each idle wait event |