Translate into your own language

Tuesday, October 31, 2017

Step by step - What is restore point, how to create and use it with example

Use named points in time to roll your database back by using flashback technology
Jane, the lead DBA at Acme Bank, has three visitors today. The first, Paul, is the head of Quality Assurance. Paul's team creates a variety of test scenarios for the applications. For each scenario, they put together the test data, and after they run each test, they need to modify the data to bring back the pretest values.
The second visitor is Tom, the operations manager. Tom is responsible for processing batch financial transaction files from each branch of the bank. If the file from a branch produces an error, the entire process is aborted and must be started from the beginning.
The third visitor is Harry, a business manager of an application developed by a third-party vendor. Harry gets updates from the vendor to modify the database structure, modify data, and so on, as part of the application upgrade process. Most upgrades go smoothly; however, when an upgrade fails, things get messy. Harry's team spends considerable time devising ways to undo failed changes.
Paul, Tom, and Harry are asking Jane to help make their processes more efficient. Jane assures them that she has a solution. Fortunately, Acme is using Oracle Database 10g Release 2, and it is possible to "rewind" the database to a named point in time.

Setup

Jane sits her visitors down and reminds them all of how it's possible to turn back the hands of time and reinstate the database to a certain point in time using a simple command: flashback database. In Oracle Database 10g Release 2, Jane says, the functionality is now enhanced significantly by the ability to name a specific point in time, called a restore point . Using this, Paul, Tom, or Harry (or a DBA acting on their behalf) can mark and flash back the database to a logical point in time.
Jane starts a demonstration on a test database. She notes that the database must be running in archivelog mode and with flashback logging enabled. She first shuts down the database and then brings it up in mounted mode. 
shutdown immediate;
startup mount;

Then she converts the database to run in archivelog mode. 
alter database archivelog;

To enable flashback, Jane first configures two parameters in the database. 
alter system set db_recovery_file_dest_size = 2G;
alter system set db_recovery_file_dest = '/u02/flashbackarea/acmeprd';

In flashback mode, the database creates flashback log files, which record the old images of the data after a change is made. These files are kept in the location specified by the db_recovery_file_dest parameter, up to the size specified by the db_recovery_file_dest_size parameter, which in this case is set to 2GB.
Jane then enables flashback logging: 
alter database flashback on;

She opens the database: 
alter database open;

She checks the status of the archive log mode and flashback: 
select flashback_on, log_mode
from v$database;

FLASHBACK_ON  LOG_MODE
------------  ----------------
YES           ARCHIVELOG

This confirms that the database is indeed in flashback mode.

Restore Points

Jane proceeds to demonstrate how to use restore points, starting with an example of how Paul's QA team can benefit from this technique. Jane creates a restore point named qa_gold. 
create restore point qa_gold;

This command, Jane reminds them, is new in Oracle Database 10g Release 2. It creates a named restore point, which is an alias to the system change number (SCN) of the database at that time.
Jane runs one of the QA team's tests, altering the test data. To flash back the database to the restore point she created, Jane shuts down the database, restarts it in mounted mode, and issues the flashback database command. 
shutdown immediate;
startup mount;
flashback database to restore point qa_gold;

That's it; the database is now "rewound" to the restore point named qa_gold. There was no need for Jane to back up the database and perform a point-in-time recovery. Paul couldn't be happier.
For Tom, Jane demonstrates a slightly different approach. Since Tom runs the batch process on one file at a time, Jane suggests creating a restore point after processing each file with some predetermined naming convention, for example, after_branch_ n , where n is the BRANCH_ID.
To keep track of the files being processed, Tom has a table—PROC—with only one column—BRANCH_ID, which stores the id of the branch whose file has been processed. Jane runs through the following process as an example of a typical batch run using restore points:
1. She creates a restore point named start_batch to mark the start of the process. 
create restore point start_batch;

2. She updates the PROC table to specify the branch being processed. 
update proc set branch_id = 1;
commit;

3. She processes the file from branch 1.
4. After the branch 1 file is processed, she creates a new restore point. 
create restore point after_branch_1;

The process is repeated until the files from all branches are processed.
Jane demonstrates the restore process to use if a file from branch 23 has an error. When the file from branch 23 is picked up for processing, the BRANCH_ID value in the PROC table will be 23. 
SQL> select branch_id from proc;
 
BRANCH_ID
---------
       23

If the processing fails for the file from branch 23, Jane rolls back the changes to the after_branch_22 restore point. 
shutdown immediate;
startup mount;
flashback database to restore point 
after_branch_22;
alter database open;

To confirm that the flashback succeeded, she checks the PROC table again. 
SQL> select branch_id from proc;
 
BRANCH_ID
---------
       22

The value of the column is 22, for the branch file one prior to the creation of the restore point. All changes made to the database after the creation of this restore point are undone.
Sometimes, the file from a branch fails but that is not known until much later. For instance, the branch 23 file processing may have failed, but that is not discovered until the processing of branch 29. Jane assures Tom that whether he's processing the branch 23 file, the branch 29 file, or any file in between, he can easily roll back to the after_branch_22 restore point.
In response to Harry's application update issue, Jane suggests a solution very similar to Paul's. Just prior to the database update, Harry or a DBA would create a restore point named pre_change. If the application update is not successful, all that Harry or the DBA needs to do is to flash back the database to that restore point using the flashback commands she demonstrated earlier.

Guaranteed Restore Point

Paul, Tom, and Harry leave Jane's office and go back to their respective departments to test their restore-point solutions.
A few hours later, Tom returns to Jane's office with an error message. When he tried to flash back to a restore point, he got this error:

ORA-38729: Not enough flashback 
database log data to do FLASHBACK.

As the error shows, there are insufficient flashback logs to flash back the database to the restore point. Jane's explanation is simple—the flashback logs are kept up to the time specified by the db_flashback_retention_target database parameter.
Older logs are not automatically deleted; however, the maximum size of the flash recovery area is determined by the db_recovery_file_dest_size database parameter, which is 2GB in this case. When Tom's flash recovery area fills up to 2GB, Oracle Database must remove some logs older than 1,440 seconds to make room for the new ones. So, when Tom wanted to perform flashback, the logs needed for the selected restore point were aged out, and that caused the error.
Tom asks if— because the flashback logs will age out—there is any guarantee that he would be able to flash back to a particular restore point.
Jane assures him that he can guarantee that the logs he needs for a particular restore point are available by creating a guaranteed restore point. Jane shows Tom how to create a guaranteed restore point: 
create restore point after_branch_22 
guarantee flashback database;

This ensures that it is definitely possible to flash back the database to the after_branch_22 restore point. Oracle Database does not age out the old flashback logs required for this restore point.

Administration

After successfully addressing Tom's question, Jane calls in her DBAs to make sure they understand how to administer the restore points. First she shows them the query to find out how many restore points have been created: 
select * from v$restore_point 
order by scn;

The output is shown in Listing 1. Noting the output, Jane describes the columns. NAME is the name of the restore point; SCN and TIME are the SCN and the time stamp (in extended format) when the restore point was created, respectively; GUARANTEE_FLASHBACK_DATABASE (shown partially as "GUA") indicates whether it's a guaranteed restore point; STORAGE_SIZE indicates the storage used by this restore point (it is non-zero only in case of guaranteed restore point). Finally, DATABASE_INCARNATION# indicates the incarnation of the database when this restore point was created. If the database was flashed back and then opened with resetlogs, it creates a new incarnation of the database.
Code Listing 1: Output of V$RESTORE_POINT 
SCN      DATABASE_INCARNATION# GUA  STORAGE_SIZE TIME                            NAME
-------- --------------------- ---- ------------ ------------------------------- ------------
14390197 6                     NO   0            01-AUG-06 01.12.27.000000000 PM QA_GOLD
24390219 6                     NO   0            01-AUG-06 02.13.16.000000000 PM AFTER_BRANCH_1
34390232 6                     NO   0            01-AUG-06 03.13.34.000000000 PM AFTER_BRANCH_2
44390243 6                     NO   0            01-AUG-06 04.13.49.000000000 PM AFTER_BRANCH_3
54394187 7                     YES  3981312      02-AUG-06 05.35.19.000000000 AM AFTER_BRANCH_4

When restore points are not needed anymore, Jane continues, you DBAs can delete them. She issues the following command to delete the after_branch_1 restore point: 
drop restore point after_branch_1;

Conclusion

Using restore points, DBAs can mark a location in time, which can then be used to rewind and fast-forward the database to a specific location. Although restore points are very helpful in recovering the database quickly from user errors, they also have other excellent uses. Table 1 summarizes various scenarios and how to resolve them by using restore points.
Table 1: Opportunities for restore points
ScenarioSolution
Batch programs may inadvertently cause issues in the database, which might require point-in-time recovery of the database before other batch programs can run.Create a restore point prior to each batch run. In case of issues, flash the database back to the restore point.
Database schema upgrades and application deployments requiring extensive schema changes might fail, resulting in an inconsistent database which may need point-in-time recovery.Create a restore point prior to the deployment. Flash the database back to that restore point in case the deployment fails.
QA team carefully creates test data, but after the test runs, the data is modified and needs to be recreated before the next test can run.Create a restore point before the first test. Flash the database back to that restore point after each test.

Friday, October 27, 2017

What is the order of precedence for Baselines, SQL profiles, Outlines, Hints etc?

The first recommendation is to "keep it simple". In other words,try not to mix up different methods if possible.

If you do have to use different methods, then:

Adding a hint changes the SQL and so you will have a new cursor that won't match up with an existing baseline. Once you add a baseline to a query containing a hint then that baseline can override whatever that hint is intended to do.

If you have a stored outline and a baseline then the Stored outline will take precedence. If you have Outlines and you want to use SPM Baselines, then migrate the outlines into the baseline. When they are migrated they are no longer considered an outline any more so they won't take precedence over the baseline. If you have migrated the stored outline into a baseline then it is just another plan in that baseline.

If a SQL Plan Baseline exists, then a plan generated by a SQL profile will be added to that baseline. Within a baseline, the plan generated by the SQL profile has the same status options as other baseline plans (ENABLED, ACCEPTED, FIXED, REPRODUCED etc.).


If SQL Profile plan is generated and added to a baseline that already contains a different accepted plan, the plan from the profile will not be one of the accepted plans in the baseline. The SQL Profile will be added as an unaccepted plan to the list of accepted and unaccepted candidates for the baseline in the plan history. 

The DBA then will have a choice between which different plans to enable and accept. If the profile plan was the first plan added to the baseline, then it would be automatically accepted as with any other initial plan.

How to Evolve a SQL Plan Baseline and Adjust the Acceptance Threshold

Non-accepted plans can be verified by executing the evolve_sql_plan_baseline function. This function will execute the non-accepted plan and compare its performance to the best accepted plan. The execution is performed using peeked bind values from the non-accepted plan.

If the non-accepted plan's performance is better that the original(default threshold of 1.5x), the plan will be accepted and added to the SQL plan baseline.

Example

Consider the following example for SQL_HANDLE SQL_0f8c8f355212ed9a. This query has a single Non-accepted plan with plan_name: SQL_PLAN_0z34g6p915vcu22368d7d

SQL> select sql_handle,plan_name,enabled,accepted from dba_sql_plan_baselines;

SQL_HANDLE                     PLAN_NAME                      ENA ACC
------------------------------ ------------------------------ --- ---
SQL_0f8c8f355212ed9a           SQL_PLAN_0z34g6p915vcu22368d7d YES NO
SQL_0f8c8f355212ed9a           SQL_PLAN_0z34g6p915vcu6ddd367b YES YES

To evolve this non-accepted plan, use DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE:

SQL> var report clob; 
SQL> exec :report := DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE( sql_handle=> 'SQL_0f8c8f355212ed9a', verify=>'YES', commit=>'YES' );

PL/SQL procedure successfully completed

We can see a report of the activity as follows:

SQL> print :report
-------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
  SQL_HANDLE = SQL_0f8c8f355212ed9a
  PLAN_NAME  = SQL_PLAN_0z34g6p915vcu22368d7d
  TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
  VERIFY     = YES
  COMMIT     = YES

Plan: SQL_PLAN_0z34g6p915vcu22368d7d
-----------------------------------
  Plan was verified: Time used .06 seconds.
  Plan failed performance criterion: 1.29 times better than baseline plan.
 
                                   Baseline Plan        Test Plan           Stats Ratio
                                   -------------      ---------            -------------
  Execution Status:                  COMPLETE          COMPLETE
  Rows Processed:                      10                  10
  Elapsed Time(ms):                    3.434             .101                   34
  CPU Time(ms):                        3.555             .1111                  32.03
  Buffer Gets:                         19                  15                   1.27
  Disk Reads:                          0                    0
  Direct Writes:                       0                    0
  Fetches:                             0                    0
  Executions:                          1                    1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of SQL plan baselines verified: 1
Number of plans accepted: 0

The new plan is 1.29 times better than baseline plan (1.29 is computed as a function of buffer gets and CPU time). This means that the plan was not accepted since it did not meet the 1.5x (150%)  improvement threshold (i.e. a plan has to be at least 1.5 times better for the SPM to accept it. This threshold of 1.5 is controlled by a hidden parameter "_plan_verify_improvement_margin". 150 is the default value.

If we set this parameter to 120:

SQL> ALTER SYSTEM SET "_plan_verify_improvement_margin"=120;

and evolve the plan again:

SQL> var report clob; 
SQL> exec :report := DBMS_SPM.EVOLVE_SQL_PLAN_BASELINE( sql_handle=> 'SQL_0f8c8f355212ed9a', verify=>'YES', commit=>'YES' );

PL/SQL procedure successfully completed

The report shows:

SQL> print :report
-------------------------------------------------------------------------------
                        Evolve SQL Plan Baseline Report
-------------------------------------------------------------------------------

Inputs:
-------
  SQL_HANDLE = SQL_0f8c8f355212ed9a
  PLAN_NAME  = SQL_PLAN_0z34g6p915vcu22368d7d
  TIME_LIMIT = DBMS_SPM.AUTO_LIMIT
  VERIFY     = YES
  COMMIT     = YES

Plan: SQL_PLAN_0z34g6p915vcu22368d7d
-----------------------------------
  Plan was verified: Time used .06 seconds.
  Plan passed performance criterion: 1.29 times better than baseline plan.
  Plan was changed to an accepted plan.
 
                       Baseline Plan      Test Plan     Stats Ratio
                      -------------      ---------     -------------
  Execution Status:        COMPLETE       COMPLETE
  Rows Processed:               10            10
  Elapsed Time(ms):             3.434         .101            34
  CPU Time(ms):                 3.555         .1111           32.03
  Buffer Gets:                  19            15              1.27
  Disk Reads:                   0             0
  Direct Writes:                0             0
  Fetches:                      0             0
  Executions:                   1             1

-------------------------------------------------------------------------------
                                 Report Summary
-------------------------------------------------------------------------------
Number of SQL plan baselines verified: 1
Number of plans accepted: 1

SPM evolution now automatically accepts SQL_PLAN_0z34g6p915vcu22368d7d because we have adjusted acceptance threshold to 1.2.

Thursday, October 26, 2017

Step by step - What is SQL plan baseline, how to create and use it

Step by step - What is SQL plan baseline, how to create and use it

In Oracle Database 11g, using the new SQL plan management feature, we can now examine how execution plans change over time, have the database verify new plans by executing them before using them, and gradually evolve better plans in a controlled manner.

SQL Plan Management
When SQL plan management is enabled, the optimizer stores generated execution plans in a special repository, the SQL management base. All stored plans for a specific SQL statement are said to be part of a plan history for that SQL statement.
Some of the plans in the history can be marked as “accepted.”When the SQL statement is reparsed, the optimizer considers only the accepted plans in the history. This set of accepted plans for that SQL statement is called a SQL plan baseline , or baseline for short.
The optimizer still tries to generate a better plan, however. If the optimizer does generate a new plan, it adds it to the plan history but does not consider it while reparsing the SQL, unless the new plan is better than all the accepted plans in the baseline. Therefore, with SQL plan management enabled, SQL statements will never suddenly have a less efficient plan that results in worse performance.
With SQL plan management, we can examine all the available plans in the plan history for a SQL statement, compare them to see their relative efficiency, promote a specific plan to accepted status, and even make a plan the permanent (fixed) one.
Capture
The capture function of SQL plan management captures the various optimizer plans used by SQL statements. By default, capture is disabled—that is, SQL plan management does not capture the history for the SQL statements being parsed or reparsed.
First, we enable the baseline capture in the session:
alter session
set optimizer_capture_sql_plan_baselines = true;

Now all the SQL statements executed in this session will be captured, along with their optimization plans, in the SQL management base. Every time the plan changes for a SQL statement, it is stored in the plan history.

Using Baselines
With the SQL plan baselines captured, we now enable the optimizer to use them:
alter session set
optimizer_use_sql_plan_baselines = true;

With baseline use enabled, when the optimizer reparses a SQL statement, it examines the plans stored in the baseline for that SQL statement and chooses the best among them. This is where the most important benefit of baselines comes in. The optimizer also still reparses the SQL statements—the presence of a baseline does not prevent that—and if the newly generated plan is not found in the plan history of the SQL, it will be added, but not as “accepted.”So, if the newly generated plan is worse, the performance of the SQL will not be affected, because the plan is not used. However, in some cases, we may decide that the new plan is better, based on your knowledge of the data distribution or the application logic. For instance, suppose the plan was captured when the table was practically empty, making the optimizer quite appropriately choose an index scan. But we know that the application populates the table before invoking the SQL statement later and that a full table scan will actually be better for the plan in the long run. In such a case, we can examine the new plan later, and if it is better, we can accept it—after which the optimizer will consider it. That‘s why we get the best of both worlds: a good plan is always used, but if the optimizer generates a better one, it will be available for comparison.

If we don‘t want to use plans in the baseline for a SQL statement, we can use the following statement in the session prior to calling the SQL statement to disable the use of baselines:
alter session set
optimizer_use_sql_plan_baselines = false;

Evolving SQL Plan Baselines
Evolving a SQL plan baseline is the process by which the optimizer determines if non-accepted plans in the baseline should be accepted. As mentioned previously, manually loaded plans are automatically marked as accepted, so manual loading forces the evolving process. When plans are loaded automatically, the baselines are evolved using the EVOLVE_SQL_PLAN_BASELINE function, which returns a CLOB reporting its results.

SET LONG 10000
SELECT DBMS_SPM.evolve_sql_plan_baseline(sql_handle => 'SYS_SQL_7b76323ad90440b9')
FROM   dual;

Altering Plan Baselines
The ALTER_SQL_PLAN_BASELINE function allows the following attributes of a specific plan, or all plans within a baseline to be altered:
  • enabled (YES/NO) : If YES, the plan is available for the optimizer if it is also marked as accepted.
  • fixed (YES/NO) : If YES, the SQL plan baseline will not evolve over time. Fixed plans are used in preference to non-fixed plans.
  • autopurge (YES/NO) : If YES, the SQL plan baseline is purged automatically if it is not used for a period of time.
  • plan_name : Used to amend the SQL plan name, up to a maximum of 30 character.
  • description : Used to amend the SQL plan description, up to a maximum of 30 character.

Saturday, October 14, 2017

How to resolve session performance issues in Oracle Database

It’s the middle of the night, and you get a frantic call from someone complaining that the database is slow. The caller demands to know why—and what you’re going to do about it. Sound familiar? If it does, you are not alone. High performance is a common expectation of database system users: they get very unhappy when they don’t get it, and they are usually not shy about letting you know. What should you do next? In this article, you will learn some techniques for troubleshooting Oracle Database performance issues.

To use the scripts in this article, you need to create some tables in a test schema and access some dynamic performance views. The database user SYS has all privileges to access the views, so you need the password for the SYS user. The script for setting up the example tables is available in the side.


Session State

Setup


To set up the test cases for this article, execute the SQL in this “Setup” section. The SQL assumes that you have access to the SYS user, you can create a user called ARUP (which means you don’t have a user with the same name), and you have a tablespace called USERS with at least 64 KB of free space.

Connect as SYS, and execute the following:

connect sys/<password> as sysdba

create user arup identified by arup
default tablespace users
/

alter user arup quota unlimited on users
/ 

-- now connect as arup
connect arup/arup

create table t1
(
 col1 number,
 col2 varchar2(1)
)
/
insert into t1 values
(1,’z’) 

/
commit
/
Before you start troubleshooting why a database is slow, you have to first understand that the database itself is never slow or fast—it has a constant speed. The sessions connected to the database, however, slow down when they hit a bump in the road. To resolve a session performance issue, you need to identify the bump and remove it. Fortunately, it’s very easy to do this in most cases. 

The first step to resolving a session performance issue is to ascertain what that database session is doing now. An Oracle Database session is always in one of three states.
  1. Idle. Not doing anything—just waiting to be given some work.

  2. Processing. Doing something useful—running on the CPU.

  3. Waiting. Waiting for something, such as a block to come from disk or a lock to be released.

If a session is waiting to be given work (idle), it’s really not slow at all—it just has nothing to do. If a session is waiting for some resource, such as a block or a lock, it has stopped processing. Until it gets that resource, the session will continue to wait. When it gets that resource, it does some processing and then moves on to the next resource it needs, waits for that to be available, and starts processing . . . and the cycle goes on until the session has nothing else to do. If it waits for resources often, the session will appear slow. But it’s not really slow—it’s just following a pattern of go, stop, go again, stop again, and so on. Your mission is to find and eliminate the “stop” issues in the session.
How difficult is it to get information about what’s causing the session to stop? It’s actually very easy: Oracle Database is instrumented to talk about what the database sessions are doing. All you need to do is to listen attentively or, more precisely, look for that information in the right place, and that place is a view called V$SESSION. Everything you need for your analysis is in this view.
To explain how to use the V$SESSION view, I will use a very common scenario—row locking—as an example. To follow along, first set up the previously mentioned tables as described in the online version of this article. Then connect as user ARUP from two different sessions. From the first session, issue the following SQL statement:
update t1
set col2 = 'x' where col1 = 1;

The output will show “1 row updated,” indicating that the row was updated. Do not issue a COMMIT after the statement. By not committing, you will force the session to get and hold a lock on the first row of the T1 table. Now go to the second session and issue the following SQL statement:

update t1
set col2 = 'y' 
where col1 = 1;

This statement will hang. Why? The answer is simple: the first session holds a lock on the row, which causes the second session to hang and the user to complain that the session is slow. To know what the second session is doing, the first thing you need to check is the STATE column in V$SESSION:

select sid, state
from v$session
where username = 'ARUP'; 

SID   STATE
————  ——————————————————
3346  WAITING
2832  WAITED KNOWN TIME

Study the output carefully. Session 3346 (in the SID column) indicates that it is waiting for something—and therefore not working. That should be your first clue that the session is experiencing one of those performance bumps in the road. But before you can determine what the session is waiting for, let’s study the state of session 2832 in the output, which shows that it waited for some known amount of time earlier. The important point is that session 2832 is not waiting for anything right now, meaning that it’s working productively.
Next, let’s see what the second session (3346) is waiting for. That information is readily available in the EVENT column in the same V$SESSION view. The EVENT column not only shows an event a session is waiting for currently, but also shows an event a session has waited for earlier. The query against V$SESSION in Listing 1 displays information from the EVENT column for both sessions.
Code Listing 1: Query for displaying sessions, session state, and events

select sid, state, event
from v$session
where username = 'ARUP'; 

SID   STATE              EVENT
————— —————————————————  ————————————————————————————
2832  WAITED KNOWN TIME  SQL*Net message from client
3346  WAITING            enq: TX - row lock contention

The output in Listing 1 shows that session 3346 is waiting right now for an event: “enq: TX – row lock contention”—short for “enqueue for transaction-level lock on row” or, in plain English, a row-level lock. The session is waiting because it wants to lock one or more rows, but another session has already placed locks on the row or rows. Unless that other session commits or rolls back its transaction, session 3346 will not get the lock it needs and will have no choice but to wait. On the other hand, the state of session 2832, “WAITED KNOWN TIME,” means that it is working—not waiting—right now. It was, however, waiting earlier for an event called “SQL*Net message from client” (I will discuss this specific event later). There is one very important lesson in these results: you cannot look at the EVENT column alone to find out what the session is waiting for. You must look at the STATE column first to determine whether the session is waiting or working and then inspect the EVENT column.
After you determine that a session is waiting for something, the next thing you need to find out is how long the session has been waiting. A very long wait usually indicates some sort of bottleneck. Where can you get information on the length of the waiting period? The answer is right there in the V$SESSION view, in the SECONDS_IN_WAIT column.
Getting the amount of time a session has been waiting makes sense for sessions that are waiting right now, but what about the sessions that are working now? Recall that the EVENT column shows not only the event a session is experiencing now but also the last wait event the session has experienced. Another column—WAIT_TIME—in the same V$SESSION view shows how long that wait lasted. (Note that WAIT_TIME is shown in centiseconds [hundredths of a second].)
Now that you know how to get information on the sessions waiting and working, let’s put all the information together in a single query, shown in Listing 2. It clearly shows the state of the sessions: whether they are working or waiting; if they are working, what they were waiting for earlier and for how long; and if they are waiting, what for and for how long.
Code Listing 2: Query for displaying sessions, session state, and wait details
col "Description" format a50
select sid,
        decode(state, 'WAITING','Waiting',
                'Working') state,
        decode(state,
                'WAITING',
                'So far '||seconds_in_wait,
                'Last waited '||
                wait_time/100)||
        ' secs for '||event
        "Description"
from v$session
where username = 'ARUP';
Output:
SID   STATE       Description
————— ——————————  ———————————————————————————————————————————————————————
2832  Working     Last waited 2029 secs for SQL*Net message from client
3346  Waiting     So far 743 secs for enq: TX - row lock contention
4208  Waiting     So far 5498 secs for SQL*Net message from client

Idle Event

Note the details of session 4208 in Listing 2; it’s currently waiting 5,498 seconds for a “SQL*Net message from client” event. Recall from the previous section that an Oracle Database session can be in one of the three states: working, waiting for a resource, or waiting for work. But how can a session determine whether it is idle? It will expect to be given work by clients via SQL*Net, but there is no way for it to know in advance if any work is coming from the clients. All it can do is wait for some instruction coming through SQL*Net. Until then, it will have nothing else to do but eagerly stare at the SQL*Net interface, and this condition is reported as “SQL*Net message from client” in the V$SESSION view’s EVENT column, which is practically the same thing as just being idle.
You can disregard another EVENT column value, “rdbms ipc message,” because it is an event status for sessions that are idle. Note that an idle session does not show IDLE as the STATE column value; it still shows “Waiting.” You have to check the EVENT column to determine whether the session is truly idle.
You may be tempted to modify the query in Listing 2 to filter sessions that include the “SQL*Net message from client” and “rdbms ipc message” idle events. Although you can do that, I highly discourage doing that, for multiple reasons. First, not all instances of the “SQL*Net message from client” event indicate that the session is idle. Consider the possibility that the network might be truly slow, in which case the session will also wait for these events. Remember, the session doesn’t have the ability to determine whether the client is truly idle or is sending instructions that are slow or stuck in the network. All it can do is wait, and it will wait with the “SQL*Net message from client” event. Second, idle events may provide some clues to Oracle Support about what else is going on inside a session. So I recommend displaying these “idle” EVENT values.

Diagnosis of Locking

The output of Listing 2 provides enough information to enable you to make a diagnosis about the performance of these three sessions. Session 4208 is idle, so any complaints that session 4208 is slow just aren’t related to the database. Any performance issues related to this session could be related to a bug in the code that’s going through an infinite loop or high CPU consumption on the application server. You can redirect the performance troubleshooting focus toward the application client.
The story of session 3346 is different. This session is truly a bottleneck to the application. Now that you know why this session appears slow—it is waiting for a row lock—the next logical question is which session holds that lock. The answer is also found in—I hope you guessed it—the V$SESSION view, in, more specifically, the BLOCKING_SESSION column. (Note that in an Oracle Real Application Clusters [Oracle RAC] environment, the blocking session may exist in a different instance. In such a case, the blocking instance is displayed in the V$SESSION view’s BLOCKING_INSTANCE column.)
You can find out the blocking session and instance by issuing the following SQL statement:
select 
  blocking_session B_SID,
  blocking_instance B_Inst
from v$session
where sid = 3346;

B_SID   B_INST
——————  ———————
 2832      1
The output shows clearly that SID 2832 is holding the lock that SID 3346 is waiting for. Now you can follow a cause/effect relationship between the session in which an update to a row is being blocked and the session that holds the lock on that row.
You can find the specific row that is locked by first finding the table containing that row. To find that table, use the same V$SESSION view; in this case, the information is in the ROW_WAIT_OBJ# column, which shows the object number of the table whose row is being locked. You can then get the name of the table from the DBA_OBJECTS view, using this object number, as shown in Listing 3.
Code Listing 3: Getting row lock information
select row_wait_obj#,
       row_wait_file#,
       row_wait_block#,
       row_wait_row#
from v$session 
where sid = 3346;

ROW_WAIT_OBJ#  ROW_WAIT_FILE#  ROW_WAIT_BLOCK#  ROW_WAIT_ROW#
—————————————  ——————————————  ———————————————— ——————————————
241876         1024            2307623          0
To get the object information:
select owner, object_type, object_name, data_object_id
from dba_objects
where object_id = 241876;

OWNER  OBJECT_TYPE  OBJECT_NAME   DATA_OBJECT_ID
—————  ———————————— ————————————  ——————————————
ARUP   TABLE        T1                    241877
The output shows that some row in the T1 table is the point of the row lock contention. But which specific row is locked? That data is available in three V$SESSION view columns—ROW_WAIT_FILE#, ROW_WAIT_BLOCK#, and ROW_WAIT_ROW#—which show the relative file ID, the block ID in that file, and the row’s slot number inside that block, respectively, for that specific row. Using this information, you can identify the ROWID of the row. The ROWID, the physical address of every row in an Oracle Database instance, can be used to uniquely identify a row.
Listing 4 shows a SQL script that enables you to select the specific blocking row from the table with the information gathered so far. Save this script in a file named rowinfo.sql. The script expects the input in the following order: owner, table name, object#, file#, block#, and row#. You can call this script and pass all the requested parameters by copying and pasting the corresponding output from Listing 3.
Code Listing 4: Finding the row information
REM Filename: rowinfo.sql
REM This shows the row from the table when the
REM components of ROWID are passed. Pass the
REM following in this exact order
REM  1. owner 
REM  2. table name 
REM  3. data_object_id
REM  4. relative file ID 
REM  5. block ID
REM  6. row Number 
REM
select *
from &1..&2
where rowid =
        dbms_rowid.rowid_create (
                rowid_type      =>  1, 
                object_number   => &3,
                relative_fno    => &4,
                block_number    => &5,
                row_number      => &6
        )
/

SQL> @rowinfo ARUP T1 241877 1024 2307623 0

COL1  C
————— —
  1   x
The output in Listing 4 shows the specific row on which a lock is being requested but that is locked by another session. So far you have identified not only the source session of the locking but the specific row being locked as well.
Is it possible that the session holding the lock (SID 2832) is somehow disconnected from the client? That can occur in connection pools or when users access the database with thick-client tools such as Oracle SQL Developer. After you identify the session holding the lock, you may want to wait until it commits or rolls back the transaction. Either action releases the lock.
In the case of a dead connection, you may alternatively decide to kill the session, which will force a rollback releasing the locks held by the blocking session and enabling the waiting sessions to continue. Occasionally the problem can be pretty simple: for instance, someone issued an UPDATE statement from a thick-client tool but forgot to commit and thus caused every session to wait for those updated rows. Identifying that blocking session enables you to send a gentle reminder to rectify that situation immediately.

More on the Session

In many troubleshooting situations, just knowing the SID of each session is not enough. You may need to know other details, such as the client machine the session is connecting from, the user (of both the database and the operating system), and the service name. All of this information is also readily available in the same V$SESSION view you have been using. Let’s briefly examine the columns that provide that information, by running the script shown in Listing 5.
Code Listing 5: Sessions from a specific user
select SID, osuser, machine, terminal, service_name, 
       logon_time, last_call_et
from v$session
where username = 'ARUP'; 

SID   OSUSER  MACHINE   TERMINAL  SERVICE_NAME  LOGON_TIME LAST_CALL_ET
————— ——————  ———————   ————————  ————————————  —————————— ————————————
3346  oradb   prodb1    pts/5     SYS$USERS     05-FEB-12          6848
2832  oradb   prodb1    pts/6     SERV1         05-FEB-12          7616
4408  ANANDA  ANLAP     ANLAP     ADHOC         05-FEB-12             0
OSUSER. The operating system user as which the client is connected. The output indicates that session 4408 is connected from the ANLAP machine, where a Windows user, ANANDA, has logged in.
MACHINE. The name of the machine where the client is running. This could be the database server itself. For two of the sessions, the machine name shows up as “prodb1.” Session 4408 runs on a different machine—ANLAP—presumably a laptop.
TERMINAL. If the session is connected from a UNIX server, this is the terminal where it runs.
LOGON_TIME. This shows when the session was first connected to the Oracle Database instance.
Using the columns shown in Listing 5, you can get very detailed information on a user’s sessions.
Suppose you receive a complaint that the applications running on the application server named appsvr1 are experiencing performance issues. Listing 6 shows a query against the V$SESSION view—including columns you’ve used in previous queries in this article—for the sessions connected from that machine and the output.
Code Listing 6: Session waits for a specific machine
col username format a5
col program format a10
col state format a10
col last_call_et head 'Called|secs ago' format 999999
col seconds_in_wait head 'Waiting|for secs' format 999999
col event format a50
select sid, username, program,
        decode(state, 'WAITING', 'Waiting',
                'Working') state,
last_call_et, seconds_in_wait, event
from v$session
where machine = 'appsvr1'
/
                                       Called      Waiting
SID   USERNAME  PROGRAM       STATE    secs ago    for secs   EVENT 
————— ———————   ———————————   ———————  —————————   ————————   ——————————————————
2832  ARUP      sqlplus.exe   Waiting       152         151   SQL*Net message 
                                                              from client
3089  ARUP      sqlplus.exe   Waiting       146         146   enq: TX - row lock 
                                                              contention
3346  ARUP      sqlplus.exe   Working        18          49   SQL*Net message 
                                                              from client
From the output, you can easily see that three sessions are connected from the appsvr1 application server. All of them are running SQL*Plus (as shown in the PROGRAM column). SID 3346 is the only one that is working (indicated by “Working” in the STATE column). Because it’s working, the EVENT column shows the last time the session waited. The wait time in this case is meaningless, because the session is not waiting but actually working. The “Called secs ago” column (representing the “last_call_et” column in V$SESSION) displays 18, which means that the session made a SQL call 18 seconds ago.
The other sessions are waiting. SID 3089 is waiting for a row lock. From the output, you can see that the session has been waiting for 146 seconds and that it also made its last SQL call 146 seconds ago. This indicates that the session has been waiting for that particular lock ever since it made that SQL call.
Finally, session 2832 is also waiting; in this case, it is waiting with a “SQL*Net message from client” event, which means it is idle, waiting to be given some work. The session issued its last SQL statement 152 seconds ago and has been idle for 151 seconds.
Armed with this information, you can diagnose performance issues very accurately. You can tell the complaining user that of the three sessions connected from the appsvr1 application server, one session is idle, one is working, and one is waiting for a lock. The user is probably referring to the slowness of this last session. Now you know the reason and how you can rectify it.

Getting the SQL

Another key piece of performance tuning information is the SQL statement a session is executing, which will provide more insights into the workings of the session. The same V$SESSION view also shows the SQL statement information. The SQL_ID column in the V$SESSION view shows the ID of the last SQL statement executed. You can get the text of that SQL statement from the V$SQL view, using the SQL_ID value. Here is an example of how I have identified the SQL statement executed by the session that appears slow to the user.
select sql_id
from v$session
where sid = 3089;

SQL_ID
—————————————————
g0uubmuvk4uax

set long 99999
select sql_fulltext
from v$sql
where sql_id = 'g0uubmuvk4uax';
SQL_FULLTEXT
————————————————————————————————————————
update t1 set col2 = 'y' where col1 = 1

Data Access Issues

I have used row-level locking as the cause of a slowdown in this article. Although locking-related contention is a very common cause, it is not the only cause of performance problems. Another major cause of contention is disk I/O. When a session retrieves data from the database data files on disk to the buffer cache, it has to wait until the disk sends the data. This wait shows up for that session as “db file sequential read” (for index scans) or “db file scattered read” (for full-table scans) in the EVENT column, as shown below:
select event
from v$session
where sid = 3011;

EVENT
—————————————————————————
db file sequential read
When you see this event, you know that the session is waiting for I/O from the disk to complete. To make the session go faster, you have to reduce that waiting period. There are several ways to reduce the wait:
  1. Reduce the number of blocks retrieved by the SQL statement. Examine the SQL statement to see if it is doing a full-table scan when it should be using an index, if it is using a wrong index, or if it can be rewritten to reduce the amount of data it retrieves.
  2. Place the tables used in the SQL statement on a faster part of the disk.
  3. Consider increasing the buffer cache to see if the expanded size will accommodate the additional blocks, therefore reducing the I/O and the wait.
  4. Tune the I/O subsystem to return data faster.

There are other options as well, but the preceding ones are the most common remediation techniques. The exact activity you undertake depends on your specific situation, but the first technique—reducing the number of blocks retrieved by a SQL statement—almost always works.  When you think about tuning to reduce the number of blocks, you can look at the SQL statement to see which table is being selected from. But what if you see two or more tables in the statement? How do you determine which table is causing the wait?
To find the table causing a wait, you will again use the V$SESSION view. The view’s P1 and P2 columns provide information about the segment the session is waiting for. Listing 7 shows a query of P1 and P2, and the output.
Code Listing 7: Checking data access waits
select SID, state, event, p1, p2
from v$session
where username = 'ARUP'; 

SID  STATE     EVENT                   P1 P2
———— ———————   ——————————————————————— —— ————
2201 WAITING   db file sequential read  5 3011
The P1 column shows the file ID, and the P2 column shows the block ID. From that information in the result in Listing 7, you can get the segment name from the extent information in DBA_EXTENTS, as shown below:
select owner, segment_name
from dba_extents
where file_id = 5
and 3011 between block_id 
and block_id + blocks;

OWNER  SEGMENT_NAME
—————— —————————————
ARUP   T1
This shows that the T1 table, owned by ARUP, is being selected from by the disk in the session. You should direct your attention to this table for tuning. You can move the table to a high-speed disk for faster I/O, or, alternatively, you can focus on making I/O in this table faster by making changes that affect this table, such as creating new indexes, creating materialized views, or building a result cache.