SQL Trace is an old database feature and in my opinion one of the best tool to analyze performance problems (it does not replace Automatic Workload Repository (AWR) or Statspack but can give much more precise information at database session level even than Active Session History (ASH)).
However in this article I don't what to show how you can use it to analyze performance issue but instead I want to show that it can also be used to debug database application code in some cases.
Let's assume you have an application using an Oracle Database and some appplication executable reports something like "ERROR" but that's all: the application does not give any other information and the application error is not documented; there is also no way to enable some application tracing because application code is not instrumented.
Let's also assume that this is third party application software and that the company providing related support is as clueless as you.
In this case (does this sound familiar to you ?), you can use SQL trace to check if this error is linked to some SQL statement.
For this article I have used Oracle Entreprise Edition 12.1.0.2 on Oracle Linux 6 but this technique should also work for Oracle 11, 10 and even 9 and Unix or Windows.
This article is using raw SQL trace data: raw SQL trace file format is not documented in official Oracle Database documentation and as a consequence it is not really supported to work directly on this data (usually you only generate SQL raw trace file with SQL trace enabling and you use TKPROF to generate the documented output). Raw SQL trace format is only documented in some My Oracle Support (MOS) documents and on some non-Oracle sites like Charles Hooper blog or on Method R site.
Ideally you should be able to reproduce the problem on a non-production database and I recommend to:
The usage of a logon trigger can have very bad side effects: if there are errors in the trigger new connections will fail and in the worst case you might not able to logon to the database instance any more! Do no try not use use a logon trigger on a production database unless you are very confident about your logon trigger. Use it at your own risk.
For example, assuming that OPS$ORACLE account is having DBA role, you can use following SQL*Plus script to enable SQL trace on all new database sessions:
connect / as sysdba grant execute on dbms_monitor to ops$oracle; connect / select name from v$database; show user; -- create trigger trace_trigger after logon on database begin dbms_monitor.session_trace_enable(binds => true, waits => true); end; / show errors --
Note that the trigger code runs with the privilege of the trigger owner thanks to definer's rights: this is why you only need to grant execute privilege on DBMS_MONITOR to trigger owner and not to the user account used by the application that will run the trigger when logging on to database instance.
If you know the Oracle user account used by application code, you can also try to enable SQL tracing only for this specific account:
connect / as sysdba grant execute on dbms_monitor to ops$oracle; connect / select name from v$database; show user; -- create trigger trace_trigger after logon on database begin if (user = 'APPLI') then dbms_monitor.session_trace_enable(binds => true, waits => true); end if; end; / show errors --
I recommend to enable tracing on all new database session because identifying the right database session used by the application can be (very) difficult especially with application servers using database connection pools.
Now you should reproduce the issue.
As soon as the issue has been reproduced you should disable tracing with:
connect / drop trigger trace_trigger;
OPS$ORACLE@DB12> select value from v$diag_info where name = 'Default Trace File'; VALUE -------------------------------------------------------------------------------- /u01/app/oracle/diag/rdbms/db12/DB12/trace/DB12_ora_9198.trc OPS$ORACLE@DB12>
$ cd /u01/app/oracle/diag/rdbms/db12/DB12/trace/ $ ls -rtl *.trc | tail -n 20 -rw-r----- 1 oracle dba 50323 Aug 22 16:00 DB12_ora_8694.trc -rw-r----- 1 oracle dba 76342 Aug 22 16:00 DB12_j000_8678.trc -rw-r----- 1 oracle dba 182798 Aug 22 16:00 DB12_j001_8680.trc -rw-r----- 1 oracle dba 40338 Aug 22 16:00 DB12_ora_8734.trc -rw-r----- 1 oracle dba 49922 Aug 22 16:00 DB12_ora_8735.trc -rw-r----- 1 oracle dba 50307 Aug 22 16:00 DB12_ora_8737.trc -rw-r----- 1 oracle dba 39276 Aug 22 16:00 DB12_ora_8789.trc -rw-r----- 1 oracle dba 49927 Aug 22 16:00 DB12_ora_8790.trc -rw-r----- 1 oracle dba 633243 Aug 22 16:00 DB12_ora_8792.trc -rw-r----- 1 oracle dba 19857 Aug 22 16:00 DB12_ora_8799.trc -rw-r----- 1 oracle dba 24208 Aug 22 16:00 DB12_ora_8800.trc -rw-r----- 1 oracle dba 123930 Aug 22 16:00 DB12_ora_8801.trc -rw-r----- 1 oracle dba 296616 Aug 22 16:00 DB12_j000_8808.trc -rw-r----- 1 oracle dba 13863 Aug 22 16:01 DB12_ora_8861.trc -rw-r----- 1 oracle dba 43076 Aug 22 16:01 DB12_ora_8862.trc -rw-r----- 1 oracle dba 106896 Aug 22 16:05 DB12_j000_8924.trc -rw-r----- 1 oracle dba 39292 Aug 22 16:10 DB12_j000_9113.trc -rw-r----- 1 oracle dba 4044 Aug 22 16:11 DB12_mmon_2165.trc -rw-r----- 1 oracle dba 20876 Aug 22 16:12 DB12_ora_9175.trc -rw-r----- 1 oracle dba 37882 Aug 22 16:12 DB12_ora_9198.trc
$ find . -newer DB12_ora_8694.trc -exec grep err= {} \; -print
./DB12_ora_8862.trc
ERROR #140067846071864:err=1031 tim=7096445569
Note that the number following "ERROR #" in a internal cursor number that you must use for next step.
The string "err=" is followed by the Oracle error number. In my case it is 1031: it means Oracle error ORA-01031.
It is possible to find different matching lines. In this case you should find the one just before the line found above.
I have found:
PARSING IN CURSOR #140067846071864 len=23 dep=0 uid=143 oct=17 lid=143 tim=7096444559 hv=2726857479 ad='80ecc920' sqlid='2u3y9cqj8j2s7' grant alter user to adm END OF STMT PARSE #140067846071864:c=1000,e=898,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=7096444556 CLOSE #140067845968192:c=0,e=8,dep=1,type=1,tim=7096444670
The raw trace files gives the SQL statement that has caused ORA-01031 error just before "END OF STMT":
grant alter user to adm
That's it.
You can also run TKPROF on the trace file to check what is written for this SQL statement.
In my case I have:
******************************************************************************** SQL ID: 2u3y9cqj8j2s7 Plan Hash: 0 grant alter user to adm call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.00 0.00 0 0 0 0 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 143 Elapsed times include waiting on following events: Event waited on Times Max. Wait Total Waited ---------------------------------------- Waited ---------- ------------ SQL*Net break/reset to client 2 0.00 0.00 SQL*Net message to client 1 0.00 0.00 SQL*Net message from client 1 0.00 0.00 ********************************************************************************
You can see that TKPROF:
Full raw SQL trace file is here and full TKPROF output is here
Using SQL trace and searching for "err=" or "ERROR #" can help to find database code triggering database error in "black box" software.
However using logon trigger to enable SQL trace is really risky and you cannot be 100% sure that you will also find the Oracle error code number in the raw SQL trace file.