Showing posts with label RLS. Show all posts
Showing posts with label RLS. Show all posts

Monday, October 17, 2011

Volatile Data, Dynamic Sampling And Shared Cursors

For the next couple of weeks I'll be picking up various random notes I've made during the sessions that I've attended at OOW. This particular topic was also a problem discussed recently at one of my clients, so it's certainly worth to be published here.

In one of the optimizer related sessions it was mentioned that for highly volatile data - for example often found in Global Temporary Tables (GTT) - it's recommended to use Dynamic Sampling rather than attempting to gather statistics. In particular for GTTs gathering statistics is problematic because the statistics are used globally and shared across all sessions. But GTTs could have a completely different data volume and distribution per session so sharing the statistics doesn't make sense in such scenarios.

So using Dynamic Sampling sounds like a reasonable advice and it probably is in many such cases.

However, there is still a potential problem even when resorting to Dynamic Sampling. If the cursors based on Dynamic Sampling get shared between sessions then they won't be re-optimized even if a GTT in one session is completely different from the one of the session that created the shared cursor previously.

So you can still end up with shared cursors and execution plans that are inappropriate to share across the different sessions. Using Dynamic Sampling doesn't address this issue. It addresses the issue if the cursors do not get shared, for example if they use literals and these literals differ so that different cursors will be generated based on the text matching.

Here is a simple test case that demonstrates the problem:


drop view v_gtt_dyn;

drop table gtt_dyn;

-- Create a Global Temporary Table with an index on it
create global temporary table gtt_dyn (
id number not null
, vc1 varchar2(100)
, filler varchar2(255)
)
on commit preserve rows
;

create index gtt_dyn_idx on gtt_dyn (id);

-- Create a simple view - it will become obvious later
-- why this has been used
create or replace view v_gtt_dyn as select * from gtt_dyn;

-- Run in Session 1
set echo on timing on

-- Unique value in ID column
insert into gtt_dyn
select
rownum as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

alter session set optimizer_dynamic_sampling = 2;

select * from v_gtt_dyn where id = 10 and rownum > 1;

set linesize 200 pagesize 0 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Run in Session 2
set echo on timing on

-- Single value in ID column
insert into gtt_dyn
select
10 as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

alter session set optimizer_dynamic_sampling = 2;

select * from v_gtt_dyn where id = 10 and rownum > 1;

set linesize 200 pagesize 0 trimspool on tab off

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));


Now this is probably an extreme case of data distribution differences but if you run it you'll see it makes the point obvious: In the second session the data distribution of the GTT is completely different, and although there are no statistics gathered on the GTT and hence Dynamic Sampling gets used to arrive at an execution plan, the plan gets shared in the second session (there is only a child number 0) - but the plan is completely inappropriate for the data distribution of the GTT in the that session, you just need to look at the E-Rows and A-Rows columns of the runtime profile:


SQL> select * from v_gtt_dyn where id = 10 and rownum > 1;

no rows selected

Elapsed: 00:00:00.07
SQL>
SQL> set linesize 200 pagesize 0 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID bjax3mksw1uza, child number 0
-------------------------------------
select * from v_gtt_dyn where id = 10 and rownum > 1

Plan hash value: 471827990

-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.08 | 1117 |
| 1 | COUNT | | 1 | | 0 |00:00:00.08 | 1117 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.08 | 1117 |
| 3 | TABLE ACCESS BY INDEX ROWID| GTT_DYN | 1 | 1 | 10000 |00:00:00.06 | 1117 |
|* 4 | INDEX RANGE SCAN | GTT_DYN_IDX | 1 | 1 | 10000 |00:00:00.02 | 63 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(ROWNUM>1)
4 - access("ID"=10)

Note
-----
- dynamic sampling used for this statement (level=2)



Imagine a more complex plan with joins and a larger data volume and this is a receipt for disaster.

If this problem cannot be addressed from application side by helping the database to generate different cursors for the different data distributions (for example by simply adding different predicates that don't change the result like 1 = 1, 2 = 2 etc.) then you might be able to handle the issue by using Virtual Private Database (VPD, aka. Row Level Security / RLS, Fine Grained Access Control / FGAC). I've already demonstrated the general approach in the past here, but in this case a slightly more sophisticated approach could make sense.

By adding the following code and RLS policy I can drive Oracle to perform a re-optimization only in those cases where it is appropriate. This limits the damage that the general approach does to the Shared Pool by generating potentially numerous child cursors unconditionally.


create or replace package pkg_rls_force_hard_parse is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2;
end pkg_rls_force_hard_parse;
/

create or replace package body pkg_rls_force_hard_parse is
-- Cache the count in session state
g_cnt number;

function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2
is
s_predicate varchar2(100);
begin
-- Only execute query once in session
-- Change if re-evaluation is desired
if g_cnt is null then
select
count(*)
into
g_cnt
from
gtt_dyn
where
id = 10
and rownum <= 10;
end if;

-- We end up with exactly two child cursors
-- with the desired different plans
-- These child cursors will be shared accordingly
if g_cnt > 1 then
s_predicate := '1 = 1';
else
s_predicate := '2 = 2';
end if;

return s_predicate;
end force_hard_parse;
end pkg_rls_force_hard_parse;
/

-- CONTEXT_SENSITIVE avoids re-evaluation of policy function at execution time
-- Note however that it doesn't avoid re-evaluation at parse time
exec DBMS_RLS.ADD_POLICY (USER, 'v_gtt_dyn', 'hard_parse_policy', USER, 'pkg_rls_force_hard_parse.force_hard_parse', 'select', policy_type => DBMS_RLS.CONTEXT_SENSITIVE);


Now if you repeat above exercise - ideally with SQL trace enabled to see the additional acitivity caused by the RLS policy - you'll notice that the different sessions will end up with different child cursors and execution plans based on the check made.

Now the reason why the view is in place might become obvious: A RLS policy on the base table would have lead to an infinite recursive execution of the RLS policy function due to the query performed within the function. There are other obvious options how to deal with that, for example storing the RLS policy function in a separate schema with the EXEMPT ACCESS POLICY privilege should also work.

This is the result in the second session now:


SQL> select * from v_gtt_dyn where id = 10 and rownum > 1;

no rows selected

Elapsed: 00:00:00.12
SQL>
SQL> set linesize 200 pagesize 0 trimspool on tab off
SQL>
SQL> select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));
SQL_ID bjax3mksw1uza, child number 1
-------------------------------------
select * from v_gtt_dyn where id = 10 and rownum > 1

Plan hash value: 424976618

-----------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 0 |00:00:00.04 | 1003 |
| 1 | COUNT | | 1 | | 0 |00:00:00.04 | 1003 |
|* 2 | FILTER | | 1 | | 0 |00:00:00.04 | 1003 |
|* 3 | TABLE ACCESS FULL| GTT_DYN | 1 | 9288 | 10000 |00:00:00.03 | 1003 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - filter(ROWNUM>1)
3 - filter("ID"=10)

Note
-----
- dynamic sampling used for this statement (level=2)


Notice how a second child cursor has been generated and that the cardinality estimate is now much closer to the reality.

Adaptive Cursor Sharing / Cardinality Feedback

I was curious to see if recent features like Adaptive Cursor Sharing or Cardinality Feedback would be able to solve the issue when using the 11g releases.

Cardinality Feedback (introduced in 11.2) unfortunately doesn't get used in the scenario described here, because Dynamic Sampling disables Cardinality Feedback in the current implementation.

Note that the usage of bind variables also disables Cardinality Feedback for those parts of the plan affected by the bind variables - as described in the Optimizer blog post that can be found here.

So may be Adaptive Cursor Sharing (ACS, introduced in 11.1) can come to rescue in case bind variables get used.

Of course the usage of bind variables increases the probability of cursor sharing in above scenario. As already outlined in a previous note ACS is a "reactive" and "non-persistent" feature, so it will only be able to correct things that have already been going wrong at least once. Furthermore if the ACS information gets aged out of the Shared Pool again mistakes will have to be repeated to get recognized by ACS.

However it is interesting to note that I wasn't able to get ACS working in a slightly modified scenario like this (without the RLS policy in place of course):


-- Session 1
set echo on timing on

set linesize 200 pagesize 0 trimspool on tab off

insert into gtt_dyn
select
rownum as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

variable n1 number

exec :n1 := 10

select * from v_gtt_dyn where id <= :n1 and rownum > 1;

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Session 2
set echo on timing on

set linesize 200 pagesize 0 trimspool on tab off

insert into gtt_dyn
select
10 as id
, rpad('x', 100) as vc1
, rpad('y', 255) as filler
from
dual
connect by level <= 10000
;

commit;

alter session set statistics_level = all;

variable n1 number

exec :n1 := 10

select * from v_gtt_dyn where id <= :n1 and rownum > 1;

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));

-- Second execution to allow ACS potentially kicking in
select * from v_gtt_dyn where id <= :n1 and rownum > 1;

select * from table(dbms_xplan.display_cursor(null, null, 'ALLSTATS LAST'));


There are some interesting points to notice when running this example:

1. A cursor that uses non-equal operators like above less or equal together with bind variables usually gets marked as "bind-sensitive" and will be monitored by ACS. But in the above case the cursor was not marked as bind sensitive and hence ACS didn't even bother to monitor

2. Consequently the two sessions share the single child cursor and the problem is not addressed by ACS even in subsequent executions

3. It looks like that again the usage of Dynamic Sampling disables ACS

Looking at the way ACS manages the Cursor Sharing criteria (check V$SQL_CS_SELECTIVITY for example) I see the problem that ACS probably wouldn't support the fact that the same value for the bind variable resulted in a completely different selectivity range.

May be this is an explanation why ACS is not activated for cursors that use Dynamic Sampling - ACS may only be able to cope with different bind value ranges that lead to different selectivities.

So even when using bind variables and 11g with ACS it looks like that only the RLS policy approach allows to address this issue from a database-only side. Ideally the application should be "data-aware" in such cases and help the database accordingly to arrive at reasonable execution plans by actively unsharing the cursors.

Friday, February 6, 2009

How to force a hard parse

It's not uncommon to be confronted with a bad behaving application that doesn't share its SQLs by using bind variables but uses literals instead. This leads to well known issues like increased CPU usage, library cache contention etc. due to the hard parse activities required.

On a recent OTN thread the question was raised, what to do if you are confronted with the opposite, a bad behaving application that shares SQL using bind variables that should not be shared? Although this might sound a bit odd at first, in pre-11g databases this "over-use" of bind variables can lead to severe issues, in particular if you have long running queries, possibly together with skewed data. Sharing SQL in such scenarios can lead to poor performance if the optimizer re-uses unsuitable execution plans that otherwise would not be used if a hard parse was forced using literals.

Obviously fixing the application and using literals where appropriate is the correct approach, but what can be done if you can't change the code, e.g. in case it's a third party application?

One possible solution is to "mis-use" the row level security feature of Oracle (RLS, often also called Virtual Private Database, VPD, also known as FGAC, Fine Grained Access Control) to modify the queries on the fly and therefore forcing a hard parse every time a query is using one of the objects the RLS policy is applied to.

The following code snippet demonstrates three slightly different approaches that were mentioned in the thread, all using the same basic principle: Add a non-filtering predicate to the query that makes the query unsharable resp. unique, e.g. like "1 = 1". The third one was contributed by Jonathan Lewis, by the way.


-- Global context approach
create table hard_parse_test
as
select * from all_objects
where rownum <= 1000;

exec dbms_stats.gather_table_stats(null, 'hard_parse_test')

create or replace context ctx_force_hard_parse
using pkg_rls_force_hard_parse
accessed globally;

create or replace package pkg_rls_force_hard_parse is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2;
procedure clear_context;
end pkg_rls_force_hard_parse;
/

create or replace package body pkg_rls_force_hard_parse is
g_counter pls_integer;

procedure clear_context
is
begin
dbms_session.clear_context('CTX_FORCE_HARD_PARSE', NULL, 'G_COUNTER');
end clear_context;

procedure set_context
is
begin
dbms_session.set_context('ctx_force_hard_parse', 'G_COUNTER', to_char(g_counter, 'TM'));
end set_context;

procedure get_context
is
begin
g_counter := to_number(SYS_CONTEXT('ctx_force_hard_parse', 'G_COUNTER'));
end get_context;

function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2
is
s_predicate varchar2(100);
begin
get_context;
if g_counter is null then
g_counter := 0;
end if;
g_counter := g_counter + 1;
if g_counter > 1000000 then
g_counter := 1;
end if;
set_context;
-- s_predicate := '1 = 1';
s_predicate := to_char(g_counter, 'TM') || ' = ' || to_char(g_counter, 'TM');
-- s_predicate := 'object_type = ''TABLE''';
return s_predicate;
end force_hard_parse;
begin
get_context;
if g_counter is null then
g_counter := 0;
set_context;
end if;
end pkg_rls_force_hard_parse;
/

exec DBMS_RLS.ADD_POLICY (USER, 'hard_parse_test', 'hard_parse_policy', USER, 'pkg_rls_force_hard_parse.force_hard_parse', 'select');

-- random approach
create table hard_parse_test_rand
as
select * from all_objects
where rownum <= 1000;

exec dbms_stats.gather_table_stats(null, 'hard_parse_test_rand')

create or replace package pkg_rls_force_hard_parse_rand is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2;
end pkg_rls_force_hard_parse_rand;
/

create or replace package body pkg_rls_force_hard_parse_rand is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2
is
s_predicate varchar2(100);
n_random pls_integer;
begin
n_random := round(dbms_random.value(1, 1000000));
-- s_predicate := '1 = 1';
s_predicate := to_char(n_random, 'TM') || ' = ' || to_char(n_random, 'TM');
-- s_predicate := 'object_type = ''TABLE''';
return s_predicate;
end force_hard_parse;
end pkg_rls_force_hard_parse_rand;
/

exec DBMS_RLS.ADD_POLICY (USER, 'hard_parse_test_rand', 'hard_parse_policy', USER, 'pkg_rls_force_hard_parse_rand.force_hard_parse', 'select');

-- sequence approach
create sequence hard_parse_test_seq maxvalue 1000000 cycle;

create table hard_parse_test_sequence
as
select * from all_objects
where rownum <= 1000;

exec dbms_stats.gather_table_stats(null, 'hard_parse_test_sequence')

create or replace package pkg_rls_force_hard_parse_seq is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2;
end pkg_rls_force_hard_parse_seq;
/

create or replace package body pkg_rls_force_hard_parse_seq is
function force_hard_parse (in_schema varchar2, in_object varchar2) return varchar2
is
s_predicate varchar2(100);
n_seq pls_integer;
begin
select hard_parse_test_seq.nextval
into n_seq
from dual;
-- s_predicate := '1 = 1';
s_predicate := to_char(n_seq, 'TM') || ' = ' || to_char(n_seq, 'TM');
-- s_predicate := 'object_type = ''TABLE''';
return s_predicate;
end force_hard_parse;
end pkg_rls_force_hard_parse_seq;
/

exec DBMS_RLS.ADD_POLICY (USER, 'hard_parse_test_sequence', 'hard_parse_policy', USER, 'pkg_rls_force_hard_parse_seq.force_hard_parse', 'select');

-- trace
alter session set tracefile_identifier = 'hard_parse_test_session_loop1';

exec dbms_monitor.session_trace_enable

declare
begin
for i in 1..1000 loop
for rec in (select * from hard_parse_test_sequence) loop
exit;
end loop;
end loop;
end;
/

exec dbms_monitor.session_trace_disable


If you check the trace files you'll see that the query in the PL/SQL loop got hard parsed for each execution, so this looks promising. Stress testing the implementation by running the loop from multiple sessions simultaneously showed that it seems to work well, and effectively prevents the sharing of the SQLs.

Using Tom Kyte's RUNSTATS package shows that there are not too many differences in resource consumption of the different approaches:


-- statistics
set timing on echo on serveroutput on trimspool on

spool hard_parse_force_rls_testcase.log

alter session set nls_language = 'AMERICAN';

alter system flush shared_pool;

exec runstats_pkg.rs_start

declare
begin
for i in 1..1000 loop
for rec in (select * from hard_parse_test) loop
exit;
end loop;
end loop;
end;
/

exec runstats_pkg.rs_middle

declare
begin
for i in 1..1000 loop
for rec in (select * from hard_parse_test_rand) loop
exit;
end loop;
end loop;
end;
/

exec runstats_pkg.rs_stop

spool off

-- statistics
set timing on echo on serveroutput on trimspool on

spool hard_parse_force_rls_testcase2.log

alter session set nls_language = 'AMERICAN';

alter system flush shared_pool;

exec runstats_pkg.rs_start

declare
begin
for i in 1..1000 loop
for rec in (select * from hard_parse_test_sequence) loop
exit;
end loop;
end loop;
end;
/

exec runstats_pkg.rs_middle

declare
begin
for i in 1..1000 loop
for rec in (select * from hard_parse_test_rand) loop
exit;
end loop;
end loop;
end;
/

exec runstats_pkg.rs_stop

spool off


Results for Global Context vs Random:


Run1 ran in 1941 hsecs
Run2 ran in 1869 hsecs
run 1 ran in 103,85% of the time

Name Run1 Run2 Diff
LATCH.list of block allocation 0 1 1
LATCH.transaction branch alloc 7 6 -1
STAT...session cursor cache co 1 0 -1
STAT...table scan blocks gotte 2,002 2,001 -1
STAT...redo entries 9 10 1
STAT...calls to kcmgcs 4 5 1
LATCH.user lock 0 1 1
LATCH.transaction allocation 0 1 1
LATCH.longop free list parent 7 6 -1
STAT...opened cursors current 1 2 1
LATCH.session timer 7 6 -1
LATCH.mostly latch-free SCN 7 6 -1
LATCH.threshold alerts latch 1 0 -1
LATCH.lgwr LWN SCN 7 6 -1
LATCH.ASM db client latch 14 12 -2
STAT...cursor authentications 3 5 2
LATCH.job_queue_processes para 0 2 2
LATCH.session idle bit 16 14 -2
LATCH.cache buffers lru chain 7 5 -2
LATCH.shared pool sim alloc 2 0 -2
LATCH.dml lock allocation 0 2 2
LATCH.undo global data 3 6 3
LATCH.parameter list 5 2 -3
STAT...table fetch continued r 3 0 -3
STAT...bytes received via SQL* 1,224 1,227 3
LATCH.redo allocation 16 19 3
LATCH.object queue header oper 50 46 -4
LATCH.In memory undo latch 8 12 4
LATCH.library cache lock alloc 12 16 4
LATCH.parallel query alloc buf 4 0 -4
STAT...consistent changes 37 32 -5
STAT...db block changes 47 42 -5
LATCH.session state list latch 6 0 -6
LATCH.active service list 40 33 -7
STAT...undo change vector size 2,132 2,140 8
STAT...parse time cpu 13 5 -8
LATCH.messages 124 116 -8
STAT...parse count (hard) 1,013 1,004 -9
STAT...parse time elapsed 15 5 -10
STAT...db block gets 39 28 -11
STAT...db block gets from cach 39 28 -11
LATCH.channel operations paren 98 84 -14
STAT...enqueue releases 1,020 1,005 -15
STAT...enqueue requests 1,021 1,005 -16
LATCH.checkpoint queue latch 184 168 -16
LATCH.library cache load lock 60 41 -19
STAT...redo size 2,884 2,904 20
STAT...workarea executions - o 38 14 -24
STAT...table scan rows gotten 178,060 178,030 -30
STAT...cluster key scans 47 8 -39
LATCH.kks stats 1,056 1,015 -41
STAT...rows fetched via callba 58 15 -43
STAT...cluster key scan block 51 8 -43
STAT...hot buffers moved to he 45 0 -45
STAT...workarea memory allocat 34 -12 -46
STAT...sorts (memory) 63 15 -48
STAT...free buffer inspected 49 0 -49
STAT...CPU used by this sessio 1,915 1,856 -59
STAT...parse count (total) 2,137 2,074 -63
STAT...CPU used when call star 1,921 1,857 -64
STAT...recursive cpu usage 1,918 1,853 -65
STAT...index scans kdiixs1 137 67 -70
STAT...Elapsed Time 1,944 1,873 -71
LATCH.SQL memory manager worka 489 416 -73
STAT...DB time 1,945 1,867 -78
STAT...index fetch by key 116 25 -91
STAT...session cursor cache hi 2,157 2,059 -98
STAT...opened cursors cumulati 2,172 2,071 -101
LATCH.enqueues 2,365 2,252 -113
STAT...execute count 3,208 3,094 -114
LATCH.enqueue hash chains 2,368 2,253 -115
STAT...calls to get snapshot s 3,210 3,091 -119
STAT...consistent gets - exami 377 131 -246
STAT...no work - consistent re 2,456 2,183 -273
STAT...buffer is not pinned co 617 230 -387
STAT...table fetch by rowid 555 103 -452
STAT...consistent gets 4,841 4,323 -518
STAT...consistent gets from ca 4,841 4,323 -518
STAT...session logical reads 4,880 4,351 -529
STAT...sorts (rows) 3,173 2,561 -612
STAT...buffer is pinned count 653 9 -644
LATCH.cache buffers chains 9,799 8,728 -1,071
LATCH.simulator hash latch 50 2,023 1,973
STAT...recursive calls 8,243 6,256 -1,987
LATCH.session allocation 7,735 2,065 -5,670
LATCH.global ctx hash table la 6,004 0 -6,004
LATCH.library cache lock 1,011,473 1,003,687 -7,786
LATCH.shared pool simulator 509,871 517,895 8,024
LATCH.library cache pin 34,233 22,204 -12,029
LATCH.library cache 1,061,947 1,040,025 -21,922
STAT...session pga memory max 310,652 196,608 -114,044
STAT...session uga memory 130,928 0 -130,928
LATCH.row cache objects 37,688,620 37,550,193 -138,427
STAT...session uga memory max 299,884 123,440 -176,444
STAT...session pga memory 310,652 0 -310,652
LATCH.shared pool 896,770 283,900 -612,870

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
41,235,708 40,439,510 -796,198 101.97%


Results for Sequence vs. Random:


Run1 ran in 1912 hsecs
Run2 ran in 1863 hsecs
run 1 ran in 102,63% of the time

Name Run1 Run2 Diff
LATCH.compile environment latc 1 0 -1
LATCH.job_queue_processes para 1 0 -1
LATCH.threshold alerts latch 0 1 1
LATCH.parameter list 3 2 -1
LATCH.resmgr:actses change gro 1 0 -1
LATCH.KMG MMAN ready and start 7 6 -1
LATCH.FOB s.o list latch 1 0 -1
LATCH.parameter table allocati 1 0 -1
STAT...cursor authentications 4 3 -1
STAT...session cursor cache co 1 0 -1
STAT...index crx upgrade (posi 2 1 -1
STAT...calls to kcmgcs 4 5 1
STAT...shared hash latch upgra 2 1 -1
LATCH.resmgr:active threads 2 0 -2
LATCH.resmgr:free threads list 2 0 -2
STAT...table fetch continued r 2 0 -2
LATCH.post/wait queue 2 0 -2
LATCH.user lock 2 0 -2
LATCH.dummy allocation 2 0 -2
LATCH.mostly latch-free SCN 9 6 -3
LATCH.lgwr LWN SCN 9 6 -3
LATCH.list of block allocation 3 0 -3
LATCH.transaction allocation 3 0 -3
STAT...change write time 3 0 -3
STAT...parse time elapsed 10 13 3
LATCH.active service list 44 41 -3
STAT...workarea memory allocat -18 -21 -3
LATCH.parallel query alloc buf 4 0 -4
LATCH.PL/SQL warning settings 2,010 2,006 -4
STAT...free buffer requested 8 12 4
STAT...parse time cpu 10 6 -4
STAT...consistent changes 37 32 -5
STAT...bytes received via SQL* 1,233 1,227 -6
LATCH.resmgr group change latc 6 0 -6
LATCH.channel operations paren 98 92 -6
STAT...physical read total IO 0 6 6
STAT...physical read IO reques 0 6 6
STAT...physical reads cache 0 6 6
STAT...physical reads 0 6 6
STAT...user I/O wait time 0 7 7
LATCH.library cache pin alloca 15 7 -8
LATCH.session state list latch 8 0 -8
LATCH.Consistent RBA 9 1 -8
LATCH.active checkpoint queue 15 6 -9
LATCH.simulator lru latch 10 0 -10
STAT...parse count (hard) 1,016 1,006 -10
LATCH.In memory undo latch 25 8 -17
LATCH.cache buffers lru chain 30 12 -18
LATCH.redo allocation 34 15 -19
LATCH.client/application info 24 0 -24
LATCH.shared pool sim alloc 0 24 24
STAT...workarea executions - o 38 13 -25
LATCH.messages 158 126 -32
LATCH.redo writing 54 22 -32
LATCH.library cache lock alloc 43 8 -35
STAT...rows fetched via callba 53 15 -38
STAT...hot buffers moved to he 41 2 -39
STAT...CPU used by this sessio 1,880 1,839 -41
LATCH.SQL memory manager worka 456 412 -44
STAT...recursive cpu usage 1,880 1,835 -45
STAT...cluster key scans 51 5 -46
STAT...CPU used when call star 1,885 1,839 -46
STAT...Elapsed Time 1,914 1,866 -48
LATCH.kks stats 1,084 1,036 -48
STAT...cluster key scan block 55 5 -50
LATCH.object queue header heap 90 36 -54
STAT...DB time 1,915 1,860 -55
STAT...sorts (memory) 68 12 -56
STAT...free buffer inspected 61 0 -61
STAT...index scans kdiixs1 133 66 -67
LATCH.object queue header oper 136 60 -76
STAT...commit cleanouts 101 0 -101
STAT...commit cleanouts succes 101 0 -101
STAT...calls to kcmgas 101 0 -101
STAT...deferred (CURRENT) bloc 101 0 -101
LATCH.session idle bit 123 14 -109
STAT...parse count (total) 2,214 2,071 -143
LATCH.library cache load lock 206 28 -178
LATCH.checkpoint queue latch 420 240 -180
STAT...index fetch by key 217 21 -196
STAT...session cursor cache hi 2,255 2,058 -197
STAT...redo entries 211 10 -201
LATCH.enqueues 2,543 2,334 -209
LATCH.dml lock allocation 214 0 -214
STAT...opened cursors cumulati 2,289 2,067 -222
STAT...no work - consistent re 2,450 2,182 -268
STAT...db block gets from cach 345 30 -315
STAT...db block gets 345 30 -315
STAT...enqueue releases 1,327 1,007 -320
STAT...enqueue requests 1,328 1,007 -321
LATCH.undo global data 322 1 -321
STAT...consistent gets - exami 465 123 -342
STAT...buffer is not pinned co 600 228 -372
STAT...db block changes 451 43 -408
STAT...table fetch by rowid 545 102 -443
STAT...consistent gets 4,924 4,314 -610
STAT...consistent gets from ca 4,924 4,314 -610
STAT...sorts (rows) 3,178 2,561 -617
STAT...buffer is pinned count 654 9 -645
LATCH.enqueue hash chains 3,074 2,336 -738
LATCH.library cache lock 1,008,133 1,007,327 -806
STAT...session logical reads 5,269 4,344 -925
LATCH.simulator hash latch 242 2,027 1,785
STAT...execute count 5,310 3,090 -2,220
STAT...calls to get snapshot s 5,413 3,087 -2,326
LATCH.library cache 1,040,087 1,043,596 3,509
LATCH.library cache pin 18,535 22,182 3,647
LATCH.row cache objects 37,693,219 37,688,109 -5,110
STAT...recursive calls 11,801 6,131 -5,670
LATCH.sequence cache 6,210 0 -6,210
LATCH.cache buffers chains 18,122 8,807 -9,315
LATCH.shared pool simulator 538,704 558,857 20,153
STAT...undo change vector size 22,736 2,148 -20,588
LATCH.session allocation 25,092 2,957 -22,135
STAT...physical read total byt 0 49,152 49,152
STAT...physical read bytes 0 49,152 49,152
STAT...redo size 65,780 2,928 -62,852
STAT...session uga memory 65,464 0 -65,464
STAT...session pga memory 131,072 65,536 -65,536
STAT...session pga memory max 262,144 131,072 -131,072
STAT...session uga memory max 315,052 65,464 -249,588
LATCH.shared pool 702,224 301,771 -400,453

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
41,062,061 40,644,708 -417,353 101.03%


In passing, it looks like that the tests on 10.2.0.4 Win32bit revealed that the 10g RLS code has some built-in check which ignores non-filtering predicates like "1 = 1", at least the predicate doesn't show up in the DBMS_XPLAN.DISPLAY output whereas an actual filter predicate like "OBJECT_TYPE = 'TABLE'" does show up. Still the hard parse is forced, so it's even better if the query is effectively not modified.

It's worth to mention that the RLS policy adds a significant overhead. If you check the trace files you'll notice that the policy function gets called once per parse and obviously twice per execution, so a single parse/execute of a query causes two hard parses and three calls to the policy function.

By the way, Tanel Poder mentioned in the OTN thread that he demonstrated a quite similar solution some time back at HotSOS which I wasn't aware of. His demo shows some enhanced versions of the solution including the ability to check for particular SQLs so that not every SQL is affected. Thanks, Tanel, for the information and link.