Skip to main content

The differences between deterministic and result cache features

 EVERY once in a while, a developer gets in touch with a question like this:

I am confused about the exact difference between deterministic and result_cache. Do they have different application use cases? I have used deterministic feature in many functions which retrieve data from some lookup tables. Is it essential to replace these 'deterministic' key words with 'result_cache'? 

So I thought I'd write a post about the differences between these two features. But first, let's make sure we all understand what it means for a function to be deterministic.

From Wikipedia:

In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states. 

Another way of putting this is that a deterministic subprogram (procedure or function) has no side-effects. If you pass a certain set of arguments for the parameters, you will always get the same results - no matter when, where or how often you run it.

What, you may be wondering, would be a side effect in a PL/SQL function? Any of these (this list is not exhaustive):

  • Any and every SQL statement
  • Referencing an out-of-scope variable (aka, "global")
  • Invoking a non-deterministic subprogram

Next we'll take a look at how you use these two features in your code. Then I will delve into the differences. We will, by the way, focus on functions in this post. It is possible to declare a procedure as deterministic, but this has no known impact that I am aware of and is not commonly used. And you cannot declare a procedure as "result cached."

1. Defining a function as deterministic

FUNCTION betwnstr (
   string_in   IN   VARCHAR2
 , start_in    IN   INTEGER
 , end_in      IN   INTEGER
)
   RETURN VARCHAR2 DETERMINISTIC 
IS
BEGIN
   RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;

This simple encapsulation of SUBSTR allows me to specify the start and end positions, rather than the start position and number of characters. I hope you will agree that this is deterministic

So that's all you have to do: add the DETERMINISTIC keyword to the header of the function or procedure.

What it does for you:

  • Makes it possible to use the function in a function-based index
  • Might improve performance by caching and reusing function return values
Here's a quick little demonstration of the caching impact of a deterministic function:
CREATE OR REPLACE FUNCTION pass_number (i NUMBER)
   RETURN NUMBER
   DETERMINISTIC
IS
BEGIN
   DBMS_OUTPUT.put_line ('pass_number executed');
   RETURN 0;
END;
/

DECLARE
   n   NUMBER := 0;
BEGIN
   FOR rec IN (SELECT pass_number (1)
                 FROM all_objects
                WHERE ROWNUM < 6)
   LOOP
      n := n + 1;
   END LOOP;

   DBMS_OUTPUT.put_line (n + 1);
END;
/

pass_number executed
6

Notice that even though the function was invoked 5 times, the body of the function was executed just once. Oracle Database created a tiny, little, short-lived cache, just for this function and just for the server call (PL/SQL block or SQL statement) in which it was invoked.

Rob van Wijk offers lots more details on the behavior and performance of deterministic functions here

2. Defining a function as "result cached"

Let's now change make that betwnstr function a result-cached function:

FUNCTION betwnstr (
   string_in   IN   VARCHAR2
 , start_in    IN   INTEGER
 , end_in      IN   INTEGER
)
   RETURN VARCHAR2 
   RESULT_CACHE
IS
BEGIN
   RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;

Gee, that was (deceptively) easy. I just add the RESULT_CACHE keyword. Notice that I removed the DETERMINISTIC keyword, but I did that only for clarity's sake. A function can have both of these keywords in its header. It can, in other words, be deterministic and result-cached.

What it does for you:

  • Tells Oracle Database that you want to use some memory in the SGA or Shared Global Area to cache argument values and returned values.
  • From that point on, whenever the function is invoked by any session in the database instance, the body of the function will only be executed if it has not already been called with those same input values.
  • If there is a "hit" in the cache for that combination of arguments, the return value(s) will simply be grabbed from the cache and returned to the calling block.
  • If the function relies on (references) any database tables, when any user commits changes to that table, the cache for the function will be automatically wiped out.
Hopefully you can tell that there is a lot more to the result cache feature than to the deterministic one. And the above list just scratches the surface. 

RESULT_CACHE is waaaaaay more powerful and can also have much greater impact (good and bad) on the performance of your application than deterministic. If you want to make use of this feature, you definitely should read through the extensive documentation here. You also find several result cache scripts on Oracle LiveSQL here.

And now for a quick little demonstration of the caching impact of a result-cached function:
CREATE OR REPLACE FUNCTION pass_number (i NUMBER)
   RETURN NUMBER
   RESULT_CACHE
IS
BEGIN
   DBMS_OUTPUT.put_line ('pass_number executed for ' || i);
   RETURN 0;
END;
/

DECLARE
   n   NUMBER := 0;
BEGIN
   FOR rec IN (SELECT pass_number (100)
                 FROM all_objects
                WHERE ROWNUM < 6)
   LOOP
      n := n + 1;
   END LOOP;

   DBMS_OUTPUT.put_line ('All done ' || TO_CHAR (n + 1));
END;
/

BEGIN
   DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (100));
   DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (200));
   DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (300));
   DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (100));
   DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (200));
   DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (300));
END;
/

pass_number executed for 100
All done 6
Returned 0
pass_number executed for 200
Returned 0
pass_number executed for 300
Returned 0
Returned 0
Returned 0
Returned 0

I call pass_number a total of three times with an argument value of 100. Notice that the function body is executed just once, out of those three times - even though the calls came from two different blocks.

And I call pass_number twice each with argument values of 200 and 300, but the function body is executed just once for each of those. 

If I disconnect from my session and reconnect, then run those same two blocks (without recompiling the function), I will see only:

All done 6
Returned 0
Returned 0
Returned 0
Returned 0
Returned 0
Returned 0

The cache for a result-cached function persists across blocks, across sessions, across users. It is a complex feature that can have a ripple effect, both good and bad, across your application.

Which is to say: if you are not careful about how you use RESULT_CACHE, you could cause unintended negative consequences, some of which are explored in this article: How to write a safe result-cached function.

How these two features are similar

Both DETERMINISTIC and RESULT_CACHE cause caching of data that can improve performance by sidestepping the need to execute the body of a function.

How they are different

The caching that the DETERMINISTIC keyword leads to has a narrow scope (only your session) and short lifespan (the caching occurs for the duration of the SQL statement that executes the function). So the overall performance impact will likely not be too great.

Data cached by the result cache feature are available to all users of a database instance. And that data remains cached until the cache is invalidated or flushed. It has much greater potential for improving performance of your application - but it also presents more of a danger of having a negative impact as well.

When to use deterministic

You should add the DETERMINISTIC keyword to your function header whenever it actually is deterministic. By doing so, you give the PL/SQL and SQL engines more information that they can use to optimize performance and make the features available (such as a function-based index).

Even if it doesn't help now, it might in the future, and it can do no harm to put it there....but....

But you should never use that keyword with a function that is not truly deterministic. Sometimes this kind of "lying" will be caught by Oracle and rejected, sometimes it could cause problems in your application.

When to use result cache

This is trickier to answer. Adding the RESULT_CACHE keyword to your function has a ripple effect through the entire database instance and overall application performance.

You need to work closely with your DBA to ensure that your dev, test and production instances are configured properly (the SGA memory area for all the result caches needs to be properly sized; you need to do everything you can to avoid latch contention).

And you need to very carefully select which functions should have the RESULT_CACHE keyword added to them. Some basic criteria include:
  • It is invoked with the same argument values repeatedly? 
  • If the function relies on a table, is the data in the table or view static (example: materialized view)? That's a good candidate. 
  • If the function relies on a table, is the data queried much more frequently than it is updated? Also a good possibility then. Remember: the cache will be invalidated when changes to a dependent table are committed.
  • Does the function have any session-specific dependencies, such as reliance on NLS parameter settings? If so, then the cached values might not match the session-dependent values. A good example is using TO_CHAR in your function without a format mask. VPD and reliance on system context values is another. Generally the solution for this is to move all such dependencies to the parameter list. 
Just remember: any deterministic function is a good candidate for the RESULT_CACHE keyword, but not every result-cached function is deterministic.

Comments

  1. FUNCTION betwnstr (
    string_in IN VARCHAR2
    , start_in IN INTEGER
    , end_in IN INTEGER
    )
    RETURN VARCHAR2 DETERMINISTIC
    IS
    BEGIN
    RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
    END;

    In this above function every time the string input varies the output will also vary, then how this function will be called deterministic. Please explain more. As per definition, for a given input the output will be same is what was mentioned as deterministic. Can you throw some more light into this.

    ReplyDelete
    Replies
    1. "For a given input"

      In other words, if I pass in the same three input values "abc" 1 and 2, I will ALWAYS get back "ab".

      If your input changes, you get different results, but those results are dependent only on the input values.

      The value returned by the function is 100% DETERMINED by the input values and NOTHING ELSE.

      Does that help?

      Delete
    2. Deterministic means that for the same input it has always the same result.
      With Steven's favorite function, betwnstr ('Steven Feuerstein', 8, 17) will give you Feuerstein; betwnstr ('Steven Feuerstein', 1, 6) will give you Steven.

      This actually means that the engine will, during the processing of your statement, store the parameters as key and the result as the value in an associative array like structure.

      In this statement:

      Select betwnstr ( str, 1, 6 )
      from (
      select 'Steven Feuerstein' str from dual
      union all
      select 'João Barreto' str from dual
      union all
      select 'Steven Feuerstein' str from dual
      union all
      select 'Steven Feuerstein' str from dual
      union all
      select 'Steven Feuerstein' str from dual
      union all
      select 'Steven Feuerstein' str from dual
      ) data

      the function will only be called twice even though Steven appears five times in the set.

      Delete
    3. João, how absolutely and totally certain are you that the body of the function will be executed precisely twice in the union-all query you show above?

      :-)

      Delete
    4. I tested using logger. I think that even if the engine rewrites the query it will not execute more than twice - unless some degree of parallelism applies?

      Delete
    5. What if some maniac does;

      TRUNCATE TABLE SYS.DUAL;

      >:)

      Separate note - In the 'caching impact of a result-cached function' example, I think this;

      DBMS_OUTPUT.put_line ('All done ' || n + 1);

      Might need to be;

      DBMS_OUTPUT.put_line ('All done ' ||TO_CHAR(n + 1));

      (just noticed as I got an error running it)

      Delete
    6. Thanks for catching that. Funnily enough, I hit the error too when running in LiveSQL, but forgot to go back to the post and fix that line. I will do so now.

      And if there is "some maniac" that has the ability to connect to SYS and truncate that table...well, you've got bigger problems than impact on function caching."

      :-)

      Delete
  2. A bit more details from me:
    1. Deterministic functions caching and scalar subquery caching mechanisms are based on hash functions.
    2. Deterministic caching depends on fetch size(arraysize) – results cached only within one fetch call, while SSC(scalar subquery caching) has no this limitation.
    3. Hash collisions depends on the single parameter “_query_execution_cache_max_size” for both mechanizms, but they are more frequent in SSC.
    4. Oracle doesn’t keep last result of deterministic functions in case of hash collisions as it does for scalar subquery caching
    5. Deterministic functions caching turns off after a certain number of attempts to get the value from the cache. But SSC always returns results from cache if values already cached.

    Full my articles with explanations and examples:
    https://jonathanlewis.wordpress.com/2015/02/08/functions-subqueries/
    http://orasql.org/2014/03/31/deterministic-functions-result_cache-and-operators/

    Best regards,
    Sayan Malakshinov

    ReplyDelete
    Replies
    1. Thanks for all this information, Sayan.

      I have passed it along to the developer who writes the C code behind this stuff. I am sure he will enjoy it greatly!

      Delete
  3. And one please from me: Steven, please tell people more often about operators and statement level consistency. I see too often user-defined functions which return SELECTs results and used in other queries, and then they suddenly get inconsistent results when somebody changes data at the same time. Still a lot of oracle developers do not know about user-defined operators...
    Short example from me: http://orasql.org/2019/12/30/pl-sql-functions-and-statement-level-consistency/

    Sayan Malakshinov

    ReplyDelete
    Replies
    1. Thanks for pointing that out, Sayan. I will tweet about that today to help get the word out.

      Delete
  4. Just to be strict about avoiding implicit conversion, your function should have the dbms_output line updated to convert the numeric to a string:

    DBMS_OUTPUT.put_line ('pass_number executed for ' || to_char(i));

    ReplyDelete
  5. Also Steven, your output shows the first line as "All done 5", when it really is "All done 6".

    Also the TO_CHAR function should be applied to the pass_number function output as it is NUMBER and thus it will be convert to a character, to then be proper for DBMS_OUTPUT so that Oracle doesn't have to implicitly convert it.

    ReplyDelete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of wo...

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch, ...

PL/SQL 101: Three ways to get error message/stack in PL/SQL

The PL/SQL Challenge quiz for 10 September - 16 September 2016 explored the different ways you can obtain the error message / stack in PL/SQL. Note: an error stack is a sequence of multiple error messages that can occur when an exception is propagated and re-raised through several layers of nested blocks. The three ways are: SQLERRM - The original, traditional and (oddly enough) not currently recommended function to get the current error message. Not recommended because the next two options avoid a problem which you are unlikely  to run into: the error stack will be truncated at 512 bytes, and you might lose some error information. DBMS_UTILITY.FORMAT_ERROR_STACK - Returns the error message / stack, and will not truncate your string like SQLERRM will. UTL_CALL_STACK API - Added in Oracle Database 12c, the UTL_CALL_STACK package offers a comprehensive API into the execution call stack, the error stack and the error backtrace.  Note: check out this LiveSQL script if...