EVERY once in a while, a developer gets in touch with a question like this:
I am confused about the exact difference between deterministic and result_cache. Do they have different application use cases? I have used deterministic feature in many functions which retrieve data from some lookup tables. Is it essential to replace these 'deterministic' key words with 'result_cache'?
So I thought I'd write a post about the differences between these two features. But first, let's make sure we all understand what it means for a function to be deterministic.
From Wikipedia:
In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states.
Another way of putting this is that a deterministic subprogram (procedure or function) has no side-effects. If you pass a certain set of arguments for the parameters, you will always get the same results - no matter when, where or how often you run it.
What, you may be wondering, would be a side effect in a PL/SQL function? Any of these (this list is not exhaustive):
- Any and every SQL statement
- Referencing an out-of-scope variable (aka, "global")
- Invoking a non-deterministic subprogram
Next we'll take a look at how you use these two features in your code. Then I will delve into the differences. We will, by the way, focus on functions in this post. It is possible to declare a procedure as deterministic, but this has no known impact that I am aware of and is not commonly used. And you cannot declare a procedure as "result cached."
1. Defining a function as deterministic
FUNCTION betwnstr (
string_in IN VARCHAR2
, start_in IN INTEGER
, end_in IN INTEGER
)
RETURN VARCHAR2 DETERMINISTIC
IS
BEGIN
RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;
This simple encapsulation of SUBSTR allows me to specify the start and end positions, rather than the start position and number of characters. I hope you will agree that this is deterministic.
So that's all you have to do: add the DETERMINISTIC keyword to the header of the function or procedure.
What it does for you:
- Makes it possible to use the function in a function-based index
- Might improve performance by caching and reusing function return values
CREATE OR REPLACE FUNCTION pass_number (i NUMBER)
RETURN NUMBER
DETERMINISTIC
IS
BEGIN
DBMS_OUTPUT.put_line ('pass_number executed');
RETURN 0;
END;
/
DECLARE
n NUMBER := 0;
BEGIN
FOR rec IN (SELECT pass_number (1)
FROM all_objects
WHERE ROWNUM < 6)
LOOP
n := n + 1;
END LOOP;
DBMS_OUTPUT.put_line (n + 1);
END;
/
pass_number executed
6
Notice that even though the function was invoked 5 times, the body of the function was executed just once. Oracle Database created a tiny, little, short-lived cache, just for this function and just for the server call (PL/SQL block or SQL statement) in which it was invoked.
Rob van Wijk offers lots more details on the behavior and performance of deterministic functions here.
2. Defining a function as "result cached"
Let's now change make that betwnstr function a result-cached function:
FUNCTION betwnstr (
string_in IN VARCHAR2
, start_in IN INTEGER
, end_in IN INTEGER
)
RETURN VARCHAR2
RESULT_CACHE
IS
BEGIN
RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;
Gee, that was (deceptively) easy. I just add the RESULT_CACHE keyword. Notice that I removed the DETERMINISTIC keyword, but I did that only for clarity's sake. A function can have both of these keywords in its header. It can, in other words, be deterministic and result-cached.
What it does for you:
- Tells Oracle Database that you want to use some memory in the SGA or Shared Global Area to cache argument values and returned values.
- From that point on, whenever the function is invoked by any session in the database instance, the body of the function will only be executed if it has not already been called with those same input values.
- If there is a "hit" in the cache for that combination of arguments, the return value(s) will simply be grabbed from the cache and returned to the calling block.
- If the function relies on (references) any database tables, when any user commits changes to that table, the cache for the function will be automatically wiped out.
CREATE OR REPLACE FUNCTION pass_number (i NUMBER)
RETURN NUMBER
RESULT_CACHE
IS
BEGIN
DBMS_OUTPUT.put_line ('pass_number executed for ' || i);
RETURN 0;
END;
/
DECLARE
n NUMBER := 0;
BEGIN
FOR rec IN (SELECT pass_number (100)
FROM all_objects
WHERE ROWNUM < 6)
LOOP
n := n + 1;
END LOOP;
DBMS_OUTPUT.put_line ('All done ' || TO_CHAR (n + 1));
END;
/
BEGIN
DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (100));
DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (200));
DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (300));
DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (100));
DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (200));
DBMS_OUTPUT.PUT_LINE ('Returned ' || pass_number (300));
END;
/
pass_number executed for 100
All done 6
Returned 0
pass_number executed for 200
Returned 0
pass_number executed for 300
Returned 0
Returned 0
Returned 0
Returned 0
I call pass_number a total of three times with an argument value of 100. Notice that the function body is executed just once, out of those three times - even though the calls came from two different blocks.
And I call pass_number twice each with argument values of 200 and 300, but the function body is executed just once for each of those.
If I disconnect from my session and reconnect, then run those same two blocks (without recompiling the function), I will see only:
All done 6
Returned 0
Returned 0
Returned 0
Returned 0
Returned 0
Returned 0
The cache for a result-cached function persists across blocks, across sessions, across users. It is a complex feature that can have a ripple effect, both good and bad, across your application.
Which is to say: if you are not careful about how you use RESULT_CACHE, you could cause unintended negative consequences, some of which are explored in this article: How to write a safe result-cached function.
How these two features are similar
How they are different
When to use deterministic
When to use result cache
- It is invoked with the same argument values repeatedly?
- If the function relies on a table, is the data in the table or view static (example: materialized view)? That's a good candidate.
- If the function relies on a table, is the data queried much more frequently than it is updated? Also a good possibility then. Remember: the cache will be invalidated when changes to a dependent table are committed.
- Does the function have any session-specific dependencies, such as reliance on NLS parameter settings? If so, then the cached values might not match the session-dependent values. A good example is using TO_CHAR in your function without a format mask. VPD and reliance on system context values is another. Generally the solution for this is to move all such dependencies to the parameter list.
FUNCTION betwnstr (
ReplyDeletestring_in IN VARCHAR2
, start_in IN INTEGER
, end_in IN INTEGER
)
RETURN VARCHAR2 DETERMINISTIC
IS
BEGIN
RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;
In this above function every time the string input varies the output will also vary, then how this function will be called deterministic. Please explain more. As per definition, for a given input the output will be same is what was mentioned as deterministic. Can you throw some more light into this.
"For a given input"
DeleteIn other words, if I pass in the same three input values "abc" 1 and 2, I will ALWAYS get back "ab".
If your input changes, you get different results, but those results are dependent only on the input values.
The value returned by the function is 100% DETERMINED by the input values and NOTHING ELSE.
Does that help?
Deterministic means that for the same input it has always the same result.
DeleteWith Steven's favorite function, betwnstr ('Steven Feuerstein', 8, 17) will give you Feuerstein; betwnstr ('Steven Feuerstein', 1, 6) will give you Steven.
This actually means that the engine will, during the processing of your statement, store the parameters as key and the result as the value in an associative array like structure.
In this statement:
Select betwnstr ( str, 1, 6 )
from (
select 'Steven Feuerstein' str from dual
union all
select 'João Barreto' str from dual
union all
select 'Steven Feuerstein' str from dual
union all
select 'Steven Feuerstein' str from dual
union all
select 'Steven Feuerstein' str from dual
union all
select 'Steven Feuerstein' str from dual
) data
the function will only be called twice even though Steven appears five times in the set.
João, how absolutely and totally certain are you that the body of the function will be executed precisely twice in the union-all query you show above?
Delete:-)
I tested using logger. I think that even if the engine rewrites the query it will not execute more than twice - unless some degree of parallelism applies?
DeleteWhat if some maniac does;
DeleteTRUNCATE TABLE SYS.DUAL;
>:)
Separate note - In the 'caching impact of a result-cached function' example, I think this;
DBMS_OUTPUT.put_line ('All done ' || n + 1);
Might need to be;
DBMS_OUTPUT.put_line ('All done ' ||TO_CHAR(n + 1));
(just noticed as I got an error running it)
Thanks for catching that. Funnily enough, I hit the error too when running in LiveSQL, but forgot to go back to the post and fix that line. I will do so now.
DeleteAnd if there is "some maniac" that has the ability to connect to SYS and truncate that table...well, you've got bigger problems than impact on function caching."
:-)
A bit more details from me:
ReplyDelete1. Deterministic functions caching and scalar subquery caching mechanisms are based on hash functions.
2. Deterministic caching depends on fetch size(arraysize) – results cached only within one fetch call, while SSC(scalar subquery caching) has no this limitation.
3. Hash collisions depends on the single parameter “_query_execution_cache_max_size” for both mechanizms, but they are more frequent in SSC.
4. Oracle doesn’t keep last result of deterministic functions in case of hash collisions as it does for scalar subquery caching
5. Deterministic functions caching turns off after a certain number of attempts to get the value from the cache. But SSC always returns results from cache if values already cached.
Full my articles with explanations and examples:
https://jonathanlewis.wordpress.com/2015/02/08/functions-subqueries/
http://orasql.org/2014/03/31/deterministic-functions-result_cache-and-operators/
Best regards,
Sayan Malakshinov
Thanks for all this information, Sayan.
DeleteI have passed it along to the developer who writes the C code behind this stuff. I am sure he will enjoy it greatly!
And one please from me: Steven, please tell people more often about operators and statement level consistency. I see too often user-defined functions which return SELECTs results and used in other queries, and then they suddenly get inconsistent results when somebody changes data at the same time. Still a lot of oracle developers do not know about user-defined operators...
ReplyDeleteShort example from me: http://orasql.org/2019/12/30/pl-sql-functions-and-statement-level-consistency/
Sayan Malakshinov
Thanks for pointing that out, Sayan. I will tweet about that today to help get the word out.
DeleteJust to be strict about avoiding implicit conversion, your function should have the dbms_output line updated to convert the numeric to a string:
ReplyDeleteDBMS_OUTPUT.put_line ('pass_number executed for ' || to_char(i));
Absolutely correct, Jaramill. Thanks!
DeleteAlso Steven, your output shows the first line as "All done 5", when it really is "All done 6".
ReplyDeleteAlso the TO_CHAR function should be applied to the pass_number function output as it is NUMBER and thus it will be convert to a character, to then be proper for DBMS_OUTPUT so that Oracle doesn't have to implicitly convert it.