My post on Conditional Logic in PL/SQL generated some interesting discussion concerning whether to use an implicit cursor (SELECT-INTO) or an explicit cursor (OPEN-FETCH-CLOSE) when fetching a single row of data.
"Hmmm," you might be saying to yourself, "a conversation about SQL queries in a post on conditional logic? That's strange."
Ah, but maybe not so strange. Because as one person put it in the comments:
Yes, this is true. You can use exception handling as a kind of conditional branching logic. But should you? Generally, no. It's a good idea to keep application logic out of exception sections. Developers don't expect or look for business logic in those sections. They are, after all, for dealing with exceptions - errors. So if you put lots of code in EXCEPTION WHEN clauses, they can be easily overlooked, making your code harder to debug and enhance.
But that question - and the recommendation to avoid the exception section - then led to other comments that led to a question we circle back to over and over again in the world of Oracle Database programming:
First of all, for this blog post, I will work with the following table and data:
Here's a block of code using implicit cursors, trapping exceptions, and displaying the error:
So that's how implicit cursors work when it comes to an outcome that is anything but "got you your one row."
OK, back to the question of which to use: implicit or explicit?
As is so often, and frustratingly, the case, the answer is: it depends. :-(
And the "depends" has to do with both performance of the two approaches and specific context in which you are writing the single row fetch code. So first:
Which Is Faster?
In each of the sections below (again, available on LiveSQL for you to run yourselves), I check the performance of implicit and explicit cursors for both finding a single row successfully and not finding any rows.
Note: in the code below, I assign values to the my_n variable and then display its value at the end of the block to ensure that the optimizer doesn't "zero out" my code because it doesn't actually do anything. Smart optimizer. And yes the PL/SQL optimizer works on anonymous blocks as well as stored program units.
No Data Found, Implicit Cursor
One Row Found, Implicit Cursor
No Data Found, Explicit Cursor
One Row Found, Explicit Cursor
So the take-away from these relatively superficial but reliably consistent performance comparisons:
"Hmmm," you might be saying to yourself, "a conversation about SQL queries in a post on conditional logic? That's strange."
Ah, but maybe not so strange. Because as one person put it in the comments:
Yes, this is true. You can use exception handling as a kind of conditional branching logic. But should you? Generally, no. It's a good idea to keep application logic out of exception sections. Developers don't expect or look for business logic in those sections. They are, after all, for dealing with exceptions - errors. So if you put lots of code in EXCEPTION WHEN clauses, they can be easily overlooked, making your code harder to debug and enhance.
But that question - and the recommendation to avoid the exception section - then led to other comments that led to a question we circle back to over and over again in the world of Oracle Database programming:
If I am fetching (at most) a single row, should I use an implicit cursor (SELECT-INTO) or an explicit cursor (OPEN-FETCH-CLOSE)?How do those questions link up? Quite simply, if you use SELECT-INTO to fetch a single row, then if no row is found, the PL/SQL engine raise the NO_DATA_FOUND exception. If more than one row is found, TOO_MANY_ROWS is raised. Let's show you that before I continue (all code in this post available in LiveSQL).
First of all, for this blog post, I will work with the following table and data:
CREATE TABLE not_much_stuff (n NUMBER)
/
INSERT INTO not_much_stuff
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL < 11
/
Here's a block of code using implicit cursors, trapping exceptions, and displaying the error:
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3 BEGIN
4 DBMS_OUTPUT.put_line ('No rows found:');
5
6 BEGIN
7 SELECT n
8 INTO my_n
9 FROM not_much_stuff
10 WHERE n = -1;
11 EXCEPTION
12 WHEN NO_DATA_FOUND
13 THEN
14 DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);
15 END;
16
17 DBMS_OUTPUT.put_line ('Too many rows found:');
18
19 BEGIN
20 SELECT n
21 INTO my_n
22 FROM not_much_stuff
23 WHERE n BETWEEN 1 AND 10;
24 EXCEPTION
25 WHEN TOO_MANY_ROWS
26 THEN
27 DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);
28 END;
29 END;
30 /
No rows found:
ORA-01403: no data found
Too many rows found:
ORA-01422: exact fetch returns more than requested number of rows
So that's how implicit cursors work when it comes to an outcome that is anything but "got you your one row."
OK, back to the question of which to use: implicit or explicit?
As is so often, and frustratingly, the case, the answer is: it depends. :-(
And the "depends" has to do with both performance of the two approaches and specific context in which you are writing the single row fetch code. So first:
Which Is Faster?
In each of the sections below (again, available on LiveSQL for you to run yourselves), I check the performance of implicit and explicit cursors for both finding a single row successfully and not finding any rows.
Note: in the code below, I assign values to the my_n variable and then display its value at the end of the block to ensure that the optimizer doesn't "zero out" my code because it doesn't actually do anything. Smart optimizer. And yes the PL/SQL optimizer works on anonymous blocks as well as stored program units.
No Data Found, Implicit Cursor
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3 BEGIN
4 FOR indx IN 1 .. 100000
5 LOOP
6 BEGIN
7 SELECT n
8 INTO my_n
9 FROM not_much_stuff
10 WHERE N = -1;
11 my_n := 100;
12 EXCEPTION
13 WHEN NO_DATA_FOUND
14 THEN
15 my_n := 100;
16 END;
17 END LOOP;
18
19 DBMS_OUTPUT.put_line (my_n);
20 END;
21 /
Elapsed: 00:00:06.372
One Row Found, Implicit Cursor
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3 BEGIN
4 FOR indx IN 1 .. 100000
5 LOOP
6 BEGIN
7 SELECT n
8 INTO my_n
9 FROM not_much_stuff
10 WHERE N = 1;
11 my_n := 100;
12 EXCEPTION
13 WHEN NO_DATA_FOUND
14 THEN
15 my_n := 100;
16 END;
17 END LOOP;
18
19 DBMS_OUTPUT.put_line (my_n);
20 END;
21 /
Elapsed: 00:00:04.703
No Data Found, Explicit Cursor
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3
4 CURSOR stuff_cur
5 IS
6 SELECT n
7 FROM not_much_stuff
8 WHERE n = -1;
9 BEGIN
10 FOR indx IN 1 .. 100000
11 LOOP
12 OPEN stuff_cur;
13
14 FETCH stuff_cur INTO my_n;
15
16 IF stuff_cur%NOTFOUND
17 THEN
18 my_n := 100;
19 END IF;
20
21 CLOSE stuff_cur;
22 END LOOP;
23
24 DBMS_OUTPUT.put_line (my_n);
25 END;
26 /
Elapsed: 00:00:04.703
One Row Found, Explicit Cursor
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3
4 CURSOR stuff_cur
5 IS
6 SELECT n
7 FROM not_much_stuff
8 WHERE n = 1;
9 BEGIN
10 FOR indx IN 1 .. 100000
11 LOOP
12 OPEN stuff_cur;
13
14 FETCH stuff_cur INTO my_n;
15
16 IF stuff_cur%FOUND
17 THEN
18 my_n := 100;
19 END IF;
20
21 CLOSE stuff_cur;
22 END LOOP;
23
24 DBMS_OUTPUT.put_line (my_n);
25 END;
26 /
Elapsed: 00:00:05.209
So the take-away from these relatively superficial but reliably consistent performance comparisons:
- When the query successfully finds just one row, SELECT-INTO is the most efficient approach.
- When the query fails to find a row, the explicit cursor is more efficient.
The reason is simple: with the explicit cursor, you avoid the raising of an exception. And in PL/SQL, the raising and handling of an exception is relatively expensive. Combine that overhead, with the desire to avoid application logic in the exception, and you can see why developers would say on that same blog post:
And so now we come to the crux of the "it depends" answer to "Which should I use?"
Is your query usually going to successfully find a single row or not?
If you expect the query to find just one row most of the time, use SELECT-INTO but wrap it in its own function or nested block, trap NO_DATA_FOUND. In the handler, don't write application code; instead set a variable's value so that the enclosing block can determine the next appropriate step.
If you expect the query to fail to find a row a lot of the time, then consider using an explicit cursor and the %NOTFOUND cursor attribute to identify a "no data found" scenario. Then take appropriate action.
Here are some patterns based on the above recommendation and the standard HR employees table.
1. Implicit cursor inside a nested block
I narrow the propagation of the NO_DATA_FOUND exception, and then take appropriate action. Here, I simply ignore the problem and let the subsequence IF statement say "We're done if no row was found." You might call an error logging procedure if a row really should have been there - and then re-raise the exception.
2. Implicit cursor inside a nested subprogram
I move all that code into its own nested subprogram (or you could put it in a package so it could be used by more than one block). Again, you need to decide what you want to do about NO_DATA_FOUND.
3. Explicit cursor unconcerned with too many rows
With the explicit cursor, I fetch once and then proceed. I don't have to worry about NO_DATA_FOUND being raised, and the IF statement ensures that I do nothing if no (non-null) value was returned.
4. Explicit cursor that checks for too many rows
But perhaps I need to know if there were > 1 rows found. In this case, fetch a second time and then raise TOO_MANY_ROWS if a row was found. Else continue on.
As is so often the case in programming and life in the real world, a very simple (or simply phrased) question can lead to a complicated, nuanced answer. That is the case with fetching single rows. You need to understand the context and the patterns of data being evaluated by the cursor. From that you make your determination.
But I will close with this very simple piece of advice: Opt for the implicit cursor (SELECT-INTO) as your default choice, and then switch to the explicit cursor only when needed.
Ha! I lied. I am not done. A developer posted this comment on LinkedIn:
CFLs are designed (intended) for fetching multiple rows from a cursor. But you could use it to fetch just a single row. Should you?
First, from a performance standpoint, CFLs are very efficient:
Cursor FOR Loop No Rows Found
Cursor FOR Loop One Row Found
CFLS compare very favorably to the SELECT-INTO performance. And you don't have to worry about NO_DATA_FOUND or TOO_MANY_ROWS being raised. If the CFL finds no rows, it does not thing. If it finds more than one, it fetches each of them.
What, me worry? Well, maybe you should. See, that is also kind of the downside of CFLs. It "hides" those different states of your data. That's fine if, say:
If you expect the query to fail to find a row a lot of the time, then consider using an explicit cursor and the %NOTFOUND cursor attribute to identify a "no data found" scenario. Then take appropriate action.
Here are some patterns based on the above recommendation and the standard HR employees table.
1. Implicit cursor inside a nested block
I narrow the propagation of the NO_DATA_FOUND exception, and then take appropriate action. Here, I simply ignore the problem and let the subsequence IF statement say "We're done if no row was found." You might call an error logging procedure if a row really should have been there - and then re-raise the exception.
PROCEDURE do_stuff_with_employee (
employee_id_in IN employees.employee_id%TYPE)
IS
l_name employees.last_name%TYPE;
BEGIN
BEGIN
SELECT last_name
INTO l_name
FROM employees e
WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
EXCEPTION
WHEN NO_DATA_FOUND
THEN
/* log the error if this really is an error or let it go... */
l_name := NULL;
END;
IF l_name IS NOT NULL
THEN
/* continue with application logic */
NULL;
END IF;
END;
2. Implicit cursor inside a nested subprogram
I move all that code into its own nested subprogram (or you could put it in a package so it could be used by more than one block). Again, you need to decide what you want to do about NO_DATA_FOUND.
PROCEDURE do_stuff_with_employee (
employee_id_in IN employees.employee_id%TYPE)
IS
l_name employees.last_name%TYPE;
FUNCTION emp_name (employee_id_in IN employees.employee_id%TYPE)
RETURN employees.last_name%TYPE
IS
l_name employees.last_name%TYPE;
BEGIN
SELECT last_name
INTO l_name
FROM employees
WHERE employee_id = employee_id_in;
RETURN l_name;
EXCEPTION
WHEN NO_DATA_FOUND
THEN
/* log the error if this really is an error or let it go... */
RETURN NULL;
END;
BEGIN
l_name := emp_name (employee_id_in);
IF l_name IS NOT NULL
THEN
/* continue with application logic */
NULL;
END IF;
END;
3. Explicit cursor unconcerned with too many rows
With the explicit cursor, I fetch once and then proceed. I don't have to worry about NO_DATA_FOUND being raised, and the IF statement ensures that I do nothing if no (non-null) value was returned.
PROCEDURE do_stuff_with_employee (
employee_id_in IN employees.employee_id%TYPE)
IS
l_name employees.last_name%TYPE;
CURSOR name_cur
IS
SELECT last_name
FROM employees e
WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
BEGIN
OPEN name_cur;
FETCH name_cur INTO l_name;
CLOSE name_cur;
IF l_name IS NOT NULL
THEN
/* continue with application logic */
NULL;
END IF;
END;
4. Explicit cursor that checks for too many rows
But perhaps I need to know if there were > 1 rows found. In this case, fetch a second time and then raise TOO_MANY_ROWS if a row was found. Else continue on.
PROCEDURE do_stuff_with_employee (
employee_id_in IN employees.employee_id%TYPE)
IS
l_name employees.last_name%TYPE;
l_name2 employees.last_name%TYPE;
CURSOR name_cur
IS
SELECT last_name
FROM employees e
WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
BEGIN
OPEN name_cur;
FETCH name_cur INTO l_name;
FETCH name_cur INTO l_name2;
IF name_cur%FOUND
THEN
CLOSE name_cur;
RAISE TOO_MANY_ROWS;
ELSE
CLOSE name_cur;
END IF;
IF l_name IS NOT NULL
THEN
/* continue with application logic */
NULL;
END IF;
END;
As is so often the case in programming and life in the real world, a very simple (or simply phrased) question can lead to a complicated, nuanced answer. That is the case with fetching single rows. You need to understand the context and the patterns of data being evaluated by the cursor. From that you make your determination.
But I will close with this very simple piece of advice: Opt for the implicit cursor (SELECT-INTO) as your default choice, and then switch to the explicit cursor only when needed.
Ha! I lied. I am not done. A developer posted this comment on LinkedIn:
I never, ever use OPEN/FETCH except for declared cursor types. Why don't you mention cursor loops? Cursor loops are more succinct, have a well-defined scope, and are much clearer to read.So let's take a look at using a cursor FOR loop (CFL) instead of SELECT-INTO or OPEN-FETCH-CLOSE. A CFL is another kind of implicit cursor, since you provide the SELECT (which could be defined as an explicit cursor, just to make things confusing) and Oracle implicitly opens, fetches from and closes the cursor. If you are not very familiar with cursor FOR loops, check out the doc.
CFLs are designed (intended) for fetching multiple rows from a cursor. But you could use it to fetch just a single row. Should you?
First, from a performance standpoint, CFLs are very efficient:
Cursor FOR Loop No Rows Found
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3 BEGIN
4
5 FOR indx IN 1 .. 100000
6 LOOP
7 FOR rec IN (SELECT n
8 FROM not_much_stuff
9 WHERE n = -1)
10 LOOP
11 my_n := rec.n;
12 END LOOP;
13 END LOOP;
14
15 DBMS_OUTPUT.put_line (my_n);
16 END;
17 /
Elapsed: 00:00:04.560
Cursor FOR Loop One Row Found
SQL> DECLARE
2 my_n not_much_stuff.n%TYPE;
3 BEGIN
4 FOR indx IN 1 .. 100000
5 LOOP
6 FOR rec IN (SELECT n
7 FROM not_much_stuff
8 WHERE n = 1)
9 LOOP
10 my_n := rec.n;
11 END LOOP;
12 END LOOP;
13
14 DBMS_OUTPUT.put_line (my_n);
15 END;
16 /
Elapsed: 00:00:04.685
CFLS compare very favorably to the SELECT-INTO performance. And you don't have to worry about NO_DATA_FOUND or TOO_MANY_ROWS being raised. If the CFL finds no rows, it does not thing. If it finds more than one, it fetches each of them.
What, me worry? Well, maybe you should. See, that is also kind of the downside of CFLs. It "hides" those different states of your data. That's fine if, say:
- You know with 100% certainty that there will always only be one row, and/or....
- You don't care if you happen to fetch more than one row, and/or....
- You don't care if you don't fetch any rows.
Sadly, there is so little we can point to in our lives, in our code, in our databases that lends itself to 100% certainty. And that's one of the reasons I do not use CFLs for single row fetches.
Another reason I avoid them is that I like my code to be, as much as possible, self-documenting (and believe me: I fail regularly to achieve this!), to tell a story that is easy to follow and easy to modify when user requirements change.
Use of a loop implies that you may - and are expecting to - iterate multiple times.
Use of a cursor FOR loop implies that you may - and are expecting to - fetch more than one row.
To use a CFL when you know - and when the SELECT statement clearly reflects - that at most one row will ever be fetched seems to me to invite confusion and possible problems for future developers working with this code.
Use of a loop implies that you may - and are expecting to - iterate multiple times.
Use of a cursor FOR loop implies that you may - and are expecting to - fetch more than one row.
To use a CFL when you know - and when the SELECT statement clearly reflects - that at most one row will ever be fetched seems to me to invite confusion and possible problems for future developers working with this code.
Here's my enhancement request:
ReplyDeleteSELECT n
INTO LAX my_n
FROM not_much_stuff
WHERE n = -1;
IF SQL%NOTFOUND THEN
my_n := 100;
END IF ;
Interesting idea, perhaps you should submit it as an enhancement idea on OTN?
Deletehttps://community.oracle.com/community/database/database-ideas
Would you also suppress raising TOO_MANY_ROWS?
If I'm expecting multiple rows then I'll use a FOR LOOP. If I'm not expecting multiple rows and then I'd welcome an exception.
DeleteMy preferred way to do a single row lookup is to (mis)use MIN or MAX.
DeleteSELECT MIN(n)
INTO my_n
FROM not_much_stuff
WHERE n = -1;
This way NO_DATA_FOUND results in a NULL value instead of an exception.
TOO_MANY_ROWS is also suppressed but a COUNT(*) can easily be added if this is a concern.
I normally use SELECT INTO because I find it makes the code simpler (no need to declare a cursor) and easier to understand.
ReplyDeleteI use exception handlers to check for NO_DATA_FOUND and TOO_MANY_ROWS and either set the variable value in there or set a flag (usually a Boolean variable) to indicate that an exception occurred. After the block I check the value of the variable or flag to determine what action to take.
I cannot find any flaw in your approach. :-)
DeleteI wander, what was the optimize level during benchmark? 0? I always use 0 to avoid any optimizations in the background.
ReplyDeleteIn my opinion, implicit vs explicit cursor vs CFL is more about code readability and usage convenience. Explicit cursor's performance overhead over implicit cursor is 1sec for 100000 iterations -- that's not a concern, I think.
I never use select...into, because of that no_data_found exception handling which, in my opinion, adds some clutter to the code. And of course I never know, if I find the record or no, so catching no_data_found is mandatory. I prefer open-fetch-close, which also allows using %rowtype.
Additionally, I suppose PL/SQL optimizer will try to replace my open-fetch-close with a select-into and exception handling during optimization process.
Using CFL for queries which return a single record is very misleading. CFL gives you a clue, that query returns several records. But when you look at the table structure and the query where clause, you see that a single record is always returned. That's weird.
I think the performance of CFL is good because PL/SQL optimizer replaces them with bulk collect, which is extremely fast.
Optimization level was 3.
DeleteTrue, performance diff between select-into and open-fetch-close is small. But it could accumulate to noticeable effect - both in terms of # of times a specific SELECT is run and also the # of times many different single row fetches are run.
The PL/SQL optimizer definitely does NOT try to replace my open-fetch-close with a select-into and exception handling. Cannot safely do that.
Thanks for reply. I'm always curious, why things are implemented one way or another. Interesting - why PL/SQL has select-into and open-fetch?
DeleteWhich one was the first one implemented and why the second command was created? It would be nice to hear.
Hello All,
ReplyDeleteAnd, just to show how real life is always more complicated than pure theory,
I remember a case where I knew from the beginning that a SELECT is likely to return several
rows, but wanted to accomplish BOTH of the following:
1. Returning column values from the first result row, as by an ORDER BY included
in the SELECT
2. Letting the user know that there was more than one row that satisfied the query,
a situation that usually invited an additional check for any possible data errors
and/or choosing a different row than the (first) one, chosen implicitly.
The specific table was a kind of setup table, where in rare cases might have contained
"too many rows" cases, which were still logically correct for some limit cases,
here involving validity dates.
The solution was, of course, an explicit cursor loop, where the first row fetched was used
to populate the output variables, and a second fetch for setting a "too many rows" flag.
This is preferable to the alternative of executing two separate SELECT-s,
one returning column values from a single row and the other one a COUNT(*).
Happy holidays and a Very Happy New Year 2018 for you all :)
Cheers & Best Regards,
Iudith
In such cases I use explicit cursor + analytic count(*) over().
DeleteHave you tried this way? Setup tables usually are small thus count(*) over all records is not a performance killer.
Pavel
If you need the number of rows in the result set it might be an option to include a COUNT (*) OVER ... No need for two separate SELECTs and if you have a lot of rows you can insert the information for log ops - "n of m rows processed"
DeleteHello All,
DeleteYes, of course you are right :)
But I am an "old wolf" ... and that solution was born much before the analytic functions ... still back in the Oracle7 days :)
At least, I suddenly remembered it while reading this thread :)
Best Regards,
Iudith
I have packaged procedures that perform a test for existence. So I might have something like this:
ReplyDeleteSELECT COUNT(*) cnt
INTO v_cnt
FROM t.
WHERE t.pkey = p_key
AND ROW=1;
RETURN(v_cnt);
Guaranteed to return a 0 or 1.
No WHEN NO_DATA_FOUND handler needed, since it's a simple aggregate.
Hello,
DeleteYou probably meant "ROWNUM = 1", which is indeed a commonly used way to limit the result to at most one row :)
Cheers & Best Regards,
Iudith
Hola todos, estuve haciendo unas pruebas con lo leÃdo, buscando una conclusión para mi experimento fue armar 3 escenarios con 1000.000 de interacciones.
ReplyDeleteNOFETCH
a) implÃcito - 31seg
b) explicito - 27seg
c) cfl - 17 seg
FETCH
a) implÃcito - 16seg
b) explicito - 19seg
c) cfl - 23 seg
Al final los tres métodos son igual de eficientes solo se deben aplicar de la manera correcta frente a cada situación.
Como referencia Steven cambiar select-into por open-fetch-close o cfl, no es la respuesta correcta hay que analizar bien la situación de cada caso.