Implicit vs Explicit Cursor: Which should you use?

My post on Conditional Logic in PL/SQL generated some interesting discussion concerning whether to use an implicit cursor (SELECT-INTO) or an explicit cursor (OPEN-FETCH-CLOSE) when fetching a single row of data.

"Hmmm," you might be saying to yourself, "a conversation about SQL queries in a post on conditional logic? That's strange."

Ah, but maybe not so strange. Because as one person put it in the comments:

Yes, this is true. You can use exception handling as a kind of conditional branching logic. But should you? Generally, no. It's a good idea to keep application logic out of exception sections. Developers don't expect or look for business logic in those sections. They are, after all, for dealing with exceptions - errors. So if you put lots of code in EXCEPTION WHEN clauses, they can be easily overlooked, making your code harder to debug and enhance.

But that question - and the recommendation to avoid the exception section - then led to other comments that led to a question we circle back to over and over again in the world of Oracle Database programming:

If I am fetching (at most) a single row, should I use an implicit cursor (SELECT-INTO) or an explicit cursor (OPEN-FETCH-CLOSE)?

How do those questions link up? Quite simply, if you use SELECT-INTO to fetch a single row, then if no row is found, the PL/SQL engine raise the NO_DATA_FOUND exception. If more than one row is found, TOO_MANY_ROWS is raised. Let's show you that before I continue (all code in this post available in LiveSQL).

First of all, for this blog post, I will work with the following table and data:

CREATE TABLE not_much_stuff (n NUMBER)
/

INSERT INTO not_much_stuff
       SELECT LEVEL
         FROM DUAL
   CONNECT BY LEVEL < 11
/

Here's a block of code using implicit cursors, trapping exceptions, and displaying the error:

SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  BEGIN
  4     DBMS_OUTPUT.put_line ('No rows found:');
  5  
  6     BEGIN
  7        SELECT n
  8          INTO my_n
  9          FROM not_much_stuff
 10         WHERE n = -1;
 11     EXCEPTION
 12        WHEN NO_DATA_FOUND
 13        THEN
 14           DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);
 15     END;
 16  
 17     DBMS_OUTPUT.put_line ('Too many rows found:');
 18  
 19     BEGIN
 20        SELECT n
 21          INTO my_n
 22          FROM not_much_stuff
 23         WHERE n BETWEEN 1 AND 10;
 24     EXCEPTION
 25        WHEN TOO_MANY_ROWS
 26        THEN
 27           DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);
 28     END;
 29  END;
 30  /
No rows found:
ORA-01403: no data found

Too many rows found:
ORA-01422: exact fetch returns more than requested number of rows

So that's how implicit cursors work when it comes to an outcome that is anything but "got you your one row."

OK, back to the question of which to use: implicit or explicit?

As is so often, and frustratingly, the case, the answer is: it depends. :-(

And the "depends" has to do with both performance of the two approaches and specific context in which you are writing the single row fetch code. So first:

Which Is Faster?

In each of the sections below (again, available on LiveSQL for you to run yourselves), I check the performance of implicit and explicit cursors for both finding a single row successfully and not finding any rows.

Note: in the code below, I assign values to the my_n variable and then display its value at the end of the block to ensure that the optimizer doesn't "zero out" my code because it doesn't actually do anything. Smart optimizer. And yes the PL/SQL optimizer works on anonymous blocks as well as stored program units.

No Data Found, Implicit Cursor

SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  BEGIN
  4     FOR indx IN 1 .. 100000
  5     LOOP
  6        BEGIN
  7           SELECT n
  8             INTO my_n
  9             FROM not_much_stuff
 10            WHERE N = -1;
 11           my_n := 100;
 12        EXCEPTION
 13           WHEN NO_DATA_FOUND
 14           THEN
 15              my_n := 100;
 16        END;
 17     END LOOP;
 18  
 19     DBMS_OUTPUT.put_line (my_n);
 20  END;
 21  /
Elapsed: 00:00:06.372

One Row Found, Implicit Cursor


SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  BEGIN
  4     FOR indx IN 1 .. 100000
  5     LOOP
  6        BEGIN
  7           SELECT n
  8             INTO my_n
  9             FROM not_much_stuff
 10            WHERE N = 1;
 11           my_n := 100;
 12        EXCEPTION
 13           WHEN NO_DATA_FOUND
 14           THEN
 15              my_n := 100;
 16        END;
 17     END LOOP;
 18  
 19     DBMS_OUTPUT.put_line (my_n);
 20  END;
 21  /
Elapsed: 00:00:04.703

No Data Found, Explicit Cursor


SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  
  4     CURSOR stuff_cur
  5     IS
  6        SELECT n
  7          FROM not_much_stuff
  8         WHERE n = -1;
  9  BEGIN
 10     FOR indx IN 1 .. 100000
 11     LOOP
 12        OPEN stuff_cur;
 13  
 14        FETCH stuff_cur INTO my_n;
 15  
 16        IF stuff_cur%NOTFOUND
 17        THEN
 18           my_n := 100;
 19        END IF;
 20  
 21        CLOSE stuff_cur;
 22     END LOOP;
 23  
 24     DBMS_OUTPUT.put_line (my_n);
 25  END;
 26  /

Elapsed: 00:00:04.703

One Row Found, Explicit Cursor


SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  
  4     CURSOR stuff_cur
  5     IS
  6        SELECT n
  7          FROM not_much_stuff
  8         WHERE n = 1;
  9  BEGIN
 10     FOR indx IN 1 .. 100000
 11     LOOP
 12        OPEN stuff_cur;
 13  
 14        FETCH stuff_cur INTO my_n;
 15  
 16        IF stuff_cur%FOUND
 17        THEN
 18           my_n := 100;
 19        END IF;
 20  
 21        CLOSE stuff_cur;
 22     END LOOP;
 23  
 24     DBMS_OUTPUT.put_line (my_n);
 25  END;
 26  /
Elapsed: 00:00:05.209

So the take-away from these relatively superficial but reliably consistent performance comparisons:

When the query successfully finds just one row, SELECT-INTO is the most efficient approach.
When the query fails to find a row, the explicit cursor is more efficient.

The reason is simple: with the explicit cursor, you avoid the raising of an exception. And in PL/SQL, the raising and handling of an exception is relatively expensive. Combine that overhead, with the desire to avoid application logic in the exception, and you can see why developers would say on that same blog post:

And so now we come to the crux of the "it depends" answer to "Which should I use?"

Is your query usually going to successfully find a single row or not?

If you expect the query to find just one row most of the time, use SELECT-INTO but wrap it in its own function or nested block, trap NO_DATA_FOUND. In the handler, don't write application code; instead set a variable's value so that the enclosing block can determine the next appropriate step.

If you expect the query to fail to find a row a lot of the time, then consider using an explicit cursor and the %NOTFOUND cursor attribute to identify a "no data found" scenario. Then take appropriate action.

Here are some patterns based on the above recommendation and the standard HR employees table.

1. Implicit cursor inside a nested block

I narrow the propagation of the NO_DATA_FOUND exception, and then take appropriate action. Here, I simply ignore the problem and let the subsequence IF statement say "We're done if no row was found." You might call an error logging procedure if a row really should have been there - and then re-raise the exception.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name   employees.last_name%TYPE;
BEGIN
   BEGIN
      SELECT last_name
        INTO l_name
        FROM employees e
       WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
   EXCEPTION
      WHEN NO_DATA_FOUND
      THEN
         /* log the error if this really is an error or let it go... */
         l_name := NULL;
   END;

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

2. Implicit cursor inside a nested subprogram

I move all that code into its own nested subprogram (or you could put it in a package so it could be used by more than one block). Again, you need to decide what you want to do about NO_DATA_FOUND.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name   employees.last_name%TYPE;

   FUNCTION emp_name (employee_id_in IN employees.employee_id%TYPE)
      RETURN employees.last_name%TYPE
   IS
      l_name   employees.last_name%TYPE;
   BEGIN
      SELECT last_name
        INTO l_name
        FROM employees
       WHERE employee_id = employee_id_in;

      RETURN l_name;
   EXCEPTION
      WHEN NO_DATA_FOUND
      THEN
         /* log the error if this really is an error or let it go... */
         RETURN NULL;
   END;
BEGIN
   l_name := emp_name (employee_id_in);

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

3. Explicit cursor unconcerned with too many rows

With the explicit cursor, I fetch once and then proceed. I don't have to worry about NO_DATA_FOUND being raised, and the IF statement ensures that I do nothing if no (non-null) value was returned.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name   employees.last_name%TYPE;

   CURSOR name_cur
   IS
      SELECT last_name
        FROM employees e
       WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
BEGIN
   OPEN name_cur;

   FETCH name_cur INTO l_name;

   CLOSE name_cur;

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

4. Explicit cursor that checks for too many rows

But perhaps I need to know if there were > 1 rows found. In this case, fetch a second time and then raise TOO_MANY_ROWS if a row was found. Else continue on.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name    employees.last_name%TYPE;
   l_name2   employees.last_name%TYPE;

   CURSOR name_cur
   IS
      SELECT last_name
        FROM employees e
       WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
BEGIN
   OPEN name_cur;

   FETCH name_cur INTO l_name;

   FETCH name_cur INTO l_name2;

   IF name_cur%FOUND
   THEN
      CLOSE name_cur;

      RAISE TOO_MANY_ROWS;
   ELSE
      CLOSE name_cur;
   END IF;

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

As is so often the case in programming and life in the real world, a very simple (or simply phrased) question can lead to a complicated, nuanced answer. That is the case with fetching single rows. You need to understand the context and the patterns of data being evaluated by the cursor. From that you make your determination.

But I will close with this very simple piece of advice: Opt for the implicit cursor (SELECT-INTO) as your default choice, and then switch to the explicit cursor only when needed.

Ha! I lied. I am not done. A developer posted this comment on LinkedIn:

I never, ever use OPEN/FETCH except for declared cursor types. Why don't you mention cursor loops? Cursor loops are more succinct, have a well-defined scope, and are much clearer to read.

So let's take a look at using a cursor FOR loop (CFL) instead of SELECT-INTO or OPEN-FETCH-CLOSE. A CFL is another kind of implicit cursor, since you provide the SELECT (which could be defined as an explicit cursor, just to make things confusing) and Oracle implicitly opens, fetches from and closes the cursor. If you are not very familiar with cursor FOR loops, check out the doc.

CFLs are designed (intended) for fetching multiple rows from a cursor. But you could use it to fetch just a single row. Should you?

First, from a performance standpoint, CFLs are very efficient:

Cursor FOR Loop No Rows Found

SQL> DECLARE 
  2     my_n   not_much_stuff.n%TYPE; 
  3  BEGIN 
  4  
  5     FOR indx IN 1 .. 100000
  6     LOOP 
  7        FOR rec IN (SELECT n 
  8                      FROM not_much_stuff 
  9                     WHERE n = -1) 
 10        LOOP 
 11           my_n := rec.n; 
 12        END LOOP; 
 13     END LOOP; 
 14   
 15     DBMS_OUTPUT.put_line (my_n); 
 16  END;
 17  /

Elapsed: 00:00:04.560

Cursor FOR Loop One Row Found

SQL> DECLARE 
  2     my_n   not_much_stuff.n%TYPE; 
  3  BEGIN 
  4     FOR indx IN 1 .. 100000
  5     LOOP 
  6        FOR rec IN (SELECT n 
  7                      FROM not_much_stuff 
  8                     WHERE n = 1) 
  9        LOOP 
 10           my_n := rec.n; 
 11        END LOOP; 
 12     END LOOP; 
 13   
 14     DBMS_OUTPUT.put_line (my_n); 
 15  END;
 16  /

Elapsed: 00:00:04.685

CFLS compare very favorably to the SELECT-INTO performance. And you don't have to worry about NO_DATA_FOUND or TOO_MANY_ROWS being raised. If the CFL finds no rows, it does not thing. If it finds more than one, it fetches each of them.

What, me worry? Well, maybe you should. See, that is also kind of the downside of CFLs. It "hides" those different states of your data. That's fine if, say:

You know with 100% certainty that there will always only be one row, and/or....
You don't care if you happen to fetch more than one row, and/or....
You don't care if you don't fetch any rows.

Sadly, there is so little we can point to in our lives, in our code, in our databases that lends itself to 100% certainty. And that's one of the reasons I do not use CFLs for single row fetches.

Another reason I avoid them is that I like my code to be, as much as possible, self-documenting (and believe me: I fail regularly to achieve this!), to tell a story that is easy to follow and easy to modify when user requirements change.

Use of a loop implies that you may - and are expecting to - iterate multiple times.

Use of a cursor FOR loop implies that you may - and are expecting to - fetch more than one row.

To use a CFL when you know - and when the SELECT statement clearly reflects - that at most one row will ever be fetched seems to me to invite confusion and possible problems for future developers working with this code.

Comments

Kevan GellingDecember 19, 2017 at 1:45 AM
Here's my enhancement request:

SELECT n
INTO LAX my_n
FROM not_much_stuff
WHERE n = -1;

IF SQL%NOTFOUND THEN
my_n := 100;
END IF ;
ReplyDelete
Replies
AnonymousDecember 19, 2017 at 11:45 AM
I normally use SELECT INTO because I find it makes the code simpler (no need to declare a cursor) and easier to understand.
I use exception handlers to check for NO_DATA_FOUND and TOO_MANY_ROWS and either set the variable value in there or set a flag (usually a Boolean variable) to indicate that an exception occurred. After the block I check the value of the variable or flag to determine what action to take.
ReplyDelete
Replies
AnonymousDecember 20, 2017 at 10:48 AM
I wander, what was the optimize level during benchmark? 0? I always use 0 to avoid any optimizations in the background.

In my opinion, implicit vs explicit cursor vs CFL is more about code readability and usage convenience. Explicit cursor's performance overhead over implicit cursor is 1sec for 100000 iterations -- that's not a concern, I think.

I never use select...into, because of that no_data_found exception handling which, in my opinion, adds some clutter to the code. And of course I never know, if I find the record or no, so catching no_data_found is mandatory. I prefer open-fetch-close, which also allows using %rowtype.
Additionally, I suppose PL/SQL optimizer will try to replace my open-fetch-close with a select-into and exception handling during optimization process.

Using CFL for queries which return a single record is very misleading. CFL gives you a clue, that query returns several records. But when you look at the table structure and the query where clause, you see that a single record is always returned. That's weird.
I think the performance of CFL is good because PL/SQL optimizer replaces them with bulk collect, which is extremely fast.
ReplyDelete
Replies
iudithDecember 23, 2017 at 5:14 AM
Hello All,

And, just to show how real life is always more complicated than pure theory,
I remember a case where I knew from the beginning that a SELECT is likely to return several
rows, but wanted to accomplish BOTH of the following:

1. Returning column values from the first result row, as by an ORDER BY included
in the SELECT
2. Letting the user know that there was more than one row that satisfied the query,
a situation that usually invited an additional check for any possible data errors
and/or choosing a different row than the (first) one, chosen implicitly.

The specific table was a kind of setup table, where in rare cases might have contained
"too many rows" cases, which were still logically correct for some limit cases,
here involving validity dates.

The solution was, of course, an explicit cursor loop, where the first row fetched was used
to populate the output variables, and a second fetch for setting a "too many rows" flag.

This is preferable to the alternative of executing two separate SELECT-s,
one returning column values from a single row and the other one a COUNT(*).

Happy holidays and a Very Happy New Year 2018 for you all :)

Cheers & Best Regards,
Iudith

ReplyDelete
Replies
UnknownDecember 28, 2017 at 9:16 AM
I have packaged procedures that perform a test for existence. So I might have something like this:
SELECT COUNT(*) cnt
INTO v_cnt
FROM t.
WHERE t.pkey = p_key
AND ROW=1;

RETURN(v_cnt);

Guaranteed to return a 0 or 1.
No WHEN NO_DATA_FOUND handler needed, since it's a simple aggregate.
ReplyDelete
Replies
Edgar CoronaApril 4, 2018 at 12:40 PM
Hola todos, estuve haciendo unas pruebas con lo leído, buscando una conclusión para mi experimento fue armar 3 escenarios con 1000.000 de interacciones.

NOFETCH
a) implícito - 31seg
b) explicito - 27seg
c) cfl - 17 seg

FETCH
a) implícito - 16seg
b) explicito - 19seg
c) cfl - 23 seg

Al final los tres métodos son igual de eficientes solo se deben aplicar de la manera correcta frente a cada situación.

Como referencia Steven cambiar select-into por open-fetch-close o cfl, no es la respuesta correcta hay que analizar bien la situación de cada caso.
ReplyDelete
Replies

Add comment

Obsessed with Oracle PL/SQL

Search This Blog

Implicit vs Explicit Cursor: Which should you use?

Labels

Comments

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

How to Pick the Limit for BULK COLLECT

Why DBMS_OUTPUT.PUT_LINE should not be in your application code