Implicit vs Explicit Cursor: Which should you use?

My post on Conditional Logic in PL/SQL generated some interesting discussion concerning whether to use an implicit cursor (SELECT-INTO) or an explicit cursor (OPEN-FETCH-CLOSE) when fetching a single row of data.

"Hmmm," you might be saying to yourself, "a conversation about SQL queries in a post on conditional logic? That's strange."

Ah, but maybe not so strange. Because as one person put it in the comments:


Yes, this is true. You can use exception handling as a kind of conditional branching logic. But should you? Generally, no. It's a good idea to keep application logic out of exception sections. Developers don't expect or look for business logic in those sections. They are, after all, for dealing with exceptions - errors. So if you put lots of code in EXCEPTION WHEN clauses, they can be easily overlooked, making your code harder to debug and enhance.

But that question - and the recommendation to avoid the exception section - then led to other comments that led to a question we circle back to over and over again in the world of Oracle Database programming:
If I am fetching (at most) a single row, should I use an implicit cursor (SELECT-INTO) or an explicit cursor (OPEN-FETCH-CLOSE)?
How do those questions link up? Quite simply, if you use SELECT-INTO to fetch a single row, then if no row is found, the PL/SQL engine raise the NO_DATA_FOUND exception. If more than one row is found, TOO_MANY_ROWS is raised. Let's show you that before I continue (all code in this post available in LiveSQL).

First of all, for this blog post, I will work with the following table and data:

CREATE TABLE not_much_stuff (n NUMBER)
/

INSERT INTO not_much_stuff
       SELECT LEVEL
         FROM DUAL
   CONNECT BY LEVEL < 11
/

Here's a block of code using implicit cursors, trapping exceptions, and displaying the error:

SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  BEGIN
  4     DBMS_OUTPUT.put_line ('No rows found:');
  5  
  6     BEGIN
  7        SELECT n
  8          INTO my_n
  9          FROM not_much_stuff
 10         WHERE n = -1;
 11     EXCEPTION
 12        WHEN NO_DATA_FOUND
 13        THEN
 14           DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);
 15     END;
 16  
 17     DBMS_OUTPUT.put_line ('Too many rows found:');
 18  
 19     BEGIN
 20        SELECT n
 21          INTO my_n
 22          FROM not_much_stuff
 23         WHERE n BETWEEN 1 AND 10;
 24     EXCEPTION
 25        WHEN TOO_MANY_ROWS
 26        THEN
 27           DBMS_OUTPUT.put_line (DBMS_UTILITY.format_error_stack);
 28     END;
 29  END;
 30  /
No rows found:
ORA-01403: no data found

Too many rows found:
ORA-01422: exact fetch returns more than requested number of rows

So that's how implicit cursors work when it comes to an outcome that is anything but "got you your one row."

OK, back to the question of which to use: implicit or explicit?

As is so often, and frustratingly, the case, the answer is: it depends. :-(

And the "depends" has to do with both performance of the two approaches and specific context in which you are writing the single row fetch code. So first:

Which Is Faster?

In each of the sections below (again, available on LiveSQL for you to run yourselves), I check the performance of implicit and explicit cursors for both finding a single row successfully and not finding any rows.

Note: in the code below, I assign values to the my_n variable and then display its value at the end of the block to ensure that the optimizer doesn't "zero out" my code because it doesn't actually do anything. Smart optimizer. And yes the PL/SQL optimizer works on anonymous blocks as well as stored program units.

No Data Found, Implicit Cursor

SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  BEGIN
  4     FOR indx IN 1 .. 100000
  5     LOOP
  6        BEGIN
  7           SELECT n
  8             INTO my_n
  9             FROM not_much_stuff
 10            WHERE N = -1;
 11           my_n := 100;
 12        EXCEPTION
 13           WHEN NO_DATA_FOUND
 14           THEN
 15              my_n := 100;
 16        END;
 17     END LOOP;
 18  
 19     DBMS_OUTPUT.put_line (my_n);
 20  END;
 21  /
Elapsed: 00:00:06.372

One Row Found, Implicit Cursor


SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  BEGIN
  4     FOR indx IN 1 .. 100000
  5     LOOP
  6        BEGIN
  7           SELECT n
  8             INTO my_n
  9             FROM not_much_stuff
 10            WHERE N = 1;
 11           my_n := 100;
 12        EXCEPTION
 13           WHEN NO_DATA_FOUND
 14           THEN
 15              my_n := 100;
 16        END;
 17     END LOOP;
 18  
 19     DBMS_OUTPUT.put_line (my_n);
 20  END;
 21  /
Elapsed: 00:00:04.703

No Data Found, Explicit Cursor


SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  
  4     CURSOR stuff_cur
  5     IS
  6        SELECT n
  7          FROM not_much_stuff
  8         WHERE n = -1;
  9  BEGIN
 10     FOR indx IN 1 .. 100000
 11     LOOP
 12        OPEN stuff_cur;
 13  
 14        FETCH stuff_cur INTO my_n;
 15  
 16        IF stuff_cur%NOTFOUND
 17        THEN
 18           my_n := 100;
 19        END IF;
 20  
 21        CLOSE stuff_cur;
 22     END LOOP;
 23  
 24     DBMS_OUTPUT.put_line (my_n);
 25  END;
 26  /

Elapsed: 00:00:04.703

One Row Found, Explicit Cursor


SQL> DECLARE
  2     my_n   not_much_stuff.n%TYPE;
  3  
  4     CURSOR stuff_cur
  5     IS
  6        SELECT n
  7          FROM not_much_stuff
  8         WHERE n = 1;
  9  BEGIN
 10     FOR indx IN 1 .. 100000
 11     LOOP
 12        OPEN stuff_cur;
 13  
 14        FETCH stuff_cur INTO my_n;
 15  
 16        IF stuff_cur%FOUND
 17        THEN
 18           my_n := 100;
 19        END IF;
 20  
 21        CLOSE stuff_cur;
 22     END LOOP;
 23  
 24     DBMS_OUTPUT.put_line (my_n);
 25  END;
 26  /
Elapsed: 00:00:05.209

So the take-away from these relatively superficial but reliably consistent performance comparisons:
  • When the query successfully finds just one row, SELECT-INTO is the most efficient approach.
  • When the query fails to find a row, the explicit cursor is more efficient.
The reason is simple: with the explicit cursor, you avoid the raising of an exception. And in PL/SQL, the raising and handling of an exception is relatively expensive. Combine that overhead, with the desire to avoid application logic in the exception, and you can see why developers would say on that same blog post:
And so now we come to the crux of the "it depends" answer to "Which should I use?"
Is your query usually going to successfully find a single row or not?
If you expect the query to find just one row most of the time, use SELECT-INTO but wrap it in its own function or nested block, trap NO_DATA_FOUND. In the handler, don't write application code; instead set a variable's value so that the enclosing block can determine the next appropriate step.

If you expect the query to fail to find a row a lot of the time, then consider using an explicit cursor and the %NOTFOUND cursor attribute to identify a "no data found" scenario. Then take appropriate action.

Here are some patterns based on the above recommendation and the standard HR employees table.

1. Implicit cursor inside a nested block

I narrow the propagation of the NO_DATA_FOUND exception, and then take appropriate action. Here, I simply ignore the problem and let the subsequence IF statement say "We're done if no row was found." You might call an error logging procedure if a row really should have been there - and then re-raise the exception.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name   employees.last_name%TYPE;
BEGIN
   BEGIN
      SELECT last_name
        INTO l_name
        FROM employees e
       WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
   EXCEPTION
      WHEN NO_DATA_FOUND
      THEN
         /* log the error if this really is an error or let it go... */
         l_name := NULL;
   END;

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

2. Implicit cursor inside a nested subprogram

I move all that code into its own nested subprogram (or you could put it in a package so it could be used by more than one block). Again, you need to decide what you want to do about NO_DATA_FOUND.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name   employees.last_name%TYPE;

   FUNCTION emp_name (employee_id_in IN employees.employee_id%TYPE)
      RETURN employees.last_name%TYPE
   IS
      l_name   employees.last_name%TYPE;
   BEGIN
      SELECT last_name
        INTO l_name
        FROM employees
       WHERE employee_id = employee_id_in;

      RETURN l_name;
   EXCEPTION
      WHEN NO_DATA_FOUND
      THEN
         /* log the error if this really is an error or let it go... */
         RETURN NULL;
   END;
BEGIN
   l_name := emp_name (employee_id_in);

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

3. Explicit cursor unconcerned with too many rows

With the explicit cursor, I fetch once and then proceed. I don't have to worry about NO_DATA_FOUND being raised, and the IF statement ensures that I do nothing if no (non-null) value was returned.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name   employees.last_name%TYPE;

   CURSOR name_cur
   IS
      SELECT last_name
        FROM employees e
       WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
BEGIN
   OPEN name_cur;

   FETCH name_cur INTO l_name;

   CLOSE name_cur;

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

4. Explicit cursor that checks for too many rows

But perhaps I need to know if there were > 1 rows found. In this case, fetch a second time and then raise TOO_MANY_ROWS if a row was found. Else continue on.

PROCEDURE do_stuff_with_employee (
   employee_id_in   IN employees.employee_id%TYPE)
IS
   l_name    employees.last_name%TYPE;
   l_name2   employees.last_name%TYPE;

   CURSOR name_cur
   IS
      SELECT last_name
        FROM employees e
       WHERE e.employee_id = do_stuff_with_employee.employee_id_in;
BEGIN
   OPEN name_cur;

   FETCH name_cur INTO l_name;

   FETCH name_cur INTO l_name2;

   IF name_cur%FOUND
   THEN
      CLOSE name_cur;

      RAISE TOO_MANY_ROWS;
   ELSE
      CLOSE name_cur;
   END IF;

   IF l_name IS NOT NULL
   THEN
      /* continue with application logic */
      NULL;
   END IF;
END;

As is so often the case in programming and life in the real world, a very simple (or simply phrased) question can lead to a complicated, nuanced answer. That is the case with fetching single rows. You need to understand the context and the patterns of data being evaluated by the cursor. From that you make your determination.

But I will close with this very simple piece of advice: Opt for the implicit cursor (SELECT-INTO) as your default choice, and then switch to the explicit cursor only when needed.

Ha! I lied. I am not done. A developer posted this comment on LinkedIn:
I never, ever use OPEN/FETCH except for declared cursor types. Why don't you mention cursor loops? Cursor loops are more succinct, have a well-defined scope, and are much clearer to read.
So let's take a look at using a cursor FOR loop (CFL) instead of SELECT-INTO or OPEN-FETCH-CLOSE. A CFL is another kind of implicit cursor, since you provide the SELECT (which could be defined as an explicit cursor, just to make things confusing) and Oracle implicitly opens, fetches from and closes the cursor. If you are not very familiar with cursor FOR loops, check out the doc.

CFLs are designed (intended) for fetching multiple rows from a cursor. But you could use it to fetch just a single row. Should you?

First, from a performance standpoint, CFLs are very efficient:

Cursor FOR Loop No Rows Found

SQL> DECLARE 
  2     my_n   not_much_stuff.n%TYPE; 
  3  BEGIN 
  4  
  5     FOR indx IN 1 .. 100000
  6     LOOP 
  7        FOR rec IN (SELECT n 
  8                      FROM not_much_stuff 
  9                     WHERE n = -1) 
 10        LOOP 
 11           my_n := rec.n; 
 12        END LOOP; 
 13     END LOOP; 
 14   
 15     DBMS_OUTPUT.put_line (my_n); 
 16  END;
 17  /

Elapsed: 00:00:04.560

Cursor FOR Loop One Row Found

SQL> DECLARE 
  2     my_n   not_much_stuff.n%TYPE; 
  3  BEGIN 
  4     FOR indx IN 1 .. 100000
  5     LOOP 
  6        FOR rec IN (SELECT n 
  7                      FROM not_much_stuff 
  8                     WHERE n = 1) 
  9        LOOP 
 10           my_n := rec.n; 
 11        END LOOP; 
 12     END LOOP; 
 13   
 14     DBMS_OUTPUT.put_line (my_n); 
 15  END;
 16  /

Elapsed: 00:00:04.685


CFLS compare very favorably to the SELECT-INTO performance. And you don't have to worry about NO_DATA_FOUND or TOO_MANY_ROWS being raised. If the CFL finds no rows, it does not thing. If it finds more than one, it fetches each of them.

What, me worry? Well, maybe you should. See, that is also kind of the downside of CFLs. It "hides" those different states of your data. That's fine if, say:

  • You know with 100% certainty that there will always only be one row, and/or....
  • You don't care if you happen to fetch more than one row, and/or....
  • You don't care if you don't fetch any rows.
Sadly, there is so little we can point to in our lives, in our code, in our databases that lends itself to 100% certainty. And that's one of the reasons I do not use CFLs for single row fetches.

Another reason I avoid them is that I like my code to be, as much as possible, self-documenting (and believe me: I fail regularly to achieve this!), to tell a story that is easy to follow and easy to modify when user requirements change.

Use of a loop implies that you may - and are expecting to - iterate multiple times.

Use of a cursor FOR loop implies that you may - and are expecting to - fetch more than one row.

To use a CFL when you know - and when the SELECT statement clearly reflects - that at most one row will ever be fetched seems to me to invite confusion and possible problems for future developers working with this code.




Comments

  1. Here's my enhancement request:

    SELECT n
    INTO LAX my_n
    FROM not_much_stuff
    WHERE n = -1;

    IF SQL%NOTFOUND THEN
    my_n := 100;
    END IF ;

    ReplyDelete
    Replies
    1. Interesting idea, perhaps you should submit it as an enhancement idea on OTN?

      https://community.oracle.com/community/database/database-ideas

      Would you also suppress raising TOO_MANY_ROWS?

      Delete
    2. If I'm expecting multiple rows then I'll use a FOR LOOP. If I'm not expecting multiple rows and then I'd welcome an exception.

      Delete
    3. My preferred way to do a single row lookup is to (mis)use MIN or MAX.

      SELECT MIN(n)
      INTO my_n
      FROM not_much_stuff
      WHERE n = -1;

      This way NO_DATA_FOUND results in a NULL value instead of an exception.

      TOO_MANY_ROWS is also suppressed but a COUNT(*) can easily be added if this is a concern.

      Delete
  2. I normally use SELECT INTO because I find it makes the code simpler (no need to declare a cursor) and easier to understand.
    I use exception handlers to check for NO_DATA_FOUND and TOO_MANY_ROWS and either set the variable value in there or set a flag (usually a Boolean variable) to indicate that an exception occurred. After the block I check the value of the variable or flag to determine what action to take.

    ReplyDelete
    Replies
    1. I cannot find any flaw in your approach. :-)

      Delete
  3. I wander, what was the optimize level during benchmark? 0? I always use 0 to avoid any optimizations in the background.

    In my opinion, implicit vs explicit cursor vs CFL is more about code readability and usage convenience. Explicit cursor's performance overhead over implicit cursor is 1sec for 100000 iterations -- that's not a concern, I think.

    I never use select...into, because of that no_data_found exception handling which, in my opinion, adds some clutter to the code. And of course I never know, if I find the record or no, so catching no_data_found is mandatory. I prefer open-fetch-close, which also allows using %rowtype.
    Additionally, I suppose PL/SQL optimizer will try to replace my open-fetch-close with a select-into and exception handling during optimization process.

    Using CFL for queries which return a single record is very misleading. CFL gives you a clue, that query returns several records. But when you look at the table structure and the query where clause, you see that a single record is always returned. That's weird.
    I think the performance of CFL is good because PL/SQL optimizer replaces them with bulk collect, which is extremely fast.

    ReplyDelete
    Replies
    1. Optimization level was 3.

      True, performance diff between select-into and open-fetch-close is small. But it could accumulate to noticeable effect - both in terms of # of times a specific SELECT is run and also the # of times many different single row fetches are run.

      The PL/SQL optimizer definitely does NOT try to replace my open-fetch-close with a select-into and exception handling. Cannot safely do that.

      Delete
    2. Thanks for reply. I'm always curious, why things are implemented one way or another. Interesting - why PL/SQL has select-into and open-fetch?
      Which one was the first one implemented and why the second command was created? It would be nice to hear.

      Delete
  4. Hello All,

    And, just to show how real life is always more complicated than pure theory,
    I remember a case where I knew from the beginning that a SELECT is likely to return several
    rows, but wanted to accomplish BOTH of the following:

    1. Returning column values from the first result row, as by an ORDER BY included
    in the SELECT
    2. Letting the user know that there was more than one row that satisfied the query,
    a situation that usually invited an additional check for any possible data errors
    and/or choosing a different row than the (first) one, chosen implicitly.

    The specific table was a kind of setup table, where in rare cases might have contained
    "too many rows" cases, which were still logically correct for some limit cases,
    here involving validity dates.

    The solution was, of course, an explicit cursor loop, where the first row fetched was used
    to populate the output variables, and a second fetch for setting a "too many rows" flag.

    This is preferable to the alternative of executing two separate SELECT-s,
    one returning column values from a single row and the other one a COUNT(*).


    Happy holidays and a Very Happy New Year 2018 for you all :)

    Cheers & Best Regards,
    Iudith


    ReplyDelete
    Replies
    1. In such cases I use explicit cursor + analytic count(*) over().
      Have you tried this way? Setup tables usually are small thus count(*) over all records is not a performance killer.

      Pavel

      Delete
    2. If you need the number of rows in the result set it might be an option to include a COUNT (*) OVER ... No need for two separate SELECTs and if you have a lot of rows you can insert the information for log ops - "n of m rows processed"

      Delete
    3. Hello All,

      Yes, of course you are right :)

      But I am an "old wolf" ... and that solution was born much before the analytic functions ... still back in the Oracle7 days :)

      At least, I suddenly remembered it while reading this thread :)

      Best Regards,
      Iudith

      Delete
  5. I have packaged procedures that perform a test for existence. So I might have something like this:
    SELECT COUNT(*) cnt
    INTO v_cnt
    FROM t.
    WHERE t.pkey = p_key
    AND ROW=1;

    RETURN(v_cnt);

    Guaranteed to return a 0 or 1.
    No WHEN NO_DATA_FOUND handler needed, since it's a simple aggregate.

    ReplyDelete
    Replies
    1. Hello,

      You probably meant "ROWNUM = 1", which is indeed a commonly used way to limit the result to at most one row :)

      Cheers & Best Regards,
      Iudith

      Delete
  6. Hola todos, estuve haciendo unas pruebas con lo leído, buscando una conclusión para mi experimento fue armar 3 escenarios con 1000.000 de interacciones.

    NOFETCH
    a) implícito - 31seg
    b) explicito - 27seg
    c) cfl - 17 seg

    FETCH
    a) implícito - 16seg
    b) explicito - 19seg
    c) cfl - 23 seg

    Al final los tres métodos son igual de eficientes solo se deben aplicar de la manera correcta frente a cada situación.

    Como referencia Steven cambiar select-into por open-fetch-close o cfl, no es la respuesta correcta hay que analizar bien la situación de cada caso.

    ReplyDelete

Post a Comment

Popular posts from this blog

Table Functions, Part 1: Introduction and Exploration

Recommendations for unit testing PL/SQL programs

The future of Oracle PL/SQL: some thoughts on Sten Vesterli's thoughts