Skip to main content

The Joy of Low Hanging Fruit, Part 2: the Dismay and the Delight

In the first post of this "Joy" series, I explained the problem: a procedure that runs in a daily batch job threatening to take more than a day to complete. Not a good situation.

I invited readers to analyze the situation and come up with their own solution before I continued. 

Now I continue!

My Dismay and My Delight – First, the Dismay

I was dismayed when I saw the implementation of reset_meme_status because I found in that procedure a cursor FOR loop that contained two non-query DML statements (. That is a classic "anti-pattern", meaning a general pattern of coding that should be avoided.

It should be avoided because the inserts and updates are changing the tables on a row-by-row basis, which maximizes the number of context switches and consequently greatly slows the performance of the code. Fortunately, this classic anti-pattern has a classic, well-defined solution: use BULK COLLECT and FORALL to switch from row-by-row processing to bulk processing.

Before moving on to explaining why it is also my delight, let's take a closer look at the anti-pattern. I am suggesting to you that you have a problem whenever you see code like this:

BEGIN
   FOR my_record IN (SELECT ...)
   LOOP
      INSERT ....
         and/or
      UPDATE ...
         and/or
      DELETE ...
   END LOOP;
END; 
 
Or like this:

BEGIN
   FOR my_index IN low_value .. high_value
   LOOP
      INSERT ....
         and/or
      UPDATE ...
         and/or
      DELETE ...
   END LOOP;
END; 

Furthermore, those SQL statements could be static (fixed at compile time, also referred to as embedded SQL) or dynamic (finalized at run time).

You might be wondering: why should this code cause a problem? Why should it run the SQL slowly? PL/SQL is supposed to be tightly integrated with SQL. Shouldn't Oracle be able to optimize this sort of code to run blazingly fast?

Well, let's give Oracle its due: it does optimize this code to run blazingly fast. The Oracle Database, SQL and PL/SQL are all very efficient. But however hot the fire, it can always get hotter right? So Oracle has built features into PL/SQL to make your code run even faster (hey, I would have said that even before I rejoined Oracle!).

The fundamental cause for the relative slowness of row-by-row-processing is that SQL statements are executed by a different runtime engine from PL/SQL statements. So Oracle must "switch" between them; these are called context switches and Oracle has paid lots of attention over the years to reduce the cost of a single context switch. That's good, because we are switching back and forth all the time.

For example, I create a simple function on top of SUBSTR:

CREATE OR REPLACE FUNCTION betwnstr (
   string_in   IN   VARCHAR2
 , start_in    IN   INTEGER
 , end_in      IN   INTEGER
)
   RETURN VARCHAR2
IS
BEGIN
   RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;
/

I can then call it inside an SQL statement:
BEGIN
   FOR rec IN (SELECT betwnstr (last_name, 1, 5)
                 FROM employees)
   LOOP
      NULL;
   END LOOP;
END;

In this case, Oracle must switch from the SQL engine to the PL/SQL engine each time betwnstr is called.

Old timers like me can remember back in the days of Oracle8 and Oracle8i, when you rarely if ever called your own (aka, user-defined) function inside SQL, like I showed you above. It was simply too slow.

Now, we do this all the time, because Oracle sliced the time it takes to do the switch. And continues to look for new ways to do. Which is why, in Oracle Database 12c, you can use the WITH clause to define a function right inside your SQL statement and you can use the UDF pragma in your function definition: two techniques to reduce some context-switching overhead.


Still, another way to get around that overhead is to reduce the number of context switches needed. And that's what the bulk processing features of PL/SQL offer to us.


Consider the following figure:


I have a cursor FOR loop executing an update statement.

Each time the PL/SQL engine gets to the update, it stops, prepares the bind variables, and then passes a packet of info (the SQL string, the variables, and other data as well) to the SQL engine. The SQL engine "does its thing" and passes back a result set (or an error). This occurs for each update statement, whether there be two or two thousand.

Hency, my dismay with extremememe.info's code.

OK, Back to My Delight

The presence of this anti-pattern was also the source of my delight. The developers needed to improve the performance of the procedure significantly. That could be very hard to do, if they have already taken full advantage of performance optimization features in SQL and PL/SQL. Since their code still contained non-query DML inside loops, I knew that was not the case and in fact we were looking at a case of  "low hanging fruit" (a goal that is easily achievable).

It was possible that we could solve their problem with a relatively straightforward change to their existing code. Conesquently, they might not need to make changes to their database design, which would have much greater impact and incur a much greater cost of development and testing.

Well, you've seen the anti-pattern: non-query DML inside the loop. What's the (anti-anti-) pattern? We already saw how Oracle has gone to great lengths to reduce the cost of a context switch. So the next obvious things to look at is: how can we reduce the number of context switches? The answer is: BULK COLLECT and FORALL (bulk processing).

This next figure demonstrates at a very high level the performance potential offered by bulk processing.


I have replaced my cursor FOR loop with a FORALL statement. Under or within the header of FORALL I attach my DML statement.

When the PL/SQL engine hits the meta-DML FORALL statement, it says:

"Oh, I know what you want me to do: for each element in the driving array, generate a DML statement with bind variables set from the array.
"Then I will pass the whole set of DML statements to the SQL engine with one context switch.
"The SQL engine does whatever it is supposed to do (that's none of my business) and then I get the results back....
"Way faster!"
Seriously, that is exactly how the PL/SQL engine talks. I should know. We've had lots of conversations over the years.

And that is the fundamental idea behind bulk processing: the fewer context switches, the faster your program runs. And we are not talking about a 10% or 20% increase. You can see DML statements execute an order of magnitude faster using FORALL.

Of course, that's talk – and  you want to see action. In the coming weeks, I will step through the transformation of the row-by-row solution to a fully compatible and much faster bulk solution.
If you are eager to see that end point, simply point your browser to:


In the meantime, I will end this chapter in my saga with a simple piece of advice that can have a big impact on your code:

Unless working with a trivial amount of data, convert all loops containing non-query DML to a combination of BULK COLLECT and FORALL.

Comments

  1. Hi Steven,

    It's outstanding your explanation about BULK COLLECT and FORALL and we are encouraged to convert all loops containing non-query DML to a combination of BULK COLLECT and FORALL. However I have a question:

    When we have million of records to be updated, we use a counter to control when commit command should be issued, to avoid DML(update,insert, insert) consume too much UNDO TBS. The code is something like this:
    ...
    loop
    insert ....
    cont = cont + 1
    If cont >= 5000
    commit;
    ...
    end loop;
    commit;


    By using BULK COLLECT and FORALL, when commit command should be issued ? ( by the way, I didn´t see commit command in the procedures).

    Best regards,
    Lavie

    ReplyDelete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of wo...

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch, ...

PL/SQL 101: Three ways to get error message/stack in PL/SQL

The PL/SQL Challenge quiz for 10 September - 16 September 2016 explored the different ways you can obtain the error message / stack in PL/SQL. Note: an error stack is a sequence of multiple error messages that can occur when an exception is propagated and re-raised through several layers of nested blocks. The three ways are: SQLERRM - The original, traditional and (oddly enough) not currently recommended function to get the current error message. Not recommended because the next two options avoid a problem which you are unlikely  to run into: the error stack will be truncated at 512 bytes, and you might lose some error information. DBMS_UTILITY.FORMAT_ERROR_STACK - Returns the error message / stack, and will not truncate your string like SQLERRM will. UTL_CALL_STACK API - Added in Oracle Database 12c, the UTL_CALL_STACK package offers a comprehensive API into the execution call stack, the error stack and the error backtrace.  Note: check out this LiveSQL script if...