Skip to main content

The Joy of Low Hanging Fruit, Part 2: the Dismay and the Delight

In the first post of this "Joy" series, I explained the problem: a procedure that runs in a daily batch job threatening to take more than a day to complete. Not a good situation.

I invited readers to analyze the situation and come up with their own solution before I continued. 

Now I continue!

My Dismay and My Delight – First, the Dismay

I was dismayed when I saw the implementation of reset_meme_status because I found in that procedure a cursor FOR loop that contained two non-query DML statements (. That is a classic "anti-pattern", meaning a general pattern of coding that should be avoided.

It should be avoided because the inserts and updates are changing the tables on a row-by-row basis, which maximizes the number of context switches and consequently greatly slows the performance of the code. Fortunately, this classic anti-pattern has a classic, well-defined solution: use BULK COLLECT and FORALL to switch from row-by-row processing to bulk processing.

Before moving on to explaining why it is also my delight, let's take a closer look at the anti-pattern. I am suggesting to you that you have a problem whenever you see code like this:

BEGIN
   FOR my_record IN (SELECT ...)
   LOOP
      INSERT ....
         and/or
      UPDATE ...
         and/or
      DELETE ...
   END LOOP;
END; 
 
Or like this:

BEGIN
   FOR my_index IN low_value .. high_value
   LOOP
      INSERT ....
         and/or
      UPDATE ...
         and/or
      DELETE ...
   END LOOP;
END; 

Furthermore, those SQL statements could be static (fixed at compile time, also referred to as embedded SQL) or dynamic (finalized at run time).

You might be wondering: why should this code cause a problem? Why should it run the SQL slowly? PL/SQL is supposed to be tightly integrated with SQL. Shouldn't Oracle be able to optimize this sort of code to run blazingly fast?

Well, let's give Oracle its due: it does optimize this code to run blazingly fast. The Oracle Database, SQL and PL/SQL are all very efficient. But however hot the fire, it can always get hotter right? So Oracle has built features into PL/SQL to make your code run even faster (hey, I would have said that even before I rejoined Oracle!).

The fundamental cause for the relative slowness of row-by-row-processing is that SQL statements are executed by a different runtime engine from PL/SQL statements. So Oracle must "switch" between them; these are called context switches and Oracle has paid lots of attention over the years to reduce the cost of a single context switch. That's good, because we are switching back and forth all the time.

For example, I create a simple function on top of SUBSTR:

CREATE OR REPLACE FUNCTION betwnstr (
   string_in   IN   VARCHAR2
 , start_in    IN   INTEGER
 , end_in      IN   INTEGER
)
   RETURN VARCHAR2
IS
BEGIN
   RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END;
/

I can then call it inside an SQL statement:
BEGIN
   FOR rec IN (SELECT betwnstr (last_name, 1, 5)
                 FROM employees)
   LOOP
      NULL;
   END LOOP;
END;

In this case, Oracle must switch from the SQL engine to the PL/SQL engine each time betwnstr is called.

Old timers like me can remember back in the days of Oracle8 and Oracle8i, when you rarely if ever called your own (aka, user-defined) function inside SQL, like I showed you above. It was simply too slow.

Now, we do this all the time, because Oracle sliced the time it takes to do the switch. And continues to look for new ways to do. Which is why, in Oracle Database 12c, you can use the WITH clause to define a function right inside your SQL statement and you can use the UDF pragma in your function definition: two techniques to reduce some context-switching overhead.


Still, another way to get around that overhead is to reduce the number of context switches needed. And that's what the bulk processing features of PL/SQL offer to us.


Consider the following figure:


I have a cursor FOR loop executing an update statement.

Each time the PL/SQL engine gets to the update, it stops, prepares the bind variables, and then passes a packet of info (the SQL string, the variables, and other data as well) to the SQL engine. The SQL engine "does its thing" and passes back a result set (or an error). This occurs for each update statement, whether there be two or two thousand.

Hency, my dismay with extremememe.info's code.

OK, Back to My Delight

The presence of this anti-pattern was also the source of my delight. The developers needed to improve the performance of the procedure significantly. That could be very hard to do, if they have already taken full advantage of performance optimization features in SQL and PL/SQL. Since their code still contained non-query DML inside loops, I knew that was not the case and in fact we were looking at a case of  "low hanging fruit" (a goal that is easily achievable).

It was possible that we could solve their problem with a relatively straightforward change to their existing code. Conesquently, they might not need to make changes to their database design, which would have much greater impact and incur a much greater cost of development and testing.

Well, you've seen the anti-pattern: non-query DML inside the loop. What's the (anti-anti-) pattern? We already saw how Oracle has gone to great lengths to reduce the cost of a context switch. So the next obvious things to look at is: how can we reduce the number of context switches? The answer is: BULK COLLECT and FORALL (bulk processing).

This next figure demonstrates at a very high level the performance potential offered by bulk processing.


I have replaced my cursor FOR loop with a FORALL statement. Under or within the header of FORALL I attach my DML statement.

When the PL/SQL engine hits the meta-DML FORALL statement, it says:

"Oh, I know what you want me to do: for each element in the driving array, generate a DML statement with bind variables set from the array.
"Then I will pass the whole set of DML statements to the SQL engine with one context switch.
"The SQL engine does whatever it is supposed to do (that's none of my business) and then I get the results back....
"Way faster!"
Seriously, that is exactly how the PL/SQL engine talks. I should know. We've had lots of conversations over the years.

And that is the fundamental idea behind bulk processing: the fewer context switches, the faster your program runs. And we are not talking about a 10% or 20% increase. You can see DML statements execute an order of magnitude faster using FORALL.

Of course, that's talk – and  you want to see action. In the coming weeks, I will step through the transformation of the row-by-row solution to a fully compatible and much faster bulk solution.
If you are eager to see that end point, simply point your browser to:


In the meantime, I will end this chapter in my saga with a simple piece of advice that can have a big impact on your code:

Unless working with a trivial amount of data, convert all loops containing non-query DML to a combination of BULK COLLECT and FORALL.

Comments

  1. Hi Steven,

    It's outstanding your explanation about BULK COLLECT and FORALL and we are encouraged to convert all loops containing non-query DML to a combination of BULK COLLECT and FORALL. However I have a question:

    When we have million of records to be updated, we use a counter to control when commit command should be issued, to avoid DML(update,insert, insert) consume too much UNDO TBS. The code is something like this:
    ...
    loop
    insert ....
    cont = cont + 1
    If cont >= 5000
    commit;
    ...
    end loop;
    commit;


    By using BULK COLLECT and FORALL, when commit command should be issued ? ( by the way, I didn´t see commit command in the procedures).

    Best regards,
    Lavie

    ReplyDelete

Post a Comment

Popular posts from this blog

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p

The differences between deterministic and result cache features

 EVERY once in a while, a developer gets in touch with a question like this: I am confused about the exact difference between deterministic and result_cache. Do they have different application use cases? I have used deterministic feature in many functions which retrieve data from some lookup tables. Is it essential to replace these 'deterministic' key words with 'result_cache'?  So I thought I'd write a post about the differences between these two features. But first, let's make sure we all understand what it means for a function to be  deterministic. From Wikipedia : In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states.  Another way of putting this is that a deterministic subprogram (procedure or function) has no side-effects. If you pass a certain set of arguments for the parameters, you will always get

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,