Skip to main content

Maintaining transaction integrity with FORALL and multiple DML statements

FORALL is used to avoid row-by-row execution of the same DML statement (differing only in the values bound into it), thereby reducing context switching between the PL/SQL and SQL engines.

I will assume in this post that you have a basic working knowledge of BULK COLLECT and FORALL. If not, check out:

A Checklist for Conversion to Bulk Processing

As noted in the checklist, you need to document your current transaction behavior, and then make sure that the same behavior manifests in your bulk implementation.

I published a video that takes you through the key steps in the conversion from row-by-row to bulk. The code for this video is available on LiveSQL.

A viewer asked:
Hi Steven - @minute 21.56 in the video, function update_employee. in case of any update failures, you are handling an exception "bulk_error" but how do you ensure that the corresponding insert gets rolled back? You showed a way to communicate the failed employee records from insert_history function to update_employee, but it needs to be done the other way round too ? to ensure "INSERT+ADJUST+UPDATE" is one transaction like in the case of cursor for loop.
I decided to answer this question by writing a post on my blog - this one. So let's dive in. The original code looked like this:
CREATE OR REPLACE PROCEDURE upd_for_dept (
   dept_in     IN employees.department_id%TYPE
 , newsal_in   IN employees.salary%TYPE)
IS
   CURSOR emp_cur
   IS
      SELECT employee_id, salary, hire_date
        FROM employees
       WHERE department_id = dept_in
         FOR UPDATE;
BEGIN
   FOR rec IN emp_cur
   LOOP
      BEGIN
         INSERT INTO employee_history (employee_id, salary, hire_date)
              VALUES (rec.employee_id, rec.salary, rec.hire_date);

         rec.salary := newsal_in;

         adjust_compensation (rec.employee_id, rec.salary);

         UPDATE employees
            SET salary = rec.salary
          WHERE employee_id = rec.employee_id;
      EXCEPTION
         WHEN OTHERS
         THEN
            log_error;
      END;
   END LOOP;
END upd_for_dept;
[Note: I have added the "FOR UPDATE" clause above; that was not in the video and original code, but it should be to ensure that while I am doing my updates, no one else can come along and modify a row in my "target" dataset. thanks to Martin Rose for pointing this out in his acerbic comments on my video. :-) ]

And I noted in the video that if an insert fails, the update does not take place, so we need to avoid this in the bulk processing. I did so by removing the employee ID of any failed insert from the bind array that is used in the second FORALL for the update:
PROCEDURE insert_history
IS
BEGIN
  FORALL indx IN 1 .. l_employees.COUNT SAVE EXCEPTIONS
     INSERT
       INTO employee_history (employee_id
                            ,  salary
                            ,  hire_date)
     VALUES (
               l_employees (indx).employee_id
             ,  l_employees (indx).salary
             ,  l_employees (indx).hire_date);
EXCEPTION
  WHEN bulk_errors
  THEN
     FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
     LOOP
        /* Log the error then ... Communicate this failure to update:
        Delete this row so that the update will not take place.
        */
        l_employees.delete (
           SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX);
     END LOOP;

END insert_history;
But Jinay correctly points out that in my bulkified code, if the update fails, I do not reverse the insert. 

Well, I can justify that omission easily: it is not an omission. The original code has this same flaw in it!

So really what Jinay has identified is a possible/likely bug in my original code. This is a very common experience when developers perform code reviews and is a primary motivator for pair programming (kill those bugs before/as they are written).

To achieve the "effect" described by Jinay, I can add SAVEPOINTs:
BEGIN
   FOR rec IN emp_cur
   LOOP
      BEGIN
         SAVEPOINT before_insert;

         INSERT
           INTO employee_history (employee_id, salary, hire_date)
         VALUES (rec.employee_id, rec.salary, rec.hire_date);

         rec.salary := newsal_in;

         adjust_compensation (rec.employee_id, rec.salary);

         UPDATE employees
            SET salary = rec.salary
          WHERE employee_id = rec.employee_id;
      EXCEPTION
         WHEN OTHERS
         THEN
            ROLLBACK TO before_insert;
            log_error;
      END;
   END LOOP;

END upd_for_dept;
Now if the insert succeeds and the update fails, the insert will be rolled back.

OK, so then the question is: how can I achieve the same effect when I convert to bulk processing? Easily...with more bulk processing and the helpful RETURNING clause.
PROCEDURE insert_history
IS
BEGIN
  FORALL indx IN 1 .. l_employees.COUNT SAVE EXCEPTIONS
     INSERT
       INTO employee_history (employee_id
                            ,  salary
                            ,  hire_date)
     VALUES (
               l_employees (indx).employee_id
             ,  l_employees (indx).salary
             ,  l_employees (indx).hire_date)
     RETURNING id, employee_id BULK COLLECT INTO l_inserted;
EXCEPTION
  WHEN bulk_errors
  THEN
     FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
     LOOP
        /* Log the error then ... Communicate this failure to update:
        Delete this row so that the update will not take place.
        */
        l_employees.delete (
           SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX);
     END LOOP;

END insert_history;
where l_inserted is declared as follows:
   TYPE inserted_rt IS RECORD
   (
      id            employee_history.id%TYPE,
      employee_id   employee_history.employee_id%TYPE
   );

   TYPE inserted_t IS TABLE OF inserted_rt;

   l_inserted inserted_t := inserted_t();
I can then modify the exception handler in the update "phase" of this bulk-ified procedure as follows:

First, add another nested subprogram:
   PROCEDURE remove_history_row (
      employee_id_in   IN employees.employee_id%TYPE)
   IS
      l_found_index   INTEGER;
      l_index         INTEGER := l_inserted.FIRST;
   BEGIN
      /* Find matching element in l_inserted, and remove */

      WHILE l_found_index IS NULL AND l_index IS NOT NULL
      LOOP
         IF l_inserted (l_index).employee_id = employee_id_in
         THEN
            l_found_index := l_index;
         ELSE
            l_index := l_inserted.NEXT (l_index);
         END IF;
      END LOOP;

      IF l_found_index IS NOT NULL
      THEN
         DELETE FROM employee_history
               WHERE id = l_inserted (l_found_index).id;
      END IF;
   END;
Then invoke the subprogram inside the exception section of the update procedure:
   EXCEPTION
      WHEN bulk_errors
      THEN
         FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
         LOOP
            remove_history_row (
               l_employees (
                  SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX).employee_id);

            log_error (
                  'Unable to update salary for employee '
               || l_employees (
                     SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX).employee_id,
               SQL%BULK_EXCEPTIONS (indx).ERROR_CODE);
         END LOOP;
   END update_employees;
I have now carried over the SAVEPOINT-based behavior to my bulk-ified version. Note that I am performing row-by-row deletes in remove_history_row. Perhaps you, my dear reader, would like to take on a little exercise of enhancing the solution I offer above to use FORALL to delete all the inserted rows for which the update failed!

Comments

  1. Hi Steven:

    TYPE t_failed IS TABLE OF employee_history.id%TYPE INDEX BY PLS_INTEGER;
    l_failed t_failed;

    PROCEDURE remove_history
    IS
    BEGIN
    FORALL indx IN 1 .. l_failed.COUNT
    DELETE FROM employee_history
    WHERE id = l_failed(indx);
    END remove_history;

    PROCEDURE update_employees
    IS
    l_err_idx NUMBER;
    l_fail_idx NUMBER;
    BEGIN
    FORALL indx IN l_employees.FIRST .. l_employees.LAST SAVE EXCEPTIONS
    UPDATE employees
    SET salary = l_employees (indx).salary
    WHERE employee_id = l_employees (indx).employee_id;
    EXCEPTION
    WHEN e_bulk_errors
    THEN
    l_fail_idx := 1;
    FOR indx IN 1..SQL%BULK_EXCEPTIONS.COUNT LOOP
    l_err_idx := SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX;
    plog.error('Unable to update salary for employee ' || l_employees(l_err_idx).employee_id
    || ' ERROR: ' || SQL%BULK_EXCEPTIONS (indx).ERROR_CODE);
    l_failed(l_fail_idx):= l_employees(l_err_idx).history_id;
    l_fail_idx := l_fail_idx + 1;
    END LOOP;
    remove_history;
    END update_employees;


    Ronan

    ReplyDelete
  2. Hello Steven,

    It looks to me that the last statement in the LiveSQL example is not entirely correct.

    In the update_employees procedure, when handling the "bulk_errors" exception raised by "FORALL indx IN INDICES OF ...",
    since the collection l_employees might be sparse, the value given by

    SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX

    represents the iteration number on which the error occurred, and NOT the effective bind collection index value at which the error occurred, so, it is not correct to extract the employee_id to be deleted from that "index" value, but, instead, we should extract it from that "iteration number" in the bind collection, as in the following:


    PROCEDURE update_employees
    IS
    l_index PLS_INTEGER;

    FUNCTION bind_array_index_for (
    bind_array_in IN employees_tt;
    error_index_in IN PLS_INTEGER)
    RETURN PLS_INTEGER
    IS
    l_index PLS_INTEGER := bind_array_in.FIRST;
    BEGIN
    FOR indx IN 1 .. error_index_in - 1
    LOOP
    l_index := bind_array_in.NEXT (l_index);
    END LOOP;

    RETURN l_index;
    END;
    BEGIN
    FORALL indx IN INDICES OF l_employees SAVE EXCEPTIONS
    UPDATE employees
    SET salary = l_employees (indx).salary,
    hire_date = l_employees (indx).hire_date
    WHERE employee_id = l_employees (indx).employee_id;
    EXCEPTION
    WHEN bulk_errors
    THEN
    FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
    LOOP
    l_index := bind_array_index_for( l_employees, SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX ) )

    remove_history_row (
    l_employees (l_index).employee_id);

    log_error (
    'Unable to update salary for employee '
    || l_employees (l_index).employee_id,
    SQL%BULK_EXCEPTIONS (indx).ERROR_CODE);
    END LOOP;
    END update_employees;


    Another variant would be to "extend" the inserted_rt record by adding to it the fields needed for the update (salary and hire_date) and then to use the dense collection l_inserted in both procedures adj_comp_for_arrays and update_employees, instead of the sparse collection l_employees.
    In such a case, the bulk_errors exception handler code in update_employees is correct as it is now.


    Thanks a lot & Best Regards,
    Iudith

    ReplyDelete
  3. Hi Steven,

    We have one requirement in PL/SQL to validate the records of the table. The fields that needs to checked for the partiular check will be mentioned in the Metadata table.
    For ex:

    Table_Name Field_name Check Column_Length Primary_Key
    Sales Order NOT_NULL 30 Y
    Sales Name LENGTH 40 N
    Sales Amount INTEGER 76 N
    Product ID NOT_NULL 10 Y

    Which is the best approach to check the fields of the table from the below two or suggest or any other approac:

    1) Column level check : Creating dynamic query to based on the input table name and open a cursor and insert these bad records into error table.
    For eg: select Order from Sales where Order is NULL; -- query will return all the fields that are null
    select NAME from Sales where length(Name)<>10; -- query will return all the fields not having length 10.

    Only challenge will be to create dynamic query as per the type of check required and the amount of data.
    In this method, if the table has huge data will there be any performance issue?

    2)Row by Row check: Or else read each and every record row by row and check each and every field, i.e top to down approach.



    Thanks,
    Srikant

    ReplyDelete
    Replies
    1. Srikant, please ask your question at ask tom.oracle.com. That way it and the answer can join our extensive Q&A database for others to benefit from later.

      Delete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of work, revamping algorithms, ensuring correctness, you know the score. Then my eyes snagge

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p