FORALL is used to avoid row-by-row execution of the same DML statement (differing only in the values bound into it), thereby reducing context switching between the PL/SQL and SQL engines.
I will assume in this post that you have a basic working knowledge of BULK COLLECT and FORALL. If not, check out:
A Checklist for Conversion to Bulk Processing
As noted in the checklist, you need to document your current transaction behavior, and then make sure that the same behavior manifests in your bulk implementation.
I published a video that takes you through the key steps in the conversion from row-by-row to bulk. The code for this video is available on LiveSQL.
A viewer asked:
And I noted in the video that if an insert fails, the update does not take place, so we need to avoid this in the bulk processing. I did so by removing the employee ID of any failed insert from the bind array that is used in the second FORALL for the update:
Well, I can justify that omission easily: it is not an omission. The original code has this same flaw in it!
So really what Jinay has identified is a possible/likely bug in my original code. This is a very common experience when developers perform code reviews and is a primary motivator for pair programming (kill those bugs before/as they are written).
To achieve the "effect" described by Jinay, I can add SAVEPOINTs:
OK, so then the question is: how can I achieve the same effect when I convert to bulk processing? Easily...with more bulk processing and the helpful RETURNING clause.
First, add another nested subprogram:
I will assume in this post that you have a basic working knowledge of BULK COLLECT and FORALL. If not, check out:
A Checklist for Conversion to Bulk Processing
As noted in the checklist, you need to document your current transaction behavior, and then make sure that the same behavior manifests in your bulk implementation.
I published a video that takes you through the key steps in the conversion from row-by-row to bulk. The code for this video is available on LiveSQL.
A viewer asked:
Hi Steven - @minute 21.56 in the video, function update_employee. in case of any update failures, you are handling an exception "bulk_error" but how do you ensure that the corresponding insert gets rolled back? You showed a way to communicate the failed employee records from insert_history function to update_employee, but it needs to be done the other way round too ? to ensure "INSERT+ADJUST+UPDATE" is one transaction like in the case of cursor for loop.I decided to answer this question by writing a post on my blog - this one. So let's dive in. The original code looked like this:
CREATE OR REPLACE PROCEDURE upd_for_dept (
dept_in IN employees.department_id%TYPE
, newsal_in IN employees.salary%TYPE)
IS
CURSOR emp_cur
IS
SELECT employee_id, salary, hire_date
FROM employees
WHERE department_id = dept_in
FOR UPDATE;
BEGIN
FOR rec IN emp_cur
LOOP
BEGIN
INSERT INTO employee_history (employee_id, salary, hire_date)
VALUES (rec.employee_id, rec.salary, rec.hire_date);
rec.salary := newsal_in;
adjust_compensation (rec.employee_id, rec.salary);
UPDATE employees
SET salary = rec.salary
WHERE employee_id = rec.employee_id;
EXCEPTION
WHEN OTHERS
THEN
log_error;
END;
END LOOP;
END upd_for_dept;
[Note: I have added the "FOR UPDATE" clause above; that was not in the video and original code, but it should be to ensure that while I am doing my updates, no one else can come along and modify a row in my "target" dataset. thanks to Martin Rose for pointing this out in his acerbic comments on my video. :-) ]And I noted in the video that if an insert fails, the update does not take place, so we need to avoid this in the bulk processing. I did so by removing the employee ID of any failed insert from the bind array that is used in the second FORALL for the update:
PROCEDURE insert_history
IS
BEGIN
FORALL indx IN 1 .. l_employees.COUNT SAVE EXCEPTIONS
INSERT
INTO employee_history (employee_id
, salary
, hire_date)
VALUES (
l_employees (indx).employee_id
, l_employees (indx).salary
, l_employees (indx).hire_date);
EXCEPTION
WHEN bulk_errors
THEN
FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
LOOP
/* Log the error then ... Communicate this failure to update:
Delete this row so that the update will not take place.
*/
l_employees.delete (
SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX);
END LOOP;
END insert_history;
But Jinay correctly points out that in my bulkified code, if the update fails, I do not reverse the insert.
Well, I can justify that omission easily: it is not an omission. The original code has this same flaw in it!
So really what Jinay has identified is a possible/likely bug in my original code. This is a very common experience when developers perform code reviews and is a primary motivator for pair programming (kill those bugs before/as they are written).
To achieve the "effect" described by Jinay, I can add SAVEPOINTs:
BEGIN
FOR rec IN emp_cur
LOOP
BEGIN
SAVEPOINT before_insert;
INSERT
INTO employee_history (employee_id, salary, hire_date)
VALUES (rec.employee_id, rec.salary, rec.hire_date);
rec.salary := newsal_in;
adjust_compensation (rec.employee_id, rec.salary);
UPDATE employees
SET salary = rec.salary
WHERE employee_id = rec.employee_id;
EXCEPTION
WHEN OTHERS
THEN
ROLLBACK TO before_insert;
log_error;
END;
END LOOP;
END upd_for_dept;
Now if the insert succeeds and the update fails, the insert will be rolled back.OK, so then the question is: how can I achieve the same effect when I convert to bulk processing? Easily...with more bulk processing and the helpful RETURNING clause.
PROCEDURE insert_history
IS
BEGIN
FORALL indx IN 1 .. l_employees.COUNT SAVE EXCEPTIONS
INSERT
INTO employee_history (employee_id
, salary
, hire_date)
VALUES (
l_employees (indx).employee_id
, l_employees (indx).salary
, l_employees (indx).hire_date)
RETURNING id, employee_id BULK COLLECT INTO l_inserted;
EXCEPTION
WHEN bulk_errors
THEN
FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
LOOP
/* Log the error then ... Communicate this failure to update:
Delete this row so that the update will not take place.
*/
l_employees.delete (
SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX);
END LOOP;
END insert_history;
where l_inserted is declared as follows: TYPE inserted_rt IS RECORD
(
id employee_history.id%TYPE,
employee_id employee_history.employee_id%TYPE
);
TYPE inserted_t IS TABLE OF inserted_rt;
l_inserted inserted_t := inserted_t();
I can then modify the exception handler in the update "phase" of this bulk-ified procedure as follows:First, add another nested subprogram:
PROCEDURE remove_history_row (
employee_id_in IN employees.employee_id%TYPE)
IS
l_found_index INTEGER;
l_index INTEGER := l_inserted.FIRST;
BEGIN
/* Find matching element in l_inserted, and remove */
WHILE l_found_index IS NULL AND l_index IS NOT NULL
LOOP
IF l_inserted (l_index).employee_id = employee_id_in
THEN
l_found_index := l_index;
ELSE
l_index := l_inserted.NEXT (l_index);
END IF;
END LOOP;
IF l_found_index IS NOT NULL
THEN
DELETE FROM employee_history
WHERE id = l_inserted (l_found_index).id;
END IF;
END;
Then invoke the subprogram inside the exception section of the update procedure: EXCEPTION
WHEN bulk_errors
THEN
FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
LOOP
remove_history_row (
l_employees (
SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX).employee_id);
log_error (
'Unable to update salary for employee '
|| l_employees (
SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX).employee_id,
SQL%BULK_EXCEPTIONS (indx).ERROR_CODE);
END LOOP;
END update_employees;
I have now carried over the SAVEPOINT-based behavior to my bulk-ified version. Note that I am performing row-by-row deletes in remove_history_row. Perhaps you, my dear reader, would like to take on a little exercise of enhancing the solution I offer above to use FORALL to delete all the inserted rows for which the update failed!
Hi Steven:
ReplyDeleteTYPE t_failed IS TABLE OF employee_history.id%TYPE INDEX BY PLS_INTEGER;
l_failed t_failed;
PROCEDURE remove_history
IS
BEGIN
FORALL indx IN 1 .. l_failed.COUNT
DELETE FROM employee_history
WHERE id = l_failed(indx);
END remove_history;
PROCEDURE update_employees
IS
l_err_idx NUMBER;
l_fail_idx NUMBER;
BEGIN
FORALL indx IN l_employees.FIRST .. l_employees.LAST SAVE EXCEPTIONS
UPDATE employees
SET salary = l_employees (indx).salary
WHERE employee_id = l_employees (indx).employee_id;
EXCEPTION
WHEN e_bulk_errors
THEN
l_fail_idx := 1;
FOR indx IN 1..SQL%BULK_EXCEPTIONS.COUNT LOOP
l_err_idx := SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX;
plog.error('Unable to update salary for employee ' || l_employees(l_err_idx).employee_id
|| ' ERROR: ' || SQL%BULK_EXCEPTIONS (indx).ERROR_CODE);
l_failed(l_fail_idx):= l_employees(l_err_idx).history_id;
l_fail_idx := l_fail_idx + 1;
END LOOP;
remove_history;
END update_employees;
Ronan
Yep, that looks good. Thanks, Ronan!
DeleteHello Steven,
ReplyDeleteIt looks to me that the last statement in the LiveSQL example is not entirely correct.
In the update_employees procedure, when handling the "bulk_errors" exception raised by "FORALL indx IN INDICES OF ...",
since the collection l_employees might be sparse, the value given by
SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX
represents the iteration number on which the error occurred, and NOT the effective bind collection index value at which the error occurred, so, it is not correct to extract the employee_id to be deleted from that "index" value, but, instead, we should extract it from that "iteration number" in the bind collection, as in the following:
PROCEDURE update_employees
IS
l_index PLS_INTEGER;
FUNCTION bind_array_index_for (
bind_array_in IN employees_tt;
error_index_in IN PLS_INTEGER)
RETURN PLS_INTEGER
IS
l_index PLS_INTEGER := bind_array_in.FIRST;
BEGIN
FOR indx IN 1 .. error_index_in - 1
LOOP
l_index := bind_array_in.NEXT (l_index);
END LOOP;
RETURN l_index;
END;
BEGIN
FORALL indx IN INDICES OF l_employees SAVE EXCEPTIONS
UPDATE employees
SET salary = l_employees (indx).salary,
hire_date = l_employees (indx).hire_date
WHERE employee_id = l_employees (indx).employee_id;
EXCEPTION
WHEN bulk_errors
THEN
FOR indx IN 1 .. SQL%BULK_EXCEPTIONS.COUNT
LOOP
l_index := bind_array_index_for( l_employees, SQL%BULK_EXCEPTIONS (indx).ERROR_INDEX ) )
remove_history_row (
l_employees (l_index).employee_id);
log_error (
'Unable to update salary for employee '
|| l_employees (l_index).employee_id,
SQL%BULK_EXCEPTIONS (indx).ERROR_CODE);
END LOOP;
END update_employees;
Another variant would be to "extend" the inserted_rt record by adding to it the fields needed for the update (salary and hire_date) and then to use the dense collection l_inserted in both procedures adj_comp_for_arrays and update_employees, instead of the sparse collection l_employees.
In such a case, the bulk_errors exception handler code in update_employees is correct as it is now.
Thanks a lot & Best Regards,
Iudith
Hi Steven,
ReplyDeleteWe have one requirement in PL/SQL to validate the records of the table. The fields that needs to checked for the partiular check will be mentioned in the Metadata table.
For ex:
Table_Name Field_name Check Column_Length Primary_Key
Sales Order NOT_NULL 30 Y
Sales Name LENGTH 40 N
Sales Amount INTEGER 76 N
Product ID NOT_NULL 10 Y
Which is the best approach to check the fields of the table from the below two or suggest or any other approac:
1) Column level check : Creating dynamic query to based on the input table name and open a cursor and insert these bad records into error table.
For eg: select Order from Sales where Order is NULL; -- query will return all the fields that are null
select NAME from Sales where length(Name)<>10; -- query will return all the fields not having length 10.
Only challenge will be to create dynamic query as per the type of check required and the amount of data.
In this method, if the table has huge data will there be any performance issue?
2)Row by Row check: Or else read each and every record row by row and check each and every field, i.e top to down approach.
Thanks,
Srikant
Srikant, please ask your question at ask tom.oracle.com. That way it and the answer can join our extensive Q&A database for others to benefit from later.
Delete