Skip to main content

Best Type of Collection for FORALL?

I recently received this question in my In Box:

Is FORALL faster with Associative Arrays or Nested Tables? Oracle 12.2 documentation says: "The most efficient way to pass collections to and from the database server is to use associative arrays with the FORALL statement or BULK COLLECT clause." And a blog post claims Associative Arrays with "indices of" option is fastest in 10.2. Just wondering if you have noticed any differences and if so, how much faster Associative Arrays are in 12.2 than Nested Tables?

Quick Answer

  1. There is no significant difference I can see in the performance based on different collection types (with the limited tests I have run).
  2. Don't trust performance tests run on very old versions of Oracle Database (e.g,, 10.2).
  3. Use the documentation as a starting, not ending, point of your exploration.
  4. Try it yourself! Writing and running the code will teach you more than reading the doc or my blog post.
If you would like to read further, I will show you the code I used in my tests.

You can run these yourself on LiveSQL. I encourage you to build your own tests and post them on LiveSQL as well. I will be happy to add links in this blog post if you send them to me.

I create a table into which I will insert rows. I then create a package to help me calculate elapsed time of my test code.
CREATE TABLE parts
(
   partnum    NUMBER,
   partname   VARCHAR2 (15)
)
/

CREATE OR REPLACE PACKAGE tmr
IS
   PROCEDURE start_timer;

   PROCEDURE show_elapsed_time (message_in IN VARCHAR2);
END;
/

CREATE OR REPLACE PACKAGE BODY tmr
IS
   l_start   INTEGER;

   PROCEDURE start_timer
   IS
   BEGIN
      l_start := DBMS_UTILITY.get_cpu_time;
   END start_timer;

   PROCEDURE show_elapsed_time (message_in IN VARCHAR2)
   IS
   BEGIN
      DBMS_OUTPUT.put_line (
            CASE
               WHEN message_in IS NULL THEN 'Completed in:'
               ELSE '"' || message_in || '" completed in: '
            END
         || (DBMS_UTILITY.get_cpu_time - l_start)
         || ' cs');

      /* Reset timer */
      start_timer;
   END show_elapsed_time;
END;
/

With Associative Arrays

Otherwise known as "index by tables" because of their use of the INDEX BY clause, these collections have been around in PL/SQL since 7.3, greatly enhanced over the years to offer string indexing and other fun stuff.

In the block below, I populate two collections, once with the part IDs and the other with the part names. I then use FORALL to insert them into the table. I use the tmr subprograms to set the starting time, and then display the number of hundredths of seconds (centi-seconds) that elapsed to do the FORALL inserts.
DECLARE
   PROCEDURE compare_inserting (num IN INTEGER)
   IS
      TYPE numtab IS TABLE OF parts.partnum%TYPE
         INDEX BY PLS_INTEGER;

      TYPE nametab IS TABLE OF parts.partname%TYPE
         INDEX BY PLS_INTEGER;

      pnums       numtab;
      pnames      nametab;
   BEGIN
      FOR indx IN 1 .. num
      LOOP
         pnums (indx) := indx;
         pnames (indx) := 'Part ' || TO_CHAR (indx);
      END LOOP;

      tmr.start_timer;

      FORALL indx IN 1 .. num
         INSERT INTO parts
              VALUES (pnums (indx), pnames (indx));

      tmr.show_elapsed_time ('Inserted ' || num || ' rows');

      ROLLBACK;
   END;
BEGIN
   DBMS_OUTPUT.put_line ('Associative Arrays');
   compare_inserting (1000000);
END;

With Nested Tables

The nested table type was added way back in Oracle8, as part of the object-relational model. The definition of a nested table type looks just like the associative array, without the INDEX BY clause. That's because, officially, nested tables have no index, no order to their elements - they are multisets. But in practice, you can use the index value, which we certainly do in the FORALL statement.

With nested tables, you must initialize the collection using its constructor function, and explicitly extend the collection to make room for new elements.
DECLARE
   PROCEDURE compare_inserting (num IN INTEGER)
   IS
      TYPE numtab IS TABLE OF parts.partnum%TYPE;

      TYPE nametab IS TABLE OF parts.partname%TYPE;

      pnums       numtab := numtab();
      pnames      nametab := nametab();
   BEGIN
      pnums.extend (num);
      pnames.extend (num);
      
      FOR indx IN 1 .. num
      LOOP
         pnums (indx) := indx;
         pnames (indx) := 'Part ' || TO_CHAR (indx);
      END LOOP;

      tmr.start_timer;

      FORALL indx IN 1 .. num
         INSERT INTO parts
              VALUES (pnums (indx), pnames (indx));

      tmr.show_elapsed_time ('Inserted ' || num || ' rows');

      ROLLBACK;
   END;
BEGIN
   DBMS_OUTPUT.put_line ('Nested Table');
   compare_inserting (1000000);
END;

With Varrays

Finally, varrays - a shortening of "varying arrays." Which is kind of funny, because varrays are the least varying of all the collection types. With varrays, you have to declare the maximum number of elements allowed in the collection up front.

With varrays, like nested tables, you must initialize the collection using its constructor function, and explicitly extend the collection to make room for new elements.
DECLARE
   PROCEDURE compare_inserting (num IN INTEGER)
   IS
      TYPE numtab IS VARRAY (1000000) OF parts.partnum%TYPE;

      TYPE nametab IS VARRAY (1000000) OF parts.partname%TYPE;

      pnums       numtab := numtab();
      pnames      nametab := nametab();
   BEGIN
      pnums.extend (num);
      pnames.extend (num);
      
      FOR indx IN 1 .. num
      LOOP
         pnums (indx) := indx;
         pnames (indx) := 'Part ' || TO_CHAR (indx);
      END LOOP;

      tmr.start_timer;

      FORALL indx IN 1 .. num
         INSERT INTO parts
              VALUES (pnums (indx), pnames (indx));

      tmr.show_elapsed_time ('Inserted ' || num || ' rows');

      ROLLBACK;
   END;
BEGIN
   DBMS_OUTPUT.put_line ('Varrays');
   compare_inserting (1000000);
END;

Results

I ran each of the blocks three times, with the results you see below. I conclude that there is no significant difference in performance based on the type of collection you choose.
Associative Arrays
"Inserted 1000000 rows" completed in: 84 cs
"Inserted 1000000 rows" completed in: 78 cs
"Inserted 1000000 rows" completed in: 77 cs

Nested Table
"Inserted 1000000 rows" completed in: 78 cs
"Inserted 1000000 rows" completed in: 76 cs
"Inserted 1000000 rows" completed in: 77 cs

Varrays
"Inserted 1000000 rows" completed in: 81 cs
"Inserted 1000000 rows" completed in: 77 cs
"Inserted 1000000 rows" completed in: 77 cs
Now, obviously, this is not a comprehensive test. Will things go more slowly with nested tables for updates instead of inserts? Are associative arrays faster with small sets of data and nested tables better with super-duper large datasets? But it seems rather unlikely that these factors would affect overall performance, so I am not going to build scripts for those.

I do, however, encourage each of you to try this yourself. Feel free to download and modify my LiveSQL script. Test some variations, let me know what you discover.

In the meantime, I suggest that you pick the collection type that allows you to most easily build and manage the collections you will feed into the FORALL statement. Associative arrays are particularly good with sparse collections (not every index value filled between first and last). Nested tables offer lots of features (not connected to FORALL) for set-oriented management of collection contents. Varrays: well, honestly, I doubt you'll ever use a array. It's intended more for usage as nested columns in relational tables, offering some performance advantages there.

Comments

Popular posts from this blog

Get rid of mutating table trigger errors with the compound trigger

When something mutates, it is changing. Something that is changing is hard to analyze and to quantify. A mutating table error (ORA-04091) occurs when a row-level trigger tries to examine or change a table that is already undergoing change (via an INSERT, UPDATE, or DELETE statement). In particular, this error occurs when a row-level trigger attempts to read or write the table from which the trigger was fired. Fortunately, the same restriction does not apply in statement-level triggers.

In this post, I demonstrate the kind of scenario that will result in an ORA-04091 errors. I then show the "traditional" solution, using a collection defined in a package. Then I demonstrate how to use the compound trigger, added in Oracle Database 11g Release1,  to solve the problem much more simply.

All the code shown in this example may be found in this LiveSQL script.

How to Get a Mutating Table Error

I need to implement this rule on my employees table:
Your new salary cannot be more than 25x th…

Table Functions, Part 1: Introduction and Exploration

August 2018 update: please do feel encourage to read this and my other posts on table functions, but you will learn much more about table functions by taking my Get Started with PL/SQL Table Functions class at the Oracle Dev Gym. Videos, tutorials and quizzes - then print a certificate when you are done!


Table functions - functions that can be called in the FROM clause of a query from inside the TABLE operator - are fascinating and incredibly helpful constructs.

So I've decided to write a series of blog posts on them: how to build them, how to use them, issues you might run into.

Of course, I am not the first to do so. I encourage to check out the documentation, as well as excellent posts from Adrian Billington (search for "table functions") and Tim Hall. Adrian and Tim mostly focus on pipelined table functions, a specialized variant of table functions designed to improve performance and reduce PGA consumption. I will take a look at pipelined table functions in the latte…

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post.

PL/SQL is a strongly-typed language. Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type).

PL/SQL offers a wide array of pre-defined data types, both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package).

Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types.

You can't really declare your own "user-defined" scalars, though you can define subtypes from those scalars, which can be very helpful from the perspective…