Skip to main content

PL/SQL 101: Less code is better - avoid unnecessarily complex algorithms

With programmers new to PL/SQL (and SQL), it is not uncommon to find that they overcomplicate things, writing more code than is necessary, and putting too much logic into PL/SQL.  That problem can then be compounded by accidentally getting the "right answer" based on inadequate testing and test data.

So it is always good to be reminded: 
Do as much work as you can in SQL, and then finish up in PL/SQL. 
Where "finish up" means do whatever is appropriate within the database, and then make that available to whatever language is being used to write the UI!

A recent quiz on the PL/SQL Challenge explored this topic, and I offer it below as a learning exercise via blog post.

Suppose I have a table and dataset as follows:

CREATE TABLE plch_animals
(
   animal_id     INTEGER PRIMARY KEY,
   animal_name   VARCHAR2 (100)
)
/

BEGIN
   INSERT INTO plch_animals (animal_id, animal_name)
        VALUES (1, 'Bonobo');

   INSERT INTO plch_animals (animal_id, animal_name)
        VALUES (2, 'Cockatoo');

   INSERT INTO plch_animals (animal_id, animal_name)
        VALUES (3, 'Spider');

   COMMIT;
END;
/

And I need to write a program that produces the following output:

Animals in Alphabetical Order
Bonobo
Cockatoo
Spider

Let's take a look at some ways to achieve this objective (some of them distinctly sub-optimal, holding off the best till last):

#1. Accidental Success Through Bad Test Data

DECLARE
   l_count   INTEGER;
   l_name    plch_animals.animal_name%TYPE;
BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   SELECT COUNT (*) INTO l_count FROM plch_animals;

   FOR indx IN 1 .. l_count
   LOOP
      SELECT animal_name
        INTO l_name
        FROM plch_animals
       WHERE animal_id = indx;

      DBMS_OUTPUT.put_line (l_name);
   END LOOP;
END;
/

You see the desired output, but only because the animal IDs happen to ascend in exactly the same sequence as an alphabetical ordering of the animal names.(see line in blue).

You certainly do not want to ever assume this will be the case with real data. Even if you look at the data today and can confirm that pattern. That's just for today. A warning sign that this code is problematic is the SELECT COUNT(*).

You should generally not need to do a separate query just to control the number of iterations of a loop. A cursor FOR loop will usually take care of that for you.

#2. Too Much SQL, Not Enough Data

DECLARE
   TYPE animal_ids_t IS TABLE OF plch_animals.animal_id%TYPE;

   l_animal_ids   animal_ids_t;
   l_name         plch_animals.animal_name%TYPE;
BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   SELECT animal_id
     BULK COLLECT INTO l_animal_ids
     FROM plch_animals;

   FOR indx IN 1 .. l_animal_ids.COUNT
   LOOP
      SELECT animal_name
        INTO l_name
        FROM plch_animals
       WHERE animal_id = indx;

      DBMS_OUTPUT.put_line (l_name);
   END LOOP;
END;
/

Oh my. This choice is similar to #1, in terms of the basic, flawed assumptions (the animal IDs happen to ascend in exactly the same sequence as an alphabetical ordering of the animal names).

But it's even worse because I fetch all the animal IDs via BULK COLLECT into an array, then when I loop through the array, I execute a single row fetch to get the name.

And then it's even worse worse because the only reason this works is that the ID values happen to align with names in alphabetical order. So the results "look" OK when the test is run with this pathetic set of test data, but in the real world? Ugh.

That's a waste of code, PGA memory and CPU cycles.

#3. Having Too Much Fun With Collections

DECLARE
   TYPE animal_ids_t IS TABLE OF plch_animals.animal_id%TYPE
      INDEX BY plch_animals.animal_name%TYPE;

   l_animal_ids   animal_ids_t;
   l_index        plch_animals.animal_name%TYPE;
   l_name         plch_animals.animal_name%TYPE;
BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   FOR rec IN (  SELECT *
                   FROM plch_animals
               ORDER BY animal_name DESC)
   LOOP
      l_animal_ids (rec.animal_name) := rec.animal_id;
   END LOOP;

   l_index := l_animal_ids.FIRST;

   WHILE l_index IS NOT NULL
   LOOP
      DBMS_OUTPUT.put_line (l_index);

      l_index := l_animal_ids.NEXT (l_index);
   END LOOP;
END;
/

I very much enjoy collections (PL/SQL's version of arrays) and use them all over my code. And I especially appreciate string-indexed associative arrays, like I use above. But, really, there is such a thing as too much.

I use a cursor FOR loop to grab the rows from the animals table, and load the ID into my array, using the name as the index value. Then I use a WHILE loop to iterate through that array, simply displaying the index value. I never even use the ID!

It gets the job done, and sure performance will be fine unless you are looking at a very large volume, but this code is downright bewildering.

#4. SQL Simple

SELECT animal_name FROM plch_animals
 ORDER BY animal_name

Doesn't get much simpler than that. And if you need it inside PL/SQL....

#5. PL/SQL Simple

BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   FOR rec IN (SELECT animal_name FROM plch_animals
                ORDER BY animal_name)
   LOOP
      DBMS_OUTPUT.put_line (rec.animal_name);
   END LOOP;
END;

Lessons Learned

1. Let SQL do the heavy-lifting, as much as possible (not that there was anything very "heavy" to lift in this exercise!)

2. Don't over-complicate matters.

3. Make sure your test data has enough volume and variety to truly exercise your algorithm.

4. If you find yourself thinking "Does it have to be this complicated?", almost certainly the answer is a resounding "No!" and you should take a step back, challenge your assumptions, and see how you can simplify your code.

And don't forget:

a. Follow me on Twitter: @sfonplsql
b. Subscribe to my YouTube channel: PracticallyPerfectPLSQL
c. Check out the PL/SQL home page: oracle.com/plsql

Comments

  1. And the converse: simplicity = beauty!

    ReplyDelete
  2. #4 is not quite correct, Header is missing from output ;-)

    ReplyDelete
  3. Right, Herald, I guess it should be something more like:

    SELECT animal_name "Animals in Alphabetical Order"
    FROM plch_animals
    ORDER BY animal_name

    ReplyDelete
  4. Hi Steven, this morning i am drinking coffee and reading this blog again and i think this statement should be extended to: Do as much work as you can in SQL, and then finish up in PL/SQL. => Do as much work as you can in less SQL statements as you can in your code, and then finish up in PL/SQL. Part with Java and external C I will skip :D Also English is not my native language so i believe this sentence can be more simplify. :)

    ReplyDelete
    Replies
    1. Excellent thoughts, Neven. What do you think of this:

      Do as much as you can in the smallest possible number of SQL statements, then wrap those in a PL/SQL API.

      What do you think of that?

      Delete
    2. That is it :) simple and clean as code should be

      Delete
  5. For real world scenarios, I often prefere the number 3) example. It separates the data fetch process from the data manipulation/interpretation part. As soon as the logic becomes more complex, this is a pattern that helps tremendously to write good modularized code.

    ReplyDelete
    Replies
    1. Very interesting, Sven. I'd love to hear what others think of this.

      Delete
    2. My first gut reaction was seeing the "As soon as" and thinking "YAGNI".

      My deeper reaction is that this really depends a lot on whether you subscribe to Tom Kyte's long-standing advice to do as much as possible in SQL. I've usually found that this means the PL/SQL-required part is actually quite small, and putting it inside the cursor loop makes eminent sense.

      If you lean more toward doing the heavy lifting in PL/SQL and use the query as just a way to get data out of tables, this start to look more appealing.

      Delete
  6. And now we come to old topic, we all agree that this is best approach, less, simple etc. But many times I believe PL/SQL - SQL developers come to problem as i do. I cant write it in one SQL statement i need two. I am not sure will i use this SQL statements individually in future as part of some others stored procedures. Anyway should i separate them in two different stored procedure or functions or keep them in one stored procedure or function?
    Questions is, should we wrap all sql statements individually in stored procedures or functions?

    ReplyDelete
  7. Lets make SQL & PLSQL great again

    ReplyDelete
    Replies
    1. Again?! Nah, SQL & PLSQL are already great :)

      Delete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of work, revamping algorithms, ensuring correctness, you know the score. Then my eyes snagge

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p