Skip to main content

PL/SQL 101: Less code is better - avoid unnecessarily complex algorithms

With programmers new to PL/SQL (and SQL), it is not uncommon to find that they overcomplicate things, writing more code than is necessary, and putting too much logic into PL/SQL.  That problem can then be compounded by accidentally getting the "right answer" based on inadequate testing and test data.

So it is always good to be reminded: 
Do as much work as you can in SQL, and then finish up in PL/SQL. 
Where "finish up" means do whatever is appropriate within the database, and then make that available to whatever language is being used to write the UI!

A recent quiz on the PL/SQL Challenge explored this topic, and I offer it below as a learning exercise via blog post.

Suppose I have a table and dataset as follows:

CREATE TABLE plch_animals
(
   animal_id     INTEGER PRIMARY KEY,
   animal_name   VARCHAR2 (100)
)
/

BEGIN
   INSERT INTO plch_animals (animal_id, animal_name)
        VALUES (1, 'Bonobo');

   INSERT INTO plch_animals (animal_id, animal_name)
        VALUES (2, 'Cockatoo');

   INSERT INTO plch_animals (animal_id, animal_name)
        VALUES (3, 'Spider');

   COMMIT;
END;
/

And I need to write a program that produces the following output:

Animals in Alphabetical Order
Bonobo
Cockatoo
Spider

Let's take a look at some ways to achieve this objective (some of them distinctly sub-optimal, holding off the best till last):

#1. Accidental Success Through Bad Test Data

DECLARE
   l_count   INTEGER;
   l_name    plch_animals.animal_name%TYPE;
BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   SELECT COUNT (*) INTO l_count FROM plch_animals;

   FOR indx IN 1 .. l_count
   LOOP
      SELECT animal_name
        INTO l_name
        FROM plch_animals
       WHERE animal_id = indx;

      DBMS_OUTPUT.put_line (l_name);
   END LOOP;
END;
/

You see the desired output, but only because the animal IDs happen to ascend in exactly the same sequence as an alphabetical ordering of the animal names.(see line in blue).

You certainly do not want to ever assume this will be the case with real data. Even if you look at the data today and can confirm that pattern. That's just for today. A warning sign that this code is problematic is the SELECT COUNT(*).

You should generally not need to do a separate query just to control the number of iterations of a loop. A cursor FOR loop will usually take care of that for you.

#2. Too Much SQL, Not Enough Data

DECLARE
   TYPE animal_ids_t IS TABLE OF plch_animals.animal_id%TYPE;

   l_animal_ids   animal_ids_t;
   l_name         plch_animals.animal_name%TYPE;
BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   SELECT animal_id
     BULK COLLECT INTO l_animal_ids
     FROM plch_animals;

   FOR indx IN 1 .. l_animal_ids.COUNT
   LOOP
      SELECT animal_name
        INTO l_name
        FROM plch_animals
       WHERE animal_id = indx;

      DBMS_OUTPUT.put_line (l_name);
   END LOOP;
END;
/

Oh my. This choice is similar to #1, in terms of the basic, flawed assumptions (the animal IDs happen to ascend in exactly the same sequence as an alphabetical ordering of the animal names).

But it's even worse because I fetch all the animal IDs via BULK COLLECT into an array, then when I loop through the array, I execute a single row fetch to get the name.

And then it's even worse worse because the only reason this works is that the ID values happen to align with names in alphabetical order. So the results "look" OK when the test is run with this pathetic set of test data, but in the real world? Ugh.

That's a waste of code, PGA memory and CPU cycles.

#3. Having Too Much Fun With Collections

DECLARE
   TYPE animal_ids_t IS TABLE OF plch_animals.animal_id%TYPE
      INDEX BY plch_animals.animal_name%TYPE;

   l_animal_ids   animal_ids_t;
   l_index        plch_animals.animal_name%TYPE;
   l_name         plch_animals.animal_name%TYPE;
BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   FOR rec IN (  SELECT *
                   FROM plch_animals
               ORDER BY animal_name DESC)
   LOOP
      l_animal_ids (rec.animal_name) := rec.animal_id;
   END LOOP;

   l_index := l_animal_ids.FIRST;

   WHILE l_index IS NOT NULL
   LOOP
      DBMS_OUTPUT.put_line (l_index);

      l_index := l_animal_ids.NEXT (l_index);
   END LOOP;
END;
/

I very much enjoy collections (PL/SQL's version of arrays) and use them all over my code. And I especially appreciate string-indexed associative arrays, like I use above. But, really, there is such a thing as too much.

I use a cursor FOR loop to grab the rows from the animals table, and load the ID into my array, using the name as the index value. Then I use a WHILE loop to iterate through that array, simply displaying the index value. I never even use the ID!

It gets the job done, and sure performance will be fine unless you are looking at a very large volume, but this code is downright bewildering.

#4. SQL Simple

SELECT animal_name FROM plch_animals
 ORDER BY animal_name

Doesn't get much simpler than that. And if you need it inside PL/SQL....

#5. PL/SQL Simple

BEGIN
   DBMS_OUTPUT.put_line ('Animals in Alphabetical Order');

   FOR rec IN (SELECT animal_name FROM plch_animals
                ORDER BY animal_name)
   LOOP
      DBMS_OUTPUT.put_line (rec.animal_name);
   END LOOP;
END;

Lessons Learned

1. Let SQL do the heavy-lifting, as much as possible (not that there was anything very "heavy" to lift in this exercise!)

2. Don't over-complicate matters.

3. Make sure your test data has enough volume and variety to truly exercise your algorithm.

4. If you find yourself thinking "Does it have to be this complicated?", almost certainly the answer is a resounding "No!" and you should take a step back, challenge your assumptions, and see how you can simplify your code.

And don't forget:

a. Follow me on Twitter: @sfonplsql
b. Subscribe to my YouTube channel: PracticallyPerfectPLSQL
c. Check out the PL/SQL home page: oracle.com/plsql

Comments

  1. And the converse: simplicity = beauty!

    ReplyDelete
  2. #4 is not quite correct, Header is missing from output ;-)

    ReplyDelete
  3. Right, Herald, I guess it should be something more like:

    SELECT animal_name "Animals in Alphabetical Order"
    FROM plch_animals
    ORDER BY animal_name

    ReplyDelete
  4. Hi Steven, this morning i am drinking coffee and reading this blog again and i think this statement should be extended to: Do as much work as you can in SQL, and then finish up in PL/SQL. => Do as much work as you can in less SQL statements as you can in your code, and then finish up in PL/SQL. Part with Java and external C I will skip :D Also English is not my native language so i believe this sentence can be more simplify. :)

    ReplyDelete
    Replies
    1. Excellent thoughts, Neven. What do you think of this:

      Do as much as you can in the smallest possible number of SQL statements, then wrap those in a PL/SQL API.

      What do you think of that?

      Delete
    2. That is it :) simple and clean as code should be

      Delete
  5. For real world scenarios, I often prefere the number 3) example. It separates the data fetch process from the data manipulation/interpretation part. As soon as the logic becomes more complex, this is a pattern that helps tremendously to write good modularized code.

    ReplyDelete
    Replies
    1. Very interesting, Sven. I'd love to hear what others think of this.

      Delete
    2. My first gut reaction was seeing the "As soon as" and thinking "YAGNI".

      My deeper reaction is that this really depends a lot on whether you subscribe to Tom Kyte's long-standing advice to do as much as possible in SQL. I've usually found that this means the PL/SQL-required part is actually quite small, and putting it inside the cursor loop makes eminent sense.

      If you lean more toward doing the heavy lifting in PL/SQL and use the query as just a way to get data out of tables, this start to look more appealing.

      Delete
  6. And now we come to old topic, we all agree that this is best approach, less, simple etc. But many times I believe PL/SQL - SQL developers come to problem as i do. I cant write it in one SQL statement i need two. I am not sure will i use this SQL statements individually in future as part of some others stored procedures. Anyway should i separate them in two different stored procedure or functions or keep them in one stored procedure or function?
    Questions is, should we wrap all sql statements individually in stored procedures or functions?

    ReplyDelete
  7. Lets make SQL & PLSQL great again

    ReplyDelete
    Replies
    1. Again?! Nah, SQL & PLSQL are already great :)

      Delete

Post a Comment

Popular posts from this blog

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p

The differences between deterministic and result cache features

 EVERY once in a while, a developer gets in touch with a question like this: I am confused about the exact difference between deterministic and result_cache. Do they have different application use cases? I have used deterministic feature in many functions which retrieve data from some lookup tables. Is it essential to replace these 'deterministic' key words with 'result_cache'?  So I thought I'd write a post about the differences between these two features. But first, let's make sure we all understand what it means for a function to be  deterministic. From Wikipedia : In computer science, a deterministic algorithm is an algorithm which, given a particular input, will always produce the same output, with the underlying machine always passing through the same sequence of states.  Another way of putting this is that a deterministic subprogram (procedure or function) has no side-effects. If you pass a certain set of arguments for the parameters, you will always get

My two favorite APEX 5 features: Regional Display Selector and Cards

We (the over-sized development team for the PL/SQL Challenge - myself and my son, Eli) have been busy creating a new website on top of the PLCH platform (tables and packages): The Oracle Dev Gym! In a few short months (and just a part time involvement by yours truly), we have leveraged Oracle Application Express 5 to create what I think is an elegant, easy-to-use site that our users will absolutely love.  We plan to initially make the Dev Gym available only for current users of PL/SQL Challenge, so we can get feedback from our loyal user base. We will make the necessary adjustments and then offer it for general availability later this year. Anyway, more on that as the date approaches (the date being June 27, the APEX Open Mic Night at Kscope16 , where I will present it to a packed room of APEX experts). What I want to talk about today are two features of APEX that are making me so happy these days: Regional Display Selector and Cards. Regional Display Sel