Skip to main content

Declarative PL/SQL

SQL is a set-oriented, declarative language. A language or statement is declarative if it describes the computation to be performed (or in SQL, the set of data to be retrieved/modified) without specifying how to compute it.

PL/SQL is a procedural language, tightly integrated with SQL, and intended primarily to allow us to build powerful, secure APIs to underlying data (via SQL).

If I try hard, I can maximize the procedural side of PL/SQL and minimize the declarative aspect of SQL (primarily by ignoring or discounting the set-orientation of SQL). That's generally a bad idea. Instead, we Oracle Database developers make the most of the many powerful features of SQL (think: analytic functions, pattern matching, joins, etc.), and minimize processing in PL/SQL.

What we should also do, though, is recognize the make the most of the declarative features of PL/SQL. There are two big reasons to do this:

1. When you don't tell the PL/SQL compiler how to do things, the optimizer has more freedom to improve performance.

2. You can write less code, improving your productivity and reduce the cost of maintaining your code in the future.

Here are a few of my favorite declarative constructs of PL/SQL

The Cursor FOR Loop

Definitely the best showcase for the benefits of declarative programming in PL/SQL.

I've identified a set of rows and columns you need to do something with (in the example below, simply display last name of all employees in department 100 whose salary > 5000).

I could do it the hard, procedural way:
DECLARE
   CURSOR emps_cur
   IS
      SELECT * FROM hr.employees
       ORDER BY last_name;

   l_emp   emps_cur%ROWTYPE;
BEGIN
   OPEN emps_cur;

   LOOP
      FETCH emps_cur INTO l_emp;

      EXIT WHEN emps_cur%NOTFOUND;

      IF l_emp.department_id = 100 AND l_emp.salary > 5000
      THEN
         DBMS_OUTPUT.put_line (l_emp.last_name);
      END IF;
   END LOOP;

   CLOSE emps_cur;
END;
There. Job done. But...really? I am going to fetch all the rows in the employees table and then check the department ID and salary? No, no, no! That should be done in SQL. So it should at least look like this:
DECLARE
   CURSOR emps_cur
   IS
      SELECT * FROM hr.employees
       WHERE department_id = 100 AND salary > 5000
       ORDER BY last_name;

   l_emp   emps_cur%ROWTYPE;
BEGIN
   OPEN emps_cur;

   LOOP
      FETCH emps_cur INTO l_emp;

      EXIT WHEN emps_cur%NOTFOUND;

      DBMS_OUTPUT.put_line (l_emp.last_name);
   END LOOP;

   CLOSE emps_cur;
END;
Now at least we are using a little of the power of SQL, and thereby minimizing the number of rows brought needlessly back from the context switch to the SQL engine.

I could, however, make things much simpler - and faster - with a cursor FOR loop:
BEGIN
   FOR l_emp IN (
      SELECT last_name FROM hr.employees
       WHERE department_id = 100 AND salary > 5000
       ORDER BY last_name)
   LOOP
      DBMS_OUTPUT.put_line (l_emp.last_name);
   END LOOP;
END;
In this third iteration, I have stepped waaaay back from writing an algorithm (declare cursor, open cursor, fetch next row, stop when no more rows, display data, back to fetch), and instead have told the compiler, in effect:
Please display the last name for all rows identified by that query.
I don't tell it how to get the job done. I let the compiler figure out the best execution path. No need to open a cursor, fetch, check to see if done, close the cursor, etc.

And boy does the compiler figure out a better execution path! Since I no longer explicitly fetch on a row-by-row basis, the PL/SQL optimizer is free to choose a different approach, and it does (with the PL/SQL optimization level set to 2 or higher): it retrieves 100 rows with each "bulk" fetch, resulting in many fewer context switches and much better performance.

Plus fewer lines of code, which will be greatly appreciated by the developers who maintain your code in years to come.

Nested Table Operators

Nested tables are just one of the three different types of collections in PL/SQL (the others are associative arrays and arrays). But there are a whole boatload of set-oriented features available only to nested tables. That's because they were designed from the start to be multisets, like relational tables. We've got....
  • MULTISET UNION - similar to SQL UNION
  • MULTISET EXCEPT - similar to SQL MINUS
  • MULTISET INTERSECT - similar to SQL INTERSECT
  • SET - Removes duplicates from a nested table (multisets can have duplicates)
  • SUBMULTISET - Returns TRUE if one nested table is entirely contained within another
  • MEMBER OF - Is a value an element of (in) the nested table?
  • = - Um, do I need to explain what this does?
You could implement the functionality embedded in each one of these, and you might even have some fun doing it. But almost certainly your code would have bugs, or would be lots slower, or would be...lots of code.

In a way, my favorite of all these is "=". It's the clearest demonstration of the power of declarative programming in this section. Suppose I define two nested tables as follows (that's right: a nested table of nested tables!).
CREATE OR REPLACE TYPE numbers_t IS TABLE OF NUMBER
/

CREATE OR REPLACE TYPE nt_of_numbers_t IS TABLE OF numbers_t
/

CREATE OR REPLACE PACKAGE nts AUTHID DEFINER
IS
   n1 nt_of_numbers_t 
      := nt_of_numbers_t (numbers_t (1, 2, 3), numbers_t (4, 5, 6)); 
   n2 nt_of_numbers_t 
      := nt_of_numbers_t (numbers_t (4, 5, 6), numbers_t (1, 2, 3)); 
END;
/
Now I want to know if n1 and n2 are equal (that is, they contain the same elements - and order is not significant), I could write something like this:
DECLARE 
   l_equals   BOOLEAN := TRUE; 
BEGIN 
   FOR indx IN 1 .. nts.n1.COUNT 
   LOOP 
      FOR indx2 IN 1 .. nts.n1 (indx).COUNT
      LOOP
         l_equals := 
            nts.n1 (indx)(indx2) = nts.n2 (indx)(indx2); 
         EXIT WHEN NOT l_equals; 
      END LOOP;
   END LOOP;

   DBMS_OUTPUT.put_line (
      CASE WHEN l_equals THEN '=' ELSE '<>' END); 
END; 
/
That's already bad enough - but then if you factor in the logic you need to write to ensure that order is not significant....OMG. Sure, we can write that stuff. We all took classes on algorithms in university (or some of us, anyway. My university education in computer science was actually very skimpy). We all know how to type s-t-a-c-k-o-v... in Google.

So, yeah, we could muscle our way through it. But why? Instead I could write nothing more than:
BEGIN  
   DBMS_OUTPUT.put_line (
      CASE WHEN nts.n1 = nts.n2 THEN '=' ELSE '<>' END); 
END; 
/
Ah, so nice.

And again, in addition to the simplicity of the code, you make it possible for your code to get faster over time, as the very smart folks over at Oracle HQ continually work to optimize performance of built-in elements of the PL/SQL language.

Here are some links to LiveSQL scripts demonstrating these features:

MULTISET Union Examples
MULTISET Intersect Examples
MULTISET Except Examples
SUBMULTISET Demonstation

Anchored Declarations

OK, I confess, that using anchored declarations doesn't always result in less code. Sometimes you will type a few more characters. But over time you are sure to save many keystrokes, when it comes to not having change your code in the future - and debug it today.

What, you may ask, is an anchored declaration? When you anchor or connect the datatype for your variable or constant declaration back to another, previously-defined element.

There are two forms of anchored declarations, %TYPE and %ROWTYPE:
name%TYPE
name%ROWTYPE
You can anchor back to another PL/SQL variable or constant, but the real value of this syntax comes from the ability to anchor back to a column or table. PL/SQL is, after all, a database programming language. So it should be hyper-aware of and able to take advantage of stuff that's in the database.

Suppose I want to declare a variable that "looks like" the last_name column in the employees table. I could look up the definition of the table, see that last_name is VARCHAR2(100), and then write this code:
PROCEDURE do_stuff
IS
   l_last_name VARCHAR2(100);
Fine. They are in synch. Well, only in your mind. When the DBA or another developer comes along later and issues this DDL statement:
ALTER TABLE employees MODIFY last_name VARCHAR2(200)
your code and your table are now officially out of synch. And if you select a really long last name from the table into that variable, kaboom! Your program fails with a VALUE_ERROR exception.

That's no good. Instead, declare your variable as follows
PROCEDURE do_stuff 
IS
   l_last_name employees.last_name%TYPE;
Now you have declaratively linked your variable to this database structure. And, lo and behold, if the definition of this column changes, Oracle Database will automatically invalidate the DO_STUFF procedure. When it is recompiled, it will pick up the new definition of last_name and your code will be fully in synch with your table.

%TYPE's perfect for individual columns. Use %ROWTYPE when you want to declare a record based on a table, view or cursor (essentially what the cursor FOR loop does for you implicitly).
PROCEDURE do_stuff 
IS
   l_employee employees%ROWTYPE;
Bottom line: you tell the PL/SQL compiler "Declare this variable to be like that other thing." and you are done. Oracle Database automatically tracks dependencies and ensures that your code always matches the state of your database objects.

How cool is that?

What Else?

Do you have a suggestion for another handy-dandy declarative construct? Let me know and I can add it the post (with your name up in lights!).

Comments

  1. One caution I have for those who want to use anchored declarations is to be careful in putting these in store procedure parameters or in package specs. The reason is the following. If execute on the pl/sql code is granted to another user, that user also needs to have select to the source object of the %TYPE and %ROWTYPE. This may not be what you want so think about who is going to consume your pl/sql.

    ReplyDelete
  2. Anchoring your variables as described is a needless obfuscation. Just make all your character variables LONG (ie; VARCHAR2(32767)), and your variables will never explode anyway (except when they reach the max. length of 32K).

    ReplyDelete
    Replies
    1. Again with the 32K varchar2s! :-)

      Well, Martin, I've tweeted your recommendation to see if anyone has any thoughts to share - I'd like to get as broad a perspective as I can, since you've raised this before.

      I don't like that idea. I don't like hard-coding a limit that could grow over time. I don't like writing code that will have developers shaking their heads and wondering why I did that.

      And I don't like green eggs with ham.

      Hopefully some other developers will chime in.

      Delete
    2. > I've tweeted your recommendation to see if anyone has any thoughts to share

      I shall resist the temptation to response myself. ;-)


      Honestly, it's so much easier. You never have to worry about exceeding the length, and the word LONG stands out to the eye much better than a %TYPE does. You don't have worry about the correct spelling of the table/column name either.

      You say you "don't like writing code that will have developers shaking their heads..." but why would they do that? I find having to decypher the %TYPE in my head much harder than reading the 4-letter word LONG -- and yet they equate (functionally) to the same thing.

      Granted, it would be much better/meaningful if you could add an unbounded definition of the VARCHAR2 datatype into the package STANDARD instead of having to use the word LONG. (Or perhaps since VARCHAR (without the 2) is now never used, use it for that instead).


      > And I don't like green eggs with ham. Hopefully some other developers will chime in.

      If you're eating green eggs & ham, then it's def. chyme, not chime. ;-)

      Delete
    3. From the doc: "Oracle supports the LONG and LONG RAW data types only for backward compatibility with existing applications."

      Delete
    4. A perfect reason to introduce a new unbounded VARCHAR datatype. ;-)

      Delete
  3. A week later, how did you get on....?

    ReplyDelete
    Replies
    1. Thanks for asking, Martin. I assume you are wondering what kind of response I got re: your recommendation. Here you go: https://twitter.com/sfonplsql/status/1058067569335304194

      Delete
    2. I can't look at that while at work, but I wondered if I had convinced you that using %TYPE was not as useful as using simple datatypes?

      Delete
    3. "not as useful as" - not quite sure how to respond to that phrasing.

      Seems to me that the usage of %TYPE has its place, and there are also scenarios where it might be better off to not use it.

      Bottom line: if you want to explicitly and in a self-documenting way tie your variable/parameter/etc to an underlying table or column, the %TYPE and %ROWTYPE attributes are great.

      Delete
    4. > Bottom line: if you want to explicitly and in a self-documenting way tie your variable/parameter/etc to an underlying table or column, the %TYPE and %ROWTYPE attributes are great.

      Sure, if you WANT to tie the down, but my point is that it's better not to.
      Easier to read, less obfuscation.

      Delete
    5. I more and more come to the conclusion that it depends (who had thought :D):
      There are cases where %TYPE brings a lot of clarity, revealing a huge amount of intention.
      There are also a lot of cases where explicit attributes (ideally combined with subtype constraints) bring a ton of value in terms of robust code.
      And there are cases, where it's just so much more convenient to use %ROWTYPE...

      Delete
    6. %ROWTYPE ? Sure, but that's not what we're debating. We're debating %TYPE.

      I've since thought of another positive reason to use my max. VARCHAR2 suggestion (aka. LONG) instead of %TYPE. There is not the need to recompile the code unlike when using %TYPE.

      Steven said: "Now you have declaratively linked your variable to this database structure. And, lo and behold, if the definition of this column changes, Oracle Database will automatically invalidate the DO_STUFF procedure. When it is recompiled, it will pick up the new definition of last_name and your code will be fully in synch with your table".

      This isn't necessary if you use VARCHAR2(32767) - aka. PL/SQL LONG.

      Delete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of work, revamping algorithms, ensuring correctness, you know the score. Then my eyes snagge

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p