Zero tolerance for out-of-scope references!

I received this question last week:
Hi Steven,
Here is a question that I have been pondering for sometime and haven't found a definitive answer to.  Perhaps you can shed some light.
When writing a procedure in PL/SQL, I pass it all the variables that it needs to do it's job.  Of course, it has access to any variables that are defined at the global level - but I rarely if ever use these (perhaps a few constants here and there). 
However, if I have a procedure that has a couple nested sub-procedures - should I follow the same pattern as, above?  It seems a bit silly to pass parameters to a proc and then re-pass those same parameters to a nested sub-proc when that nested sub-proc already has the variables in scope. 
So should the rule of thumb be: 
For procedures, pass all parameters in or use local variables as needed. Use global variables only in extraordinary circumstances. 
or 
For nested sub-procedures, don't pass variables in or out.  Use what is already in scope in the outer procedure. Only pass parameters to nested sub-procedures for special situations.  
Thanks in advance, Alan
And here is my response:

In an ideal world (looking around my basement office, can't find it), the answer is simple and clear:
NEVER code out-of-scope references in your subprograms. Use the parameter list to pass in and out any variables that are declared outside that subprogram.
There, all done. Nice and neat.

Well, except for reality. The reality is that it can be time consuming to extract all out-of-scope references in that new subprogram and convert them into parameters.

So check to see if your editor can help you do that!

But if you do take the time to carefully refactor your code when you created nested subprograms (and you should do both: created nested subprograms and carefully refactor your code), the results will be appreciated by all

The likelihood of you putting a bug into your code as you write drops. And there will be an even steeper decline in the chance of someone coming along later and inadvertently introducing a bug.

Example: instead of this...

CREATE OR REPLACE PROCEDURE my_proc AUTHID DEFINER
IS
   TYPE numbers_t IS TABLE OF NUMBER;

   l_numbers   numbers_t;

   PROCEDURE double_up
   IS
   BEGIN
      FOR indx IN l_numbers.FIRST .. l_numbers.LAST
      LOOP
         l_numbers (indx) := 2 * l_numbers (indx);
      END LOOP;
   END;
BEGIN
   l_numbers := numbers_t (1,2,3,4,5);

   double_up;
END;
/

do this...

CREATE OR REPLACE PROCEDURE my_proc AUTHID DEFINER
IS
   TYPE numbers_t IS TABLE OF NUMBER;

   l_numbers   numbers_t;

   PROCEDURE double_up (numbers_io IN OUT numbers_t)
   IS
   BEGIN
      FOR indx IN numbers_io.FIRST .. numbers_io.LAST
      LOOP
         numbers_io (indx) := 2 * numbers_io (indx);
      END LOOP;
   END;
BEGIN
   l_numbers := numbers_t (1,2,3,4,5);

   double_up (l_numbers);
END;
/

[4 September addition to post:  As you will see in the comments, Joao suggested using PL/Scope to help us identify out of scope references. I agreed. I even agreed as I was writing and publishing the original post. But I was lazy or busy or something. So no PL/Scope solution was included. Then I got less busy or less lazy or something. See what you think of the following.]

So we all agree - ? - that out of scope references should be avoided. It would be awfully nice if we could automatically detect all such references. If only, if only, PL/SQL gave us a way to do such analysis....

It does! It's called PL/Scope and it is a fantastic and still poorly understand (by me) and under-utilized (by everyone) code analysis tool built into PL/SQL. Read all about it here.

So suppose I execute the following statements (first, I enable collection of identifier data for PL/Scope, then I compile a package). Out of scope references marked in orange.

ALTER SESSION SET plscope_settings='identifiers:all'
/

CREATE OR REPLACE PACKAGE plscope_demo AUTHID DEFINER
IS
   g_global   NUMBER;

   PROCEDURE my_procedure (
      param1_in   IN INTEGER,
      param2      IN employees.last_name%TYPE);
END plscope_demo;
/

CREATE OR REPLACE PACKAGE BODY plscope_demo
IS
   g_private_global   NUMBER;

   PROCEDURE my_procedure (
      param1_in   IN INTEGER,
      param2      IN employees.last_name%TYPE)
   IS
      c_no_such       CONSTANT NUMBER := 100;
      l_local_variable         NUMBER;
      another_local_variable   DATE;
      
      FUNCTION return_local RETURN NUMBER
      IS
      BEGIN
         RETURN l_local_variable;
      END;
   BEGIN
      IF param1_in > l_local_variable
      THEN
         DBMS_OUTPUT.put_line (param2);
         g_global := 100;
      ELSE
         DBMS_OUTPUT.put_line (c_no_such);
      END IF;
      
      IF c_no_such IS NOT NULL
      THEN 
         g_private_global := 1;
      END IF;
   END my_procedure;
END plscope_demo;
/

I can then run the following query to identify variables, constants and exceptions that are referenced out of scope:

WITH declared_in
     AS (SELECT decl.NAME variable_name,
                ctxt.TYPE || '-' || ctxt.NAME declared_in
           FROM USER_IDENTIFIERS decl, USER_IDENTIFIERS ctxt
          WHERE     decl.USAGE_CONTEXT_ID = ctxt.USAGE_ID
                AND decl.TYPE IN ('CONSTANT',
                                  'VARIABLE',
                                  'EXCEPTION')
                AND decl.USAGE = 'DECLARATION'
                AND decl.OBJECT_NAME = ctxt.OBJECT_NAME
                AND decl.OBJECT_TYPE = ctxt.OBJECT_TYPE),
     vars_used_in
     AS (SELECT decl.NAME variable_name,
                ctxt.TYPE || '-' || ctxt.NAME referenced_in
           FROM USER_IDENTIFIERS decl, USER_IDENTIFIERS ctxt
          WHERE     decl.USAGE_CONTEXT_ID = ctxt.USAGE_ID
                AND decl.TYPE IN ('VARIABLE',
                                  'EXCEPTION')
                AND decl.USAGE IN ('REFERENCE', 'ASSIGNMENT')
                AND decl.OBJECT_NAME = ctxt.OBJECT_NAME
                AND decl.OBJECT_TYPE = ctxt.OBJECT_TYPE),
     constants_used_in 
     AS (SELECT decl.NAME variable_name,
                ctxt.TYPE || '-' || ctxt.NAME referenced_in
           FROM USER_IDENTIFIERS decl, USER_IDENTIFIERS ctxt
          WHERE     decl.USAGE_CONTEXT_ID = ctxt.USAGE_ID
                AND decl.TYPE = 'CONSTANT'
                AND decl.USAGE = 'REFERENCE'
                AND decl.OBJECT_NAME = ctxt.OBJECT_NAME
                AND decl.OBJECT_TYPE = ctxt.OBJECT_TYPE)
(SELECT * FROM vars_used_in UNION SELECT * FROM constants_used_in) 
MINUS
SELECT * FROM declared_in
/

Here's my output:










Now there is certainly more to be done here: add line numbers to make it easy to locate the out of scope ref, and more. 

But I feel, I feel, I feel a sense of laziness or is it busy-ness or something coming over me now, so perhaps someone else can take my starting script and finish it up.

Pretty please?

Comments

  1. Hi Steven,

    This is for Alan: you can use PL/SCOPE to help you identify those out-of-scope references.
    A query like this might help:

    WITH ui
    AS (SELECT *
    FROM user_identifiers
    WHERE object_name = 'MY_PROC')
    SELECT line
    , LPAD (' ', 2 * (LEVEL - 1)) || name n
    , TYPE
    , usage
    FROM ui
    START WITH usage_context_id = 0
    CONNECT BY PRIOR usage_id = usage_context_id
    ORDER SIBLINGS BY line, col;

    Unfortunately I haven't found a (pure-sql) way to detect the end of a sub-procedure and I'm not sure I want to create a pl/sql parser.
    If someone knows a way, please be so kind and share it with the world :)

    Cheers

    ReplyDelete
  2. Thanks, Joao! Funny thing - I was struggling to find time yesterday to use PL/Scope to show *only* out of scope references. I am sure I can do it, just a bit rusty with my PL/Scope skills. Should not need to do any parsing. Hopefully get it done today.

    ReplyDelete
  3. Hi Steven.
    That’s an interesting topic. Technical, yet a bit philosophical.
    I agree with the general rule, but feel a bit uncomfortable with the categorical “NEVER” and “Zero tolerance”.

    First, what about read-only declarations, like CONSTANTS, TYPES and CURSORS? I once read this good advice somewhere ;-)

    Package application-named literal constants together
    - Define constants that are referenced throughout your application in a single, central package
    - Define constants that are more specific to a single area of functionality within the package that encapsulates that functionality

    (“Oracle PL/SQL Best Practices” by Steven Feuerstein; I own a copy of the first edition from 2001)

    Second, there is a huge difference (in my opinion) between package-level global variables and local variables used in sub-modules. It’s true that both may be manipulated from different “places”, and this is where the potential bugs may rear their ugly head. But while the “globally-used local variables” live only for the duration of a single procedure call, the package-level variables continue to live for the duration of the session, between different calls to different procedures, which makes the risk for potential bugs much higher.

    Having said that, I don’t find *private* package-level variables useless. For example, they are great for implementing session-level cache, especially if you are using Standard Edition and therefore cannot use Function Result Cache.
    Regarding *public* package-level (non-constant) variables – can’t think of a justified use case for them.

    Finally, the query for identifying out-of-scope references is a great idea, but there are some “false negative” results it may find. Two that I think of are predefined exceptions and loop variables. For example, for this package:

    CREATE OR REPLACE PACKAGE plscope_demo AUTHID DEFINER IS
    PROCEDURE my_procedure;
    END plscope_demo;
    /

    CREATE OR REPLACE PACKAGE BODY plscope_demo IS

    PROCEDURE my_procedure IS
    l_local_variable NUMBER;
    BEGIN
    FOR l_implicitly_declared IN 1 .. 10
    LOOP
    l_local_variable := l_implicitly_declared;
    END LOOP;
    EXCEPTION
    WHEN no_data_found THEN
    NULL;
    END my_procedure;
    END plscope_demo;
    /

    The query will return:
    VARIABLE_NAME REFERENCED_IN
    L_LOCAL_VARIABLE ITERATOR-L_IMPLICITLY_DECLARED
    NO_DATA_FOUND PROCEDURE-MY_PROCEDURE

    Thanks,
    Oren Nakdimon @DBoriented
    http://db-oriented.com

    ReplyDelete
  4. Oren - excellent response! I agree:

    1. Package level public constants - excellent reason to reference them "out of scope"

    2. Package level private variables for caching - excellent reason to reference them "out of scope"

    False negatives: nice catch. I think that the script needs to be updated so that it walks up the tree from the reference/assignment till it hits the "owning" program unit. I would do it with a PL/SQL function. Anyone know how to do it in "pure" SQL?

    ReplyDelete
  5. Hm. As for standalone stored procedures I do agree with your my_proc / double_up example. However; in 99.9% of cases at least I don't use standalone procedures for production code. 99.9% of my PL/SQL code goes into packages. And when using packages if I pass all the parameters I don't see a reason for declaring your double_up procedure locally in the my_proc procedure but simply make it package private.

    And using packages I sometimes do like to make use of nested procedures to refactor bits of code which is somehow repeated all over the place. For example if I need to do a insert and based on some criteria a column differs; like

    procedure do_my_insert(ivVal1 in varchar2, ivVal2 in varchar2,
    some_condition in number, some_other_condition in number) is
    procedure do_insert(ivcond in varchar2) is
    begin
    insert into my_tab(val1, val2, cond) values (ivVal1, ivVal2, ivCond);
    end;
    begin
    if some_condition = some_other_condition then
    do_insert(ivVal1, ivVal2, 'MATCHED');
    elsif [....]
    do_insert(ivVal1, ivVal2, 'WHATEVER');
    [...]
    end if;
    end;

    In a package at least to me it wouldn't make sense to blow up the parameter list of the do_insert procedure, as then it wouldn't need to be declared as a nested procedure. It would be package private.
    I do keep the number of such nested procedures to a minimum, and I also try to keep them as short as possible. If a procedure contains a truckload of nested procedures which themselves are huge as well it might be time to refactor the outer procedure and maybe put the procedure in a new package with the nested procedures as package private.

    Of course this refactoring process is easier if the nested procedures have all their parameters passed as well; however then I go back to Argument A: why do they have to be nested procedures? Why didn't theybecome package private procedures in the first place?

    cheers

    ReplyDelete
  6. Christian, first, to me the decision about created a nested subprogram or private (package level) subprogram has to do with scope of usage. If that subprogram is only going to be used, only makes sense, within another subprogram, it is nested. If it could be or is used in multiple subprograms, then it is private.

    Even for that rule, though, a balancing act is necessary. I might be writing a really complex procedure and end up with 5 levels of nested subprograms. I then decide that is too much - makes the code hard to read, if only from the standpoint of excessive indentation. And so I move some of that out to private subprograms.

    Same balancing act regarding passing parameters: yes, the refactoring can become onerous and the parameter list bloated.

    So what do you do? You "cut corners" (violate the best practice to bow to reality) and/or revamp your parameter list to pass in, say, a record, instead of 25 parameters.

    ReplyDelete

Post a Comment

Popular posts from this blog

Table Functions, Part 1: Introduction and Exploration

Recommendations for unit testing PL/SQL programs

The future of Oracle PL/SQL: some thoughts on Sten Vesterli's thoughts