Skip to main content

Do you REALLY need that SQL to be dynamic?


Dynamic SQL means a SQL statement that is constructed, parsed and executed "dynamically" at run time (vs. "statically" at compile time).

It's very easy to write static SQL in PL/SQL program units (one of the great joys of working with this database programming language). It's also quite easy to implement dynamic SQL requirements in PL/SQL.

But that doesn't mean you should. The bottom line regarding dynamic SQL is:
Construct and execute SQL at runtime only when you have to.
There are several good reasons to avoid unnecessary dynamic SQL:
  1. Security: dynamic SQL opens up the door to SQL injection, which can lead to data corruption and the leaking of sensitive data.
  2. Performance: while the overhead of executing dynamic SQL has gone way down over the years, it is certainly still faster to use static SQL.
  3. Maintainability: the code you write to support dynamic SQL is more - literally more code - and harder to understand and maintain.
Sometimes the misuse of dynamic SQL is obvious. Consider the following:

CREATE OR REPLACE FUNCTION name_from_id (id_in IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name   the_table.the_name%TYPE;
BEGIN
   EXECUTE IMMEDIATE 'select the_name from the_table where id = ' || id_in
      INTO l_the_name;

   RETURN l_the_name;
END;
/

The developer apparently believed, however, that since the value of the ID can change, it needs to be concatenated to the SQL statement, so it's gotta be done dynamically. Ha! There's nothing dynamic about this query. It should be rewritten to:

CREATE OR REPLACE FUNCTION name_from_id (id_in IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name   the_table.the_name%TYPE;
BEGIN
   SELECT the_name
     INTO l_the_name
     FROM the_table
    WHERE id = id_in;

   RETURN l_the_name;
END;
/

Often, the need for dynamic SQL is compelling at first, but then disappears with a little bit of analysis. Consider this function:

CREATE OR REPLACE FUNCTION name_from_id (table_in   IN VARCHAR2,
                                         id_in      IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name   VARCHAR2 (32767);
BEGIN
   EXECUTE IMMEDIATE
      'select the_name from ' || table_in || ' where id = ' || id_in
      INTO l_the_name;

   RETURN l_the_name;
END;
/

Well, heck, if I don't know the table name till run time, I sure can't use a static SELECT statement, right? Of course, right!

So, first of all, if this code is really and truly necessary, you need to think about SQL injection.

1. Is the table name provided directly by the user? You should never allow user input to be stuffed directly into your dynamic statement. They could enter all sorts of nasty stuff.

2. Even if not passed directly from user fingertips to EXECUTE IMMEDIATE, you should use DBMS_ASSERT to make sure that the table name passed in is a valid DB object, as in:

CREATE OR REPLACE FUNCTION name_from_id (table_in   IN VARCHAR2,
                                         id_in      IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name   VARCHAR2 (32767);
BEGIN
   EXECUTE IMMEDIATE
      'select the_name from ' || 
         SYS.DBMS_ASSERT.sql_object_name (table_in) || 
      ' where id = :id'
      INTO l_the_name
      USING id_in;

   RETURN l_the_name;
END;
/

Great, so the function is less vulnerable than before to injection. But does it really need to be dynamic?

The only way to answer that question is to check and see where and how the function is used. I could do a text search through my code (via an editor like Sublime) or an object search (in SQL Developer). I could also use PL/Scope if I'd gathered identifier information across my code base.

Suppose after doing my analysis, I find that the function is called twice as follows:

l_name := name_from_id (table_in => 'TABLE1', id_in => l_id);

l_recent_name := name_from_id (table_in => 'TABLE2', id_in => l_most_recent_id);

OK, I could shrug and say "Yep, two different table names. I need to use dynamic SQL."

But that would be a mistake. What I should think and say is:

"What? Just two different tables? I don't need dynamic SQL for that. That's just laziness."

Instead I could do either of the following:
  1. Create two different functions.
  2. Change to static SQL inside the function.
Two Different Functions

Really, why not? It's so easy and fast to write PL/SQL functions that call SQL. Just change to:

CREATE OR REPLACE FUNCTION name_from_table1_id (id_in IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name table1.the_name%TYPE;
BEGIN
   SELECT the_name
     INTO l_the_name
     FROM table1
    WHERE id = id_in;

   RETURN l_the_name;
END;
/

CREATE OR REPLACE FUNCTION name_from_table2_id (id_in IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name table2.the_name%TYPE;
BEGIN
   SELECT the_name
     INTO l_the_name
     FROM table2
    WHERE id = id_in;

   RETURN l_the_name;
END;
/

One Function, No Dynamic SQL

So you don't want two, three many functions for the "same" functionality (get name for ID). Fine, keep them all in one subprogram, but move the conditional logic inside.

CREATE OR REPLACE FUNCTION name_from_id (table_in   IN VARCHAR2,
                                         id_in      IN INTEGER)
   RETURN VARCHAR2
   AUTHID DEFINER
IS
   l_the_name   VARCHAR2 (32767);
BEGIN
   CASE table_in
      WHEN 'TABLE1'
      THEN
         SELECT the_name
           INTO l_the_name
           FROM table1
          WHERE id = id_in;
      WHEN 'TABLE2'
      THEN
         SELECT the_name
           INTO l_the_name
           FROM table2
          WHERE id = id_in;
      ELSE
         raise_application_error (
            -20000,
            'name_from_id does not support fetching from ' || table_in);
   END CASE;

   RETURN l_the_name;
END;
/

Sure, sometimes you really do need to use EXECUTE IMMEDIATE and dynamic SQL. In those situations, make sure you minimize the attack surfaces for SQL injection. Make sure you've got strong exception handling and logging.

But do make certain that this extra "technical debt" is necessary. There's a good chance you can get by without dynamic SQL. If so, your users, your fellow developers and your manager will all thank you!

Comments

  1. Hey, Steven

    Aren't your last two code examples swapped?

    And your code example using DBMS_ASSERT is a bit safer on "table_in" but not on "id_in". Besides DBMS_ASSERT it ought to use a bind variable and USING. Not just for safety, but also to avoid too much hard parsing.

    Apart from the nitpicking I agree totally with your main point - only use dynamic SQL where it is really really necessary.

    If not for the good reasons shown here, then at least the lazy developer should think about how much more troublesome it is to develop, test and debug dynamic SQL, since it isn't parsed until runtime. You won't catch your errors at compiletime ;-)

    Cheerio
    /Kim

    ReplyDelete
    Replies
    1. Thanks, Kim, I switched the code around, absolutely correct on that score. And certainly that should be a bind variable. Done!

      Delete
  2. This PL/Sql is sweet dream of every developer :)

    ReplyDelete
  3. Hello Steven

    I agree with your point about usage of dynamic SQL. However, i have to notice that your examples are rather simplistic. For analysis sake, lets consider a case where you have to handle arbitrary number of parameters that you have to use as filter.

    In example i have a page where i can dynamically define filtering criteria. I can select which fields i will use for filtering and i of course provide those values. The form on the page (browser page) is totally dynamic meaning that i can select which fields i want to use and i can define relationship between fields. I can also group fields and have multiple groups as filter. So, I start with blank form. Only control i have at the beginning is a drop down menu where i select the field on which i will do the search and then i build from there adding additional fields, grouping them adding AND/OR between fields or between field groups. As you can see one search can contain one set of field combinations and another will contain completely separate set of field combinations.

    How would you tackle this problem without using the dynamic SQL? Also for analysis sake lets say that we have 10 different fields to choose from and they are different data types.

    Cheers

    ReplyDelete
    Replies
    1. Hi, Nenad

      We see in daily life out there in the developers world loads of cases where developers used dynamic SQL for just such simplistic cases. Steven is highlighting that simple cases don't need dynamic SQL.

      Your case isn't simplistic ;-)

      With some restriction on what users may choose in your search form, I think I might (after lots and lots of hard thinking work) design a monster SQL statement that could handle it - but it would be terrible to do so and be absolutely awful for the optimizer to handle. I won't try it.

      I'd consider such a case a prime example of just what dynamic SQL is meant for. Steven says do dynamic SQL "only when you have to" - and this would in my opinion be a case where "you have to".

      But the second point of Stevens post then is, that "when you have to", you at the same time have to be very careful when you construct your dynamic SQL. A search form like yours where users select fields from dropdowns is better than allowing users to type field names, for example. Even so, use of DBMS_ASSERT would still be a good idea. Or alternatively not sending field names as parameters but some numeric field identifier that you make a function "field_id_2_name" with a CASE structure - that way SQL injection also is not possible. And of course bind the search values the user enters rather than just appending text also hardens your dynamic SQL.

      The point remains - for simple cases it is very very unlikely you'll need dynamic SQL, so don't. For complex cases, think about if it might be broken down into multiple simple cases you can do with static SQL - if it can't, then you probably have a legitimate use for dynamic SQL :-)

      My 2 cents...

      Cheerio
      /Kim

      Delete
    2. Thanks for that in-depth and nuanced response, Kim.

      Yes, Nenad, I am definitely not saying "Don't ever use dynamic SQL."

      Sometimes even if it is *possible* to use static SQL for a solution, that solution is so difficult to understand or manage, dynamic SQL becomes the preferred approach.

      Delete
    3. Oh, let me add another choice you have, which could sort of be called "dynamic at compile-time" as opposed to "dynamic at run-time" - that is generating code.

      If you have for example a fixed number of combinations of "building blocks" you would build your dynamic SQL with, an alternative to the dynamic SQL is to make a bit of code that iterates all the combinations and generate the CASE structure like Stevens example, just with 117 THEN clauses each containing one generated static SQL statement.

      This generator-approach for "medium complexity" situations (not the full-blown dynamic search form) has the advantage of being completely SQL injection proof, plus the static cursors are more likely to be in cursor cache and avoid even the soft parsing.

      Often dynamic SQL will be quite appropriate for even such cases (if you take care of the SQL injection), but there can be cases where the generator approach is preferable.

      Delete
  4. Thanks Kim and Steven
    I agree it is a complex form and it was not a trick question. It is actually based on real life scenario. I just wanted to make sure i am not missing something.
    Thanks again.

    Cheers

    ReplyDelete
  5. I'll one thing to the concept of "generic search" which is a common challenge where we "build" a SQL on the fly.

    This is one of the great arguments for quality instrumentation in your code. Track every search that's done in your app, and typically you'll end up with a set of SQL's that are far more common than others. You might end up with metrics like:

    - all records for today: 1000 occurences
    - all records for customer: 800 occurences
    - records for customer with sales less than $10 paid with credit card: 10 occurences

    and so forth. In these instance, I *still* like to consider the use of some static SQL along the lines of:

    if [criteria = records for today] then
    specifically crafted SQL
    elseif [criteria = records for customer ] then
    specifically crafted SQL
    else
    no we build the dynamic SQL in the normal way


    That lets us manufacture the very best SQL's we can for the more frequent searches.

    ReplyDelete
    Replies
    1. Love this. Great combination of: "Do some slick programming" and "Fine tune your app to its actual use in the real world." Need to think about where to apply this in the Dev Gym!

      Delete
  6. From SQL tuning perspective, to reduce the hard parseing of SQL codes in library cache, most of DBAs, of course including Oracle official Docs, encourage to code the dynamic SQL so that can make use of bind variables. Aren't they right?
    By the way, Steven - did you participate in writing the Oracle online help Docs before?

    Thanks,
    Gavin

    ReplyDelete
    Replies
    1. Gavin, could you please provide me with a URL to doc that says this? This is flat-out wrong for PL/SQL. Static SQL *always* makes use of bind variables in PL/SQL. With Dynamic SQL, you can and *should* use bind variables but many developers do not.

      My participation in PL/SQL doc is minimal. I provide feedback and sometimes review content, but I do not write it, per se.

      Delete
    2. Sorry. I tried to find out the exact passage in Oracle online help Docs, which encourages/recommends to use dynamic SQL to make use of bind variables better. But I have not yet. Maybe I made mistake ...
      I will get a chance to find it. and if get it, I will let you know.

      Thanks,

      Gavin

      Delete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of wo...

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch, ...

PL/SQL 101: Three ways to get error message/stack in PL/SQL

The PL/SQL Challenge quiz for 10 September - 16 September 2016 explored the different ways you can obtain the error message / stack in PL/SQL. Note: an error stack is a sequence of multiple error messages that can occur when an exception is propagated and re-raised through several layers of nested blocks. The three ways are: SQLERRM - The original, traditional and (oddly enough) not currently recommended function to get the current error message. Not recommended because the next two options avoid a problem which you are unlikely  to run into: the error stack will be truncated at 512 bytes, and you might lose some error information. DBMS_UTILITY.FORMAT_ERROR_STACK - Returns the error message / stack, and will not truncate your string like SQLERRM will. UTL_CALL_STACK API - Added in Oracle Database 12c, the UTL_CALL_STACK package offers a comprehensive API into the execution call stack, the error stack and the error backtrace.  Note: check out this LiveSQL script if...