Skip to main content

Minimize context switches and unnecessary PL/SQL code: an example from the PL/SQL Challenge

On the PL/SQL Challenge, when you click on a link to play a quiz, you are taken to the "launch" page. We give you an opportunity to review assumptions and instructions, and then press the Start button when you are ready (your score is based in part on the time it takes you to answer).

However, if you've taken that particular quiz before, and there have been no changes to assumptions or instructions, the launch page just gets in the way.

So I decided to streamline the flow on our site as follows:

1. If a person has never taken this quiz before, go to the launch page.

2. Otherwise, if assumptions or instructions have changed since the last playing of the quiz, go to the launch page.

3. Otherwise, go straight to the Play Quiz page.

I figured the way to do this is build a function that will be invoked from Oracle Application Express. Here is a first pass, using the top-down design technique, at implementing the function.

CREATE OR REPLACE PACKAGE qdb_player_mgr
IS
   FUNCTION can_skip_launch_page (comp_event_id_in   IN INTEGER,
                                  user_id_in         IN INTEGER)
      RETURN BOOLEAN;
END qdb_player_mgr;
/

CREATE OR REPLACE PACKAGE BODY qdb_player_mgr
IS
   FUNCTION can_skip_launch_page (comp_event_id_in   IN INTEGER,
                                  user_id_in         IN INTEGER)
      RETURN BOOLEAN
   IS
      l_can_skip          BOOLEAN;
      l_last_changed_on   DATE;
   BEGIN
      l_can_skip := NOT competition_answered (comp_event_id_in, user_id_in);

      IF l_can_skip
      THEN
         l_last_changed_on :=
            GREATEST (last_assumption_change (comp_event_id_in),
                      last_instruction_change (comp_event_id_in));
         l_can_skip :=
            l_last_changed_on <=
               most_recent_answer (comp_event_id_in, user_id_in);
      END IF;

      RETURN l_can_skip;
   END;
END qdb_player_mgr;
/

That's easy to understand, right? So the next step would be to implement each of the functions, each of which I can describe as follows:
  • competition_answered: use the comp_event_id to get it's parent competition ID. See if the specified user answered any events in this competition.
  • last_assumption_change: use the comp_event_id to get it's parent competition ID. Get the last-changed date for assumptions for that competition.
  • last_instruction_change: use the comp_event_id to get it's parent competition ID. Get the last-changed date for instructions for that competition.
  • most_recent_answer: use the comp_event_id to get it's parent competition ID. Get the maximum start date for any answers by this user for any events in this competition.
I hope that you are now saying: "No, Steven, don't do it! Don't write a separate function for each of these conditions!"

But, but, why not? I so looooove to write PL/SQL functions!

Even without really understanding our data model, the first hint of a problem with this approach should come with the repetition of "Use the comp_event_id to get it's parent competition ID."  Why would I want to have to do that over and over again?

The other, more general problem with this approach is that it looks like I will execute four SELECT statements en route to coming up with my return value. That's four context switches between the PL/SQL and SQL engines. 

For best performance, I would like to minimize context switches, and beyond that avoid unnecessary SQL statements. In other words, when looking at requirements for a new program like this, I should first ask: "How much information can I get in a single SELECT?"

Turns out, an awful lot.  In the implementation below, I join my competition events table to my parent competitions table, so I can get the information for that competition once. Then I use it in the single value subquery to get the latest answer submitted for that user across all events in the competition.

   FUNCTION can_skip_launch_page (comp_event_id_in   IN INTEGER,
                                  user_id_in         IN INTEGER)
      RETURN BOOLEAN
   IS
      l_can_skip          BOOLEAN;

      CURSOR answer_info_cur
      IS
         SELECT c.instr_last_changed_on,
                c.assump_last_changed_on,
                (SELECT MAX (start_date)
                   FROM qdb_compev_answers eva, qdb_comp_events ce
                  WHERE     eva.user_id = can_skip_launch_page .user_id_in
                        AND eva.comp_event_id = ce.comp_event_id
                        AND ce.competition_id = c.competition_id)
                   last_taken_on
           FROM qdb_comp_events ce, qdb_competitions c
          WHERE     ce.comp_event_id = can_skip_launch_page .comp_event_id_in
                AND ce.competition_id = c.competition_id;

      answer_info_r       answer_info_cur%ROWTYPE;
   BEGIN
      OPEN answer_info_cur;
      FETCH answer_info_cur INTO answer_info_r;
      CLOSE answer_info_cur;

      l_can_skip := 
            answer_info_r.last_taken_on >=
               GREATEST (answer_info_r.instr_last_changed_on,
                         answer_info_r.assump_last_changed_on);

      /* If no answer, then last_taken_on is NULL, so "can skip" might
         be NULL at this point, and that means "cannot skip." */
      RETURN NVL (l_can_skip, FALSE);
   END;

I could have used a SELECT-INTO for this single-row query, but then I would have had to declare a record type to match the cursor 's SELECT list or declared individual variables for each expression returned. With an explicit cursor, I can simply declare a record using %ROWTYPE.

This second solution is not as easy to read as the first implementation - unless you are comfortable with SQL and the data model of the application at hand (which obviously you would be, if it was your code!). 

But if you are going to write applications on top of a relational database, you should be comfortable - very comfortable - with SQL.

Comments

  1. Great example.

    I remember first embracing the concept of PL/SQL table APIs back in '95 when I read your pioneering book. I took it too far though, over-engineering and designing functions for all sorts of queries and operations I thought the app devs might need. When I tried to use them myself, it left me ill to see how inefficient and chatty the code was compared to a well-written DML statement. It also peppered the schemas with tons of unused code that became more difficult to clean out as time wore on.

    In the end, neither the table APIs, or my plethora of "clever" functions, ended up being what the applications needed. They needed custom, efficient SQL statements that satisfied unique requirements, as you've done here. The pursuit of a re-usable data layer API has left me more prone to create nothing but the common framework up front (security, logging, messages, email, etc.), only crafting PL/SQL routines as needed by the JSON-fluent middle tier services.

    ReplyDelete
  2. Thanks, Bill, very interesting. But you still really just want to have one "way" for any consumer of the table to, say, insert a row into the table, right?

    ReplyDelete
    Replies
    1. Yes, sorta. One published, authorized PL/SQL insert proc to be used by OLTP apps for transactional tables (and less volatile tables the app is allowed to administer). We also use various technical and cultural measures to ensure enthusiastic developers aren't going around the API to do their own thing.

      But the Dev DBAs sometimes have to write their own inserts, because we have backend jobs, loads, AQ processes, web service calls, and sync processes that bring data in bulk and need custom SQL the simple insert proc doesn't satisfy.

      So publicly, exposed to the developers, yes, one way to insert new. Privately, we might have two or three.

      I think the overall point of the post is great wisdom, that sometimes the goodness and readability of PL/SQL functions can backfire and be much more efficiently handled with well-designed, set-oriented SQL.

      Delete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of work, revamping algorithms, ensuring correctness, you know the score. Then my eyes snagge

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p