Friday, June 16, 2017

So you want to write a technical book?


I received this question today:
If I wanted to write a tech book, where/how would I start?
Rather than provide an individual answer, I thought I'd answer on my blog. Here goes.

First, how I answer this question for myself (the variation being: "Do you want to write another book?"):

No, don't do it.

:-)

I decided a few years ago that I would not write new books and instead keep my core set of books on PL/SQL up to date (for anyone who's wondering, that means essentially 3 out my 10 books on PL/SQL).

It takes a lot of time to write a book, any sort of book. And certainly with a technical book you need to be concerned about technical accuracy (slightly less critical with fiction :-) ).

In addition, people aren't buying books like they used to. Gee, thanks, Google (and people publishing ripped-off e-copies of books, and all the free content published on blogs and...).

So you definitely should not go into such a project thinking you are going to make much, if any, money on the book.

Some reasons to go ahead with such a project anyway:

  • You always wanted to publish a book, see your name listed as an author. 
  • You want to build your reputation in a given technology.
  • Along with (or through) that, you want to increase the revenue you can generate around that technology (speaking fees, hourly consulting rates).
Assuming you have decided to take the plunge, you need to:
  • Decide on a topic
  • Do lots of writing.
  • Find a publisher.
Mostly in the order. But I suggest that you do not write a whole book and then look for a publisher. That is likely necessary if you are writing a work of fiction. But with a technical book, it's a bit different.

Here's my suggestion, after you decide on a topic:

1. Come up with a table of contents for your book.

2. Start blogging about your topic. You don't even have to create your own blog. Publish on LinkedIn or Medium or any number of other channels.

Pick a chapter (maybe start at the beginning, maybe not) and do some writing. Publish it. See how people respond - to your writing, to the topic, etc.

If you get a strong response, then it is time to approach publishers. This where getting a technical book published can be so much easier than a work of fiction. 

You can offer your TOC, some samples of writing, and overall summary of a book, and from that alone, secure a contract with a publisher. 

I have a long, happy history with O'Reilly Media. But there are lots of technical publishers out there. And certainly an editor I very much respect and encourage you to seek out is Jonathan Gennick. I am sure he'd be happy to talk to you, and give you even more and better advice.

Friday, June 9, 2017

PL/Scope 12.2: Find all commits and rollbacks in your code

Yes, another post on PL/Scope, that awesome code analysis feature of PL/SQL (first added in 11., and then given a major upgrade in 12.2 with the analysis of SQL statements in PL/SQL code)!

A question on StackOverflow included this comment:
But there can be scenarios where it is difficult to identify where the ROLLBACK statement are executed in a complex PL SQL program (if you have to do only a modification to the existing code).
As of 12.2, it is super-duper easy to find all commits and rollbacks in your code.

Find all commits:

SELECT st.object_name,
       st.object_type,
       st.line,
       src.text
  FROM all_statements st, all_source src
 WHERE     st.TYPE = 'COMMIT'
       AND st.object_name = src.name
       AND st.owner = src.owner
       AND st.line = src.line
ORDER BY st.object_name,
         st.object_type   
/

Find all rollbacks:

SELECT st.object_name,
       st.object_type,
       st.line,
       src.text
  FROM all_statements st, all_source src
 WHERE     st.TYPE = 'ROLLBACK'
       AND st.object_name = src.name
       AND st.owner = src.owner
       AND st.line = src.line
ORDER BY st.object_name,
         st.object_type   
/

Reminder: these data dictionary views are populated only when your session or program unit has these settings enabled:

ALTER SESSION SET plscope_settings='identifiers:all, statements:all'

Friday, June 2, 2017

More 12.2 PL/Scope Magic: Find SQL statements that call user-defined functions

When a SQL statement executes a user-defined function, your users pay the price of a context switch, which can be expensive, especially if the function is called in the WHERE clause. Even worse, if that function itself contains a SQL statement, you can run into data consistency issues.

Fortunately, you can use PL/Scope in Oracle Database 12c Release 2 to find all the SQL statements in your PL/SQL code that call a user-defined function, and then analyze from there.

I go through the steps below. You can run and download all the code on LiveSQL.

First, I turn on the gathering of PL/Scope data in my session:

ALTER SESSION SET plscope_settings='identifiers:all, statements:all'
/

Then I create a table, two functions and a procedure, so I can demonstrate this great application of PL/Scope:

CREATE TABLE my_data (n NUMBER)
/

CREATE OR REPLACE FUNCTION my_function1
   RETURN NUMBER
   AUTHID DEFINER
IS
BEGIN
   RETURN 1;
END;
/

CREATE OR REPLACE FUNCTION my_function2
   RETURN NUMBER
   AUTHID DEFINER
IS
BEGIN
   RETURN 1;
END;
/

CREATE OR REPLACE PROCEDURE my_procedure (n_in IN NUMBER)
   AUTHID DEFINER
IS
   l_my_data   my_data%ROWTYPE;
BEGIN
   SELECT my_function1 ()
     INTO l_my_data
     FROM my_data
    WHERE     n = n_in
          AND my_function2 () = 0
          AND n = (SELECT my_function1 () FROM DUAL);

   SELECT COUNT (*)
     INTO l_my_data
     FROM my_data
    WHERE n = n_in;

   UPDATE my_data
      SET n = my_function2 ()
    WHERE n = n_in;
END;
/

Note that only two of the three DML statements in MY_PROCEDURE contain a function call (the first query and the update).

Now I UNION ALL rows from ALL_STATEMENTS and ALL_IDENTIFIERS to get a full picture:

WITH one_obj_name AS (SELECT 'MY_PROCEDURE' object_name FROM DUAL)
    SELECT plscope_type,
           usage_id,
           usage_context_id,
           LPAD (' ', 2 * (LEVEL - 1)) || usage || ' ' || name usages
      FROM (SELECT 'ID' plscope_type,
                   ai.object_name,
                   ai.usage usage,
                   ai.usage_id,
                   ai.usage_context_id,
                   ai.TYPE || ' ' || ai.name name
              FROM all_identifiers ai, one_obj_name
             WHERE ai.object_name = one_obj_name.object_name
            UNION ALL
            SELECT 'ST',
                   st.object_name,
                   st.TYPE,
                   st.usage_id,
                   st.usage_context_id,
                   'STATEMENT'
              FROM all_statements st, one_obj_name
             WHERE st.object_name = one_obj_name.object_name)
START WITH usage_context_id = 0
CONNECT BY PRIOR usage_id = usage_context_id
/

And I see these results:

PLSCOPE_TYPE    USAGE_ID    USAGE_CONTEXT_ID    USAGES
ID    1    0    DECLARATION PROCEDURE MY_PROCEDURE
ID    2    1      DEFINITION PROCEDURE MY_PROCEDURE
ID    3    2        DECLARATION FORMAL IN N_IN
ID    4    3          REFERENCE NUMBER DATATYPE NUMBER
ID    5    2        DECLARATION VARIABLE L_MY_DATA
ID    6    5          REFERENCE TABLE MY_DATA
ST    7    2        SELECT STATEMENT
ID    8    7          REFERENCE TABLE MY_DATA
ID    9    7          REFERENCE COLUMN N
ID    10    7          REFERENCE FORMAL IN N_IN
ID    11    7          REFERENCE COLUMN N
ID    13    7          CALL FUNCTION MY_FUNCTION1
ID    14    7          CALL FUNCTION MY_FUNCTION2
ID    15    7          ASSIGNMENT VARIABLE L_MY_DATA
ID    16    15            CALL FUNCTION MY_FUNCTION1
ST    17    2        SELECT STATEMENT
ID    18    17          REFERENCE TABLE MY_DATA
ID    19    17          REFERENCE FORMAL IN N_IN
ID    20    17          REFERENCE COLUMN N
ID    21    17          ASSIGNMENT VARIABLE L_MY_DATA
ST    22    2        UPDATE STATEMENT
ID    23    22          REFERENCE TABLE MY_DATA
ID    24    22          REFERENCE FORMAL IN N_IN
ID    25    22          REFERENCE COLUMN N
ID    26    22          REFERENCE COLUMN N
ID    27    22          CALL FUNCTION MY_FUNCTION2


OK. Now let's get to the substance of this blog post. I use subquery refactoring (WITH clause) to create and then use some data sets: my_prog_unit - specify the program unit of interest just once; full_set - the full set of statements and identifiers; dml_statements - the SQL DML statements in the program unit. Then I find all the DML statements whose full_set tree below it contain a call to a function.

WITH my_prog_unit AS (SELECT USER owner, 'MY_PROCEDURE' object_name FROM DUAL),
     full_set
     AS (SELECT ai.usage,
                ai.usage_id,
                ai.usage_context_id,
                ai.TYPE,
                ai.name
           FROM all_identifiers ai, my_prog_unit
          WHERE ai.object_name = my_prog_unit.object_name
            AND ai.owner = my_prog_unit.owner
         UNION ALL
         SELECT st.TYPE,
                st.usage_id,
                st.usage_context_id,
                'type',
                'name'
           FROM all_statements st, my_prog_unit
          WHERE st.object_name = my_prog_unit.object_name
            AND st.owner = my_prog_unit.owner),
     dml_statements
     AS (SELECT st.owner, st.object_name, st.line, st.usage_id, st.type
           FROM all_statements st, my_prog_unit
          WHERE     st.object_name = my_prog_unit.object_name
                AND st.owner = my_prog_unit.owner
                AND st.TYPE IN ('SELECT', 'UPDATE', 'DELETE'))
SELECT st.owner,
       st.object_name,
       st.line,
       st.TYPE,
       s.text
  FROM dml_statements st, all_source s
 WHERE     ('CALL', 'FUNCTION') IN (    SELECT fs.usage, fs.TYPE
                                          FROM full_set fs
                                    CONNECT BY PRIOR fs.usage_id =
                                                  fs.usage_context_id
                                    START WITH fs.usage_id = st.usage_id)
       AND st.line = s.line
       AND st.object_name = s.name
       AND st.owner = s.owner
/

And I see these results:

STEVEN    MY_PROCEDURE    6    SELECT       SELECT my_function1 ()
STEVEN    MY_PROCEDURE    18    UPDATE"   UPDATE my_data

Is that cool or what?

Tuesday, May 9, 2017

Use records to improve readability and flexibility of your code

Suppose I've created a table to keep track of hominids:

CREATE TABLE hominids
(
   hominid_name     VARCHAR2 (100),
   home_territory   VARCHAR2 (100),
   brain_size_cm    INTEGER
)
/

I might then write code like this:

DECLARE
   l_b_hominid_name    VARCHAR2 (100) := 'Bonobo';
   l_b_brain_size_cm   INTEGER := 500;
   l_g_hominid_name    VARCHAR2 (100) := 'Gorilla';
   l_g_brain_size_cm   INTEGER := 750;
   l_n_hominid_name    VARCHAR2 (100) := 'Neanderthal';
   l_n_brain_size_cm   INTEGER := 1800;

What do you think?

I find the little voice of Relational Theory inside my head rebelling.

"All that repetition! All that denormalization! All that typing (or copy-pasting, which is even worse)!"

Surely if I should avoid having redundant data in rows of my tables, I should avoid redundant code, too?

Yes, I should.  I don't like to see long lists of declarations, especially when the names are very similar and follow a pattern. 

A fine way to avoid this kind of code is to use record types to group related variables together within a named context: the record variable. So I could rewrite the declaration section above to:


DECLARE
   l_bonobo        hominids%ROWTYPE;
   l_gorilla       hominids%ROWTYPE;
   l_neanderthal   hominids%ROWTYPE;
BEGIN
   l_bonobo.hominid_name := 'Bonobo';
   l_bonobo.brain_size_cm := 500;
   l_gorilla.hominid_name := 'Gorilla';
   l_gorilla.brain_size_cm := 750;
   l_neanderthal.hominid_name := 'Neanderthal';
   l_neanderthal.brain_size_cm := 1800;

Notice that I now move the initializations of the variable (well, record.field) values to the executable section. That's because PL/SQL does not yet offer a built-in function (in object-oriented lingo, a constructor method) for record types.

So I no longer have six declarations - just three. And, of course, if my table had 15 columns and I had declared a separate variable for each of those, I would have been able to shrink down my declarations from 45 to 3!

Still, I don't like putting all that initialization code in the main body of my block. How about if I create my own "record constructor" function, and then call that:

CREATE OR REPLACE FUNCTION new_hominid (
   name_in IN hominids.hominid_name%TYPE,
   home_territory_in IN hominids.home_territory%TYPE,
   brain_size_cm_in IN hominids.brain_size_cm%TYPE)
   RETURN hominids%ROWTYPE
IS
   l_return hominids%ROWTYPE;
BEGIN
   l_return.hominid_name := name_in;
   l_return.home_territory := home_territory_in;
   l_return.brain_size_cm := brain_size_cm_in;
   RETURN l_return;
END;
/

DECLARE
   l_bonobo        hominids%ROWTYPE := new_hominid ('Bonobo', NULL, 500);
   l_gorilla       hominids%ROWTYPE := new_hominid ('Gorilla', NULL, 750);
   l_neanderthal   hominids%ROWTYPE := new_hominid ('Neanderthal', NULL, 1800);
BEGIN
   DBMS_OUTPUT.put_line (l_neanderthal.brain_size_cm);
END;
/

Ahhhhh. Just three declarations. Default values assigned in the declaration section. All the details of the assignments hidden away behind the function header.

And when I add a new column to the table (or generally a field to a record), I can add a parameter to my new_hominid function, along with a default value of NULL, and none of my existing code needs to change (unless that new column or field is needed).

Yes, I like that better.

How about you?

Wednesday, May 3, 2017

Getting my Oracle Database 12c Release 2 up and running on Mac via Docker

I love to follow in the footsteps of people who are braver, smarter and more knowledgeable than me.

So I was happy to wait till SQL Maria (Maria Colgan) published her blog post on Oracle Database 12c now available on Docker, with step-by-step instructions for taking advantage of the new Docker image for 12.2 now available (specifically, 12.2 via Docker on Github, 12.2 via Docker at the Docker Store).

I am happy to report that I can now connect SQL Developer to my containerized 12.2 database. Thank you, Maria, for a very helpful post!

Now, I am not going to repeat everything Maria already wrote. That would be silly. I will simply point out some things you might find helpful as you do the same thing I did (follow in Maria's footsteps - which, literally, meant lots of copy-pasting rather dumbly).

1. Watch out for those dashes when you copy/paste.

Docker was not responding as expected to my commands and I (well, actually, Gerald) eventually noticed that the dash, copied from the blog post, was too long - it had been translated into a different character. So watch out for that! You might need to retype the command yourself.

I hate that.

:-)

2. Create your own folder for your Oracle Database files. I know it should be obvious. But I am a copy-paste sorta guy, and probably the only one in the world who would copy this command into my terminal and expect it to work:

docker run --name oracle -p 1521:1521 -p 5500:5500 
-v /Users/mcolgan-mac/oradata:/opt/oracle/oradata 
oracle/database:12.2.0.1-ee

And it did - once I created my own folder for the files, and replaced that in the command.

Oh and by the way, that entire command (once you swap out mcolgan-mac for your own foler) needs to be one one line.

After that, everything went very smoothly and, again following Maria's wonderfully clear steps, I had my database up and running.

Then I set up my connection in SQL Developer:



and voila! My own 12.2 database running in a Docker container, on my Mac.

Thanks, Maria!
Thanks, Gerald!
Thanks, Docker!
Thanks, Oracle!



Monday, May 1, 2017

Deterministic functions, caching, and worries about consistent data

A developer contacted me with the following questions last week:

We have created a function that returns a single row column value form a query. When we call this function with the same input values it takes to long to return. Example:

select max (det_function('A2')) from dual connect by rownum <= 1000000

But when we change the function to a deterministic function the statement returns really fast. The only thing where we are unsure is what happens when the tables has changed to which the statement of the function selects? Do we need a to commit this table to bring oracle to re-execute the statement in the function and not use the cache or what should we do to get a consistent return value?

FUNCTION det_function (v_in_id VARCHAR2) RETURN NUMBER DETERMINISTIC
AS
   v_ident   NUMBER;
BEGIN
   SELECT VALUE INTO v_ident
     FROM my_table
    WHERE id = v_in_id;

   RETURN v_ident;
EXCEPTION
   WHEN VALUE_ERROR OR NO_DATA_FOUND THEN RETURN -1;
END;

A function is deterministic if the value returned by the function is determined entirely by its input(s).

The following function, for example, is deterministic:

FUNCTION betwnstr (
   string_in      IN   VARCHAR2
 , start_in       IN   INTEGER
 , end_in         IN   INTEGER
)
   RETURN VARCHAR2
IS
BEGIN
   RETURN (SUBSTR (string_in, start_in, end_in - start_in + 1));
END betwnstr;

You can also quickly see, I hope, that any function that contains SQL (like the first function defined above) cannot possibly be deterministic: it depends on the contents of one or more tables to do its job, and those datasets are not passed as IN parameters.

Does that mean the compiler will complain? No! But it does mean that you could create real problems for yourself if you are not careful about your use of this keyword.

So the rule should be: Only add the DETERMINISTIC keyword to truly deterministic functions.

Why? Why should it matter? Because under certain circumstances (such as the one identified by the developer above), Oracle Database will not execute your function, but instead simply use a previously cached return value.

Within the scope of a single server call (e.g., execution of a PL/SQL block), Oracle Database will keep track of input and return values for your deterministic functions. If in that same server call, you pass the same input values to the function, the database engine may choose to not actually execute the function, but instead simply pass back the previously-computed return value (for those same inputs).

That's why this developer saw such a great leap forward in performance.

Once that SELECT statement finishes, though, memory for the cache is released. When and if that same query is run again, the engine will start rebuilding and using that cache.

While that statement executing, though, no matter what sort of changes are made to the table, no matter if a commit is issued or not, those changes will not be visible to the statement that called the function.

That's why I will repeat The Rule again:

Only add the DETERMINISTIC keyword to truly deterministic functions.

If your function contains a SELECT statement and you want to call it from a SELECT statement, the best thing to do is take the SQL out of the function and "merge" it into your SQL - in other words, no user-defined functions. Just SQL.

Rob van Wijk offers lots more details on the behavior and performance of deterministic functions here. You will also be well-served to read Bryn Llewellyn's in-depth exploration of How to write a safe result-cached function.

Rather than repeat all those findings, I will simply conclude with:

1. Use the DETERMINISTIC function primarily as a way to document to future developers that your function is currently free of side effects, and should stay that way.

2. If you are looking for ways to improve the performance of functions executed inside SQL, learn more about the UDF pragma (new in Oracle Database 12c Release 1).

3. See if the function result cache feature (also explored in Bryn's blog post) might be applicable to your situation.

4. Do not call user-defined functions from SQL statements that in turn contain SQL statements (or at least do so with extreme caution). That SQL inside the function is not part of the same read-consistent image as the data set identified by the "outer" SQL.

Thursday, April 20, 2017

Tips for getting along with your DBA


Developers and DBAs: can't we all just get along?

Sure we can!

We just have to break out of the old routine of

Developer: Hey, DBA, add twelve indexes to make my code run faster!
DBA: Hey, Developer, tune your code to make it run faster!

That is, finger-pointing.

Instead, we need to work together, and developers I am not the least big reluctant to say:

It's up to us, not the DBAs, to take the first steps.

So here are tips on what you, the developer, can do to foster a strong, collaborative and highly productive relationship with your DBA:

1. Ask your DBA for advice. 

"I want to make my code run faster. What do you think I should do?" There's no better to improve a relationship than to show some humility and express interest in the opinions - and knowledge - of others.

2. Do the right thing. 

Learn about the performance-related features of PL/SQL (and SQL) and apply them. Here are some links to help get started:

PL/SQL Optimization and Tuning (Doc)
High Performance PL/SQL Videos
SQL Analytics Videos by Connor McDonald
Introduction to Indexing Videos by Chris Saxon

3. Give your DBA a heads-up when your pattern of writing code changes. 

Utilizing new and different features of PL/SQL can have a ripple effect on memory consumption and overall application performance. Don't blindside your DBA.

For example, you learn about executing "bulk SQL" from PL/SQL. So cool! So powerful! And potentially a big PGA memory suck, through the use of collections.

Or you discover the Function Result Cache. Another very exciting enhancement added in 11.1. "Hey, I'm going to add the RESULT_CACHE clause to 100 functions. So easy!" Yes, but you might kill overall database activity with latch contention.