Wednesday, April 15, 2015

Table Functions, Part 2: Returning complex (non-scalar) datasets

In my introduction to table functions, I showed how to build and "query" from a table function that returns a collection of scalars (number, date, string, etc.).

If that's all you need to do, well, lucky you!

Most of the time, however, you need to pass back a row of data consisting of more than one value, just as you would with "regular" tables when, say, you needed the ID and last name of all employees in department 10, as in:

SELECT employee_id, last_name
  FROM employees
 WHERE department_id = 10

This post explores how you can go about doing that with table functions.

So, first, let's explore how not to do it. That's usually a bit more entertaining. :-)

Let's take the very simple example above: return an ID and a name. Let's further assume that this data is not readily available from a table - we need a table function to put it all together.

Here's one approach I could take:

CREATE OR REPLACE TYPE two_values_in_one_t
   IS TABLE OF VARCHAR2 (4000)
/

CREATE OR REPLACE FUNCTION tabfunc_concat
   RETURN two_values_in_one_t
IS
   l_return   two_values_in_one_t := two_values_in_one_t ();
BEGIN
   l_return.EXTEND (2);
   l_return (1) := '143-Feuerstein';
   l_return (2) := '147-Silva';

   RETURN l_return;
END;
/

SELECT COLUMN_VALUE 
  FROM TABLE (tabfunc_concat ())
 ORDER BY COLUMN_VALUE
/

COLUMN_VALUE
-------------------------
143-Feuerstein
147-Silva

But wait! That's just a single value, not two values. So what I really need is something more like this:

  SELECT SUBSTR (COLUMN_VALUE, 1, INSTR (COLUMN_VALUE, '-') - 1)
            employee_id,
         SUBSTR (COLUMN_VALUE, INSTR (COLUMN_VALUE, '-') + 1)
            last_name
    FROM TABLE (tabfunc_concat ())
ORDER BY employee_id
/

I could even put it inside a view:

CREATE OR REPLACE VIEW two_values_v
AS
     SELECT SUBSTR (COLUMN_VALUE,
                    1,
                    INSTR (COLUMN_VALUE, '-') - 1)
               employee_id,
            SUBSTR (COLUMN_VALUE, INSTR (COLUMN_VALUE, '-') + 1)
               last_name
       FROM TABLE (tabfunc_concat ())
   ORDER BY employee_id;

SELECT * FROM two_values_v
 ORDER BY employee_id
/

EMPLOYEE_ID LAST_NAME
----------- -------------
143         Feuerstein
147         Silva

Is that slick or what?

Actually, well, "what". Not very slick at all. Ugly. Cumbersome. Error prone.

What if, for  example, an employee's name was hyphenated, as in "Silva-Feuerstein"?

So, yes, sure, you could take this approach when you need to pass back multiple values per row, but it's awfully kludgy and will not enhance your reputation with other developers on your team.

Which means it's time for....another way not to do it. :-)

Suppose I have a package defined as follows:

CREATE OR REPLACE PACKAGE two_values_pkg
AS
   TYPE two_values_rt IS RECORD
   (
      employee_id   INTEGER,
      last_name     VARCHAR2 (100)
   );
END;
/

Then it seems perfectly reasonable and "the way to go" to take the following steps:

1. Create a nested table type based on the record type.

/* Warning! This will not work. */
CREATE OR REPLACE TYPE two_values_nt
   IS TABLE OF two_values_pkg.two_values_rt
/

2. Build the function around that nested table, populating values of fields of each of two records:

CREATE OR REPLACE FUNCTION tabfunc_no_concat
   RETURN two_values_nt
IS
   l_return   two_values_in_one_t := two_values_in_one_t ();
BEGIN
   l_return.EXTEND (2);
   
   l_return (1).employee_id := '143';
   l_return (1).last_name := 'Feuerstein';
   
   l_return (2).employee_id := '147';
   l_return (3).last_name := 'Silva';

   RETURN l_return;
END;
/

Yes, it seems so reasonable - at first glance. And yet when I try to create that type, I see:

PLS-00201: identifier 'TWO_VALUES_PKG.TWO_VALUES_RT' must be declared

Must be declared? But it is declared. The following block, for example, executes without any kind of compilation error (when you see "PLS" you know that is a compilation error):

DECLARE
   r   two_values_pkg.two_values_rt;
BEGIN
   NULL;
END;
/

Yes, well. Let's face it: PL/SQL is very gracious about understanding and leveraging PL/SQL. SQL, on the other hand, is a bit arrogant. Sure, it will recognize and execute functions declared at the schema level (CREATE OR REPLACE FUNCTION) or in a package specification. But outside of that, SQL really doesn't want to have anything to do with PL/SQL.

OK, kidding aside. It is worth remembering that "PL/SQL" stands for Procedural Language extensions to SQL. So of course it must know about and support SQL - that is, in fact, one of the most wonderful things about PL/SQL: native support for SQL.

But SQL exists independently of PL/SQL and so it would make sense that it does not recognize a whole boatload of procedural constructs from The Language That Changed My Life.

Bottom line: you cannot reference a PL/SQL record type (even one defined with %ROWTYPE against a database table) as the datatype of a schema-level nested table.

Bummer, 'cause that's a fast and easy way to do things.

Instead, I must dip my toe into the world of object-oriented programming in PL/SQL and create an object type to serve as the base datatype for the nested table.

[Note: there is an exception to this rule, as noted by Patrick Barel in comments below. If you are defining a pipelined table function, then you can define your nested table type in a package specification, based on a record type.]

Now, if you are one of the 99% of PL/SQL developers who are not comfy with object orientation and have never had the inclination or reason to learn about PL/SQL's OO features, don't worry! You will not have to do so in order to use object types with your table functions.

You will, in essence, create an object type, just like you create a relational table or a record type, defining an attribute (analogous to a record's field) for each distinct value you want returned in the row from the table function.

To do this, you do not need to understand object-oriented principles, nor pretend to get all excited about dynamic polymorphism.

Instead, create the object type:

CREATE OR REPLACE TYPE two_values_ot AS OBJECT
   (
      employee_id   INTEGER,
      last_name     VARCHAR2 (100)
   );
/

Then create a nested table based on that type:

CREATE OR REPLACE TYPE two_values_nt
   IS TABLE OF two_values_ot
/

Then return that nested table type in the function and populate the nested table with two instances of that type:

CREATE OR REPLACE FUNCTION tabfunc_no_concat
   RETURN two_values_nt
IS
   l_return two_values_nt := 
      two_values_nt (two_values_ot (143, 'Feuerstein'),
                     two_values_ot (147, 'Silva'));
BEGIN
   RETURN l_return;
END;
/

and then....voila!

SELECT employee_id, last_name 
  FROM TABLE (tabfunc_no_concat ())
 ORDER BY employee_id
/

EMPLOYEE_ID LAST_NAME
----------- -------------
143         Feuerstein
147         Silva

Notice:

1.  That hard-coded COLUMN_VALUE is no longer a part of the equation. The names of the object type attributes become the names of the "columns" in the SELECT list. You can reference those names in the WHERE clause, ORDER BY clause, etc.

2. In this latest table function, I took advantage of the constructor functions that come with both nested tables and object types to set all the desired values to be returned by the table function right in the declaration section. This is not something you can do with records and associative arrays. That's very handy, but practically speaking will not likely be the way you build your datasets within a table function.

So now you know the secret handshake: when your table function needs to return rows with more than one column, create an object type that defines the desired structure, build the nested table on that object type, and then have the table function return a collection that that nested table type.

Links to Table Function Series

5 comments:

  1. You stated:
    ----
    Bottom line: you cannot reference a PL/SQL record type (even one defined with %ROWTYPE against a database table) as the datatype of a schema-level nested table.
    ----
    But I was doing a presentation on this subject at the OUGN conference and someone came up to me saying: that isn't quite true.
    He sent me a testscript, which I adjusted a bit (naming, formatting) to show me what he meant:
    ----- script begins ------
    CREATE OR REPLACE PACKAGE tf_demo AS

    TYPE emps IS TABLE OF emp%ROWTYPE;
    FUNCTION getemps RETURN emps
    PIPELINED;

    TYPE userrec IS RECORD(
    id NUMBER
    ,VALUE VARCHAR2(50));
    TYPE userrecs IS TABLE OF userrec;
    FUNCTION getuserrecs RETURN userrecs
    PIPELINED;

    END tf_demo;
    /
    sho err

    CREATE OR REPLACE PACKAGE BODY tf_demo AS

    FUNCTION getemps RETURN emps
    PIPELINED AS
    outp emp%ROWTYPE;
    BEGIN
    FOR outp IN (SELECT * FROM emp) LOOP
    PIPE ROW(outp);
    END LOOP;
    RETURN;
    END getemps;

    FUNCTION getuserrecs RETURN userrecs
    PIPELINED AS
    outp tf_demo.userrec;
    BEGIN
    FOR i IN 1 .. 14 LOOP
    outp.id := i;
    outp.value := 'iteration:' || i;
    PIPE ROW(outp);
    END LOOP;
    RETURN;
    END getuserrecs;

    END tf_demo;
    /
    sho err

    select * from table(tf_demo.getemps);
    select * from table(tf_demo.getuserrecs);
    ----- script ends ------
    I tested this on Oracle 10G (10.2.0.4.0) and 11G (11.2.0.2.0) and it works as he told me.
    Thanks to Kristian Saksen for pointing this out to me.

    ReplyDelete
  2. Thanks, Patrick and Kristian. Yes, for pipelined functions that exception holds: you can use a package-based nested table type and then you can use %ROWTYPE to define the return type. I will point out that exception in the [corrected] post.

    ReplyDelete
  3. Hello All,

    I think that it is an issue of terminology here:

    When we talk about "datatype of a schema-level nested table" we mean a nested table that was created using a

    CREATE TYPE my_nested_table AS TABLE OF ...

    and, as such, Steven's statement is correct.

    A type defined inside a package is, well ... NOT exactly what we call
    "schema-level nested table" ... though, strictly speaking,
    "sitting inside a package" means that is also "sits" inside a schema ...
    as opposed to being defined in an ad-hoc anonymous block ...

    It is true that behind the scenes Oracle creates schema-level types
    for supporting PIPELINED functions that return collections defined inside packages, but this still does not make OUR package level type
    to be a "schema-level nested table".
    The supporting types automatically created by Oracle are, indeed,
    schema-level nested tables.

    What to do ...
    PL/SQL Challenge players have learned to be very strict about terminology :):)

    Thanks a lot & Best Regards,
    Iudith





    ReplyDelete
  4. Thank you! You are the best teacher.

    ReplyDelete