Skip to main content

Table Functions, Part 5b: Table Functions vs Pipelined Table Functions

August 2018 update: please do feel encourage to read this and my other posts on table functions, but you will learn much more about table functions by taking my Get Started with PL/SQL Table Functions class at the Oracle Dev Gym. Videos, tutorials and quizzes - then print a certificate when you are done!

Last week, I published my first post in the Table Function series on pipelined table functions. I included this advice at the top of the post:

Here's my first piece of advice regarding this interesting and in many ways unique (within PL/SQL) feature:
You are unlikely to ever need a pipelined table function, but when you do, it is going to be amazing.
I bring this up right at the start because I have found over the years that many developers talk in the most glowing terms about pipelined table functions. Yet when I look at what they are doing with those functions, it becomes clear that they are not pumped up about the pipelining. They are not, in fact, benefiting at all from the pipelining feature.

Instead, they are excited simply about table functions. And that's cool, that's fine. Table functions are, indeed, quite awesome. Read my series! :-)

Part of the reason I gave this advice is that I'd run some performance tests on TFs and PTFs and noticed that PTFs seemed to run a bit slower than TFs when you weren't taking advantage of pipelining. So I didn't want to be too encouraging of users to change all their TFs into PTFs.

But then Patrick Barel pointed readers to his article on PTFs, which concluded with this advice:
Bottom line is to always use pipelined table functions instead of normal table functions, except when the function is only meant for PL/SQL or client consumption
Well, gee, that sorta conflicted with my advice. But I didn't get mad at Patrick. No, sir, I did not. And that's because I am aware, more than anyone else in the world (I hope), of how often I have been wrong, and how much I have learned from others.

So instead, Patrick, Erik Van Roon and Kim Berg Hansen and I all embarked on a small exploration into the comparable performances of TFs and PTFs. And that (along with a conversation with PL/SQL Product Manager, Bryn Llewellyn, led me to re-evaluate my thinking (um, fancy way of saying: I was wrong).

So I have now changed my original post on pipelined table functions, and I will share with you some of what Patrick, Erik and Kim discovered. If I have not done each of you justice, let us know!

Let's start with Kim's nice encapsulation from Patrick's article of a recommendation for choosing between TFs and PTFs:
  • If your table function is only used within SQL [[ SELECT ... FROM TABLE( myfunc() ) ... ]], then just go ahead and always use pipelined table functions.
  • If your table function is only assigned directly to collection variable (either PL/SQL or client language), then you must use a regular table function.
  • If your table function needs to be used in both places (SQL and PL/SQL):
    • If it is relatively small amounts of data, just use the regular table function, it'll be OK for your SQL too.
    • If it is relatively large amounts of data, consider having two versions of the function:
      • One version pipelined for SQL TABLE() calls.
      • One version regular for collection assignments. This may be a simple wrapper that populates and returns a collection with the results of the pipelined version.
Hopefully that is clear enough. So the big question we tried to get our heads around was: are table functions faster or slower than pipelined table functions (when you are not taking advantage of the pipelining effect)? 

I will hand over the microphone to Erik to sum up the results we got, after running a variety of tests:

There seems to be a slight difference in performance between the two types of table functions. Apart from Kim’s test they all seem to be in favor of the pipelined version. 

However, the difference is minimal, ranging from 0.5 – 15%, while the absolute differences range from 0.05 – 3 seconds. And that is for processing a million rows.

As Patrick says, and as we all know “the argument for using Pipelined Table Functions is mostly the impact on the PGA”. In fact, with my current settings, if I run my script for 10 million records the PTF’s still run fine, but the TF as can be expected crashes with memory errors.

I think in general the cost of overhead of one or the other should not be a consideration.

Erik

I will leave it to Kim, Erik and Patrick to decide if they would like to publish the details of their test runs (including code? My readers love to get their hands on code written by experts!).

But I would like to thank all three of you for helping keep me honest and accurate and up-to-date. I had fallen into a trap I warn others of ("A long, long time ago, my tests showed that PTFs were slower than TFs, so watch out.").

Any time your statement starts (silently or otherwise) with "A long time ago..." or "Back in Oracle9i...", it is time to challenge those assumptions and run some new tests.

Especially before posting a new entry on my blog.

Comments

  1. I'm relieved to hear that Bryn didn't refute the findings of the test ;-)

    ReplyDelete
  2. After we talked about it for awhile, Bryn's basic conclusion was that performance should not be the guiding concern. Pipelining was designed for a specific use case: calling a table function exclusively in the FROM clause of a SELECT. So when you are doing that, use a PTF. Which matches up very nicely with your write-up, Kim.

    ReplyDelete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of work, revamping algorithms, ensuring correctness, you know the score. Then my eyes snagge

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch,

Quick Guide to User-Defined Types in Oracle PL/SQL

A Twitter follower recently asked for more information on user-defined types in the PL/SQL language, and I figured the best way to answer is to offer up this blog post. PL/SQL is a strongly-typed language . Before you can work with a variable or constant, it must be declared with a type (yes, PL/SQL also supports lots of implicit conversions from one type to another, but still, everything must be declared with a type). PL/SQL offers a wide array of pre-defined data types , both in the language natively (such as VARCHAR2, PLS_INTEGER, BOOLEAN, etc.) and in a variety of supplied packages (e.g., the NUMBER_TABLE collection type in the DBMS_SQL package). Data types in PL/SQL can be scalars, such as strings and numbers, or composite (consisting of one or more scalars), such as record types, collection types and object types. You can't really declare your own "user-defined" scalars, though you can define subtypes  from those scalars, which can be very helpful from the p