Skip to main content

Table Functions, Part 5b: Table Functions vs Pipelined Table Functions

August 2018 update: please do feel encourage to read this and my other posts on table functions, but you will learn much more about table functions by taking my Get Started with PL/SQL Table Functions class at the Oracle Dev Gym. Videos, tutorials and quizzes - then print a certificate when you are done!

Last week, I published my first post in the Table Function series on pipelined table functions. I included this advice at the top of the post:

Here's my first piece of advice regarding this interesting and in many ways unique (within PL/SQL) feature:
You are unlikely to ever need a pipelined table function, but when you do, it is going to be amazing.
I bring this up right at the start because I have found over the years that many developers talk in the most glowing terms about pipelined table functions. Yet when I look at what they are doing with those functions, it becomes clear that they are not pumped up about the pipelining. They are not, in fact, benefiting at all from the pipelining feature.

Instead, they are excited simply about table functions. And that's cool, that's fine. Table functions are, indeed, quite awesome. Read my series! :-)

Part of the reason I gave this advice is that I'd run some performance tests on TFs and PTFs and noticed that PTFs seemed to run a bit slower than TFs when you weren't taking advantage of pipelining. So I didn't want to be too encouraging of users to change all their TFs into PTFs.

But then Patrick Barel pointed readers to his article on PTFs, which concluded with this advice:
Bottom line is to always use pipelined table functions instead of normal table functions, except when the function is only meant for PL/SQL or client consumption
Well, gee, that sorta conflicted with my advice. But I didn't get mad at Patrick. No, sir, I did not. And that's because I am aware, more than anyone else in the world (I hope), of how often I have been wrong, and how much I have learned from others.

So instead, Patrick, Erik Van Roon and Kim Berg Hansen and I all embarked on a small exploration into the comparable performances of TFs and PTFs. And that (along with a conversation with PL/SQL Product Manager, Bryn Llewellyn, led me to re-evaluate my thinking (um, fancy way of saying: I was wrong).

So I have now changed my original post on pipelined table functions, and I will share with you some of what Patrick, Erik and Kim discovered. If I have not done each of you justice, let us know!

Let's start with Kim's nice encapsulation from Patrick's article of a recommendation for choosing between TFs and PTFs:
  • If your table function is only used within SQL [[ SELECT ... FROM TABLE( myfunc() ) ... ]], then just go ahead and always use pipelined table functions.
  • If your table function is only assigned directly to collection variable (either PL/SQL or client language), then you must use a regular table function.
  • If your table function needs to be used in both places (SQL and PL/SQL):
    • If it is relatively small amounts of data, just use the regular table function, it'll be OK for your SQL too.
    • If it is relatively large amounts of data, consider having two versions of the function:
      • One version pipelined for SQL TABLE() calls.
      • One version regular for collection assignments. This may be a simple wrapper that populates and returns a collection with the results of the pipelined version.
Hopefully that is clear enough. So the big question we tried to get our heads around was: are table functions faster or slower than pipelined table functions (when you are not taking advantage of the pipelining effect)? 

I will hand over the microphone to Erik to sum up the results we got, after running a variety of tests:

There seems to be a slight difference in performance between the two types of table functions. Apart from Kim’s test they all seem to be in favor of the pipelined version. 

However, the difference is minimal, ranging from 0.5 – 15%, while the absolute differences range from 0.05 – 3 seconds. And that is for processing a million rows.

As Patrick says, and as we all know “the argument for using Pipelined Table Functions is mostly the impact on the PGA”. In fact, with my current settings, if I run my script for 10 million records the PTF’s still run fine, but the TF as can be expected crashes with memory errors.

I think in general the cost of overhead of one or the other should not be a consideration.

Erik

I will leave it to Kim, Erik and Patrick to decide if they would like to publish the details of their test runs (including code? My readers love to get their hands on code written by experts!).

But I would like to thank all three of you for helping keep me honest and accurate and up-to-date. I had fallen into a trap I warn others of ("A long, long time ago, my tests showed that PTFs were slower than TFs, so watch out.").

Any time your statement starts (silently or otherwise) with "A long time ago..." or "Back in Oracle9i...", it is time to challenge those assumptions and run some new tests.

Especially before posting a new entry on my blog.

Comments

  1. I'm relieved to hear that Bryn didn't refute the findings of the test ;-)

    ReplyDelete
  2. After we talked about it for awhile, Bryn's basic conclusion was that performance should not be the guiding concern. Pipelining was designed for a specific use case: calling a table function exclusively in the FROM clause of a SELECT. So when you are doing that, use a PTF. Which matches up very nicely with your write-up, Kim.

    ReplyDelete

Post a Comment

Popular posts from this blog

Running out of PGA memory with MULTISET ops? Watch out for DISTINCT!

A PL/SQL team inside Oracle made excellent use of nested tables and MULTISET operators in SQL, blending data in tables with procedurally-generated datasets (nested tables).  All was going well when they hit the dreaded: ORA-04030: out of process memory when trying to allocate 2032 bytes  They asked for my help.  The error occurred on this SELECT: SELECT  *    FROM header_tab trx    WHERE (generated_ntab1 SUBMULTISET OF trx.column_ntab)       AND ((trx.column_ntab MULTISET             EXCEPT DISTINCT generated_ntab2) IS EMPTY) The problem is clearly related to the use of those nested tables. Now, there was clearly sufficient PGA for the nested tables themselves. So the problem was in executing the MULTISET-related functionality. We talked for a bit about dropping the use of nested tables and instead doing everything in SQL, to avoid the PGA error. That would, however require lots of wo...

How to Pick the Limit for BULK COLLECT

This question rolled into my In Box today: In the case of using the LIMIT clause of BULK COLLECT, how do we decide what value to use for the limit? First I give the quick answer, then I provide support for that answer Quick Answer Start with 100. That's the default (and only) setting for cursor FOR loop optimizations. It offers a sweet spot of improved performance over row-by-row and not-too-much PGA memory consumption. Test to see if that's fast enough (likely will be for many cases). If not, try higher values until you reach the performance level you need - and you are not consuming too much PGA memory.  Don't hard-code the limit value: make it a parameter to your subprogram or a constant in a package specification. Don't put anything in the collection you don't need. [from Giulio Dottorini] Remember: each session that runs this code will use that amount of memory. Background When you use BULK COLLECT, you retrieve more than row with each fetch, ...

PL/SQL 101: Three ways to get error message/stack in PL/SQL

The PL/SQL Challenge quiz for 10 September - 16 September 2016 explored the different ways you can obtain the error message / stack in PL/SQL. Note: an error stack is a sequence of multiple error messages that can occur when an exception is propagated and re-raised through several layers of nested blocks. The three ways are: SQLERRM - The original, traditional and (oddly enough) not currently recommended function to get the current error message. Not recommended because the next two options avoid a problem which you are unlikely  to run into: the error stack will be truncated at 512 bytes, and you might lose some error information. DBMS_UTILITY.FORMAT_ERROR_STACK - Returns the error message / stack, and will not truncate your string like SQLERRM will. UTL_CALL_STACK API - Added in Oracle Database 12c, the UTL_CALL_STACK package offers a comprehensive API into the execution call stack, the error stack and the error backtrace.  Note: check out this LiveSQL script if...