Tuesday, August 25, 2015

Table Functions, Part 5b: Table Functions vs Pipelined Table Functions

Last week, I published my first post in the Table Function series on pipelined table functions. I included this advice at the top of the post:

Here's my first piece of advice regarding this interesting and in many ways unique (within PL/SQL) feature:
You are unlikely to ever need a pipelined table function, but when you do, it is going to be amazing.
I bring this up right at the start because I have found over the years that many developers talk in the most glowing terms about pipelined table functions. Yet when I look at what they are doing with those functions, it becomes clear that they are not pumped up about the pipelining. They are not, in fact, benefiting at all from the pipelining feature.

Instead, they are excited simply about table functions. And that's cool, that's fine. Table functions are, indeed, quite awesome. Read my series! :-)

Part of the reason I gave this advice is that I'd run some performance tests on TFs and PTFs and noticed that PTFs seemed to run a bit slower than TFs when you weren't taking advantage of pipelining. So I didn't want to be too encouraging of users to change all their TFs into PTFs.

But then Patrick Barel pointed readers to his article on PTFs, which concluded with this advice:
Bottom line is to always use pipelined table functions instead of normal table functions, except when the function is only meant for PL/SQL or client consumption
Well, gee, that sorta conflicted with my advice. But I didn't get mad at Patrick. No, sir, I did not. And that's because I am aware, more than anyone else in the world (I hope), of how often I have been wrong, and how much I have learned from others.

So instead, Patrick, Erik Van Roon and Kim Berg Hansen and I all embarked on a small exploration into the comparable performances of TFs and PTFs. And that (along with a conversation with PL/SQL Product Manager, Bryn Llewellyn, led me to re-evaluate my thinking (um, fancy way of saying: I was wrong).

So I have now changed my original post on pipelined table functions, and I will share with you some of what Patrick, Erik and Kim discovered. If I have not done each of you justice, let us know!

Let's start with Kim's nice encapsulation from Patrick's article of a recommendation for choosing between TFs and PTFs:
  • If your table function is only used within SQL [[ SELECT ... FROM TABLE( myfunc() ) ... ]], then just go ahead and always use pipelined table functions.
  • If your table function is only assigned directly to collection variable (either PL/SQL or client language), then you must use a regular table function.
  • If your table function needs to be used in both places (SQL and PL/SQL):
    • If it is relatively small amounts of data, just use the regular table function, it'll be OK for your SQL too.
    • If it is relatively large amounts of data, consider having two versions of the function:
      • One version pipelined for SQL TABLE() calls.
      • One version regular for collection assignments. This may be a simple wrapper that populates and returns a collection with the results of the pipelined version.
Hopefully that is clear enough. So the big question we tried to get our heads around was: are table functions faster or slower than pipelined table functions (when you are not taking advantage of the pipelining effect)? 

I will hand over the microphone to Erik to sum up the results we got, after running a variety of tests:

There seems to be a slight difference in performance between the two types of table functions. Apart from Kim’s test they all seem to be in favor of the pipelined version. 

However, the difference is minimal, ranging from 0.5 – 15%, while the absolute differences range from 0.05 – 3 seconds. And that is for processing a million rows.

As Patrick says, and as we all know “the argument for using Pipelined Table Functions is mostly the impact on the PGA”. In fact, with my current settings, if I run my script for 10 million records the PTF’s still run fine, but the TF as can be expected crashes with memory errors.

I think in general the cost of overhead of one or the other should not be a consideration.

Erik

I will leave it to Kim, Erik and Patrick to decide if they would like to publish the details of their test runs (including code? My readers love to get their hands on code written by experts!).

But I would like to thank all three of you for helping keep me honest and accurate and up-to-date. I had fallen into a trap I warn others of ("A long, long time ago, my tests showed that PTFs were slower than TFs, so watch out.").

Any time your statement starts (silently or otherwise) with "A long time ago..." or "Back in Oracle9i...", it is time to challenge those assumptions and run some new tests.

Especially before posting a new entry on my blog.

2 comments:

  1. I'm relieved to hear that Bryn didn't refute the findings of the test ;-)

    ReplyDelete
  2. After we talked about it for awhile, Bryn's basic conclusion was that performance should not be the guiding concern. Pipelining was designed for a specific use case: calling a table function exclusively in the FROM clause of a SELECT. So when you are doing that, use a PTF. Which matches up very nicely with your write-up, Kim.

    ReplyDelete