Got an email from an expert Oracle Database developer (let's call him Sheldon, in honor of my dad, who taught himself programming - RPG, no less!) yesterday:
CREATE OR REPLACE PACKAGE pkg
Is inlining (via PRAGMA INLINE) even something worth talking about? I’m getting the feeling that it’s not. I’m trying to come up with an example to show folks the beauty of inlining function calls. And I’m getting the idea that either oracle is doing something like inlining code all the time or is just really super-efficient on the calls these days.
Well, sure, Oracle Database is really super-efficient. Always. Automagically. Don't have to worry about doing any tuning yourself. Oh, wow, what a wonderful world!
Pinch. Wake up! That was a dream you were having, Steven.
Oracle Database does an awful lot to ensure high performance for our applications, but it's not yet fully automatic or automagic. We can do things to drastically screw up performance. And we can also take actions, both in the way we write our code and in the features we utilize, to improve performance.
Inlining is one of those. First, an explanation of the basic concept, then I will address Sheldon's question.
Suppose that I have written very nicely modularized code, with lots of functions defining APIs to underlying rules and data, with nested subprograms to improve readability, etc.
If you want to write an application that is easy to debug, maintain and enhance, you simply must modularize your logic.
But there is a price to pay: each time the PL/SQL engine executes the invocation of a subprogram, there is some overhead involved (look up the code, transfer data back and forth, etc.).
The idea behind inlining is that at the time of compilation, the compiler replaces the invocation of a subprogram with that subprogram's implementation. Then at runtime, the cost of executing the subprogram is reduced.
Here's a very crude example. I write this code:
CREATE PROCEDURE my_proc
IS
n NUMBER;
FUNCTION nested_func (p NUMBER)
RETURN PLS_INTEGER
IS
BEGIN
RETURN p * 10;
END;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
n := nested_func (indx);
END LOOP;
END;
/
But with inlining enabled, what is actually executed looks more like this:
CREATE PROCEDURE my_proc
IS
n NUMBER;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
n := indx * 10;
END LOOP;
END;
/
Simple enough, right?
OK. In Oracle Database 11g, we (not that I was a part of "we" - Oracle Corporation - back when 11.1 was released) added a new PL/SQL optimization level (3) and a new pragma (INLINE) to allow PL/SQL developers to take advantage of inlining. You can turn it on globally via the optimization level or in a very granular fashion with the pragma (which means: command to the compiler).
I won't go into all the details here. Check out the doc or Tim Hall's fine article or Adrian Billington's helpful exploration for lots more information.
So then - back to Sheldon's question. He was unable to verify the performance benefit from inlining. What's the problem? Well, let's read the introduction to Subprogram Inlining from the doc:
One optimization that the compiler can perform is subprogram inlining. Subprogram inlining replaces a subprogram invocation with a copy of the invoked subprogram (if the invoked and invoking subprograms are in the same program unit).
Note that last section within parentheses:
The invoked and invoking subprograms must be in the same program unit.
What exactly does that mean?
It means that the function you want to inline must be:
- Defined as a nested subprogram within the subprogram that invokes it, or...
- Defined within the same package.
You've already seen an example of the former. But before I move on to an example of the latter (inlining subprograms from same package) or even showing you an example where inlining will not happen, I'd like to cover:
How do I know whether or not inlining is working?
My approach has been to perform timings on various implementations and see if a particular scenario makes things faster. That can be productive, except when the optimization technique is fairly fine-tuned in its applicability (which can be the case with inlining).
Bryn Llewellyn, PL/SQL Product Manager (follow @brynlite), suggests a far superior approach: use compile-time warnings (though in this case, less of a warning than an FYI).
Let's compile the my_proc procedure with inlining enabled (that's what level 3 optimization means) and compile-time warnings enabled:
alter session set Plsql_Warnings = 'Enable:All'
/
alter session set Plsql_Optimize_Level = 3
/
CREATE PROCEDURE my_proc
IS
n NUMBER;
FUNCTION nested_func (p NUMBER)
RETURN PLS_INTEGER
IS
BEGIN
RETURN p * 10;
END;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
n := nested_func (indx);
END LOOP;
END;
/
Then I issue a SHOW ERRORS command and see in SQL Developer the following:
Inlining was performed, and once that was done, there were no more calls to NESTED_FUNC, so the compiler also removed the nested subprogram definition. Cool!
Now let's do the same thing for an invocation of a subprogram that is defined within the same package, but outside of the invoking subprogram:
alter session set Plsql_Warnings = 'Enable:All'
/
alter session set Plsql_Optimize_Level = 3
/CREATE OR REPLACE PACKAGE pkg
AUTHID DEFINER
IS
PROCEDURE test_inlining;
END;
/
CREATE OR REPLACE PACKAGE BODY pkg
IS
FUNCTION private_func (p NUMBER)
RETURN PLS_INTEGER
IS
BEGIN
RETURN p * 10;
END;
PROCEDURE test_inlining
IS
n NUMBER;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
n := private_func (indx);
END LOOP;
END;
END;
/
And SHOW ERRORS reveals:
Finally, let's try to inline a subprogram defined outside of the invoking subprogram's program unit.
CREATE OR REPLACE FUNCTION my_func (p NUMBER)
RETURN PLS_INTEGER
IS
BEGIN
RETURN p * 10;
END;
/
CREATE OR REPLACE PROCEDURE test_inlining
IS
n NUMBER;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
n := my_func (indx);
END LOOP;
END;
/
Finally, let's try to inline a subprogram defined outside of the invoking subprogram's program unit.
alter session set Plsql_Warnings = 'Enable:All'
/
alter session set Plsql_Optimize_Level = 3
/CREATE OR REPLACE FUNCTION my_func (p NUMBER)
RETURN PLS_INTEGER
IS
BEGIN
RETURN p * 10;
END;
/
CREATE OR REPLACE PROCEDURE test_inlining
IS
n NUMBER;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
n := my_func (indx);
END LOOP;
END;
/
And what warnings do I see now?
Not a single mention of inlining. Even with the optimization level set to 3. That should be a pretty good hint, right?
But let's make sure. I will explicitly request inlining of the call to MY_FUNC (which should not be necessary, given the optimization level, but bear with me).
CREATE OR REPLACE PROCEDURE test_inlining
IS
n NUMBER;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
PRAGMA INLINE (my_func, 'YES');
n := my_func (indx);
END LOOP;
END;
/
And then I see the following warnings:
Ah....so you can that the compiler tells me that I've made the request - but it was not "done", only "requested".
As stated in the documentation, since my_func is not defined within the same program unit as the invoking subprogram, inlining does not apply.
So now I've shown you when inlining will take place and how you can verify that the compiler performed as expected.
Compile-time warnings are really quite cool. Turn them on in your IDE and pay attention to what they say!
Going back to the original question from Sheldon, though: "Is inlining (via PRAGMA INLINE) even something worth talking about? I’m getting the feeling that it’s not."
In my (admittedly limited) experience, it can be hard to come up with scripts that demonstrate massive performance improvements. It will definitely help most in "extreme" scenarios involving a very large number of subprogram invocations (loops, recursion, etc.).
[2016-09-23] The recommendation from the PL/SQL Development Team is unequivocal on this matter: you should set the PL/SQL optimization level to 3 at the instance level. Give the compiler the maximum number of tools to use to optimize your code. It may not make a big difference, but it will surely help overall.
I could, I suppose, take the position that you should just turn on inlining globally (optimization level) and let the compiler have at it, except that the documentation also states:
So I would suggest that you not bump up the optimization level to 3 globally, and instead put in specific, explicit requests to inline precisely in those programs in which the use case fits clearly and you can demonstrate a noticeable improvement.
Not a single mention of inlining. Even with the optimization level set to 3. That should be a pretty good hint, right?
But let's make sure. I will explicitly request inlining of the call to MY_FUNC (which should not be necessary, given the optimization level, but bear with me).
CREATE OR REPLACE PROCEDURE test_inlining
IS
n NUMBER;
BEGIN
FOR indx IN 1 .. 10000000
LOOP
PRAGMA INLINE (my_func, 'YES');
n := my_func (indx);
END LOOP;
END;
/
And then I see the following warnings:
Ah....so you can that the compiler tells me that I've made the request - but it was not "done", only "requested".
As stated in the documentation, since my_func is not defined within the same program unit as the invoking subprogram, inlining does not apply.
So now I've shown you when inlining will take place and how you can verify that the compiler performed as expected.
Compile-time warnings are really quite cool. Turn them on in your IDE and pay attention to what they say!
Going back to the original question from Sheldon, though: "Is inlining (via PRAGMA INLINE) even something worth talking about? I’m getting the feeling that it’s not."
In my (admittedly limited) experience, it can be hard to come up with scripts that demonstrate massive performance improvements. It will definitely help most in "extreme" scenarios involving a very large number of subprogram invocations (loops, recursion, etc.).
[2016-09-23] The recommendation from the PL/SQL Development Team is unequivocal on this matter: you should set the PL/SQL optimization level to 3 at the instance level. Give the compiler the maximum number of tools to use to optimize your code. It may not make a big difference, but it will surely help overall.
If a particular subprogram is inlined, performance almost always improves. However, because the compiler inlines subprograms early in the optimization process, it is possible for subprogram inlining to preclude later, more powerful optimizations.
If subprogram inlining slows the performance of a particular PL/SQL program, then use the PL/SQL hierarchical profiler (explained in Oracle Database Development Guide) to identify subprograms for which you want to turn off inlining.
Thanks!!!
ReplyDeleteHello Steve,
ReplyDeleteI would like to ask for your assistance on the question below for my assignment on my Online Masteral. Currently, I do not have any background in SQL and I am starting to learn the basic of it. Hope you can assist me so that I may use as a reference same with the other questions. Thank you in advance!
------------------------
Create a procedure that displays the name of the customer with the highest amount of orders among all customers. Name the procedure PRINT_CUST_PROC. The procedure will also invoke a function called GET_HIGHORDER_FUNC that retrieves the name and total amount ordered of customer with the highest amount of order. Invoke the procedure and view the results.
------------------------
I am glad to hear you are getting started with SQL and PL/SQL. I am sorry, though, that I do not have time to do this assignment for you. You will find many examples of PL/SQL programs and code at livesql.oracle.com. I also encourage you check out devgym.oracle.com.
DeleteHi Steven,
ReplyDeleteThank you for the nice write-up.
I have a question, if you don't mind. Using the compilation warnings method, I was able to observe the following (using PL/SQL 12.2.0.1 on top of Oracle Linux Server 7.9).
Using the first example from your post, with plsql_optimize_level set to 3 and plsql_code_type set to 'interpreted', I can see that the compiler inlines the call to the nested function. However, if I change the compilation type to 'native', the nested function is no longer inlined. And it's not just a mirage; if I specifically add PRAGMA INLINE(nested_func, 'yes') before the invocation, the warnings show that the function is again inlined.
Any thoughts as to why, without a specific request to inline, with optimization level set to 3 and compilation type set to native, the compiler would choose to NOT inline the function call?
This is not just academic curiosity; for example, I found that inlining calls to PUSH and POP in a simple stack implementation, called a very large number of times from a small procedure, reduces execution time by a factor of two. Not huge, but not negligible either, especially given the extremely low cost of inlining (at least in this case).
Thank you! - mathguy
Interesting.
DeleteI do not have an answer for this, but I will see what I can find out.