tag:blogger.com,1999:blog-7849367040589270673.post9029531748003353635..comments2024-03-21T22:50:39.997-07:00Comments on Obsessed with Oracle PL/SQL: Table Functions, Part 5c: Another use case for Pipelined Table Functions (and simple example)Steven Feuersteinhttp://www.blogger.com/profile/18405765731886460622noreply@blogger.comBlogger3125tag:blogger.com,1999:blog-7849367040589270673.post-26950176046279462182015-10-26T07:11:58.628-07:002015-10-26T07:11:58.628-07:00I got a lot of mileage using this approach to bind...I got a lot of mileage using this approach to bind arrays of ROWIDs when issuing [C]RUD operations where the ROWIDs are known. This affords me the performance benefits of using ROWIDs but not requiring a dynamic number of bind variables or having to use literals.<br /><br />SELECT | UPDATE | DELETE<br />.......<br />WHERE ROWID IN (SELECT CHARTOROWID(COLUMN_VALUE) FROM TABLE(?))<br /><br />(In Java JDBC we reference ROWIDs as strings)<br /><br />nickmanhttps://www.blogger.com/profile/09238427833396441114noreply@blogger.comtag:blogger.com,1999:blog-7849367040589270673.post-89150783074723993612015-08-28T10:32:49.569-07:002015-08-28T10:32:49.569-07:00( I already posted the continuation yesterday, but...( I already posted the continuation yesterday, but it seems to have been lost ...<br /> so I post it once again ... )<br /><br />As we can expect, the execution plan (for both functions) used a hash join and looks as follows:<br /><br />---------------------------------------------------------------------------------------------------<br />| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |<br />---------------------------------------------------------------------------------------------------<br />| 0 | SELECT STATEMENT | | 1 | 7 | 312 (3)| 00:00:04 |<br />| 1 | SORT AGGREGATE | | 1 | 7 | | |<br />|* 2 | HASH JOIN RIGHT SEMI | | 1 | 7 | 312 (3)| 00:00:04 |<br />| 3 | COLLECTION ITERATOR PICKLER FETCH| MY_LIST_PTF | 8168 | 16336 | 19 (0)| 00:00:01 |<br />| 4 | TABLE ACCESS FULL | PLCH_DATA | 1000K| 4882K| 289 (2)| 00:00:04 |<br />---------------------------------------------------------------------------------------------------<br /><br />The cardinality that we see here for the TABLE function result set is the default of 8168 that the Oracle optimizer is using if we do not "help" it to use a more realistic number<br />( Adrian Billington at http://www.oracle-developer.net has several nice articles on this topic ).<br /><br />But, anyway, as we see in the execution plan, the result set of the TABLE function is used as the build table for the HASH join, so intuitively I cannot comprehend how exactly can the pipeline-ness of the function be used as an advantage for the join itself.<br /><br />We indeed see that while using the pipelined function is still faster than the non-pipelined one, the two CPU times are not dramatically different.<br /><br />I guess that the only advantage of using the pipelined function here may come during the creation of the in-memory hash join "build table" based on the table function result set, that is, hash values can be stored in the hash table while data is still coming in.<br /><br />I think that the join step itself to table plch_data is not influenced by the table function <br />being pipelined or not.<br /><br />Of course, this conclusion can be different for a NESTED LOOPS execution plan, driven by the table function result set.<br />In such a case, the join process itself (scanning of the plch_data outer table) can already be performed while rows are still being returned from the pipelined function table.<br /><br />But such a plan will probably be chosen by Oracle only for a much bigger data table in comparison with the driving table function data set, with an index on the big data table.<br /><br /><br />Just for comparison, for your example with table plch_data having only 1 row,<br />the execution plan looks like this:<br /><br />---------------------------------------------------------------------------------------------------<br />| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |<br />---------------------------------------------------------------------------------------------------<br />| 0 | SELECT STATEMENT | | 1 | 5 | 23 (5)| 00:00:01 |<br />| 1 | SORT AGGREGATE | | 1 | 5 | | |<br />|* 2 | HASH JOIN SEMI | | 1 | 5 | 23 (5)| 00:00:01 |<br />| 3 | TABLE ACCESS FULL | PLCH_DATA | 1 | 3 | 3 (0)| 00:00:01 |<br />| 4 | COLLECTION ITERATOR PICKLER FETCH| MY_LIST_PTF | 8168 | 16336 | 19 (0)| 00:00:01 |<br />---------------------------------------------------------------------------------------------------<br /><br />Here we see that the smaller table plch_data was chosen as the hash build table,<br />and then indeed the time difference can be higher between the pipelined and non-pipelined table functions, because the pipeline-ness does influence the join itself.<br /><br /><br />I would be glad if the performance gurus reading this post could confirm my conclusions :)<br /><br />Thanks a lot & Best Regards,<br />Iudith<br /><br /><br /><br />iudithhttps://www.blogger.com/profile/04905902445036068357noreply@blogger.comtag:blogger.com,1999:blog-7849367040589270673.post-46075721780679874282015-08-27T01:01:20.371-07:002015-08-27T01:01:20.371-07:00Hello Steven,
This post was really nice and stimu...Hello Steven,<br /><br />This post was really nice and stimulated some research.<br />Of course, the natural temptation was to perform a test where table plch_data has a higher number of rows, so, using your code already at hand, I performed the test below, in 11.2.0.3.0.<br /><br /><br />DECLARE<br /> l_count INTEGER;<br /> l_start PLS_INTEGER;<br /><br /> PROCEDURE mark_start<br /> IS<br /> BEGIN<br /> l_start := DBMS_UTILITY.get_cpu_time;<br /> END mark_start;<br /><br /> PROCEDURE show_elapsed (NAME_IN IN VARCHAR2)<br /> IS<br /> BEGIN<br /> DBMS_OUTPUT.put_line (<br /> '"'<br /> || NAME_IN<br /> || '" elapsed CPU time: '<br /> || TO_CHAR (DBMS_UTILITY.get_cpu_time - l_start)<br /> || ' centiseconds');<br /> mark_start;<br /> END show_elapsed;<br />BEGIN<br /> -- INSERT INTO plch_data VALUES (1);<br /> FOR i in 1 .. 1000000 LOOP<br /> INSERT INTO plch_data VALUES (i) ;<br /> END LOOP;<br /><br /> COMMIT;<br /><br /> DBMS_STATS.GATHER_TABLE_STATS( USER, 'PLCH_DATA' );<br /> <br /> mark_start;<br /><br /> SELECT COUNT (*)<br /> INTO l_count<br /> FROM plch_data<br /> WHERE n IN (SELECT * FROM TABLE (my_list_tf));<br /><br /> show_elapsed ('TF match on first');<br /><br /> SELECT COUNT (*)<br /> INTO l_count<br /> FROM plch_data<br /> WHERE n IN (SELECT * FROM TABLE (my_list_ptf));<br /><br /> show_elapsed ('PTF match on first');<br /><br /> -- ----------------------------------------------------<br /><br /> UPDATE plch_data<br /> SET n = n + 500000 ;<br /><br /> COMMIT;<br /><br /> DBMS_STATS.GATHER_TABLE_STATS( USER, 'PLCH_DATA' );<br /><br /> mark_start;<br /><br /> SELECT COUNT (*)<br /> INTO l_count<br /> FROM plch_data<br /> WHERE n IN (SELECT * FROM TABLE (my_list_tf));<br /><br /> show_elapsed ('TF match on last');<br /><br /> SELECT COUNT (*)<br /> INTO l_count<br /> FROM plch_data<br /> WHERE n IN (SELECT * FROM TABLE (my_list_ptf));<br /><br /> show_elapsed ('PTF match on last');<br />END;<br />/<br />"TF match on first" elapsed CPU time: 94 centiseconds<br />"PTF match on first" elapsed CPU time: 74 centiseconds<br /><br />"TF match on last" elapsed CPU time: 87 centiseconds<br />"PTF match on last" elapsed CPU time: 72 centiseconds<br /><br />PL/SQL procedure successfully completed.<br /><br />( to be continued )iudithhttps://www.blogger.com/profile/04905902445036068357noreply@blogger.com