I recently heard a very interesting story about a use for LOG ERRORS: namely, to verify the correctness (and improve the performance) of a problematic SQL statement.
I thought you might enjoy hearing about it.
Greg Belliveau of Digital Motorworks buttonholed me at ODTUG's Kscope14 conference with this report:
As part of a data mart refresh, we had an insert statement that refreshed a FACT table, and over the years had gotten to the point where it took well over 3 hours to complete.
There was a NOT IN clause in the statement, and we were pretty sure that was the cause of the degenerating performance. Why was it there? Our best guess was so that the statement would not fail in case someone ran the refresh for a date range that had already ran...even though there was code that checked and prevented that from happening.
[Steven: In other words, programmer's insurance. "Just in case, let's do this." That's always problematic in software. Is that a real "case"? Will it ever happen? What are the consequences of including this insurance? Will we document it so that others coming along later can figure out why this weird code is here? Usually not.]
Not wanting to lose the "safety net" that was in place, we decided to try the LOG ERRORS feature you had mentioned in your session at Oracle Open World in 2010. We removed the NOT IN clause and added LOG ERRORS. The insert statement then ran (and continues to run to this day) in roughly one minute (down from three hours!).
Oh, and there's never been a single row inserted in the error table!
Nice, nice, very nice.
It's always a bit scary to mess with code that's been around for years and (since it's been around for years) is a gnarly bunch of logic. SQL, with its set orientation, can be even more off-putting than, say, a PL/SQL function.
So, of course, the solution is to build a comprehensive automated regression test script so that you can compare before and after results.
Yes, but almost no one will (ever) do that. So we do what we can.
And in this case, Greg and his team came up with a very creative solution:
We are pretty sure NOT IN is causing the performance problem, but we also cannot afford to remove the clause and have the statement fail.
So we'll take it out, but add LOG ERRORS to ensure that all inserts are at least attempted.
Then we can check the error log table afterwards to see if the NOT IN really was excluding data that would have caused errors.
In Greg's case, the answer was "Nope". So far as they can tell, the NOT IN was unnecessary.
OK, but maybe it will be necessary a week or month or year from now. What then?
Well, the bad data that should have been excluded will still be excluded (the insert will fail), and then the job can check the log table and issue whatever kind of report (or alarm) is needed.
So LOG ERRORS as part regression test, part performance monster.
Nice work, Greg.
Nice work, Oracle.