In the second post of this "Joy" series, I
introduced the primary technique we would use to improve the performance of
extremememe.info's daily job: bulk processing with FORALL and BULK COLLECT.
I emphasized that whenever you see a loop that contains
non-query DML, you should convert it to use these bulk features.
So we've got some problematic code and we know what features
of PL/SQL to use. So now it's time to check the documentation, make sure we've the syntax down, and apply the
feature, right?
Change the code around and watch the bits and bytes flow like greased lightning!
Change the code around and watch the bits and bytes flow like greased lightning!
Not quite.
Sure, we could have done that. But after some 30 years in
software, I have grown leery and wary of excitement. When we get all excited
about a new feature or a solution to a problem, we tend to then be in a hurry
to apply it. When you hurry, you miss critical information. When you hurry, you
don't question your assumptions. When you hurry, you are much more likely to
make mistakes.
In fact, I have found that deferred satisfaction (holding off as long as you before writing
the actual code) has served me well, saved me from creating bigger problems,
and definitely saved me time.
With that in mind, let's continue with the story.
Code Review Identifies a Logic Flaw
I shared my dismay (non-query DML inside loop) and delight (a
known fix with greatly increased performance) with the extremememe.info team.
They were excited about the potential improvements and were eager to get
started. Then, however, I noticed something odd about the algorithm they'd
presented. Here, let's show you the relevant part of the code; can you see what
might be the problem?
The original em_update_status
1 CREATE OR REPLACE PROCEDURE em_update_status
2 IS
3
CURSOR incoming_cur
4
IS
5
SELECT * FROM em_incoming;
6
7
l_mention em_mentions%ROWTYPE;
8
l_status
em_memes.meme_status%TYPE;
9
BEGIN
10
FOR incoming_r IN incoming_cur
11
LOOP
12
BEGIN
13
SAVEPOINT new_transaction;
14
15
l_mention := em_memes_pkg.unpacked_incoming (incoming_r);
16
17
INSERT INTO em_mentions (meme_id,
18 source_name,
19
source_details,
20 occurred_on)
21 VALUES (l_mention.meme_id,
22 l_mention.source_name,
23 l_mention.source_details,
24 l_mention.occurred_on);
25
26
em_memes_pkg.reset_meme_status (l_mention.meme_id,
27
l_status);
28
29
IF l_status IS NOT NULL
30
THEN
31 UPDATE em_memes
32 SET meme_status = l_status
33 WHERE meme_id =
l_mention.meme_id;
34
END IF;
35
EXCEPTION
36
WHEN OTHERS
37
THEN
38
em_memes_pkg.log_error;
39
ROLLBACK TO new_transaction;
40
END;
41
END LOOP;
42 END;
A brief Q&A session soon clarified matters.
"Can the table of incoming data contain more than one
reference to the same meme?"
Heads nodded vigorously. They told me that the more
"viral" a meme behaved, the more mentions there would be. On some
very popular memes, the daily incoming could contain millions of references to the same meme. That led directly to my
second question:
"Then why are you updating the meme status after each
insert of a mention?" (see lines 26-34)
Now eyebrows shot up and heads turned to look at each other.
"Wow," said one developer. "Oh. My. Gosh." said another.
The team lead turned to me. "We never thought of that
before. We've been doing an enormous amount of unnecessary processing. That's
kind of embarrassing. It sure helps having a fresh pair of eyes looking at our
code."
Yes, it does.
Never underestimate the power of your brain to hide the obvious from you. Once you've
spent a lot of time writing or working with a program, you don't really see the code anymore. You see what you think the code should be. When someone
else looks at it, they come at it without any preconceived notions and see very
different aspects of the program.
SO REMEMBER!
If you're the only one who's ever looked at
your code,
you can be certain it contains bugs,
and perhaps even substantial errors in your algorithms.*
you can be certain it contains bugs,
and perhaps even substantial errors in your algorithms.*
We quickly came to an agreement that the statuses of the
memes should only be done after all
mentions were inserted. Furthermore, only those memes with new mentions should
be updated, so we would need to keep track of those.
The resulting code might be a bit more complicated, but if
written correctly we can make the code easy to understand and maintain. Bottom line,
though: we are focused on performance and this could be a big help.
* Which reminds me of one of my favorite sayings: If you
look around the room and wonder who the chump** is, you are the chump.
** Chump: a foolish or easily deceived person.
Analyze Current Behavior
Before you can think about making changes to a program, you
need to make sure that you understand how the current state of the code works -
and what it does when errors occur. If you don't do this, how can you be sure
that the new version still meets user requirements? Oh, right. You could build
a regression test. I address that in the next section.
For now, let's focus simply on the importance of analyzing
and understanding (deeply) your code before embarking on changes. This is
especially critical for programs that change the contents of tables.
Here is a list of the behaviors we need to reproduce in the
new, bulked-up version of reset_meme_status:
- If the incoming table is empty, then make no changes.
- If an error occurs when inserting a row into the mentions table, do not update the meme status.
- If an error occurs processing either DML statement, log the error and continue processing.
But wait a minute – that certainly reflects the current
program, but not the program we want to build. Remember that we figured out
that the meme status update should occur after all incoming rows are processed.
So in actually the new set of behaviors we need are:
- If the incoming table is empty, then make no changes.
- If an error occurs when inserting a row into the mentions table, log the error and continue processing.
- If an error does not occur, record this meme as being changed.
- After all incoming rows have been processed, updated the status on all affected memes.
This change actually simplifies greatly the kind of code we
will need to write using BULK COLLECT and FORALL. The reason is that FORALL can
only be used with a single DML statement. So if my loop contains two DML statements I have to find a way
to "communicate" between from FORALL to the next. I will explore this
challenge more fully in the next installment of this series.
So now we know what we need to do in the bulked-up version. That
means it's time start writing code, right? Wrong.
Build the Regression Test
Now, let's talk about seriously
deferring satisfaction.
Let's talk about the importance of building automated
regression tests before you embark on a substantial revision to
mission-critical code.
The reset_meme_status procedure had been in production for
several years. It had been thoroughly tested and was now relied upon implicitly
to do its job without errors.
There was no margin for error; the new version had to work just as well (but lots
faster). And the only way to ensure
that this would be the case was a to build a script that verifies the current
behavior, and could then be run against the new version as well.
This is called a regression
test. You want to make sure that your code has not regressed (to an earlier, buggier state) after you applied a
change.
It isn't easy to build regression tests for database
programs. First, you must analyze and document current behavior. Then you need
to come up with a list of all the test cases needed to verify that behavior.
Then you have to build a script to set up the tables with all the variability
of production data. Then you have to write code to check the contents of those
tables after your program runs. Finally, you need a way to run all that code so
that you can easily identify failures or confirm success.
Some development groups write all of this code by hand.
Others use one of several regression test tools, including:
utPLSQL - an open
source project that I wrote back in 1999, is still used by many groups around
the world and has recently come back to life, under the administration of Paul
Walker. You must write your own test packages, but the utPLSQL framework then
runs the package and verifies the results.
SQL Developer Unit
Testing - Oracle's SQL Developer tool offers integrated unit testing that
also allows you to describe the expected behavior of your program and then
generates the code needed to verify that behavior.
Code Tester for
Oracle - a commercial product from Dell that will generate test packages
from your descriptions of expected behavior.
There are others besides these, including PLUTO, PL/Unit and
dbFit.
I encourage you to explore these options and to make the
greatest effort possible to build regression tests, so that you can avoid
getting calls from users that go like this: "Thanks for adding all those
new features in version 2.6, but you broke three old features that I rely on every day."
Now, having said all of that, I am also resigned to the fact
that it will be the rarest of PL/SQL developers (and teams) who actually make
time to build automated regression tests.
I don't do a whole lot of them myself.
So I would like to offer a simple alternative that could
give you some nice benefits without taking an enormous amount of time.
Before I do I have a question:
You need to write a new program. Your manager says so. OK,
so you think about what you need to do and you start typing. Here's the
question: how do you know when you're done?
Seems like an obvious question. But the answer often comes
down to a variation of:
"I'll know it when I see it."
Unfortunately, our vision is all too often blurred by the
pressures we face to get our work done and our programs coded.
So here is my suggestion: before you start a new program,
take out a piece of paper and ask yourself "How will I know when I'm
done?"
Write down brief
descriptions of distinct scenarios the program should handle and what should
happen. Don't worry about too many details; the bulk of the value of this step
is to externalize the list from your
head (very bad place to store a to-do list when you are pressed for time)
Then (finally),
you write your code, and you check off the items on the list as you go.
The checklist would contain reminders of key best practices,
but more importantly lists out the things your program should be able to do
before you declare it done.
Of course, the most important items on the checklist are
taken from the previous section: what does the new version of the program need
to do?
Task
|
Completed?
|
Program
Requirements
|
|
If the incoming table is empty, then make no changes.
|
|
If an error occurs when inserting a row into the mentions
table, log the error and continue processing.
|
|
If an error does not occur, record this meme as being
changed.
|
|
After all incoming rows have been processed, updated the
status on all affected memes.
|
|
Best Practices
|
|
Add instrumentation to trace program activity.
|
|
Avoid hard-coded declarations. Use %TYPE or subtypes.
|
|
Validate all assumptions you are making (and document
them).
|
I realize that when you look at my little checklist here, it
can seem kind of obvious. Actually, that is (partly) the point.
These things should
be obvious, they often are obvious,
but it is so easy to forget them, to push hard and fast along the path to
production, and inadvertently leave unimplemented requirements in your wake.
Having a piece of paper (or, better, a tool) to help you remember can make an
enormous difference.
OK, so remember what I said about deferred satisfaction?
Here's another deferral: I will post this next episode of
my series without even getting to bulk processing!
In my next article, though, I promise to offer a detailed look at how bulk processing involves a shift
from row by row to phased processing,
and what effect that can have on your code.
Until then, happy PL/SQL coding!
Steven
steven.feuerstein@oracle.com
Code review is very important to filter bug. Code review are very helpful and make this task easy. Thanks for sharing nice information.
ReplyDelete