Remember the good old days when there was a <userid>.PLAN_TABLE with all your access path data in it?
When the package was dropped or changed by some program maintenance you could simply do a DELETE WHERE NOT EXISTS style SQL to keep your data up to date and pristine?
Of course those days have *long* gone…
Nowadays everyone on the block has got DataStudio installed and it installs a whole bunch of “new” hidden tables to enable Query Tuning. This is OK of course, but do you actually take care of this data? If it is the production EXPLAIN tables – Are you image copying, runstating and reorging when you should? Do you have all the indexes you need? Are you deleting too much or not enough data? Can you, in fact, delete anything these days without the risk that you will delete an access path that may “come back from the dead”?
First, a quick review of which tables you may well have in existence at the moment:
DB2 V8
PLAN_TABLE | The Good old original, now with 58 columns |
DSN_STATEMNT_TABLE | Also available as a published API with 12 columns used for Costs |
DSN_FUNCTION_TABLE | As above with 15 columns for checking which function is called |
DSN_DETCOST_TABLE | The Hidden tables first officially externalized by DataStudio |
DSN_FILTER_TABLE | |
DSN_PGRANGE_TABLE | |
DSN_PGROUP_TABLE | |
DSN_PREDICAT_TABLE | |
DSN_PTASK_TABLE | |
DSN_QUERY_TABLE | Not populated by EXPLAIN(YES) for BIND/REBIND |
DSN_SORT_TABLE | |
DSN_SORTKEY_TABLE | |
DSN_STRUCT_TABLE | |
DSN_VIEWREF_TABLE |
Now, most of the descriptive text for the new ones is “IBM internal use only” but some of them are pretty cool! Please remember that all this data is asis, and will probably *not* help you in your day to day tuning one little bit. That being said, I do use the DSN_PREDICAT_TABLE quite a lot.
The first “problem” is: the DSN_QUERY_TABLE because it contains a CLOB(2M) column, which means that it can grow very very quickly – well, not the DSN_QUERY_TABLE itself of course, rather its auxiliary table that holds the LOB data.
DB2 9
PLAN_TABLE | Now with 59 columns | |
DSN_STATEMNT_TABLE | ||
DSN_FUNCTION_TABLE | ||
DSN_DETCOST_TABLE | ||
DSN_FILTER_TABLE | ||
DSN_PGRANGE_TABLE | ||
DSN_PGROUP_TABLE | ||
DSN_PREDICAT_TABLE | ||
DSN_PTASK_TABLE | ||
DSN_QUERYINFO_TABLE | New for query rewrite & accelerator usage and with two LOBs | |
DSN_QUERY_TABLE | ||
DSN_SORT_TABLE | ||
DSN_SORTKEY_TABLE | ||
DSN_STATEMENT_CACHE_TABLE | New with a LOB column | |
DSN_STRUCT_TABLE | ||
DSN_VIEWREF_TABLE |
The DSN_STATEMENT_CACHE_TABLE must be separately maintained as it has nothing to do with the PLAN_TABLE of course!
DB2 10
PLAN_TABLE | Now with 64 columns |
DSN_STATEMNT_TABLE | |
DSN_FUNCTION_TABLE | |
DSN_COLDIST_TABLE | New and contains the SYSCOLDIST data |
DSN_DETCOST_TABLE | |
DSN_FILTER_TABLE | |
DSN_KEYTGTDIST_TABLE | New and contains the SYSKEYTGTDIST data |
DSN_PGRANGE_TABLE | |
DSN_PGROUP_TABLE | |
DSN_PREDICAT_TABLE | |
DSN_PTASK_TABLE | |
DSN_QUERYINFO_TABLE | |
DSN_QUERY_TABLE | |
DSN_SORT_TABLE | |
DSN_SORTKEY_TABLE | |
DSN_STATEMENT_CACHE_TABLE | |
DSN_STRUCT_TABLE | |
DSN_VIEWREF_TABLE |
Now it must be apparent that the amount and size of this data is getting out of hand. If someone, by accident, creates all of these tables in production, then every BIND or REBIND with EXPLAIN(YES) will be doing a vast amount of I/O possibly up to the point, that you run out of space in one or more of the tablespaces (Especially the LOB ones really hurt!). I actually filled two packs with data…cost me a pizza…
In DB2 10, IBM aligned all of the EXPLAIN tables to have a “common” key: QUERYNO, APPLNAME, PROGNAME, COLLID, GROUP_MEMBER, SECTNOI, VERSION and EXPLAIN_TIME. Before this, some or nearly all of these columns were not there (DSN_PGRANGE_TABLE and DSN_QUERY_TABLE being the worst with only QUERYNO, GROUP_MEMBER, and EXPLAIN_TIME as a “key”)
This leads to some minor problems in determining which rows can be deleted, but nothing disastrous!
To clean-up in DB2 10, the simplest and best is a DELETE WHERE NOT EXISTS from the child tables to the PLAN_TABLE and as these are nowadays all the same; it is very simple:
DELETE FROM <userid>.DSN_<name>_TABLE A WHERE NOT EXISTS (SELECT 1 FROM <userid>.PLAN_TABLE B WHERE A.QUERYNO = B.QUERYNO AND A.APPLNAME = B.APPLNAME AND A.PROGNAME = B.PROGNAME AND A.COLLID = B.COLLID AND A.GROUP_MEMBER = B.GROUP_MEMBER AND A.SECTNOI = B.SECTNOI AND A.VERSION = B.VERSION AND A.EXPLAIN_TIME = B.EXPLAIN_TIME) ;
And you are done once you have done this on all 16 tables…
DSN_STATEMNT_TABLE |
DSN_FUNCTION_TABLE |
DSN_COLDIST_TABLE |
DSN_DETCOST_TABLE |
DSN_FILTER_TABLE |
DSN_KEYTGTDIST_TABLE |
DSN_PGRANGE_TABLE |
DSN_PGROUP_TABLE |
DSN_PREDICAT_TABLE |
DSN_PTASK_TABLE |
DSN_QUERYINFO_TABLE |
DSN_QUERY_TABLE |
DSN_SORT_TABLE |
DSN_SORTKEY_TABLE |
DSN_STRUCT_TABLE |
DSN_VIEWREF_TABLE |
Of course the first real work is to find out which data you no longer need in the <userid>.PLAN_TABLE; and here the new catalog table SYSIBM.SYSPACKCOPY can help you – once you have bound everything in NFM; of course!
DELETE FROM <u serid>.PLAN_TABLE A WHERE NOT EXISTS (SELECT 1 FROM (SELECT COLLID , NAME , BINDTIME FROM SYSIBM.SYSPACKCOPY UNION ALL SELECT COLLID , NAME , BINDTIME FROM SYSIBM.SYSPACKAGE ) B WHERE A.COLLID = B.COLLID AND A.PROGNAME = B.NAME AND A.BIND_TIME = B.BINDTIME) ;
This finds all of the currently “in use” (or possibly in use) collections and packages and then simply deletes from the PLAN_TABLE any that are not there. Having run this, you can then run the 16 other tidy up queries before finally doing a REORG with inline RUNSTATS and COPY, as this data is critical for your business! Now as it is critical… where have you got it stored? Does it also have the same stringent standards as other production data? One table per tablespace with a correct SEGSIZE, Bufferpool etc. Now is the time to review where your EXPLAIN data actually lives and take corrective action – it will speed up BIND and REBIND as deadlocks will reduce, and finally add all of the tablespaces to your DB2 Database Maintenance system! Just doing INSERTs is all well and good but when you try and SELECT the data, the performance can be terrible! For this the RealTime DBAExpert (RTDX) is your friend.
At this point it is also worth remembering that indexes also can be a great help! You should have two indexes on the PLAN_TABLE on BIND_TIME, COLLID, NAME and also BIND_TIME, QUERYNO, PROGNAME. All of the other EXPLAIN tables need at least EXPLAIN_TIME, QUERYNO and PROGNAME. You can happily add or alter these, of course, as it might even make sense to have indexes on QUERYNO, APPLNAME, PROGNAME, COLLID, GROUP_MEMBER, SECTNOI, VERSION, and EXPLAIN_TIME. Feel free to mix’n’ match for what you are doing with your tables! For example if using HINTs, it is recommended to have QUERYNO, APPLNAME, PROGNAME, VERSION, COLLID, OPTHINT or the performance will be pretty bad.
Bottom line is – Just make sure that you *only* create the EXPLAIN tables you actually *need*
and you *only* create the indexes you actually *need*
Happy EXPLAINing! As usual comments and queries are always welcome!
TTFN,
Roy Boxwell
Senior Architect