Every now and again, the DBAs of the world meet with the developers of the world and try and bang some heads together…I already wrote a newsletter all about existence checks which has changed and evolved over the years and now I wish to investigate another age-old problem “Is Db2 there?”
Why?
First up, is “Why?” – Why would you want to know that? Well think about our wonderful agile world these days. The JAVA developers do not even know what Db2 *is* and so when they write/script/generate/get generated for them SQL (Delete whatever is not appropriate), they have no idea *where* the code will run. To be fair, they should not really need to…the idea these days is that your back end can be anywhere and on any hardware platform.
So, wouldn’t it be cool to be able to tell the application that the required database is “not currently available” – This is when the “Is Db2 there?” SQL question, and all of its problems, was born.
Problems
How to do this ?
How often to do this ?
Do you need to do this ?
Haven’t you got better things to do than this?
1 – How to do this
The
first problem was solved when a developer wrote this SQL:
SELECT * FROM SYSIBM.SYSDUMMY1
PREPAREd, OPENed, FETCHed it and CLOSEd it. Now remember, with dynamic SQL, everything must be in cursors so you really did have to define a cursor, open the cursor, fetch the cursor and, if you are being a good coder, close the cursor. If all of this was successful, then you “knew” that Db2 on z/OS was up and running and available for all your further SQL requests.
This was ok when there was not so much dynamic SQL going on (Before DB2 V8), but naturally it started getting a bit out of hand very quickly.
The
SQL then changed a little bit to be:
SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
The
logic here was it could only return one row/value back and so made the network
traffic less. Naturally Db2 went off and counted the single row in that table
using a tablespace scan *every* time…
This slowly became a small, but very noticeable, overhead with 1,000,000’s of getpages against the smallest Db2 Catalog table ever invented…
Now
Terry Purcell and the optimizer crew came up with the PRUNED access plan in Db2
9:
PRUNED
Db2 does not generate an access path for the query because the query is guaranteed to qualify zero rows, such as the case of an always-false WHERE clause.
For
example:
WHERE 0 = 1
This happens when a “leg” of an SQL query cannot ever be true. In this case, the complete leg of the SQL query is thrown away. With this advance in optimizer logic the predicate:
WHERE 1 = 0
Actually became really useful, as it meant that the *entire* SQL was only run through the optimizer and the SQLCODE was set to +100 if doing a FETCH with the * style SQL or the value of your host variable was set to zero if using the COUNT(*) style. This meant that the getpages for SYSIBM.SYSDUMMY1 dropped down to zero – which is a very very good thing
IBM have been improving what PRUNED can do ever since, and Terry Purcell wrote
“Another benefit of column pruning is that it can provide extra table pruning opportunities. In Db2 10, an enhancement was delivered to prune outer join tables if no columns were required from that table and it was guaranteed that the table would not introduce duplicates. In Db2 12, this pruning enhancement is extended to outer joins to views/table expressions that are guaranteed not to return duplicates and no columns are required from the view/table expression.”
2 – How often to do this?
How long is a piece of string? The problem is that when you do the “Is Db2 there?” test at the start of a long conversational transaction, it could well be that Db2 has disappeared at the end…
3 – Do you need to do this?
Again, an age-old question – Most JAVA frameworks do this (Like they do embedded COMMIT and ROLLBACK!) and so the simple answer is yes!
4 – Haven’t you got better things to do than this?
Of
course you do! We all do, but these little changes can actually have a major
impact on the overall machine throughput!
Measured Response
Here are some real word results of tests with dynamic SQL with/without the 1 = 0 and running a million times each. I have tried doing a variety of flavors, like Heinz…
First
up are the SELECT * style selects:
SEL-867 SELECT * FROM SYSIBM.SYSDUMMY1
SEL-868 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
SEL-869 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH UR
SEL-870 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH UR
FOR FETCH ONLY
SEL-871 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH CS
SEL-872 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH CS
FOR FETCH ONLY
SEL-873 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH RR
SEL-874 SELECT * FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH RR
FOR FETCH ONLY
Now
the SELECT COUNT(*) style selects:
SEL-875 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
SEL-876 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
SEL-877 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH UR
SEL-878 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH UR
FOR FETCH ONLY
SEL-879 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH CS
SEL-880 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH CS
FOR FETCH ONLY
SEL-881 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH RR
SEL-882 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
WHERE 1 = 0
WITH RR
FOR FETCH ONLY
The
results were:
Rows Rows Index Tblsp
StmtID Execs Getpg exam proc Scans Scans
---------- ----- ----- ----- ----- ----- -----
867 1000K 2M 1000K 1000K 0 1000K
868 1000K 0 0 0 0 0
869 1000K 0 0 0 0 0
870 1000K 0 0 0 0 0
871 1000K 0 0 0 0 0
872 1000K 0 0 0 0 0
873 1000K 0 0 0 0 0
874 1000K 0 0 0 0 0
875 1000K 2M 1000K 1000K 0 1000K
876 1000K 0 0 1000K 0 0
877 1000K 0 0 1000K 0 0
878 1000K 0 0 1000K 0 0
879 1000K 0 0 1000K 0 0
880 1000K 0 0 1000K 0 0
881 1000K 0 0 1000K 0 0
882 1000K 0 0 1000K 0 0
StmtID Tot. CPU Avg. CPU Tot. Elap Avg. Elap
---------- SS.mmmmmm SS.mmmmmm SS.mmmmmm SS.mmmmmm
867 2.481068 0.000002 16.869174 0.000017
868 1.291817 0.000001 8.119036 0.000008
869 1.331707 0.000001 6.220308 0.000006
870 1.330709 0.000001 9.460538 0.000009
871 1.306633 0.000001 8.984930 0.000009
872 1.326517 0.000001 9.181043 0.000009
873 1.324213 0.000001 7.392330 0.000007
874 1.322115 0.000001 7.694403 0.000008
875 3.066573 0.000003 18.824193 0.000019
876 1.467454 0.000001 6.168161 0.000006
877 1.478887 0.000001 7.714925 0.000008
878 1.480241 0.000001 9.903416 0.000010
879 1.477947 0.000001 9.965071 0.000010
880 1.469971 0.000001 9.195251 0.000009
881 1.467287 0.000001 8.687336 0.000009
882 1.487021 0.000001 11.742945 0.000012
The SELECT * with WHERE 1 = 0 WITH UR is actually the winner!
Remember
– You never stop learning!
As
always I would be pleased to hear from you!
TTFN,
Roy Boxwell
Senior Architect
Results
Reader test results from Isaac Yassin:
Measured for dynamic SQL (SPUFI in UR mode – DSNESPUR),
using Omegamon application trace. Same machine, same LPAR, same subsystem,
measuring the 3rd consecutive execution of each SQL. Both do a
“prepare/open/fetch/close” according to trace.
select * from sysibm.sysdummy1 vs. select * from
sysibm.sysdummy1 where 0=1
The differences are:
For Db2 10
Without “where 0=1”
PREPARE – InDB2 Time = 0.00048 InDB2 CPU = 0.00047
FETCH - 2 fetches, InDB2 time = 0.00131 InDB2 CPU= 0.00116 (avg. 0.00058)
LOCKs - Type = MDEL , Level = S , DB=DSNDB06 , PS=SYSEBCDC , Count = 1
Sequential scan of data page = 2 pages scanned, 1 row qualified
With “where 0=1”
PREPARE - InDB2 Time = 0.00077 InDB2 CPU = 0.00049
FETCH - 1 fetch, InDB2 time = 0.00008 , InDB2 CPU = 0.00008
Locks - none
NO sequential scan
For Db2 11
Without “where 0=1”
PREPARE - InDB2 Time = 0.00045 InDB2 CPU = 0.00045
FETCH - 2 fetches, InDB2 time = 0.00121 InDB2 CPU= 0.00120 (avg. 0.00060)
LOCKs - Type = MDEL , Level = S , DB=DSNDB06 , PS=SYSEBCDC , Count = 1
Sequential scan of data page = 2 pages scanned, 1 row qualified
With “where 0=1”
PREPARE - InDB2 Time = 0.00044 InDB2 CPU = 0.00044
FETCH - 1 fetch, InDB2 time = 0.00007 , InDB2 CPU = 0.00007
Locks - none
NO sequential scan
select 1 from sysibm.sysdummy1 vs. select 1 from sysibm.sysdummy1 where 0=1
The differences are:
For Db2 10
Without “where 0=1”
PREPARE - InDB2 Time = 0.00057 InDB2 CPU = 0.00054
FETCH - 2 fetches, InDB2 time = 0.00015 InDB2 CPU= 0.00015 (avg. 0.00007)
LOCKs - None
No Sequential scan of data page
With “where 0=1”
PREPARE - InDB2 Time = 0.00060 InDB2 CPU = 0.00034
FETCH - 1 fetch, InDB2 time = 0.00006 , InDB2 CPU = 0.00006
Locks - none
NO sequential scan
For Db2 11
Without “where 0=1”
PREPARE - InDB2 Time = 0.00047 InDB2 CPU = 0.00047
FETCH - 2 fetches, InDB2 time = 0.00087 InDB2 CPU= 0.00082 (avg. 0.00041)
LOCKs - none
NO Sequential scan of data pages
With “where 0=1”
PREPARE - InDB2 Time = 0.00044 InDB2 CPU = 0.00044
FETCH - 1 fetch, InDB2 time = 0.00007 , InDB2 CPU = 0.00007
Locks - none
NO sequential scan of data pages
Bottom line
“select 1” works better than “select *” as you don’t really scan the data and not taking locks, using “where 0=1” eliminates 1 fetch and the locks & scan as well.
Isaac Yassin