2019-11 Db2 Existence-checks: SELECT for DUMMIES

Every now and again, the DBAs of the world meet with the developers of the world and try and bang some heads together…I already wrote a newsletter all about existence checks which has changed and evolved over the years and now I wish to investigate another age-old problem “Is Db2 there?”


Newsletter: 2012-10: Existence check SQL – Through the ages
Reader test results: see the end of this newsletter for SELECT queries tested by one of our readers on Db2 10 and DB2 11.

Why?

First up, is “Why?” – Why would you want to know that? Well think about our wonderful agile world these days. The JAVA developers do not even know what Db2 *is* and so when they write/script/generate/get generated for them SQL (Delete whatever is not appropriate), they have no idea *where* the code will run. To be fair, they should not really need to…the idea these days is that your back end can be anywhere and on any hardware platform.

So, wouldn’t it be cool to be able to tell the application that the required database is “not currently available” – This is when the “Is Db2 there?” SQL question, and all of its problems, was born.

Problems


  1. How to do this ?

  2. How often to do this ?

  3. Do you need to do this ?

  4. Haven’t you got better things to do than this?


1 – How to do this

The first problem was solved when a developer wrote this SQL:

SELECT * FROM SYSIBM.SYSDUMMY1

PREPAREd, OPENed, FETCHed it and CLOSEd it. Now remember, with dynamic SQL, everything must be in cursors so you really did have to define a cursor, open the cursor, fetch the cursor and, if you are being a good coder, close the cursor. If all of this was successful, then you “knew” that Db2 on z/OS was up and running and available for all your further SQL requests.

This was ok when there was not so much dynamic SQL going on (Before DB2 V8), but naturally it started getting a bit out of hand very quickly.

The SQL then changed a little bit to be:

SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1

The logic here was it could only return one row/value back and so made the network traffic less. Naturally Db2 went off and counted the single row in that table using a tablespace scan *every* time…

This slowly became a small, but very noticeable, overhead with 1,000,000’s of getpages against the smallest Db2 Catalog table ever invented…

Now Terry Purcell and the optimizer crew came up with the PRUNED access plan in Db2 9:

PRUNED

Db2 does not generate an access path for the query because the query is guaranteed to qualify zero rows, such as the case of an always-false WHERE clause.

For example: 

WHERE 0 = 1

This happens when a “leg” of an SQL query cannot ever be true. In this case, the complete leg of the SQL query is thrown away. With this advance in optimizer logic the predicate:

WHERE 1 = 0

Actually became really useful, as it meant that the *entire* SQL was only run through the optimizer and the SQLCODE was set to +100 if doing a FETCH with the * style SQL or the value of your host variable was set to zero if using the COUNT(*) style. This meant that the getpages for SYSIBM.SYSDUMMY1 dropped down to zero – which is a very very good thing

IBM have been improving what PRUNED can do ever since, and Terry Purcell wrote

“Another benefit of column pruning is that it can provide extra table pruning opportunities. In Db2 10, an enhancement was delivered to prune outer join tables if no columns were required from that table and it was guaranteed that the table would not introduce duplicates. In Db2 12, this pruning enhancement is extended to outer joins to views/table expressions that are guaranteed not to return duplicates and no columns are required from the view/table expression.”

2 – How often to do this?

How long is a piece of string? The problem is that when you do the “Is Db2 there?” test at the start of a long conversational transaction, it could well be that Db2 has disappeared at the end…

3 – Do you need to do this?

Again, an age-old question – Most JAVA frameworks do this (Like they do embedded COMMIT and ROLLBACK!) and so the simple answer is yes!

4 – Haven’t you got better things to do than this?

Of course you do! We all do, but these little changes can actually have a major impact on the overall machine throughput!


Measured Response


Here are some real word results of tests with dynamic SQL with/without the 1 = 0 and running a million times each. I have tried doing a variety of flavors, like Heinz…

First up are the SELECT * style selects:

SEL-867 SELECT * FROM SYSIBM.SYSDUMMY1 
SEL-868 SELECT * FROM SYSIBM.SYSDUMMY1
     WHERE 1 = 0     
SEL-869 SELECT * FROM SYSIBM.SYSDUMMY1
     WHERE 1 = 0
     WITH UR 
SEL-870 SELECT * FROM SYSIBM.SYSDUMMY1 
     WHERE 1 = 0 
     WITH UR 
FOR FETCH ONLY 
SEL-871 SELECT * FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                 
    WITH CS                      
SEL-872 SELECT * FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                 
    WITH CS                      
FOR FETCH ONLY               
SEL-873 SELECT * FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                  
    WITH RR                      
SEL-874 SELECT * FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                  
    WITH RR                      
FOR FETCH ONLY               

Now the SELECT COUNT(*) style selects:

SEL-875 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
SEL-876 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
     WHERE 1 = 0     
SEL-877 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
     WHERE 1 = 0
     WITH UR 
SEL-878 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 
     WHERE 1 = 0 
     WITH UR 
FOR FETCH ONLY 
SEL-879 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                 
    WITH CS                      
SEL-880 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                 
    WITH CS                      
FOR FETCH ONLY               
SEL-881 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                  
    WITH RR                      
SEL-882 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 
    WHERE 1 = 0                  
    WITH RR                      
FOR FETCH ONLY 

The results were:

                            Rows   Rows  Index  Tblsp   
    StmtID  Execs  Getpg    exam   proc  Scans  Scans   
----------  -----  -----   -----  -----  -----  -----   
       867  1000K     2M   1000K  1000K      0  1000K   
       868  1000K      0       0      0      0      0   
       869  1000K      0       0      0      0      0   
       870  1000K      0       0      0      0      0   
       871  1000K      0       0      0      0      0   
       872  1000K      0       0      0      0      0   
       873  1000K      0       0      0      0      0   
       874  1000K      0       0      0      0      0 

       875  1000K     2M   1000K  1000K      0  1000K   
       876  1000K      0       0  1000K      0      0   
       877  1000K      0       0  1000K      0      0   
       878  1000K      0       0  1000K      0      0   
       879  1000K      0       0  1000K      0      0   
       880  1000K      0       0  1000K      0      0   
       881  1000K      0       0  1000K      0      0   
       882  1000K      0       0  1000K      0      0     

    StmtID   Tot. CPU  Avg. CPU   Tot. Elap  Avg. Elap  
----------  SS.mmmmmm  SS.mmmmmm  SS.mmmmmm  SS.mmmmmm  
       867   2.481068   0.000002  16.869174   0.000017  
       868   1.291817   0.000001   8.119036   0.000008  
       869   1.331707   0.000001   6.220308   0.000006  
       870   1.330709   0.000001   9.460538   0.000009  
       871   1.306633   0.000001   8.984930   0.000009  
       872   1.326517   0.000001   9.181043   0.000009  
       873   1.324213   0.000001   7.392330   0.000007  
       874   1.322115   0.000001   7.694403   0.000008
 
       875   3.066573   0.000003  18.824193   0.000019  
       876   1.467454   0.000001   6.168161   0.000006  
       877   1.478887   0.000001   7.714925   0.000008  
       878   1.480241   0.000001   9.903416   0.000010  
       879   1.477947   0.000001   9.965071   0.000010  
       880   1.469971   0.000001   9.195251   0.000009  
       881   1.467287   0.000001   8.687336   0.000009  
       882   1.487021   0.000001  11.742945   0.000012 

The SELECT * with WHERE 1 = 0 WITH UR is actually the winner!

Remember – You never stop learning!

As always I would be pleased to hear from you!

TTFN,
Roy Boxwell
Senior Architect

Results


Reader test results from Isaac Yassin:


Measured for dynamic SQL (SPUFI in UR mode – DSNESPUR), using Omegamon application trace. Same machine, same LPAR, same subsystem, measuring the 3rd consecutive execution of each SQL. Both do a “prepare/open/fetch/close” according to trace.

select * from sysibm.sysdummy1 vs. select * from
sysibm.sysdummy1 where 0=1

The differences are:

For Db2 10

Without “where 0=1”

 PREPARE – InDB2 Time = 0.00048 InDB2 CPU = 0.00047
FETCH - 2 fetches, InDB2 time = 0.00131 InDB2 CPU= 0.00116 (avg. 0.00058)
LOCKs - Type = MDEL , Level = S , DB=DSNDB06 , PS=SYSEBCDC , Count = 1
Sequential scan of data page = 2 pages scanned, 1 row qualified

With “where 0=1”

 PREPARE - InDB2 Time = 0.00077 InDB2 CPU = 0.00049
FETCH - 1 fetch, InDB2 time = 0.00008 , InDB2 CPU = 0.00008
Locks - none
NO sequential scan

For Db2 11

Without “where 0=1”

PREPARE - InDB2 Time = 0.00045 InDB2 CPU = 0.00045
FETCH - 2 fetches, InDB2 time = 0.00121 InDB2 CPU= 0.00120 (avg. 0.00060)
LOCKs - Type = MDEL , Level = S , DB=DSNDB06 , PS=SYSEBCDC , Count = 1 
Sequential scan of data page = 2 pages scanned, 1 row qualified

With “where 0=1”

PREPARE - InDB2 Time = 0.00044 InDB2 CPU = 0.00044
FETCH - 1 fetch, InDB2 time = 0.00007 , InDB2 CPU = 0.00007
Locks - none
NO sequential scan
select 1 from sysibm.sysdummy1 vs. select 1 from sysibm.sysdummy1 where 0=1 

The differences are:

For Db2 10

Without “where 0=1”

 PREPARE - InDB2 Time = 0.00057 InDB2 CPU = 0.00054
 FETCH - 2 fetches, InDB2 time = 0.00015 InDB2 CPU= 0.00015 (avg. 0.00007)
 LOCKs - None
 No Sequential scan of data page

With “where 0=1”

PREPARE - InDB2 Time = 0.00060 InDB2 CPU = 0.00034
FETCH - 1 fetch, InDB2 time = 0.00006 , InDB2 CPU = 0.00006
Locks - none
NO sequential scan

For Db2 11

Without “where 0=1”

PREPARE - InDB2 Time = 0.00047 InDB2 CPU = 0.00047
FETCH - 2 fetches, InDB2 time = 0.00087 InDB2 CPU= 0.00082 (avg. 0.00041)
LOCKs - none
NO Sequential scan of data pages

With “where 0=1”

PREPARE - InDB2 Time = 0.00044 InDB2 CPU = 0.00044
FETCH - 1 fetch, InDB2 time = 0.00007 , InDB2 CPU = 0.00007
Locks - none
NO sequential scan of data pages

Bottom line


“select 1” works better than “select *” as you don’t really scan the data and not taking locks, using “where 0=1” eliminates 1 fetch and the locks & scan as well.

Isaac Yassin