GIVE and TAKE Program

1,2,3


Give and Take 2020

Information on the Give and Take 4,5,6,7


Previous Give & Take

We have “GIVEn” various free-of-charge Use Cases from our SQL WorkloadExpert for Db2 z/OS like:

1 Index Maintenance Costs

2 EXPLAIN Suppression

3 BIF Usage

Limited free-of-Charge Db2 Application

This Program started in Europe, during our 30th anniversary was such a success, that it is now being Extended for the benefit of North American Db2 z/OS sites.

<a href="http://www.seg.de/produkte/db2-zos-produkte/sql-workloadexpert-for-db2-zos/" target="

Index Maintenance Costs, EXPLAIN Suppression, BIF

Limited free-of-Charge DB2 Application

This Program started in Europe, during our 30th anniversary was such a success, that it is now being Extended for the benefit of North American DB2 z/OS sites.

SQL WorkloadExpert for DB2 z/OS (WLX) contains several “Use Cases”. We provided three of them, free of charge, for one month to different sites. In return, we received their results. We’d like to share this with you now.

We have “GIVEn” various free-of-charge Use Cases from SQL Workload Expert for DB2 z/OS like
1  Index Maintenance Costs
2  EXPLAIN Suppression
3  BIF Usage – This last one is still available
We TAKE the anonymized results for research
and will communicate with the local User Groups for discussions

Inspiring experiences

Customer Statements

3BiF USAGE
News
Read the Customer Comments across the Industry 

 

  • Health Care
  • Insurance
  • Banking
  • Car Manufacturing

First results from DB2 z/OS sites

1Index Mantenance CostsNearly all the data we got back showed a positive result for created Indexes…
2EXPLAIN SuppressionAbout 10% of SQLs are actually “left over”…
3 BIF Usage When migrating to a new DB2 version, the BIFs are not always compatible and an extreme amount of data is produced.

 

The difficulty of near-time analysis to track down BIFs within dynamic SQL have been solved with the BIF Usage Use Case…

[Results from DB2 z/OS sites]

Program 3 – BIF Usage –  has now started

BIF-Usage

 

Presentation

BIF CompatibilityDB2 10 compatibility mode
Changes to the STRING formating of decimal data within the CHAR and VARCHAR built-in function and to the CAST specification with CHAR and VARCHAR result types as well as  UNSUPPORTED TIMESTAMP STRINGs.
White PaperFinding BIFsAnd How to Lead a Problem-Free Life With Them in the Future
Navigating the Challenges of moving to a new DB2 Release
Newsletter2015-01 – BIFCIDS – Where’s the BIF?How will you deal with loop-hole usage in production code?
VideoBIF Usage(11min.) Trap  and correct the BIFs that will cause belly-ache one day soon

BIF Usage video

2015-09 A real CLUSTER Buster

 

Are you using the “default” clustering INDEX or are you defining the correct INDEX with the CLUSTER Attribute?

This newsletter is dedicated to all DDL designers who do their best but then omit that last tiny bit. No-one really notices, or even cares, for years and years until…

Imagine a huge table up in the billions of rows. Imagine now that you have SQL that accesses this table and it must, as always, run fast. So what do you do? You create an index, RUNSTATS it and Hey Presto! Everything is sweet and dandy! Now imagine this happening again and again over time… What you finally end up with is a huge table with billions of rows now with ten indexes! Not too brilliant for insert and update but that is not the point of this newsletter.

 

An SQL, that had worked perfectly well, suddenly went pear shaped…

So now stop imagining, as this had already really happened at a customer site. We come to the crux: An SQL, that had worked perfectly well, suddenly went pear shaped (belly-up for the non-British English readers!) and started using a two column index with one matching column instead of a six column index with six matching columns! This change in access path caused death by random-IO to occur and it all went horribly wrong.

 

Now the question is why? What on earth happened for the DB2 Optimizer to make such a terrible decision?

1- RUNSTATS review

First idea was, of course, my favourite – Incomplete or not Full RUNSTATS data. In fact there were “bogus stats” from 2003 in the SYSCOLDIST, but even after all the bogus stats were deleted and a complete RUNSTATS with HISTOGRAM and FREQVAL performed, the access path remained stuck on the “bad” index.

 

2- DDL review

I then reviewed the DDL that created all of the objects and noticed that none of the indexes was defined with the CLUSTER attribute. The table itself was “as old as the hills,” but all of the indexes had been created and/or altered many times over the last ten years or so.

 

3- Redefine the “bad” Index as CLUSTER

 The dummy CLUSTER

Now, as we all know, if no index is defined as CLUSTER DB2 picks one to be a dummy CLUSTER when it does a REORG. So you can end up with 100% clustering non-clustered indexes. In this case that was exactly what was happening. The “bad” index was, purely by fluke after many years of index maintenance, the “default” clustering index, however it was a *terrible* choice for a clustering index. Worse still: because of the fact it was non-unique with two columns and therefore small (well small in this case was still 40,000 pages!) it looked positively “good” to the DB2 Optimizer—hence the decision to abandon a six column matching index in favour of a single column one…

“Cleansing” with ALTER, REORG and the DB2 10 INCLUDE syntax

A quick ALTER of the original “first” index to get the CLUSTER attribute, a REORG scheduled for the weekend to get the data into *proper* CLUSTERing sequence and – Bob’s your uncle! Access path swapped back to the “good” index.

Now there’s still work to be done here as ten indexes is about seven too many, if you ask me.With DB2 10 it is possible to use the INCLUDE syntax to weed out some of the extra indexes and thus speed up all usage of this mega-table. But, for the right here and now, the job is done!

So now I am at the end of this sad story of how a little design “error” of just forgetting one little attribute on an index create statement caused major mayhem many years down the line… remember to check *all* of your tables and see if you have any beauties like this in your shop (surely not!)

 

 

Here’s a little SQL that will do the job for you:

--                                                                     
-- QUERY TO LIST OUT ALL TABLES WITH TWO OR MORE INDEXES WHERE NO INDEX
-- IS DEFINED AS CLUSTER                                               
--                                                                     
SELECT A.CREATOR                                                       
      ,A.NAME                                                          
FROM SYSIBM.SYSTABLES  A                                               
WHERE NOT A.CREATOR = 'SYSIBM'                                         
  AND NOT EXISTS                                                       
          (SELECT 1                                                    
           FROM SYSIBM.SYSINDEXES B                                    
           WHERE A.NAME       = B.TBNAME                               
             AND A.CREATOR    = B.TBCREATOR                            
             AND B.CLUSTERING = 'Y')                                   
  AND 1 < (SELECT COALESCE(COUNT(*) , 0)                               
           FROM SYSIBM.SYSINDEXES B                                    
           WHERE A.NAME    = B.TBNAME                                  
             AND A.CREATOR = B.TBCREATOR)                              
ORDER BY 1 , 2                                                         
;

Note that this query excludes the SYSIBM indexes as IBM also forgot to CLUSTER them!

 

As usual, any comments or questions please mail me!

 

TTFN

Roy Boxwell

2015-08 Overloaded Logs

 

Have all your SQL members an “equivalent” elapsed time?

 

Hey, just a quick newsletter this month to highlight an interesting observation I had recently…

 

The high-water mark for elapsed time for some SQL was inexplicably high…

At a customer site, the Lead DBA and I were analysing SQL performance—using our  DB2 SQL WorkloadExpert tool—and we saw in the data-sharing aggregated view of the executed SQLs that the high-water mark for elapsed time for some SQL was inexplicably high.

We then viewed the data at the Member level for these SQLs and what we saw practically jumped right out of the data at us. The same SQL on one Member was taking up to 45% longer to execute when compared against another member of the Data-sharing Group. Time to see what was going on…

 

High elapsed times and lots of Wait times on one member

On the member in question there were very high elapsed times and lots of Wait times—all well in excess of the other members. Viewing the SQLs that ran on the “problem” Member, it was quickly apparent that data change SQLs, (Insert, Delete, Update and Merge), were happening orders of magnitude more often than on any other member in the group. This started giving the Lead DBA a good idea about where the Problem could lie.

 

The problem was Logging

On this member, the OUTBUFF was increased in size, both the Active and Archive Logs were increased in size, and some traffic was switched to another Member in the group.

Tra la! The problem was fixed! All members now have an “equivalent” elapsed time for the same SQL.

 

When was the last time that you took a look to see how your workload for SQL Updates is “balanced” across your machine? Just saying.

 

As usual, any comments or questions are welcome!

TTFN

Roy Boxwell

2015-07 Bad Data Day

A Good time to check your DB2 Catalog Statistics !

One of my favorite topics is STATISTICS and RUNSTATS.

This month I have a short newsletter involving both of them!

 

Something jumped right out…

Some time ago we were helping one of our customers to perform an Early Precheck (Going from DB2 9 to 10). To do so, we requested a copy of their entire DB2 production statistics so the optimizer could work here at our labs in Dusseldorf – just like at the customer site.

 

…as we loaded up DB2 production statistics of a customer

We loaded up the data and I noticed something that just jumped right out…

SELECT TYPE 
      ,FREQUENCYF 
      ,NUMCOLUMNS 
      ,SUBSTR(COLVALUE , 1 , 11) AS COLVALUE 
      ,COLGROUPCOLNO 
       FROM SYSIBM.SYSCOLDIST 
       WHERE TBOWNER = 'aaaaaaa' 
         AND TBNAME  = 'bbbbbb' 
         AND NAME    = 'cccccc' 
ORDER BY 1 , 3 , 2 
; 
---------+---------+---------+---------+---------+---------+---------
TYPE             FREQUENCYF   NUMCOLUMNS   COLVALUE      COLGROUPCOLNO
---------+---------+---------+---------+---------+---------+---------
F    +0.6066539624818615E-02            1 . xxxxxxxxxx .. 
F    +0.6066539624818615E-02            1 . xxxxxxxxxx 
F    +0.6287988717751475E-02            1 . xxxxxxxxxx .. 
F    +0.6287988717751475E-02            1 . xxxxxxxxxx 
F    +0.8554928116458912E-02            1 . xxxxxxxxxx .. 
F    +0.8554928116458912E-02            1 . xxxxxxxxxx 
F    +0.8578238547293950E-02            1 . xxxxxxxxxx .. 
F    +0.8578238547293950E-02            1 . xxxxxxxxxx 
F    +0.8852136109605646E-02            1 . xxxxxxxxxx 
F    +0.8852136109605646E-02            1 . xxxxxxxxxx .. 
F    +0.1229042465777374E-01            1 . xxxxxxxxxx 
F    +0.1229042465777374E-01            1 . xxxxxxxxxx .. 
F    +0.1331608361451540E-01            1 . xxxxxxxxxx .. 
F    +0.1331608361451540E-01            1 . xxxxxxxxxx 
F    +0.1342098055327308E-01            1 . xxxxxxxxxx .. 
F    +0.1342098055327308E-01            1 . xxxxxxxxxx 
F    +0.1633478440765281E-01            1 . xxxxxxxxxx 
F    +0.1633478440765281E-01            1 . xxxxxxxxxx .. 
F    +0.8439891140288000E+00            1 . xxxxxxxxxx .. 
F    +0.8439891140288000E+00            1 . xxxxxxxxxx 
DSNE610I NUMBER OF ROWS DISPLAYED IS 20 
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100 

 

There were “duplicates”!

This is just one of many examples. Also above, you can see the problem – Notice the data in COLGROUPCOLNO? Sometimes there is a hex value and sometimes not!

Now, believe this or not, all of this goes back to a bug in RUNSTATS in DB2 V8 which got this APAR:

PK33517:

COLGROUPCOLNO ASSOCIATED WITH SINGLE COLUMN CONTAINS A NUMERIC VALUE INSTEAD OF ZERO LENGTH FIELD ACCORDING TO SQL REF

What’s interesting here, is this APAR is marked as FIN so the “bug” disappeared in DB2 9!

The bug was fixed in DB2 9, but the “bad” data was not automatically cleaned up

As you can see, the bug caused an erroneous value in the COLGROUPCOLNO to be set for single column frequency rows.

The bug was then fixed in DB2 9, but the “bad” data that had been inserted was not automatically cleaned up and, as I hope you all know, the SYSCOLDIST data is never automatically deleted – it is only ever updated. So when the bug was fixed, the low-value or hexadecimal column number in that field was no longer EQUAL to a zero length field, and so an insert was done. Since then these rows have just stayed there…

 

Query to count this bad rows in the SYSCOLDIST

I wrote a little query just to show the count of how many of these bad rows existed in the SYSCOLDIST:

SELECT COUNT(*) AS BAD_GUYS 
FROM SYSIBM.SYSCOLDIST 
WHERE NUMCOLUMNS                = 1 
  AND TYPE                      = 'F' 
  AND NOT LENGTH(COLGROUPCOLNO) = 0 
; 
---------+---------+---------+---------+------
    BAD_GUYS 
---------+---------+---------+---------+------
       8680 
DSNE610I NUMBER OF ROWS DISPLAYED IS 1


All of these entries should be deleted from SYSCOLDIST to help the optimizer pick the right access paths!

Check your stats !

I first noticed this problem during a test of our Statistics HealthCheck product, which flagged over 1,200 critical problems in the DB2 catalog and, thinking I had broken something, I checked all of the checks and found the above bad data. Now is as good a time as any to check your stats *and* download our Statistics Healthcheck Freeware!!

 

As usual any questions or  comments are welcome,

TTFN Roy Boxwell

Senior Software Architect

2015-06 SOUNDEX and other “cool” features – Part four: DB2 10-DB2 11

 

Update for DB2 10 and all new for DB2 11

 

This newsletter completes my walk through of new Scalar functions that I began a few years ago. I will start today with two new ones that were introduced by APAR PM56631 in May 2012 to DB2 10. PACK (a Scalar Function), and its opposite number UNPACK (a Row Function).

 

New Scalar functions in DB2 10:

Some PACK examples

PACK basically “packs” together a list of columns, or literals, into one variable length binary string in UNICODE format. Here’s some real PACK examples—note that SYDUMMYU is used!

SELECT PACK(CCSID 1208, 'ALICE', DATE('2014-07-22') , DOUBLE(8.6))      
FROM SYSIBM.SYSDUMMYU;                                                  
---------+---------+---------+---------+---------+---------+---------+--
00000301C0018001E004B80005414C494345201407224021333333333333            
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
SELECT PACK(CCSID 1208, 'ROY', BIGINT(123456789) ,                      
TIMESTAMP('2014-07-22-07.43.23.123456'))                                
FROM SYSIBM.SYSDUMMYU;                                                  
---------+---------+---------+---------+---------+---------+---------+--
00000301C001EC018804B80003524F5900000000075BCD15000620140722074323123456
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
SELECT PACK(CCSID 1208,                                                 
TIMESTAMP('2014-07-22-07.55.23.123456'))                                
FROM SYSIBM.SYSDUMMYU;                                                  
---------+---------+---------+---------+---------+---------+---------+--
0000010188000620140722075523123456                                      
DSNE610I NUMBER OF ROWS DISPLAYED IS 1

You can see the format that it creates is actually pretty standard and can have uses in normal processing!

Using the second example above, we can see what is in the output:

The VARBIN string starts with a control flag byte which we can happily ignore

00

Then comes how many elements are in the data

0003

Then comes a set of “number of element” SQLTYPES. If any are odd numbered then the data value is *not* in the string as it is in fact NULL

01C0   Varying Character data 448 in decimal
01EC   BIGINT data 492 in decimal
0188   TIMESTAMP data 392 in decimal

 

The first data element CHAR containing “ROY” looks like

04B8   CCSID of String 1208 in decimal
0003   Length of string 
524F59 Data in UNICODE

 

The second data element BIGINT containing 123456789 looks like

00000000075BCD15      BIGINT data big endian always!

The third data element TIMESTAMP containing ‘2014-07-22-07.43.23.123456’ looks like

0006                  TIMESTAMP precision
20140722074323123456  TIMESTAMP data

UNPACK statements

Now, these VARBIN strings can get very long indeed, but what can you do with all of the data? Well, obviously UNPACK springs to mind… Here’s the above data again, but getting input to an UNPACK statement you have to create a FUNCTION first… something like this:

--#SET TERMINATOR $                                        
CREATE FUNCTION PACKUDF ()                                 
        RETURNS VARBINARY(128)                             
        VERSION V1                                         
        LANGUAGE SQL                                       
        DETERMINISTIC                                      
        NO EXTERNAL ACTION                                 
        CONTAINS SQL                                       
        CALLED ON NULL INPUT                               
        STATIC DISPATCH                                    
        NOT SECURED                                        
        DISABLE DEBUG MODE                                 
        BEGIN                                              
        DECLARE PACKSTR VARBINARY(128);                    
        SET PACKSTR =                                      
               PACK(CCSID 1208                             
                   ,VARCHAR('ROY' , 40)                    
                   ,BIGINT(123456789)                      
                   ,TIMESTAMP('2014-07-22-07.43.23.123456')
                   );                                      
        RETURN PACKSTR;                                    
        END                                                
$   

This just does what the second example did, but buried in a Scalar SQL Function. This can then easily be called by just doing this:

SELECT UNPACK(PACKUDF()).*                                                           
          AS (USER_NAME    VARCHAR(40) CCSID UNICODE                    
             ,LARGE_NUMBER BIGINT                                       
             ,WHEN_USED    TIMESTAMP)                                   
FROM SYSIBM.SYSDUMMY1 ;                                                 
---------+---------+---------+-..--------+---------+---------+---------+---------+---
USER_NAME                      ..            LARGE_NUMBER  WHEN_USED                 
---------+---------+---------+-..--------+---------+---------+---------+---------+---
ROY                                             123456789  2014-07-22-07.43.23.123456
DSNE610I NUMBER OF ROWS DISPLAYED IS 1   
            

I deleted some white space (..) above just to get it all to fit on the page, by the way! Now quite *what* you would do with PACK and UNPACK I’ll leave up to you!!!

 

On now to DB2 11: Aggregate Functions and 11 new scalar functions for new ARRAY data type

Here we got three new Aggregate Functions and 11 new Scalar Functions. Most of these are for the new ARRAY data type that was introduced in DB2 11. First, here’s how you can create ARRAYs, so that you can understand a bit more about the Functions that follow:

CREATE TYPE BOXWELL.ROYARRAY  AS INTEGER  ARRAY ??(??) ;     
CREATE TYPE BOXWELL.ROYARRAY4 AS CHAR(16) ARRAY ??( 32 ??) ; 
CREATE TYPE BOXWELL.ROYARRAY5 AS CHAR(16)                    
   ARRAY ??( VARCHAR(8) CCSID UNICODE FOR MIXED DATA ??) ;

Note that I use trigraphs ( ??( = [ and ??) = ] ), as I *hate* the square brackets! (I would *love* to meet that developer in a dark alley one night…I know, I know, it is like that in the “other” SQL world but it doesn’t mean I have to use them!).

 

Now arrays are pretty cool for passing data back from SQL Scalar Functions, or for Parameters of SQL Scalar Functions and Native SQL Procedures, as it simplifies the handling of lots of same type parameters. This is really good for all the JAVA programmers out there who use array types all the time!

In fact, all of the ARRAY stuff is *only* allowed within the confines of Stored Procedures and SQL PL.

What also arrived in DB2 11 is the ability to “unravel the array” using the new “collection-derived-table” syntax—otherwise known as UNNEST—and here are the two examples from the doc:

 

Example 1: Suppose that PHONENUMBERS is a user-defined array type that is defined as an ordinary array.

RECENT_CALLS is an array variable of the PHONENUMBERS type. RECENT_CALLS contains the following phone numbers:

9055553907
4165554213
4085553678
The following SELECT statement uses UNNEST to retrieve the list of phone numbers from the Array:

SELECT T.ID, T.NUM FROM UNNEST(RECENT_CALLS) WITH ORDINALITY AS T(NUM, ID);

ID NUM
1 9055553907
2 4165554213
3 4085553678

SET PHONELIST[’Home’] = ’4443051234’;
SET PHONELIST[’Work’] = ’4443052345’;
SET PHONELIST[’Cell’] = ’4447893456’;

The following SELECT statement is executed:

SELECT T.ID, T.PHONE FROM UNNEST(PHONELIST) AS T(ID, PHONE);

The result table looks like this, although the order of rows might differ:

ID PHONE
Cell 4447893456
Home 4443051234
Work 4443052345

 

The Array Aggregate Function, ARRAY_AGG, and all of the Scalar Functions:

ARRAY_DELETE,
ARRAY_FIRST,
ARRAY_LAST,
ARRAY_NEXT,
ARRAY_PRIOR,
CARDINALITY,
MAX_CARDINALITY,
and TRIM_ARRAY,
are all available to use but only within the confines of SQL PL.

 

CHAR9 and VARCHAR9

Also new in DB2 11 are CHAR9 and VARCHAR9, which are the good old functions from DB2 9 but now keeping the way they used to work forever—I hope they will not actually be used!

XSLTRANSFORM was also delivered to do XML transformations, but I have never actually tried it. You must note this bit of text:

 

This user-defined function requires IBM SDK for z/OS, Java Technology Edition Version 6.
This user-defined function uses the XSLT support that is provided by the W3C XSL Transformations V1.0 Recommendation.

 

GROUPING & MEDIAN

Then we got two new Aggregate Functions. The first is GROUPING—used in the OLAP extensions for doing automatic sub- and grand totaling, and then MEDIAN—which *only* works if you have the IDAA switched on!

So, lots of interesting stuff here. You may want to go back to the 3 newsletters  from a few years ago (2012) to recap all of these “cool” features. That way you’ll have plenty to read if you get bored on holiday!

 

Feel free to send me your comments and ask questions!

Link back to general newsletter page

2015-05 Top 10 Things to Ignore for DB2 z/OS

 

This newsletter was inspired by a recent article I read in the “Enterprise Systems Magazine” called “Top 10 Ways to Waste Money on CPU”. Why not the Top 10 things to ignore?

 

DB2 z/OS things you could ignore but most definitely should not!

So here’s my little list, in no particular order, of things you could ignore but most definitely should not!

  1SQL DELETE statements in mega-million  tablesSQL DELETE statements in mega-million  tables when a REORG DISCARD would kill two birds with one stone. (I love that phrase) Anyway, after 500,000 singleton deletes the tablespace probably needs a REORG anyway and so why not do two in one? A bit of a no-brainer really.
  2LOB columnsLOB columns, whose size would *easily* fit inside an inline LOB or even a VARCHAR. LOBs are still slow and cumbersome to use, but inline LOBs are great. If you can use ‘em – do so!
  3BP0 being used for *everything* by default…BP0 being used for *everything* by default… Please split the BP s into groups!!! BP0 is only, and I mean ONLY, for the Catalog and Directory. That way you can actually keep the size low and spare some memory for other BPs. LOB and XML tablespaces get their own BP. Tables and Indexes are split. Sort gets its own. You get the idea ?
  4Utility jobs still based on 1990’s ideasUtility jobs still based on 1990’s ideas. Are you still running a RUNSTATS to see if a REORG is needed? Are you running REORGs without inline RUNSTATS? Are your RUNSTATS using FREQVAL and, if required, HISTOGRAM?
  5Death by “indexiphication”.Death by “indexiphication”. Do you have tables with more than three indexes? Do you have ten or more indexes? Time to look for INCLUDE usage and LASTUSED Timestamps here!
  6PLAN_TABLE explosionPLAN_TABLE explosion. Do you have multiple PLAN_TABLEs in production? Are you REORGing, RUNSTATSing and Image Copying them on a regular basis? Are you purging them of rubbish data on a regular basis?
  7Are your ZPARMs up to date?Have you checked the Rules of Thumb in regard to ZPARMS since they were last set back in the 80’s? Now is the time to do a review of all the ZPARMS to see where you can really get performance boosts. (For example the default SRTPOOL In DB2 10 is now 10,000k but in DB2 V8 and 9 it was just 2,000k)
 8Are you removing garbage from the DB2 Catalog and Directory ?Are you removing garbage from the DB2 Catalog and Directory ? Do you really need all the packages and versions of those packages from 1989 these days? If a table gets RUNSTATSed that these ancient, never executed, packages uses then it should trigger a review of the access paths, which could, of course, flag up problems where no real problem exists.
  9 COMMIT frequency.You never need to check or change this do you…
 10 TrainingIDUG, Insight, and RUGs etc. you can never ever get enough info about how things work and how to make things better.

 

One thing you should certainly NOT ignore, is my newsletter! I have lots of exciting topics coming up in 2015 and I’ll also let you know about our webinars.

Upcoming Newsletters

  • SOUNDEX and other cool features part 4 – update for DB2 10 & all new for DB2 11
  • BAD Data Day
  • Overloaded Log
  • A real CLUSTER Buster

 

As usual, any comments or questions are welcome!

TTFN

Roy Boxwell

2015-04 SQLCODEs of interest

 

Are your DB2 11 SQLCODEs up-to-date?

I originally wanted to call this newsletter “SQLCODE101” but not all of my international readers may understand… Anyway, this newsletter is dedicated to that small group of people who, like me, share an interest in SQLCODEs.

The “newest” SQLCODES are not updated in copy book style checking routines

One thing I noticed years ago, is how often the “newest” codes are not updated in copy book style checking routines.

I was at one customer site once where their copy book entry looked like this:

 88 SQLCA-ERROR                VALUE -999 THRU -001.

And I choked on my coffee!
Scary huh?

It actually got worse when I then saw:

88 SQLCA-WARNING             VALUE +101 THRU +999.

Now these are soooo bad it hurts the eyes!
But as they were lurking in copy book code, they were simply never seen…

 

DB2 11, current documentation

As of DB2 11, current documentation shows the actual ranges are -30106 to -7 and +12 to +30100.

So the above COBOL should really look like this:

88 SQLCA-ERROR               VALUE -32000 THRU -001.
88 SQLCA-WARNING             VALUE   +1   THRU +99
+101 THRU +32000.

 

The SQLCODE +100 is special, as it is “no row found” or “end of cursor”.

 

What other SQLCODEs are of general interest then?

Well here’s my list in no particular order:

 

-803 (Duplicate key)

This is sometimes called the “lazy update check”. First do an insert, if it bounces with a -803 make the key unique or change it to an update statement. Now this logic is fine if the majority of the inserts succeed, but if the majority are failing it is probably time to swap the logic around and try an update, then if you get a +100 do an insert instead. This also stops any “negative SQLCODE monitors” from firing off too many false positives!

 

-802 and its younger brother +802 (Numeric exception)

Now do you see that I have two codes here? The +802 means a numeric exception has occurred, but the SQL carries on with -2 set in the indicator variable:

 

 +802 EXCEPTION ERROR

+802 EXCEPTION ERROR exception-type HAS OCCURRED DURING operation-type OPERATION ON data-type DATA, POSITION position-number

Explanation: The exception error exception-type occurred while performing one of the following operations on a field that has a data-type of DECIMAL, FLOAT, SMALLINT, or INTEGER:

 

Whereas the802 is a bit different:

 

 -802 EXCEPTION ERROR

-802 EXCEPTION ERROR exception-type HAS OCCURRED DURING operation-type OPERATION ON data-type DATA, POSITION position-number

Explanation: An exception error has occurred in the processing of an SQL arithmetic function or arithmetic expression. The exception error occurred in one of the following areas:

  • In the SELECT list of an SQL SELECT statement.
  • In the search condition of a SELECT, UPDATE, MERGE, or DELETE statement.
  • In the SET clause of the UPDATE operation.
  • During the evaluation of an aggregate function.

So you must really take good care here!

 

 Very interesting SQLcodes

 

-514 and -518 (Prepared SQL gone)

Now these are *very* interesting and you should have a quick look in your monitor to see how many of these you get. I really hope that all your SQL code has retry logic to rePREPARE the SQL that has gone. I was rather surprised to see prepared statements in current active use getting these messages. But if the data is flushed from the DSC then this is what can happen!

Here’s a little snippet from one of Namik Hrle’s DSC presentations:

 

It gives a little “hint” that the prepared statement can be thrown from the cache at any time, (not just the obvious bad guys like RUNSTATS etc.)

 

-331, +335, +445 and +20141 (Truncation or Character Conversion problem)

Here, only the -331 (CHARACTER CONVERSION CANNOT BE PERFORMED BECAUSE A STRING, POSITION position-number, CANNOT BE CONVERTED FROM source-ccsid TO target-ccsid, REASON reason-code) is actually returning an error code, while all the others – +335 (use of substitute character), +445, & +20141 (Truncation) – carry on regardless. This might not be what is actually desired.

 

+222 (Positioned on a hole)

You have positioned onto a deleted/updated row in a SENSITIVE STATIC cursor – These warnings should just trigger another fetch until either table end or a real row is found.

 

-911 (Timeout or Deadlock)

Now this beauty actually does the ROLLBACK for you, so you must be aware of that when you get this bad guy!

 

 

Do you have any SQLCODEs that you treat specially?

I would love to hear from you if you do!

 

 

As usual, if you have any comments or queries please feel free to drop me a line!

TTFN

 

Roy Boxwell

 

2014-01: Complementing IBMs ACCESS PLAN Stability 1, for Dynamic SQL

 

In collaborazione con Expertise4IT

Con il package di salvataggio globale delle RUNSTATS implementato da SEG: Per quale ragione questa buona idea in qualche caso fallisce?

 

• La storia
• Cos’è accaduto realmente?
• Come risolvere il problema?
• Rescue Steps
• Un esempio (comprensivo di pannelli JCL)
• Il prossimo mese
Questo mese vi voglio raccontare una storia. La storia è vera, ma I nomi sono stati cambiati per proteggere gli innocenti

 

 

La storia

Una grande azienda schedula regolarmente la fasatura dei processi di produzione nella notte di Giovedì. Lo scorso anno, una notte tutto procedeva regolarmente ma … al Venerdì mattina …….

…i telefoni cominciarono a squillare perchè gli utenti si lamentavano di tempi di risposta lunghi o nulli (timeout) in un’applicazione piuttosto critica per il business.

Il problema rapidamente evolve dai normali ‘spegnete e riavviate’ e ‘avete cambiato qualcosa?’ ai senior managers che chiedono per quale motivo le cose non funzionano più.

Allo stato il Gruppo DBA non era stato ancora coinvolto, dal momento che il primo pensiero era che si trattasse di un “bad” package passato in produzione accidentalmente nella notte.

  • Il Gruppo di gestione della produzione, allora, riportava indietro I packages in gioco, ma l’operazione non produceva effetti….
  • I ritardi di consegna diventavano peggiori. Vengono quindi ‘stoppati’ quasi tutti i server WebSphere, almeno per permettere che un pò di lavoro venisse eseguito sul sistema sovraccarico. Essendo ormai giunti al livello di panico, i DBA and il team JAVA vengono coinvolti.
  • E questi trovano rapidamente il colpevole; si trattava di un SQL dinamico molto complesso che si era ‘comportato’ correttamente e con efficienza fino ad un certo istante della notte di Giovedì, e adesso performava in maniera disastrosa.
  •  I DBA riorganizzano le tabelle più consistenti fra quelle in gioco, nella speranza che le cose vadano meglio, ma, purtroppo, non ci sono risultati apprezzabili….

    Finalmente il DBA propone la creazione di un indice con relative RUNSTATS in produzione.

L’access path si modifica sostanzialmente e le performance tornano ad essere buone.
I server WebSphere vengono restartati e, gradualmente, si torna alla normalità.
Tutto questo processo é durato 2 giorni!
L’azienda deve rifasare la sua catena logistica e le consegne Just-In-Time, quindi il blocco ha avuto qualche seria ripercussione, ovviamente….  E questa è tutta la storia

 

Cosa è realmente avvenuto?

Il DBA, investigando sulle possibili cause del blocco, ha scoperto che ciò che realmente era accaduto, era che era andata in macchina una RUNSTATS solo su una piccola tabella che, fatta girare durante la notte di Giovedì in un momento inopportuno, aveva determinato un passaggio dell’access path allo stato di “pear-shaped”(aka Belly Up) per tutta la giornata di Venerdì e per metà del Sabato…

Come risolvere rapidamente e facilmente il problema?

Il DBA pensò ai modi in cui il problema, che avrebbe potuto riproporsi in futuro, potesse essere risolto in maniera facile e rapida. E qui comincia la mia parte della storia…

Questa azienda utilizza software proprio e possiede una licenza d’uso della componente Enterprise Statistics Distribution (ESD) del Bind ImpactExpert, che estrae, e opzionalmente converte, tutte i dati di catalogo di cui ha bisogno l’ottimizzatore per fare il proprio lavoro. Normalmente, I clienti utilizzano questo tool per copiare le statistiche di produzione in stile ‘sand box’, per vedere se un APAR o una migrazione di DB2 può causare problemi imprevisti. Per questo motivo, essi usano la componente Early-PreCheck del nostro tool Bind ImpactExpert per SQL statico e dinamico. Adesso facciamo un altro scenario, chiamato DSC (Dynamic Statement Cache) Protection, che dovrebbe realizzare quello che desidera il cliente, ma fa anche molto altro e, ovviamente, ha costi più elevati!

Ed è così che nacque l’idea di un tool ‘tascabile’ che abbiamo chiamato RUNSTATS Rescue. Ho sentito che chiedevate “Perchè si chiama PocketTool?”; la risposta è: “Perchè costa quanto il denaro che si tiene in tasca!” (aka Pin Money negliUSA o una “allowance” se preferite). Questi tools sono realmente molto economici!

Adesso, prima che smettiate di leggere questa newsletter e cominciate a lamentarvi del fatto che si tratti di una pura attività di marketing, per favore mettetevi in testa che quanto io descrivo in questo documento potrebbe anche essere stato scritto da voi e, quindi, avete solo bisogno di darmi credito sufficiente per condividere l’idea…

 

RUNSTATS Rescue

L’idea è usare l’EXPLAIN in qualunque modalità, shape or form, sia in SPUFI, che direttamente all’interno di qualsiasi monitor, semplicemente per ‘EXPLAINare’ l’SQL critico, e tenere a mente quale PLAN_TABLE owner state usando e, anche, il QUERYNO che avete appena usato.

Usando questi due inputs, RUNSTATS Rescue analizza l’output dell’ EXPLAIN per costruire una lista di control cards di estrazione e update per il nostro tool ESD, per tutte le tabelle usate e per I relative indici –anche se, ovviamente, non in uso! Infine viene anche generato un flusso di RUNSTATS DSC per tutti I tablespaces coinvolti nella query al fine di avere la certezza che la prossima volta che lo statement critico andrà in macchina, userà le corrette statistiche.

Ora, ovviamente, sorge la domanda: “Come faccio a sapere quail sono le statistiche da usare come Rescue statistics?”

La risposta è: “Quelle che c’erano prima prima che venisse eseguita una REORG con inline statistics o una RUNSTATS “. Questo è il punto chiave da tenere a mente: dovete semplicemente eseguire lo ESD extract prima che venga eseguita qualunque operazione di maintenance sul Database.
Molti siti hanno giorni, o weekends, in cui girano le procedure di maintenance, e non è certo un problema estrarre I dati e salvarli, per esempio,in un GENGROUP, al fine di facilitare l’individuazione di data e time in cui le statistiche erano ‘buone’ e poi, con il job di RUNSTATS Rescue, ripristinare molto rapidamente le statistiche necessarie. Questo consente al gruppo DBA di avere più tempo per capire cos’è realmente successo e apportare le giuste correzioni, quantomeno rimanendo a proprio agio.

10 Rescue Steps

  • 1. Selezionare il nuovo scenario di RUNSTATS Rescue
  • 2. Generare JCL
  • 3. Opzionalmente copiarle in GENGROUP dataset
  • 4. Inserire EXPLAIN TABLE-CREATOR e QUERYNO
  • 5. Lancio automatico del nostro browser di catalogo
  • 6. Drill down fino al livello degli indici
  • 7. Chiedere un “new” file name per le “rescue” statistics estratte
  • 8. Eseguire l’estrazione delle RUNSTATS Rescue
  • 9. Resettare le statistiche ed eseguire le RUNSTATS
  • 10. Le “Rescued” Statistics

 

Di seguito un esempio guidato di cosa avviene in realtà:

 

In basso nella schermata, si può osservare il nuovo scenario delle  RUNSTATS Rescue – Selezionandolo si attiva una finestrella con tre passi. Il primo deve essere eseguito solo una volta e va semplicemente inserito in un job produttivo esistente. Si raccomanda di farne il primo job della normale catena di maintenance del DB2 Database Maintenance.

1 – Select the new scenario RUNSTATS Rescue

 

2 – Generate some JCL

La prima opzione genera alcuni JCL come sotto illustrato:

 

3 – Opzionalmente copiare in un GENGROUP dataset

Esempio di step finale di copia su GENGROUP

 

4 – Insert the EXPLAIN TABLE-CREATOR and QUERYNO

Selezionare il secondo passo per preparare le RUNSTATS Rescue

5 – Lancio automatico della scansione di catalogo

Premendo “enter” si lancia il browser sul catalogo per consentire la vision degli oggetti usati dallo SQL…..

 

6 – Drill down to the Index level

…con I relativi indici, ovviamente!

 

7 – Ask “new” file name for the extracted “rescue” statistics

Uscendo con PF3 il tool chiede il dsname delle Production Statistics originali, come estratte dal job del primo passo, e un  “new” file name per le rescue statistics estratte:

 

 

8 – Perform the RUNSTATS Rescue extraction

Il JCL che esegue l’estrazione delle RUNSTATS Rescue, incluso il passo opzionale su saving su GDG, si presenta come sotto:

 

9 – Reset the statistics and executes the RUNSTATS

Infine, viene selezionato il terzo step, che azzera le statistiche ed esegue la RUNSTATS che provvede al flush della DSC

 

10 – The “Rescued” Statistics

Adesso, la prossima volta che lo statement apparirà, userà le “rescued” statistics e riprodurrà il vecchio Access Path.

 

Nel prossimo mese

Nel prossimo mese desidero espandere la problematica trattata con la possibilità di estenderne la soluzione allo SQL statico.

Il mese successive, scenderò nei dettagli della ‘DSC Protection scenario’ che ho citato prima. Non si tratta di un tool tascabile, ovviamente, ma è molto interessante!

Come sempre questions or comments sono benvenuti,

TTFN Roy Boxwell
Senior Software Architect

 

 

2015-03: DB2 z/OS object changes: Quiet Times for maintenance

Do you have an idea when tables are in use?

 

Ahhh! Wouldn’t it be great if we all had just quiet times? Sadly we never have time for anything these days, let alone for peace and quiet!

The quiet before the Storm?

What I mean by Quiet Times is, however, different: it is the time when a given table, or set of tables, is not in use. This is very interesting to find out, especially when you are doing data definition changes (DDL). For example: you are given the task of adding some columns to some tables – naturally these days you have no idea who or what is actually using the tables, and absolutely no idea *when* they are being used.

What do you do?

Well, all you can do is schedule the change for early one morning and then quickly push the ALTERs and the REORGs through – hoping not to collide with any users of the data.

 

Guessing when tables are in use can be dangerous

This is all a bit haphazard and dangerous! Wouldn’t it be better if you could look at a calendar and see that this table is only used Mo – Th from 09:00 – 16:00 thus giving you a really big hint that Friday morning is a better bet?

 

Capture your DB2 SQL Workload & project the results into a Calendar view

Using the new and enhanced IFCIDs in DB2 10 you can now do this! Capture your workload and analyze when table(s) are being used and project the results into a Calendar view:

News from the labs Newsletter 2015-03: Quiet Times

 

Gives this style Output:

News from the labs Newsletter 2015-03: Quiet Times

 

Handy huh?

Video (3 min.)  – Presentation

– You can drag the dates back and forth to validate the assumptions of a period of time, and then you can happily do your ALTERs and REORGs during the day.

– Apart from not having to get up early, the added bonus is that you get to learn more about who uses the tables!

Of course this system is *not* a crystal ball! It is just showing historical usage. Who knows what the future holds?

Would this style of output be useful for you? Could you imagine this helping you in your day-to-day tasks?

 

As usual any queries or criticism gladly accepted!

TTFN,

Roy Boxwell

2015-01 BIFCIDS – Where’s the BIF?

How will you deal with loop-hole usage in production code?

 

 

The IFCIDs 366 and 376

DB2 provides many and varied IFCIDs. But for today, I’m most interested in the 366 and 376. The 366 is available in DB2 10 and the 376 in DB2 11. Now I like to call these “BIFCIDs” because they are triggered whenever a BIF is used that will behave differently than it is currently used when moving to the next release of DB2. (It’s also triggered when changing Application Compatibility settings in DB2 11 and higher).

 

So where’s the BIF?

BIF Usage Video (11min:)       Presentation

Well, a BIF is a Built-In Function such as CHAR, DECIMAL, etc. There are hundreds of them these days. In the last few DB2 releases, IBM has changed a few to make DB2 more compatible with SQL standards. They have actually closed a couple of loop-holes, where “bad” data could be accepted and processed.

 

 

Loop-hole user?

What happens is: someone somewhere found this loop-hole and used it in production code. Now when you upgrade your DB2, this code will either fail or give erroneous results – which is never good. Hence IBM created the IFCID 366. This is output every time an SQL statement is PREPARED, or executed, that contains a candidate BIF. There were so many of these, that IBM introduced a sort of condensed version so it only triggered one for the first execution, or prepare, but sadly that IFCID—376—is only for DB2 11.

Where can, or will, this really hurt?

 

Looking into the documentation for these IFCIDs you will see a long list of when they are written:

***********************************************************************
**  IFCID 0366 is a serviceability trace.                            **
**  It can be used to identify applications that are affected        **
**  by incompatible changes.                                         **
**  The QW0366FN field indicates the type of incompatible Change:    **
**                                                                   **                                                      
**  QW0366FN = 1                                                     **
**  Indicates that the pre Version 10 CHAR built-in function has     **
**  been invoked. There is an incompatible change to the output of   **
**  the CHAR function for some decimal data. The zparm               **
**  BIF_COMPATIBILITY and/or the SYSCOMPAT_V9 schema have been used  **
**  by this application to get the old behavior. Please make the     **
**  appropriate changes and rebind with the SYSCURRENT schema to     **
**  use the Version 10 CHAR(decimal) built-in function.              **
**  (PM29124 V10 only, usermod V8/V9)                                **
**                                                                   **
**  QW0366FN = 2                                                     **
**  Indicates that the pre Version 10 VARCHAR built-in function or   **
**  CAST(decimal AS CHAR or VARCHAR) has been invoked.               **
**                                                                   **
**  QW0366FN = 3                                                     ** 
**  Indicates that an unsupported character representation of a      ** 
**  timestamp string was used. PM48741 V10 only.                     **
**                                                                   ** 
**  QW0366FN = 4                                                     ** 
**  A QW0366FN 4 record indicates that the statement uses the        **     
**  word ARRAY_EXISTS as an unqualified user-defined function Name   **   
**  in a context that may be incompatible with Version 11.           **     
**                                                                   **
**  QW0366FN = 5                                                     ** 
**  A QW0366FN 5 record indicates that the statement uses the        **
**  word CUBE as an unqualified user-defined function Name           **
**  in a context that may be incompatible with Version 11.           **
**                                                                   **
**  QW0366FN = 6                                                     **
**  A QW0366FN 6 record indicates that the statement uses the        **
**  word ROLLUP as an unqualified user-defined function Name         **
**  in a context that may be incompatible with Version 11.           **
**                                                                   ** 
**  QW0366FN = 7                                                     **
**  A QW0366FN 7 record indicates that DB2 for z/OS server issued    **
**  a SQLCODE -301 for incompatible data type conversion from        **
**  string data type (e.g. CHAR, VARCHAR, GRAPHIC, VARGRAPHIC        **
**  etc.) to numeric data type in V10 CM mode when implicit          **
**  cast is not supported or V10 NFM mode when DDF_COMPATIBILITY     **
**  zparm is set to DISABLE_IMPCAST_NJV or SP_PARMS_NJV to           **
**  disable implicit cast, and the client is CLI Driver              **
**  or v11 NFM mode & APPLCOMPAT = V10R1 when DDF_COMPATIBILITY      **
**  is set to SP_PARMS_NJV or DISABLE_IMPCAST_NJV to disable         **
**  implicit cast either from string data type to numeric or         ** 
**  from numeric data type to string data type.                      **
**                                                                   **
**  QW0366FN = 8                                                     **
**  A QW0366FN 8 record indicates that DB2 for z/OS server           **
**  returned output data match the data types of the                 **
**  corresponding CALL statement arguments when DDF_COMPATIBILITY    **
**  zparm is set to SP_PARMS_NJV.                                    **
**                                                                   **
**  QW0366FN = 9                                                     **
**  A QW0366FN 9 record indicates a data type conversion from        **
**  a TIMESTAMP WITH TIME ZONE input to a TIMESTAMP data             **
**  during input host variable bind-in process on server when        **
**  DDF_COMPATIBILITY zparm is set to IGNORE_TZ to ignore the        **
**  time zone information sent by Java IBM Data Server Driver.       **
**                                                                   **
**  QW0366FN = 10                                                    ** 
**  RTRIM, LTRIM or STRIP version 9 being used with mixed data       **
**                                                                   **
**  QW0366FN = 1101                                                  ** 
**  Indicates that the INSERT statement that inserts into an XML     **
**  column without XMLDOCUMENT function has been processed (which    **
**  should result in SQLCODE -20345 when run on DB2 release prior    **
**  to V11). Starting with V11, SQL error will no longer be issued.  **
**  Application will no longer recieve SQLCODE for this Statement.   **
**                                                                   ** 
**  QW0366FN = 1102                                                  **
**  Indicates that V10 XPath evaluation behavior was in effect which **
**  resulted in an error. For instance, a data type conversion error **
**  could have occured for a predicate that would otherwise be       **
**  evaluated to false. Starting from V11, such "irrelevant" Errors  **
**  might be suppressed so an application might no longer recieve    **
**  the SQLCODE for this Statement.                                  **
**                                                                   **
**  QW0366FN = 1103                                                  **
**  Indicates that a dynamic SQL uses the ASUTime limit that has     **
**  been set for the entire thread for RLF reactive governing.       **
**  For instance, when a dynamic SQL is processed from package A,    **
** if the ASUTime limit is already set during other dynamic SQL      ** 
** processing from package B in the same thread, the SQL from        **
** package A will use the ASUTime limit set during the SQL           **
** processing from package B. Stating with v11, dynamic SQLs from    **
** multiple packages will use the ASUTime limit that is set          **
** considering its own package information.                          **
**                                                                   **
** QW0366FN = 1104, 1105, 1106, 1107                                 **
** Indicates that CLIENT special register (CLIENT_USERID,            **
** CLIENT_WRKSTNNAME, CLIENT_APPLNAME, CLIENT_ACCTNG) has been set   **
** to a value that is longer than what is supported prior to V11.    **
** A shorter value has been used instead.                            **
**                                                                   **
** QW0366FN = 1108                                                   **
** Indicates that CLIENT special register (CLIENT_USERID,            **
** CLIENT_WRKSTNNAME, CLIENT_APPLNAME, CLIENT_ACCTNG) has been set   **
** to a value that is longer than what is supported prior to V11.    **
** Truncated values upto the supported lengths prior to v11 have     **
** been used for RLF table search instead.                           **
**                                                                   **
** QW0366FN = 1109                                                   **
** Indicates that CAST(string AS TIMESTAMP) was processed for the    **
** input string of length 8 and input was treated as a store clock   **
** value (or input string was of length 13 and was treated as a      **
** GENERATE_UNIQUE value). This behavior is incorrect for a CAST     **
** and is valid for TIMESTAMP built-in function only. This behavior  **
** is being corrected in DB2 11 so that input to CAST is not         **
** treated as a store clock value nor GENERATE_UNIQUE.               **
**                                                                   **
** QW0366FN = 1110                                                   **
** Indicates the integer argument of SPACE function is greater       **
** than 32764.                                                       **
**                                                                   **
** QW0366FN = 1111                                                   **
** Indicates the optional integer argument of VARCHAR function       **
** has a value greater than 32764. *                                 **
***********************************************************************

 

Useful stuff indeed!

Phew! Not a bad list, huh? Now you see why these IFCIDs are so useful. It could well be, that you have none of these “alive” in your system today. Or, of course, it could be that you get millions of the things! Somehow you will have to work out a way to save the data, analyse it to get to the root cause, and then, finally, fix the problem(s).

 

Saved by APPLCOMPAT?

You could argue that the new DB2 11 parameter Application Compatibility will save you, but this is really a false economy. All it enables is the guarantee that the code will still “run”. However, in two more DB2 releases the code will fail and, in two more releases – so about six years – who will even know *how* to change which piece of source code and, perhaps even, where is that source code?

 

Saved by BIFCIDs

Personally, what I would do, is : to run our SQL WorkloadExpert tool to trap all the required [B]IFCIDs for a few hours (at first!).Then I would analyse the results, fix the code where it needs fixing – and repeat! I would keep doing this until no IFCID records are coming out and I would be set!

 

What is even better, is that our SQL WorkloadExpert will work correctly even when any new QW0366FN values appear – so when IBM decides to add another code (Like the new values 9 and 10 above for example) this BIF Usage still works correctly.

 

Of course, you may have another tool that you use at your site.

Can it see “Where’s the BIF?”

How will you deal with loop-hole usage in production code?

 

As usual, any question or comments gladly welcome!

 

TTFN

Roy Boxwell