2013-04: DSC Hit Ratio – Overrated metric???

 

Dynamic SQL Cache (DSC) settings

I have recently been to a bunch of shops where everyone was very happy with their Dynamic SQL Cache (DSC) settings. They all said “We have statements for up to four days in the cache and a 99.2% hit ratio so everything is green!”

They were so wrong….

The problem is, that one statement in the DSC that is executed 16,000,000 times, gives a HUGE false positive on the hit ratio, and *one* really old statement does not automatically imply that *all* statements are so long lived in the DSC.

 

What I have found is that the DSC flush rate per hour is a much much better metric to see if your DSC is thrashing itself to death. If you have 10,000 statements or more being flushed out of the cache, then your cache is not too healthy!

At the sites where I looked, the peak was 25,000 statements an hour being flushed! This is seriously bad when the DBAs were sure that everything “is green”…

 

So how do you work out your Flush/hour ???

Well, you could always buy and install the SQL WorkloadExpert that does the calculations for you, or you could run this set of SQL to also get nearly the same style of data!

 

First,

do an EXPLAIN STMTCACHE ALL on one of the members of your data-sharing groups or in your non-datasharing DB2 (Remember that this command only extracts the local data!!!)

 EXPLAIN STMTCACHE ALL;
---------+---------+---------+---------+---------+---------
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 0
---------+---------+---------+---------+---------+---------

 Then,

execute the following SQL:

SELECT COUNT(*) AS NUM_STMTS

  , INTEGER((MAX(STMT_ID) - MIN(STMT_ID) + 1 - COUNT(*)) /
  (
  TIMESTAMPDIFF ( 2 , CAST(
  MAX(CACHED_TS) - MIN(CACHED_TS) AS CHAR(22)
  )) / 3.6E+03
  ) )
  AS FLUSH_PER_HOUR
  , MAX(STMT_ID) - MIN(STMT_ID) + 1 - COUNT(*) AS FLUSHED_STMTS
  , MIN(STMT_ID) AS MIN_STMT
  , MAX(STMT_ID) AS MAX_STMT
  , MIN(CACHED_TS) AS MIN_STMT_TS
  , MAX(CACHED_TS) AS MAX_STMT_TS
  , DECIMAL(
    TIMESTAMPDIFF ( 2 , CAST(
      MAX(CACHED_TS) - MIN(CACHED_TS) AS CHAR(22)
                           )) / 3.6E+03
           , 12 , 3 ) AS MAX_AGE_OF_CACHE
FROM <userid>.DSN_STATEMENT_CACHE_TABLE
      ;

 

I use here the TIMESTAMPDIFF built-in function (BIF) with 2 to get the seconds, and then divide by 3600 to get back to hours.

The output will look like:

 ---------+---------+---------+---------+---------+---------+---------
NUM_STMTS   FLUSH_PER_HOUR   FLUSHED_STMTS    MIN_STMT     MAX_STMT
---------+---------+---------+---------+---------+---------+---------
     69                4               200        7750         8018

---------+---------+---------+---------+---------+---------+---------+-
MIN_STMT_TS                  MAX_STMT_TS               MAX_AGE_OF_CACHE
---------+---------+---------+---------+---------+---------+---------+-
2013-04-24-09.49.55.712231   2013-04-26-09.52.46.758709          48.047

 

Here, on one of our small test systems you can see we have a very low use and flush rate, however, I am sure that you all will be surprised when you run this in your pre or production DB2s!

 

Feel free to send me your comments and ask questions.

TTFN,
Roy Boxwell
Senior Software Architect.

2013-05: Access Path Changes During a DB2 10 Skip-Level Migration

Lately I have been asked a lot about skip level migration and so, even though I was surprised that people are *still* on DB2 V8 (Even the nails have rusted on the perch…), I have dug out this “old” marketing/whitepaper below as I thought it could actually make a nice newsletter as a final last gasp info output for DB2 V8…

Obviously this is not written in the normal “Roy” style as it is primarily a marketing paper but there is still good data in it – and next month we will be back in the land of technical newsletters – I promise!

So here it is DB2 10 Skip Level Migration:

 

DB2 10 Skip Level Migration
A speedier way to DB2 10 

For the second time in the life cycle of DB2 z/OS, IBM supports a version jump for the migration – now from version 8 to version 10. It was also possible to leave out a version ten years ago, namely the jump from version 5 to version 7. The direct jump from DB2 version 8 to DB2 version 10 is referred to as a „Skip-Level Migration„.

During the DB2 V10 beta program, this function was available to selected customers; and it has been intensively tested and is a great option for DB2 customers that are currently on DB2 version 8 (I sincerely hope that is not many!). If a site plans the skip migration ahead and prepares it wisely, the migration from V8 to DB2 10 can be as smooth as other migration scenarios. IBM estimates that the project duration for the migration should take around one and one-half times as long as it would take for a simple DB2 migration. However, a skip-level migration carries a much greater risk because software updates for two versions will be simultaneously implemented.

 

General recommendations

This newsletter describes some special aspects to be considered when performing a skip-level migration and rounds out the information provided by IBM.

IBM’s best practices contain a customer recommendation for detailed migration planning as well as an extensive test strategy.

It is especially important to ascertain the current service levels of all applications when conducting this special type of migration. The measures described in this newsletter should support this.

 

 

Test & review RUNSTATS

Regarding applications, IBM’s best practices strongly recommend that customers test and review their RUNSTATS procedures in order to guarantee that the DB2 Optimizer can choose efficient access paths by using all of the required information in the DB2 catalog (including the new stats in 9).

 

REBIND

Version 10 – as well as Version 9 – contains a number of Optimizer enhancements that can/will drastically improve performance, which is realized after the migration by REBINDing all packages (if static SQL is available). For dynamic SQL, the new Optimizer code will be used instantly. For several reasons, customers should plan to REBIND all static application packages as part of the skip-level migration. The changes of the static access paths can be cross checked based on the EXPLAIN information in the PLAN_TABLEs. Dynamic SQL access paths can be compared using the dynamic statement cache (DSC) as a base for an EXPLAIN.

 

Access Path Changes Due to Optimizer Enhancements

Part of the DB2 migration is the very time-consuming process to perform the necessary checks and validations for all of the current access paths. With skip-level migration, additional attention to the changed Optimizer behavior is strongly recommended.
Unfortunately, the access path changes caused by a migration are not predictable in advance. Or are they?

 

The Early Precheck

Each access path depends upon the rules and algorithms of the Optimizer and the DB2 version-dependent code.
By comparing all version 8 access paths with the resulting access paths of version 10, the tool Bind ImpactExpert (BIX) identifies the changes necessary for the new Optimizer in advance.

DB2 applies several version-specific optimizations to the internal executable code of an access path. Thus, it is important to reactivate those performance options – fast column processing a.k.a. SPROCS (selects) and UPROCS (updates) etc. – because the options get “lost” during a migration without REBINDs.
One important thing to mention here: IBM automatically REBINDs all packages that are from version 5 and older. To stay in control of the access paths during the AUTO-REBIND: BIX-IT! 
Bind ImpactExpert (BIX) safely ensured performance stability during migrations from V7 to V8 and from V8 to V9. According to individual service levels of applications being affected by the migration, the tool prechecks the whole environment or just a subset that requires special attention. At a number of customer installations, so-called Optimizer “patterns” were determined (patterns classify different types of access path changes).

Some examples:

– Index accesses change to table space scans for small tables
– Non-matching index scans change to table space scans for large indexes

The functional changes to RUNSTATS utilities in DB2 9 will also have an impact on the access paths since catalog statistics will change after executing the new RUNSTATS.

Version 8 also extended the functionality of the RUNSTATS utility. Adapting your RUNSTATS strategies accordingly can correct access path degradations that are discovered.

 

Preparing Suitable Catalog Statistics Prior to Migration

When preparing for a migration, special attention should be paid to your RUNSTATS procedures.

On the one hand, the Optimizer requires detailed statistics in order to choose (more) efficient access paths. On the other hand, after the migration, it is necessary to REBIND all packages in order to activate the new DB2 10 Optimizer code. Due to the amount of time required to execute RUNSTATS for all objects, it is not advisable to perform this on the day of the migration when entering DB2 conversion mode (CM). A RUNSTATS instantly invalidates the access paths in a DSC for dynamic SQL.

 

What is the best way to prepare the migration with RUNSTATS without risking the DSC?

Bind ImpactExpert’s DSC Protection allows the execution of RUNSTATS for all objects under DB2 Version 8 without any risk by preventing unwanted invalidation of the DSC. This means there will be no uncontrolled access path changes for dynamic SQL during the preparation phase. With static SQL, the Optimizer determines (new) access paths the REBIND is executed.

 

What is a practical method to compare all V8 and DB2 10 access paths?

Bind ImpactExpert  categorizes each SQL statement into:

  •  Same access paths (or unimportant changes)
  •  Improved access paths
  •  Degraded access paths
  •  Changed access paths (not automatically rated)

 

Is your RUNSTATS strategy under version 8 prepared for the new challenges of the DB2 Optimizer under Version 10?

Comparing access paths with Bind ImpactExpert before the migration will help with the preparations.

 

How do Optimizer patterns specifically influence your applications in your DB2 system?

This too is analyzed and answered by Bind ImpactExpert.

Enhancements to Bind ImpactExpert support skip-level migration. It provides extended rules and standards as well as results categories.

Using Bind ImpactExpert, the first analysis of the access path methods between versions 8 and 10 show results similar to those found in the past. However, the percentage of changes made during a skip-level migration is much higher.

During a “regular” migration, only 5% to 30% of all statements (dynamic and static) have access path changes. This number can double during skip-level migration.

 

Using APCOMPARE/APREUSE in DB2 10

There are two new REBIND options in DB2 10 that allow you to compare and reuse access paths (APCOMPARE and APREUSE).

These options suppress a REBIND execution in the event of an access path change. However, unless the packages were bound or rebound in DB2 9 NF, these options do not support skip-level migration. Using them later (like in your daily maintenance procedure) is also highly controversial because they only know that access paths have “changed”; thereby suppressing REBINDs for access path improvements as well.

 

Using Package and Access Path Stability after the Migration

Version 9 introduced Plan Management, which is also known as either package or access path stability. This is available as a REBIND PACKAGE parameter and as a ZPARM. In DB2 10, an enhancement to this function provides you with the option to save the old version 8 access paths when performing the REBIND. If required, an access path can be rolled back to the old access path. PLANMGMT(EXTENDED), for example, has 3 access path versions (CURRENT, PREVIOUS, and ORIGINAL).

Bind ImpactExpert conveniently includes Plan Management in the skip-level migration so a reversal of degraded access paths is possible at any time.

It reduces the overhead associated with this feature by up to 90% by limiting its use to those access paths that have degraded. For complete details about Plan Management, refer to SOFTWARE ENGINEERING’s white paper title “Package Stability“ – A great feature with some weak points”.

This Bind ImpactExpert function – which was so useful during migrations from V8 to DB2 9 – is also usable for skip-level migration.

 

Summary

DB2 10 offers significant performance improvements. To get the most out of DB2 10 improvements, global REBINDs are a must. With Bind ImpactExpert, you can predict access paths before you migrate and avoid surprises in a production DB2 10 environment.

As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2013-06: Do you have a space problem in DB2?

I have recently been involved with trialing and testing our space monitor software on DB2 z/OS.

It was originally designed many moons ago to monitor the size of the secondary extents that DB2 was using and to issue dynamically on-the-fly an ALTER SECQTY to guarantee that the maximum size of the object was reached *before* you ran out of extents.

Now you might be wondering “What’s that got to do with the price of beef?”

Because, as we all know, DB2 V8 introduced a “sliding scale” to the size of the secondary extents so that it could also guarantee that a dataset hit its maximum size *before* you ran out of extents.

So what’s interesting here?
Simply put – both solutions are great until you actually hit the wall!
When you get to the maximum number of datasets – Who is gonna help you??? If it is 01:00 on Saturday morning and your critical production tablespace has got its maximum of 32 datasets – what are you going to do??? Could you actually get the REORG through before prime time starts on Monday morning?? Or what happens when partition 26 completely fills up?

IFCIDs to the rescue!

This is where our tool SAX (Space Assurance Expert) comes in handy. It is a STC that runs 24×7 catching the IFCIDs that DB2 spits out whenever a dataset issues a request for an extent. What it then does is the “clever” bit if you like! Using the DB2 Catalog SAX determines the exact make-up of the object being extended and can use two levels of warning percentages to start triggering alarm bells way, way before it all goes pear-shaped!
Here is my little “ready-reckoner” for Linear Dataset Allocations :

Object type: TABLESPACE      ! Maximum number of data sets 
-----------------------------+---------------------------- 
LOB tablespaces              ! 254 
-----------------------------+---------------------------- 
Non-partitioned tablespaces  ! 32 
-----------------------------+---------------------------- 
Partitioned tablespaces      ! 1 (Percent used check) 
-----------------------------+---------------------------- 
Partitioned By Growth        ! MAXPARTITIONS. LPS check if 
tablespaces                  ! more than one. If on last 
                             ! partition then percent used.
-----------------------------+---------------------------- 

Object type: INDEX ! Maximum number of data sets 
-----------------------------+---------------------------- 
Non-partitioned indexes on   ! MIN ( 4096 , 2 power 32 / 
tablespace with LARGE,       ! ( DSSIZE / TS PGSIZE)) 
DSSIZE, or more than 64      ! Eg: 128 GB DSSIZE with 
Partitions                   ! 8 KB Tablespace Page 
                             ! gives 256 Pieces (datasets) 
                             ! Or 4 GB DSSIZE with 
                             ! 4 KB Tablespace Page 
                             ! gives 4096 Pieces (datasets)
-----------------------------+---------------------------- 
Non-partitioned indexes      ! 32 
otherwise                    ! 
-----------------------------+---------------------------- 
Partitioned indexes          ! 1 (Percent used check) 
-----------------------------+----------------------------

Understand PBG space definitions

Here you can see that it is not as easy as it used to be and you must also make sure you understand PBG space definitions. We allow two percentages and use them in two different ways
1) The number of datasets that have been allocated
2) The used space with a linear dataset
The second is also used if it is a PBG with MAXPARTITIONS 1 (e.g. The DB2 Catalog) or if the partition being extended is the last allowable partition.

These warnings are issued as WTOs and can easily be picked up by system automation tools to open job tickets or send e-mails to alert DBAs days or weeks before the system stops working.

Watch the number of extents

What I have also seen is that the number of extents is sometimes getting very large indeed! One customer had numerous datasets with over 4,000 extents! Now we all know that no-one knows where data is really stored on the modern disk sub-systems but still… I would schedule a reorg at say 1,000 extents! The number of extents changed a *long* time ago in z/OS 1.7 to raise it from 255 to 7,257 spread over 59 volumes *but* still limited to 123 extents per volume. This little nugget of information is *very* important if you are thinking of going down the “one huge EAV volume for all my data” road (These disks can have up to 262,668 cylinders or about 223GB!) as the extents per volume is still there!

In comparison the good old MOD-3s had 3,339 cylinders and 3GB of space.

So what I want to tell you this month is:

1) Check your Linear Dataset Allocations – Are you banging your head on the wall yet?

2) Implement a methodology to warn you in advance!
 Buying our software is the smart route of course!

3) Talk with your storage personnel about space and space management on a regular
basis
.

Finally, our SAX also checks and alerts if your SMS storage groups start getting full.
This is especially handy for your DB2 10 NF Catalog, Copy Pools and Work Storage groups.

As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2013-07: Death by Index Probe (DB2 10)

Now this might sound like a James Bond style bad guy’s death threat but actually this has happened at a shop I was at, and so I thought I would share the grisly details with you all!

In DB2 10 IBM increased the intelligence of the Optimizer again and so they introduced the dreaded “index probe”. It is a kind of double check – Is the data really empty?

 

Index probing is used in three distinct circumstances in DB2 10 any mode:

First is:
1) RUNSTATS reported an empty table
2) RUNSTATS reported empty qualifying parts
3) Catalog statistics are at default
4) A matching predicate is estimated to return zero rows

Second is:
1) Table has the VOLATILE attribute

Third is:
1) Table has fewer than ZPARM NPGTHRSH pages

Now the last two are basically the same – Don’t trust the RUNSTATS as it might have been run at an inopportune time, e.g. when the table was empty, so checking the index to see if there is data there really does make sense.

 

A quick solution in the past – A killer at present

Now what happened at our customer site was Dynamic SQL PREPAREs started taking up to 30 seconds for some SQL. What was weird was that the SQL was against many partitioned objects (Normally more than 250 and most actually at 999!) using the full partitioned key. The customer checked the statistics – everything fine, REORGed the DB2 Catalog – just in case! No change and then they got the answer from IBM – Remove the VOLATILE keyword!

Now we go back in time … When the customer migrated to DB2 V8 many, many moons ago the access to their, then only 254 partitions, partioned tables got much worse and IBM investigated and said “use the VOLATILE keyword”, and hey Presto! It worked! It also worked for DB2 9 but is a disaster in DB2 10. By the way – the 999 partitions were done “for future growth” and most of the partitions are, of course, empty.
IBM has said that a corrective PTF will come out in September.

 

Check the access path

Of course simply ALTERing the table to be NOT VOLATILE stopped the disastrous index probe of 999 datasets *but* it also, naturally enough, changed the access paths as well! The customer must now compare all the DB2 9 and DB2 10 dynamic SQL access paths to see what will now go pear shaped… Luckily they have our Bind ImpactExpert with   Early Precheck Dynamic and Static so the task is simple to do but someone must still do it and then analyze the results!
Here’s a little query to show you if *you* have any of these little darlings waiting to go BOOM at your shop…

SELECT SUBSTR(STRIP(A.CREATOR) CONCAT '.' 
       CONCAT STRIP(A.NAME) , 1 , 32 ) AS TABLE
     , MAX(B.PARTITION) AS MAX_PARTS 
FROM SYSIBM.SYSTABLES    A 
,SYSIBM.SYSTABLEPART     B 
WHERE A.SPLIT_ROWS = 'Y' 
AND A.TYPE         = 'T' 
AND A.DBNAME       = B.DBNAME 
AND A.TSNAME       = B.TSNAME 
AND B.PARTITION    > 200 
GROUP BY A.CREATOR, A.NAME 
ORDER BY 1 
;

 

As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2013-11: FREE DB2 HealthCheck

 

This month, as a thank you to everyone on the News from the Labs „nice“ list, we have a seasonal freebie!

Our DB2 HealthCheck package pinpoints weaknesses and opportunities for improvements and optimization in DB2 z/OS systems.
Just click this link, fill in and select „Healthcheck Freeware“ down the page.
My support colleagues will ship our Licensed Freeware edition of our HealthCheck series or DB2 z/OS.

 

PerformanceHealthCheck for DB2 z/OS (PHC)

PerformanceHealthCheck for DB2 z/OS (PHC) checks your DB2 subsystem for a range of problems and lists out what it finds including the latest enhancement – the Coupling Facility checker.

I read on listserv about people with “Coupling Facilities under stress” and so I added some CF checks. It checks the six important values in your CF. The Level of the microcode, the transfer time, the number of rejects, the false contention percentage, the subchannel busy percentage and finally the all paths busy count. From these KPIs you can see if your CF is “under stress” or not! Now to get all this juicy data the LOAD library *must* be APF authorized of course!

Remember that the normal Performance HealthCheck still runs fine without being APF auth’d just the CF check must be.

 

Statistics HealthCheck for DB2 z/OS (SHC)

Along with PHC comes Statistics HealthCheck for DB2 z/OS (SHC), which lots of you may already be familiar with. It allows you to analyze and assess a complete DB2 subsystem down to a single database and tell you what is ʺwrongʺ or inconsistent with your DB2 catalog statistics.

This enables you to determine any problems before they get bad and to improve performance by providing the DB2 Optimizer with better information from which it can base its cost estimate on. It fully supports and is compliant for DB2 10. It is a perennial favorite and you cannot run it enough. Especially when you’re going to migrate to a new DB2 version, this software is a must to protect yourself from strange optimizer behavior.

 

The binaries come with the products documentation with a full feature overview that summarizes what our PHC can do for you!
As usual any  questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2013-09: Roy’s first DB2 11 review

Well now I can finally talk about DB2 11 for the first time …

We have had it for over a year now and have been in the ESP from the beginning. In this newsletter I want to give you a „first look“ at what I think is cool, great or odd!

 

DB2 11 – here’s my personal list of *brilliant* down to *why*?

Brilliant
Great
Useful
Why?

 

Brilliant

REORG TP with NPSIs got a nice boost in performance and the new ability to do TP copies during a DSNUM ALL REORG is *very* good. REORG can now do much better and faster inline statistics including histograms. The switch phase got a massive boost as well. Mapping tables also became history in this release! You can finally drop ‘em all from your system. DB2 will create and use them on the fly. In NFM you *must* use the new format mapping table so you might as well let DB2 do the work for you!
RUNSTATS now works *without* a profile – Hooray! Terry Purcell can now sleep better at night but the update of the RTS TOTALROWS and TOTALENTRIES is *still* only done if you run with SHRLEVEL REFERENCE – and how many of you do that?? A new option RESET ACCESSPATH finally gives you the chance to *remove* all access path data in the catalog for the given objects. This then enables a new RUNSTATS to collect only the information you actually need/want. ZiiP offload has been greatly increased as well.
For the Optimizer there is Index Skipping to enable faster duplicate discarding processing, a new In Memory Data Cache (IMDC) to enable better run time optimizer choice depending on the current amount of free memory (This memory cache value (MXDTCACH) is default 20MB and can be increased) and a bunch of new query rewrite techniques to help „automatic bad SQL generators“ get a chance of index access. E.g. WHERE YEAR(date_column) = 2013 is rewritten to WHERE date_column BETWEEN ‚2013-01-01‘ AND ‚2013-12-31‘. Finally OR and IN list with NULLs gets a rewrite boost as well so WHERE COLUMN IN (‚A‘, ‚Y‘) OR COLUMN IS NULL is simply rewritten to COLUMN IN (‚A‘, ‚Y‘, NULL). All these changes allow index usage where it was not possible before.

The optimizer team also delivered a seriously good boost for embedded CASE structures that used to be anathema to DBAs:

SELECT COL2, COL3
FROM TAB1
WHERE COL1 = CASE (CAST(? AS INT))
             WHEN 1 THEN 'R'
             WHEN 2 THEN 'O'
             ELSE 'Y'
             END
;

SELECT TAB2.COL1, TAB1.COL2, TAB2.COL2
FROM TAB1, TAB2
WHERE TAB2.COL1 = CASE WHEN TAB1.COL1 = 'Y'
                  THEN TAB1.COL2
                  ELSE TAB1.COL3
                  END
;

Both of these used to be stage2 but now can actually use indexes!

 

Great

Expanded LRSN from six to ten bytes. DB2’s six byte RBA meant it could store 2 to the power of 48 bytes of data on the Log or 256TB of data. Back in the late 80’s this was thought to be „more than enough“ … So now we have 2 to the power of 80 or 1YB. Yep, that’s a Yottabyte of Log! This is optional but from NFM on DB2 will internally always store ten bytes – the externalized data is dependent on whether you do the migration or not. However, it is highly recommended as you will get a small performance hit to convert down to six bytes. The other boost this gives is a reduction in spins waiting for a unique LRSN in data sharing which – with the faster and faster hardware – was becoming a bottleneck.

DSN1COPY – The first time a dataset is opened for normal business after a DSN1COPY, some basic „sanity“ checks are run to try and flag „bad“ DSN1COPY runs before too much work, hopefully any, has actually been done. This is very handy as the DSN1COPY is notoriously difficult with its parameters and input control cards.

AUTOSIZE on Bufferpools gets a wonderful fix to stop „runaway“ autonomic mismanagement. In other words, you can set upper and lower bounds which will *not* be exceeded – very nice!

Pseudo-empty index page cleanup daemon – A very nice „set it and forget it“ feature that actually removes completely empty pages from indexes, thus reducing the need for REORG INDEX and boosting performance.

 

Useful

Global Variables – Pretty handy feature in DB2 11 to easily pass data along a chain of programs. A bit like a SPA in good old IMS!

JVM only being started once and using 64 Bit storage is a major winner for most shops as nearly everyone uses JAVA these days and sometimes even on Big Iron.

Archive Tables – Similar to „time travel“ tables from DB2 10 but looked upon as an „extension of the current table“. This is very handy indeed when all you want is a bucket to throw any DELETEs into. Plus you can use this bucket as input to be added to the current table or not – Very nice feature this!

DROP COLUMN – I will be using this as I have *always* : created a copy table, cross loaded it, dropped the original table, created the new table without the column, cross loaded back to it, created any dependent views, GRANT access back. Reorg, Runstats, Rebind – Done *phew*. Now you can simply Drop the column and then schedule a reorg and you are also done!

Autonomous procedures – This gives the chance to COMMIT any work done *just* in the stored procedure. The main task can still ROLLBACK but that will not affect the data COMMITted in the proc. This could be very handy or a complete disaster depending on your business logic!

Faster recovery of directory data because certain directory tables/indexes update RBA/LRSNs are now stored in the SYSLGRNX. This is a very good feature for recovery of catalog objects of course!

 

Why?

GROUP BY multiple columns and aggregates has been added – now here is another new SQL feature, like OLAP queries, that nearly no-one will ever use!

Enforce NUMTCB=1 for stored procedures … this is an enhancement?!?!?

And last but not least

Finally my personal little favorite – LIKE_BLANK_INSIGNIFICANT is a new ZPARM – it makes the rules for LIKE of static and varchar get a little bit more aligned. This is, of course, a change in behavior and you must make sure that your programs react as you still think they should!

 

That’s all for now –next time I will dive into details on some of the above topics.
If anyone has any wishes please email me and I will see what I can do!

As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2013-08: Inline LOBs

In DB2 10 you can now define LOBs to also be „inline“.

This small feature is actually fantastic for performance and general use and I urge all DB2 users to evaluate using them!

What they enable is the use of just the „base“ table space and not the aux (LOB) space at all! This is a very good thing as LOB access can(!) sometimes be painfully slow and cause bottlenecks in the processing.

 

In the DB2 10 Performance guide Chapter 4.3 there is a nice list of benefits:

Inline LOBs offer the following performance advantages over LOBs that are stored in auxiliary tables (sometimes called outline LOBs):

  • Disk space savings because two LOBs cannot share a page on a LOB table space
  • Disk space savings because the inline portion of a LOB can be compressed
  • Synchronous I/Os to the AUX index and LOB table space are avoided
  • CPU savings associated with accessing the AUX index and LOB table space
  • Sequential and dynamic prefetch I/O for LOBs
  • Improved effectiveness of FETCH CONTINUE when scanning rows
  • Index on expression can be enabled for LOB data

Note the LAST one which I think is actually one of the best reasons!!!

 

Obviously there are a few „considerations“:

  1. Check how long your LOBs are now and try to size for the 80 / 20 rule
  2. Remember that the Page size and the related BP size will need to be adjusted and tuned
  3. If you rarely use the LOB column then don’t bother!
  4. If going from „old“ style space to an inline LOB with compression then you must do a sort of triple jump:

a. ALTER to get a UTS and also in RRF format – REORG to action
b. ALTER to get INLINE LOB usage – REORG to action
c. Now REORG it again to actually get COMPRESSION as that is not done by the earlier REORGs!

 

An example SQL from the DB2 Performance chapter 4.3

Here’s an example SQL that I took from the DB2 Performance chapter 4.3 and changed a little to report  more info:

WITH LOB_DIST_TABLE (LOB_LENGTH                    
                    ,LOB_COUNT)                           
AS (                                               
    SELECT LOBCOL_LENGTH                                 
            ,COUNT(*)                                       
    FROM (                                           
    SELECT ((LENGTH(STATEMENT) / 1000) + 1) * 1000 
            AS LOBCOL_LENGTH                     
            FROM SYSIBM.SYSPACKSTMT                     
            ) LOB_COL_LENGTH_TABLE                      
    GROUP BY LOBCOL_LENGTH                               
   )                                                     
SELECT '01000' AS SIZE                                   
       ,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                 
       ,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /               
        (SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                             
WHERE LOB_LENGTH <= 1000                                       
UNION ALL                                                      
SELECT '02000' AS SIZE                                         
,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                   
,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /               
(SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE

FROM LOB_DIST_TABLE                                            
WHERE LOB_LENGTH <= 2000                                       
UNION ALL                                                      
SELECT '04000' AS SIZE                                         
       ,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                   
       ,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /               
        (SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                            
WHERE LOB_LENGTH <= 4000                                        
UNION ALL                                                      
SELECT '08000' AS SIZE                                         
,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                   
,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /                
(SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                             

WHERE LOB_LENGTH <= 8000                                       
UNION ALL                                                        

SELECT '12000' AS SIZE                                         
,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                   
,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /               
(SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                            
WHERE LOB_LENGTH <= 12000                                      
UNION ALL                                                       

SELECT '16000' AS SIZE                                          
,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                   
,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /               
(SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE

FROM LOB_DIST_TABLE                                            
WHERE LOB_LENGTH <= 16000                                      
UNION ALL                                                       

SELECT '20000' AS SIZE                                          
,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                   
,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /               
(SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE

FROM LOB_DIST_TABLE                                             
WHERE LOB_LENGTH <= 20000                                      
UNION ALL                                                      

SELECT '24000' AS SIZE                                        
,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                  
,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /              
(SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                           
WHERE LOB_LENGTH <= 24000                                      
UNION ALL                                                     
SELECT '28000' AS SIZE                                        
      ,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                  
      ,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /              
       (SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                           
WHERE LOB_LENGTH <= 28000                                     
UNION ALL                                                      
SELECT '32000' AS SIZE                                        
      ,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                  
      ,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /              
       (SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                           
WHERE LOB_LENGTH <= 32000                                     
UNION ALL                                                     
SELECT '99999' AS SIZE                                        
      ,COALESCE(SUM(LOB_COUNT) , 0) AS COUNT                  
      ,(COALESCE(SUM(LOB_COUNT) , 0) * 100.00) /              
       (SELECT SUM(LOB_COUNT) FROM LOB_DIST_TABLE) AS PERCENTAGE
FROM LOB_DIST_TABLE                                           
WHERE LOB_LENGTH >  32000                                     
ORDER BY 1                                                    
;

Now it returns this data:

---------+---------+---------+-------
SIZE         COUNT           PERCENTAGE
---------+---------+---------+-------
01000       112102              95.66
02000       115957              98.95
04000       116771              99.65
08000       117044              99.88
12000       117110              99.94
16000       117140              99.96
20000       117173              99.99
24000       117174              99.99
28000       117175              99.99
32000       117176             100.00
99999            0                .00

DSNE610I NUMBER OF ROWS DISPLAYED IS 11
As can be seen just changing the inline LOB length to be 1000 bytes would „hit“ over 95% of the rows!
This might even mean no change to page size and/or bufferpool would be needed!

Remember to change the driver CTE to be your candidate LOB column and table as here I have used STATEMENT and SYSIBM.SYSPACKAGE.

 

Return the LOB column names and tables needed

Here’s a little query, taken from the performance book and enhanced/corrected, to return the LOB column names and tables needed:

SELECT SUBSTR(CO.NAME , 1 , 30) AS COLUMN_NAME              
,STRIP(CO.TBCREATOR) CONCAT '.' CONCAT STRIP(CO.TBNAME)
                                AS TABLE_NAME               

FROM SYSIBM.SYSCOLUMNS CO                                   
WHERE CO.COLTYPE IN ('BLOB' , 'CLOB' , 'DBCLOB')            
  AND NOT EXISTS (SELECT TB.NAME                            
                  FROM SYSIBM.SYSTABLES TB                  
                  WHERE TB.NAME    = CO.TBNAME              
                    AND TB.CREATOR = CO.TBCREATOR           
                    AND TB.TYPE    = 'X')                   
ORDER BY 2 , 1
;                                              

Create an index

One very nice feature is the ability to create an index on expression:

 CREATE INDEX IQATW005.LOB_IOE_IX      
  ON IQA0610.IQATW005                  
    (CHAR(SUBSTR(SQL_TEXT , 1 , 254)) )
     USING STOGROUP SYSDEFLT            
          PRIQTY    14400              
          SECQTY    14400              
     ERASE NO                           
     FREEPAGE    5                      
     PCTFREE     5                      
     BUFFERPOOL BP0                     
     CLOSE YES                          
;

Only available for SUBSTR and with fixed start and end but still very useful as you can now „scan“ the text in your WHERE clauses!

The performance benefits can be immense for in-line LOBs and so I recommend a quick test and then roll-out to production!
As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2013-10: DB2 11 Expanded RBA/LRSN

Now if you were a good attendee at the IOD (and paid attention to the freebies!), you probably got a free copy of the great little book „DB2 11: „The Database for Big Data & Analytics“.

If you did not get a copy at the IOD you can download a free electronic copy here.

 

On page 30 there is the following text:

   “Implementation of the expanded RBA/LRSN formats is optional but highly recommended as DB2 11 will use 10-byte values internally when in NFM, and there will be a small performance penalty for converting these to old Format”

Italics added by me of course!

Now this caused a couple of questions at the IOD as how to manage a REORG of the whole world. So my newsletter this month is to suggest a way to do this without just REORGing everything from DB 0 to DB Z!

 

How to manage a REORG without just REORGing everything from DB0 to DBZ ?

1) Be on *usage* metric
Start with your “known suspects” list – Every shop has a top ten list of SQLs normally based on CPU. What you must do is change this list to not be on CPU but on *usage*. The trick is how to actually do this! Well, I would start with the average executes per hour for all objects (To get this metric you must EXPLAIN or have the EXPLAIN data). Obviously the objects at the top of this list should be REORGed first!

2) 50 Reorgs a night
Get your system automation or DB2 Database Maintenance system to use the above data to drive, say, 50 Reorgs a night *purely* based on usage metrics

3) Wait

4) You’re done when no more REORGs are triggered for this metric

Of course it would be easier if all the data was there in front of you but then you would need our SQL WorkLoadExpert (WLX) to get the usage metrics which is, of course, linked to our RealTime DBAExpert (RTDX) which takes care of the REORGs but I digress…

 

SYSCOPY’s handling

What is also “interesting” is the way SYSCOPY handles all of this… If you select the LENGTH of the START_RBA before the CATENFM job has REORGed SYSCOPY you will get 6 back. After the CATENFM has done SYSCOPY you then get 10 back. This way your “home grown” systems can “know” if the RBA is 6 or 10 Bytes…

SELECT LENGTH(START_RBA)
FROM SYSIBM.SYSCOPY 
FETCH FIRST 1 ROW ONLY 
;

RBA & LRSN extensions

The next interesting point is that a RBA has been extended on the left and a LRSN has been extended on the left *and* the right. So you need to be aware of whether or not the row was written in a data-sharing environment. Probably not that interesting to most people as their shops are one or the other and not normally a mix, but for our software we must know what is in the START_RBA so we use a little CASE to do the work for us if running in DB2 V8, 9 or 10:

CASE WHEN GROUP_MEMBER > ' ' 
     THEN 
          SUBSTR(X'00' CONCAT START_RBA CONCAT X'000000' , 1 , 10)
     ELSE 
          SUBSTR(X'00000000' CONCAT START_RBA , 1 , 10)
END

 

The last thing I will mention this month is the SYSLGRNX format has changed which might catch a few people out if you are using CI style programs to read it! The DB2 11 NFM version is now in a segmented space and has the extended trailer so beware!

As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect

2014-01: Complementing IBMs ACCESS PLAN Stability

 

Part 1: Dynamic SQL

with SEGs complete RUNSTATS Rescue package: Why it is a good idea but why it fails in certain cases?

The story

This month I want to tell you a story. This story is true, but the names have been changed to protect the innocent! A big company regularly scheduled production staging on Thursday nights. One night last year everything went as normal – until the Friday morning…

… then the telephones started ringing, and people were complaining about slow or non-existent response times for a rather critical business application.

The problem quickly escalated from the standard “Help desk”-level answers of „Switch it off and on, reboot“, and “Have you changed anything?” to senior managers demanding to know why things were not working anymore.  At this point, the DBA group was not actually involved, as it was first thought that a “bad” package had been promoted “by accident” on the night before.

  • The production group backed out the staged packages, but it didn’t help…
  • The delays got worse. They then stopped nearly all of the WebSphere Servers to at least allow *some* work through the over loaded system. Now in full panic mode, the DBA and the JAVA teams got involved.They both quickly found the “culprit”, which was an extremely large and complex dynamic SQL statement that had worked fine until some time Thursday night, and was now behaving *very* badly indeed.
  • The DBA team REORGed the “big” tables involved, in the hope that it would then all get better… It didn’t.
  • Finally the DBA team proposed creating a new index which was quickly done and  RUNSTATed in production.

The SQL then switched the access path back to a good one. The WebSphere servers were all re-started, and gradually everything returned to normal. This whole process actually took two days! The company involved relies upon its logistics chains and Just-In-Time delivery, so this outage had some serious repercussions, of course…    Get the full story

 

What really happened?

The DBA team then investigated further and found out that what had really happened, was that a RUNSTATS, on just one small table, had been run on the Thursday night at “an inappropriate time” thus causing the statement’s access path to go „pear-shaped“(aka Belly Up) all day Friday and half of Saturday…

How to fix the problem quickly and easily?

The DBA team then thought about ways that such a problem – should it reoccur in the future – could be quickly and easily fixed. Now my part of the story begins… This company uses our software and had a license to run the Enterprise Statistics Distribution (ESD) component of Bind ImpactExpert, which extracts, and optionally converts, all of the DB2 Catalog data that the DB2 Optimizer needs to do its job. Normally, customers use this to copy all the production statistics over to a sand box style system, to see if a DB2 APAR or DB2 Migration will cause unforeseen problems. For this, they use the Early-PreCheck component of our tool Bind ImpactExpert for Dynamic or Static SQL. Now we do have another scenario, called DSC (Dynamic Statement Cache) Protection, that would nearly do what they wanted, but also does a lot more and, of course, costs more to use! And so arose the idea for our new PocketTool called RUNSTATS Rescue. “Why is it a PocketTool?” I hear you ask “Because it only costs pocket money!” (aka Pin Money in the USA or an „allowance“ if you prefer) These tools are inexpensive to use – really! Now, before you stop reading at this point and start complaining about the fact this Newsletter is just 100% Marketing, please bear in mind that what I describe here could also be written by you – then you just need to give me credit for sharing the idea…

RUNSTATS Rescue

The idea is to use EXPLAIN in any way, shape or form, either in SPUFI, or directly in any monitor, to simply EXPLAIN the “culprit” SQL, and to remember the PLAN_TABLE owner you are using, as well as the QUERYNO you just used. Using these two inputs, RUNSTATS Rescue analyses the EXPLAIN output to build a list of extract and update control cards for our ESD, for all of the tables used and *all* of the indexes – even those *not* used, of course! Finally a DSC flush RUNSTATS is also generated for all Tablespaces involved in the query to make sure that the next time this “culprit” SQL comes into the system, it will then use the correct statistics.

Now, of course, the question is: „How do I know which statistics to use as the Rescue statistics?“

The answer: “The ones that were there before you did a REORG with inline RUNSTATS or a stand-alone RUNSTATS”. This is the key point to bear in mind: You must simply run the ESD extract before any normal DB2 Database maintenance jobs run. Most shops have days, or weekends, when they run these, and it is not a problem to extract the data and then copy it, for example, to a GENGROUP, to enable easily finding the date and time of the last extract when the statistics were “good”, thus enabling the RUNSTATS Rescue job to revert the required statistics very quickly. This gives the DBA group much-needed time to find out what really happened and take any appropriate action – almost – at their leisure.

10 Rescue Steps

  1. Select the new scenario RUNSTATS Rescue
  2. Generate some JCL
  3. Optionally copy to a GENGROUP dataset
  4. Insert the EXPLAIN TABLE-CREATOR and QUERYNO
  5. Automatic launch of our catalog browser
  6. Drill down to the Index level
  7. Ask “new” file name for the extracted “rescue” statistics
  8. Perform the RUNSTATS Rescue extraction
  9. Reset the statistics and executes the RUNSTATS
  10. The „Rescued“ Statistics

Top of page

Here’s a walkthrough example of how it looks in real life:

News 2014-01 Bild1

Near the bottom, you can see the new scenario RUNSTATS Rescue – select it to get a little pop-up window with the three steps. The first step must only be done once and then simply be plugged into an existing production job. I would recommend the first job of the normal DB2 Database Maintenance job stream.

1 – Select the new scenario RUNSTATS Rescue in the little pop-up window with the three steps

2 – Generate some JCL

The first option just generates some JCL looking like: News 2014-01 bild3

3 – Optionally copy to a GENGROUP dataset

At the end the optional step to copy to a GENGROUP dataset News 2014-01 bild4

4 – Insert the EXPLAIN TABLE-CREATOR and QUERYNO

Selecting the second option then requests the required input data as discussed earlier News 2014-01 bild5

5 – Automatic launch of our catalog browser

Hitting „enter“ then launches our catalog browser to enable you to see which objects were being used by that SQL… News 2014-01 bild6

6 – Drill down to the Index Level

News 2014-01 bild7

7 – Ask “new” file name for the extracted “rescue” statistics

PF3-ing out of the tool then asks for your original Production Statistics datasets, as extracted by the job in the first step, and a “new” file name for the extracted “rescue” statistics: News 2014-01 bild8

8 – Perform the RUNSTATS Rescue extraction

The next JCL appears that performs the RUNSTATS Rescue extraction, including the optional steps for GENGROUP support, as seen here: News 2014-01 bild9

9 – Reset the statistics and executes the RUNSTATS

Finally, the third option is selected, which actually resets the statistics and executes the RUNSTATS to flush the DSC News 2014-01 bild10

10 – The „Rescued“ Statistics

Now the next time that statement appears, it will use the “rescued” statistics and get back its old Access Path.   Top of page

Next Month

Next month I wish to expand upon this topic with the capability of doing the same for Static SQL. The month after that, I will go into detail about the DSC Protection scenario I mentioned earlier. That is not a pocket tool, of course, but it *is* very interesting!   As usual any questions or comments are welcome, TTFN Roy Boxwell Senior Software Architect

2014-02: Complementing IBMs ACCESS PLAN Stability

 

Part 2: the Story continues for Static SQL…

 

Welcome back to the second part of this newsletter.
Remember that last month I talked about a nasty real-world event that caused pain and grief to a large number of fellow DBAs?

 

Short reminder of part 1: rescue Dynamic SQL

A big company regularly scheduled production staging on Thursday nights. One night last year everything went as normal – until the Friday morning……then the telephones started ringing, and people were complaining about slow or non-existent response times for a rather critical business application…

See the previous newsletter here and the  walk through” example for Dynamic SQL rescue.

 

The Story continues for Static SQL

Well that was “only” Dynamic SQL that was killing them.

SOFTWARE ENGINEERING GmbH created the new Pocket Tool: RUNSTATS Rescue to gallop over the hill and rescue them from bad statistics timing problems.

…Everyone then lamented “But my Static SQL *also* goes horribly wrong if the statistics are badly timed. What can SEG do there?“

 

Rescue Static & Dynamic SQL

Naturally we have the ability to interface with Static SQL as well as Dynamic. Now, as all roads lead to Rome, or as I like to say “There is more than one way to skin a cat” – my co-workers don’t like that saying much for some reason… anyway, I digress – the interface to the dynamic RUNSTATS Rescue is the PLAN_TABLE owner and the QUERYNO used for the EXPLAIN. RUNSTATS Rescue does the rest –

so for static, there is a “simple” entry point. Just cut–and–paste the static SQL statement that is “causing grief” into SPUFI and EXPLAIN it, just as if it were dynamic SQL. However, the problem here, is that you must then manually generate all required REBINDs.

Top of page

Rescue „pure“ Static SQL

What we decided to do, is to create a new Pocket Tool extension for “pure” Static SQL. If you only have Dynamic SQL or if you only have Static SQL, then you will only need a single licence! The data that the new extension requires is all currently available in your SQL monitor, Trace etc. which tells you who the bad guy is, namely: Collection ID, Package name, Version and Statement Id. If you have this data *and* you have done your last BIND or REBIND with EXPLAIN(YES) – and I hope that’s true for 100% of production packages! – you are ready to rock’n’roll!

 Two prepare options

So this is how it will look. First we have a new option in the Menu selection list:

News 2014-02 Bild1

 

 

 

 

 

Here you can see that we now have two Prepare options: one is for Dynamic, and is the same as in last month’s newsletter, and the other is a new Static one which then leads you to the next pop–up:

 

  A REBIND step instead of a RUNSTATS step

News 2014-02 Bild2

Here you simply enter the required data and then, after it has been verified, you generate and run the RUNSTATS Rescue job, just like last time.

However, now you do not get a RUNSTATS step at the end of the job. Instead,  you’ll get a REBIND step for either the single PACKAGE or, if REBIND ALL is Y, for all affected PACKAGES in any COLLECTION. Obviously this option must be used “with care,” however, if the statistics are bad for one package why shouldn’t they be bad for all packages?

 

An errant Access path onto the good path

That’s it!

Now you have three ways to get an errant Access Path back onto the good path. (Remember that you can *always* cut–and–paste Static SQL into an EXPLAIN SPUFI – which means there are actually two ways to do Static).

 

Top of page

Next month

Next month I will go into detail about the DSC Protection scenario from our Bind ImpactExpert tool that is *really* cool and helps deal with the same type of problems but at a much higher and fully-integrated manner. It also completely handles the tricky question of :

“OK, so this query gets better when I reset my statistics but *this* query gets worse!”

 

As usual any questions or comments are welcome,
TTFN Roy Boxwell
Senior Software Architect