z/OS newsletter Each month, SEG sends out valuable technical DB2 z/OS information based on knowledge gained by the experienced developers and technicians in our product labs. We would like to share this newsletter with other DB2 users, like yourself.

17. Februar 201714. Januar 2021

2017-02 Why SIZE still matters in Db2 12

What has changed for space management in Db2 12?

How to avoid SIZE limits in Db2 12 like in previous Db2 Releases?

How to monitor the maximum possible SIZE of table and index spaces and table and index partitions?

SIZE in Db2 12: Now that Db2 12 has gone GA, I thought it would be nice to do a quick re-recap of Space management and its problems over the releases. My “old” newsletter 2014-05: Why SIZE matters for Db2 still receives a lot of hits on our website, so I know that this is a big topic of interest for many of you. Some nifty things have been introduced in Db2 12 to make space a lot easier to use and manage.

In the beginning…secondary allocation for tablespaces and indexes since Db2 V7, Db2 V8…

Since Db2 V8, the DBAs of this world have all had the ability to forget about PQTY and SQTY in the DDL for Tablespaces and Indexes. At first, nearly no-one trusted the sliding scale algorithm, and SOFTWARE ENGINEERING’s product Space AssuranceExpert (aka SAX) monitored and reacted instantly to secondary allocations.

However, we now have Db2 12, and I thought it would be interesting to review what was done in Db2 V7 (when our SAX was launched), and the difference nowadays in the Db2 12 world.

IFCID issuing for space extents

Every time a secondary allocation is done in Db2, it can be made to spit out an IFCID. SAX runs as a started task, active 24×7, from Db2 start up until just before Db2 shut down. It catches all of these IFCIDs thrown by Db2, and performs an analysis with six basic questions:

1	Can this dataset reach its maximum physical size before running out of physical extents? (The actual size is dependent on the “geometry” of the object of course!)
2	Will this object run out of datasets? (The number of datasets an object can have is, once again, dependent on the “geometry” of the object)
3	Is this partition nearing its maximum size?
4	Did Db2 ask for one extent but got more back?
5	Are any of my SMS disk storage pools running out of space?
6	Are there any SEQUENCES that are about to hit the wall?

(Numbers five and six are actually triggered by a timer, naturally.)

Can this dataset reach its maximum size before running out of extents?

Remember, back in those old days of Db2 V7? We only had 255 extents and 254 partitions, but datasets could still get pretty big pretty fast.

The problem lots of shops had, was that an important dataset would “hit the buffers” of maximum number of extents *way* before it ever ran out of physical space. Thus causing grief, wailing and gnashing of teeth! SAX stopped all this by giving WTO “heads-up” style messages in two flavors. First, a warning message, and then a critical message. This gave DBAs and space managers much needed time to plan for the outage and the, inevitably, long running REORG to actually action the required ALTER, or perhaps even any DROP/RECREATE that had to be done.

IBM also noticed this problem and so introduced in Db2 V8 the “sliding scale” of secondary allocations, as long as the ZPARM OPTIMIZE EXTENT SIZING field (MGEXTSZ) was set to YES (this is the default from Db2 9, by the way). Of course, to really use this, you then had to ALTER all of the existing spaces PQTY and SQTY to be -1, and then remember to delete all PRIQTY and SECQTY lines in your DDL and also rely on the TSQTY and IXQTY ZPARMs giving a big enough “first default”. (By the way, defaults for these two ZPARMS are 0, which is actually translated to be 720k or one cylinder for normal spaces and 7200k or 10 cylinders for LOB spaces). This all probably explains why the take up of this great feature has not been that spectacular and, in fact, Listserv *still* gets questions about “How good is this feature?” This also explains why the primary reason for having SAX is still valid at most shops today!

However, most shops these days tend to ignore the extents problem and only REORG when over 1000 extents have been allocated. This is no problem for SAX, as it knows the SECQTY and the MGEXTSZ ZPARM settings and can decide to “ignore” an IFCID for extent and ALTER SECQTY processing if the SECQTY is -1 and the MGEXTSZ is YES.

Will this object run out of datasets?

Now the problem of running out of datasets is very, very evil indeed… For a non-partitioned space, you can have up to 32 datasets. Db2 will happily allocate away and you will never know, or even be informed, if, and when, the last possible dataset has just been allocated and, of course, you will not know that the 33rd one cannot be allocated until you get a -904 unavailable resource error! By definition this is “not good”, as you must do a HUGE REORG with a bunch of managers breathing down your neck and *not* make any mistakes with the new allocations. (Again, this is a very good reason to have SAX doing all the monitoring and triggering early warning “heads-up” style messages!)

Is this partition nearing its maximum size?

A partition running out of space is rare, but when it does happen it is, of course, a disaster! The idea in SAX, is to warn when “the end is near” for any of the partitions in use and thus, as before, allow time for the ALTER etc.

Did Db2 ask for one extent but got more back?

Degenerated extents are annoying as well. You have only 255 or 7,257 extents, Db2 requests one but gets up to five back! This is “wasting” your precious supply of extents and so SAX can also warn you if this starts happening. Remedial action can again be planned to correct the problem, (normally a volume defrag in this case). Now in z/OS 1.7 “Extent Constraint Removal” was introduced for the DATACLAS which, if set to “Y”, allows 7,257 extents but still limits you to 123 extents per volume and 59 volumes. So watch out if you are using huge “virtual” disks (E.g. MOD 54 or EAV), as you can end up wasting space because you still cannot exceed 123 extents per volume.

SAX also takes care of duplicate recording – This is where an Extent is registered but SMS “consolidates it into the primary/existing extent – This would normally get logged as an extent but SAX sees this and does not report it as an extent.

Are any of my SMS disk storage pools running out of space?

When an SMS Pool runs out of space, either for sort/work or image copy, it is *not* good! The idea here, is to also give a “heads-up” style alert. The DBA can trigger the space management people to have a look at the state and size of the SMS storage groups this time alerted by percentage used or GBs of space free.

Are there any SEQUENCES that are about to hit the wall?

The usage of SEQUENCES has taken off. Nowadays shops can run into the problem of SEQUENCES hitting the maximum/minimum number for a NOCYCLE defined sequence. SAX tests sequences at the same time as the SMS groups to warn about any encroaching problem with WTO/MSG and reporting.

What was new in Db2 V8?

Db2 V8 introduced a big change – Partitions went up to a maximum of 4,096, and the calculation about how many pieces your NPI can have got “a little bit complex” (see also my previous newsletter: “2014-04 Are you going to pieces”).

What was new in Db2 9? PBG and UTS spaces

In Db2 9 the next major advance came with UTS spaces. The one that caused the most grief was, of course, PBG. Why? Well, the first problem was that some people went mad and used MAXPARTITIONS 4096 right from the get-go. They then found out this could not simply be changed and ended up being a huge problem. IBM came out with a bunch of fixes for these people, but the recommendation is still true today: “Use the number you expect to use!”

PBGs, however, came with a new set of space management problems:

1	By definition every partition is full, and so a TP REORG is “dangerous” -especially if you have VARCHAR, and even more so if compressed.
2	ALTER at TP level is not supported for PBG.
3	Getting rid of empty partitions was not supported
4	Adding partitions dynamically (by command) was not supported.
5	What to do if the partition that is “in use” is growing and is a – The last allowed Partition b – MAXPARTITIONS is set to one? Now these are “non trivial” because the Db2 catalog is so defined and you would not want an alert every time someone created a table or index! The trick here, is to treat these conditions as if it was a normal space and so, instead of warning that you are using the last part, it waits until you are using, e.g. 80% of that part. Then, e.g. at 90% comes the critical threshold warning.

Big changes happened here in Db2 12.

What was new in Db2 10?

With Db2 10 came the ability to ALTER PBGs to add parts which made using DSN1COPY to clone data around a lot better!

What was new in Db2 11?

In Db2 11 the REORG utility can be used to remove any empty parts in PBGs by the use of the ZPARM REORG_DROP_PBG_PARTS being set to ENABLE (DISABLE is the default).

What is new in Db2 12?

Now in Db2 12 there is partition independence for DSSIZE. Before, all partitions had to have the same maximum size (DSSIZE). Now you can have different sizes for different parts. This requires either making a new tablespace (UTS Relative Page Numbering), or an ALTER and TS level reorg of an existing UTS space. The tablespace goes relative page numbering and the RID increases in size to seven bytes hence the need for a TS level REORG. The Partitioning indexes also get DSSIZE so they can vary in size as well. Once you are there, all of the Partitions can then be ALTERed up in size with no outage! This is really, really good!

REORG of a PBG can “spill” into a new partition. This is also really good, as it was the major problem with PBG TP level reorgs. The chance of LOB data going into COPYP during the log apply phase has been stopped – Thankfully! Finally, delete of empty partitions is controlled with a utility DROP PART syntax.

The SAX way for Space monitoring

The SAX tool way of processing all this info is neatly summarised in the help panel of the tool itself:

SUPERVISE LPS

Supervise linear pagesets. If specified, a warning is issued
in case of high allocated reaches this percentage of the
maximum data set size for partitioned objects.

For non-partitioned objects, a warning is issued for every
newly allocated data set as soon as the data set number
reaches this percentage of the maximum number of data sets:

Two different values may be entered for warning and critical
values with different message ids. This may be useful for
automation reasons (see below).

Object type: TABLESPACE      ! Maximum number of data sets
-----------------------------+----------------------------
LOB tablespaces              ! 254
-----------------------------+----------------------------
Non-partitioned tablespaces  ! 32
-----------------------------+----------------------------
Partitioned tablespaces      ! 1 (Percent used check)
-----------------------------+----------------------------
Partitioned By Growth        ! MAXPARTITIONS. LPS check if
tablespaces                  ! more than one. If on last
                             ! partition then percent used.
-----------------------------+----------------------------
Object type: INDEX           ! Maximum number of data sets
-----------------------------+----------------------------
Non-partitioned indexes on   ! MIN ( 4096 , 2 power 32 /
tablespace with LARGE,       !      ( DSSIZE / TS PGSIZE))
DSSIZE, or more than 64      ! Eg: 128 GB DSSIZE with
Partitions                   !       8 KB Tablespace Page
                             ! gives 256 Pieces (datasets)
                             ! Or    4 GB DSSIZE with
                             !       4 KB Tablespace Page
                             ! gives 4096 Pieces (datasets)
-----------------------------+----------------------------
Non-partitioned indexes      ! 32
otherwise                    !
-----------------------------+----------------------------
Partitioned indexes          ! 1 (Percent used check)
-----------------------------+----------------------------
To support automation based on WTO ids two different
thresholds may be specified:
Field (1) specifies a warning threshold using WTO ids
O2RTSU04 - 12W  (non-partitioned spaces)
O2RTSU04 - 14W  (partitioned spaces)
O2RTSU04 - 16W  (partition by growth spaces)
Field (2) specifies a critical threshold using WTO ids
O2RTSU04 - 13W  (non-partitioned spaces)
O2RTSU04 - 15W  (partitioned spaces)
O2RTSU04 - 17W  (partition by growth spaces)

AUDIT DEGENERATED XTS
Audit secondary quantity for de-generated extents. If
specified, a warning is issued in case of the last extent
does not reach this percentage of the SECQTY specified
in the Db2 catalog. If this field is left blank, no
auditing is performed.

AUDIT SMS STOGROUPS
Should the Space AssuranceExpert audit SMS stogroups. Y/N
If Y is entered, a pop-up window will allow you to enter up
to 24 SMS storage groups which will be audited.
If WARN IF % ALLOC > or WARN IF GB FREE < is specified and
exceeded, a warning (WTO) will be issued.

CHECK SYSSEQUENCES
Should the Monitor also check for SYSIBM.SYSSEQUENCES that
are running out of room every PING minutes?

N  - do nothing.  This is the default.
I  - check Identity Columns and Doc Ids for XML.
S  - check User Defined Sequences.
B  - do both.

PERCENT USED
If checking of SEQUENCES is desired then a threshold
percentage must be given from 1 to 99. If this percentage of
the available sequences is exceeded then an action is
triggered.

EXCEEDED ACTION
When a percentage is exceeded this specifies what type and
and which style of message should be externalized.

N  - do nothing.  This is the default.
W  - to write out a WTO.
M  - to write a message to the job log.
B  - do both.

To support automation based on WTO ids the following
messages are output:

O2RTS000 - 20W  (SEQUENCES MAXVALUE)
O2RTS000 - 21W  (SEQUENCES MINVALUE)
O2RTS000 - 22W  (IDENTITY MAXVALUE)
O2RTS000 - 23W  (IDENTITY MINVALUE)

So now you know why size still matters for Db2 12! The big question now is: “Are your space management and monitoring tools up-to-date, or are they still Db2 V7?”

As usual, if you have any comments or queries please feel free to drop me a line!

TTFN

Roy Boxwell

17. Januar 201714. Januar 2021

2017-01 Db2 12 technical overview: Roy’s first features review

This Db2 12 technical overview presents in an „easy to read“ table list a review of new Db2 12 features

Have you encountered any other Db2 12 changes you’d like to discuss?

Now that Db2 12 has gone GA I can finally talk about it. So here’s another new Features “first look” at what I think is cool, great, or odd !

This is my personal list for a Db2 12 technical overview – in no particular order :

Db2 12 SQL Optimizer, triggers, Arrays, Merge, UNICODE Columns. Temporal, SQL pagination, SQL Stability, Log, Partitions,…
Data Sharing
Utilities DSN1COPY, Alternate Copy Tools, Audit, REORG, PBG reorgs, COMPRESSRATIO, RELOAD, RO tablespaces, LOAD, BACKUP and Recovery, PiT, RUNSTATS…

AGILE This release of Db2 will be ”the last” release, as Db2 Development has gone all agile on us and will be doing Continuous Delivery (CD) from now on. CD promises Easier, Cheaper, Faster and Simpler Db2 maintenance and the quick realization of new functionality.

Db2 12 – SQL

Optimizer

MQT or Table expression columns are “trimmed” if they are not used in the outer query.

In LEFT OUTER JOIN, if columns are not used, they can be Pruned.

UNION ALL gets major work when pushing down join predicates as well as pushing down ORDER BY and FETCH FIRST

Outer table joins can get reordered to avoid unnecessary materializations

User-defined functions get two improvements with merge and the introduction of indexes on the join or correlation predicates that are passed in as parameters

Adaptive Index is designed for Multi Index and single index list prefetch to determine at execute time the filtering of each index. This ensures the optimal execution sequence of indexes or, perhaps, a quicker fallback to Tablespace scan if no filtering index exists.

TRIGGERS

The new “advanced” triggers enable SQL and Global variable usage and SQL PL.

ARRAYS

Get a couple of nice new features, specifically the use of a global variable as an array type and the ability to use the ARRAY_AGG without forcing an ORDER BY.

GLOBAL VARIABLES

Get LOB support and in a SET they can be the target.

PureXML

The XMLMODIFY can do multiple document updates in a single invocation. Various XML performance boosts are also included, e.g. XMLTable and the XSLTRANSFORM allows transformations to different formats.

JSON

When using the JSON_VAL function the first argument must not now always be a BLOB. It can be a view, CASE, table expression, trigger transition variable or SQL PL variable or parameter.

MERGE

Is now a full MERGE with the ability to use table references with multiple MATCHED clauses, including DELETE operations.

SQL PAGINATION

The ability of Db2 to “understand” typical paging has been greatly boosted. Typically it was always coded like:

SELECT blah blah blah
FROM mytable
WHERE (SURNAME = ‘BOXWELL’ AND FORENAME > ‘ROY’)
        OR (SURNAME > ‘BOXWELL’)

This is pretty horrible for the Db2 optimizer but we *all* know what we really mean! Now in Db2 12, so does the optimizer! Sadly you must rewrite your queries a little so this example becomes:

SELECT blah blah blah
FROM mytable
WHERE (SURNAME, FORENAME) > (‘BOXWELL’, ‘ROY’)

Also with this comes a nice little feature called OFFSET ROWS. Typically, this is for when the connection to the server is a bit shaky and so after some paging, when the cursor is reopened, the code “knows” it can miss the first 60 rows, so the cursor changes to be:

SELECT * FROM mytable OFFSET 60 ROWS

Nice feature, but beware of polluting the DSC! It is much better to use a parameter marker for these Offsets!

UNICODE Columns

In DB2 11, we got a “fix” for UNICODE columns that was really a “crutch”. This has now been fixed with real UNICODE columns in DB2 12. You must migrate your existing data though!

Piece-wise DELETE

This is a feature I have wanted for decades! Simply add the FETCH FIRST nnnn ROWS ONLY within a DELETE and then programmatically loop around until you are done. Much easier than the method we have today of DECLAREing a CURSOR with an UPDATE of a dummy column and the DELETE WHERE CURRENT OF and after 5000 or so issue a COMMIT.

TEMPORAL RI

You can now add RI as normal and not be forced to use a trigger or stored procedure.

TEMPORAL TABLES

Get the ability to not just be inclusive-exclusive but also inclusive-inclusive.

TEMPORAL Logical Transactions

Another new feature with temporal tables, is the ability to support logical units of work for SYSTEM_TIME. These logical units of work are not determined by COMMIT or ROLLBACK but by using a built-in Global Variable.

PERCENTILE functions

Two new functions PERCENTILE_CONT and PERCENTIL_DISC are new BIFs.

DRDA Fast Load

Is the ability to load data into z/OS DB2 from files sitting on distributed clients.

ODBC

Gets a new INI keyword KEEPDYNAMIC and the connection attribute of SQL_ATTR_KEEP_DYNAMIC.

Obfuscated Code

Mainly of interest to Vendors is the ability to hide your stored procedure, TRIGGER or UDF coding from prying eyes.

RLF for Static SQL

This is a big one! The Resource Limit Facility has always only been available for dynamic SQL. Now you can also use it to cap Static SQL.

TRANSFER OWNERSHIP

This is a very handy way of clearing out all the old owners from a DB2 system.

SQL Stability

Dynamic Plan Stability is nearly the same as BIND QUERY, but the hope is that it will be easier and better to use! But beware of saving all of your dynamic SQL away!

Static Plan Stability gets a good enhancement that allows FREE on the original or previous. What is really good, is that the current version can be in use so there is no application outage anymore.

Insert

New Insert algorythm can be used for faster unclustered insert processing in some cases. Only for UTS MEMBER CLUSTER (This is actually the default for these spaces).

CONCENTARTE LITERALS

Now supported at the Package Level.

FTB

Fast Index Traversal – Especially good for randomly accessed indexes. If the index is unique, and 64 bytes or less, it is eligible. Index is controlled with the new Catalog table SYSIBM.SYSINDEXCONTROL and the -DISPLAY STATS(INDEXMEMORYUSAGE) command.

Log

Active log size can go from 4GB now up to 768GB ! Be careful here!

In-Memory bufferpools

by using PAGESTEAL(NONE) keyword.

PARTITIONS

Finally we get the chance to give each partition its own DSSIZE as well as the Partitioning indexes! This is great, but sadly is only available to an existing space once you have reorged the whole tablespace…However, once you are there, you can then have data and index parts up to 1TB in size, plus, when you do an ALTER of the DSSIZE, it does not cause an outage (as long as you make it bigger!). A side effect of this is that the RID is now seven bytes (see REORG mapping table for other changes). You can now also add partitions in the middle of an existing PBR table.

Data Sharing

Recovery

of retained locks from a failed member can be handled automatically

LPL and GRECP recovery

LPL and GRECP recovery auto retries three times after waiting three minutes

Db2 12 Utilities

DSN1COPY

In DB2 11 this utility got a few sanity checks and now the REPAIR CATALOG utility can fix some of these. The REPAIR CATALOG TEST also looks for some problems caused by misuse/abuse of DSN1COPY.

ALTERNATE COPY POOLS

The usage of BACKUP SYTEM is growing. So is the amount of storage required! The idea here, is to define a set of copy pools, but only one for many DB2 subsystems. The alternate copy pool uses as many volumes as it needs and leaves the other volume free for a different subsystem backup. This reduces the amount of space that must be allocated.

Audit

A new Authorization arrived: UNLOADAUTH to “replace” the “Does the user have SELECT auth on the table?” check that has run up to now. UNLOAD is special and should be controlled over this auth and no longer over just SELECT.

REORG

PBG tablespaces get the best news here!

PBG reorgs can now spill over into a new PBG if the row(s) do not fit back into the original partition. Classic case here, is compressed data that no longer fits back. This forced people to use a TS level reorg or not use compression.

If the PBG contains LOB data and it extended to a new partition in the log apply phase, then the LOB space was left in COPY Pending… pretty horrible and that no longer happens in DB2 12.

Another PBG bonus, is the delete of “emptied” Partitions after a REORG has completed.

Improved FlashCopy support – You can now decide to stop the REORG if the flash copy fails.

New Catalog column COMPRESSRATIO for use by utilities that records the compression savings at the record instead of at the page level.

RELOAD phase can now be offloaded to zIIP.

RO tablespaces can now be REORGed at any SHRLEVEL.

The mapping table gets changed again due to the relative page numbering in the new PARTITION support (seven byte RID).

LOAD

PART REPLACE with dummy input against an empty (PBR) partition could be quicker.

LOAD SHRLEVEL CHANGE PARALLEL support for PBG for SHRLEVEL CHANGE.

Additional zIIP offload, like in REORG, in the RELOAD phase, including the data conversion and loading of the record into the page set.

LOAD RESUME BACKOUT YES to avoid RECP on failure. Adds a new option on LOAD RESUME SHRLEVEL NONE to allow LOAD to back out the rows already loaded upon encountering an error (such as conversion, LOB/XML, duplicate key, referential integrity violation) without leaving the page set in RECP.

PREFORMAT support for auxiliary tables. Support is extended to LOB table spaces and auxiliary indexes.

Maintain MAXASSIGNEDVAL for identity columns. LOAD now maintains the MAXASSIGNEDVAL for user-provided input and resets the value if a LOAD REPLACE is run on the table space.

LOAD REPLACE support for the COMPRESSRATIO column for use by utilities that records the compression savings at the record instead of at the page level column.

BACKUP and RECOVERY

Point-in-Time support for PBGs, Flashcopy FLASHCOPY_PPRCP keyword. As mentioned the default is changed to not recover unchanged objects. MODIFY RECOVERY gets two new options: DELETEDS to delete the datasets and NOCOPYPEND to not set COPY pending after doing the MODIFY.

PiT

Has been improved with the ability to skip unnecessary recoveries. SCOPE UPDATE only processes objects that have been updated up to the TOLOGPOINT or TORBA.

RUNSTATS

New CLUSTERRATIO formula which should better reflect dynamic prefetch. Terry Purcell has stated that it is not a huge change and does not require a RUNSTATS of all tablespaces!

FREQVAL COUNT nn

The COUNT nn is now optional and, if not used, then RUNSTATS will work out the best number for you. This is really, really nice and I would recommend this in an instant! It has also been retro fitted to DB2 11.

Autonomic Statistics with PROFILEs

I am no fan of this, as I believe it makes for a pretty nasty feedback loop where anyone’s “dumb” QMF/SAS/DSNTEP2/SPUFI will get inserted as a PROFILE COLGROUP, and then these PROFILEs will get bigger and bigger until no-one knows which are really useful and which are just fluff! I would recommend setting the ZPARMs STATFDBK_SCOPE to ALL (Default) STATFDBK_PROFILE to NO (Default is YES). When YES is used DB2 12 will create and/or maintain a PROFILE for you. Finally, validate that the SYSTABLES column STATS_FEEDBACK is set to “N” (Default is “Y”) for any and all tables where you do *not* want SYSSTATSFEEDBACK data. E.g. All the DSNDB01 tables where a RUNSTATS is not even allowed!

DSC

DSC Invalidation got switched off by default. In the past *any* RUNSTATS flushed the cache. Now you must add the key word INVALIDATECACHE YES to get this to occur. (Unless you use the REPORT NO UPDATE NONE syntax this still just flushes the DSC)

Inline Stats

Inline Stats got a huge boost with PROFILE support, MOST/BOTH/LEAST and LOAD PARALLEL got inline stats.

Have you encountered any other Db2 12 changes you’d like to discuss?

As usual, feel free to email me with questions or comments.

TTFN

Roy Boxwell

8. November 201614. Januar 2021

Rotten Results from RUNSTATS Require Rescue

Do you know the basic rules to ensure access path stability when using RUNSTATS?

DB2 z/OS Access path stability: Time for another of my “I noticed something strange at a customer site recently” Newsletters. Enjoy!

RUNSTATS are good aren’t they?

At this particular site, the RUNSTATS methodology of RUNSTATS was, shall we say, “sub-optimal.” They use an ancient system to decide when to RUNSTATS, and they do tablespace’s and index’s *never* at the same time. Just to complicate matters even more, they never use inline RUNSTATS because “if the REORG abends, the statistics in the DB2 catalog are dead”. Now you are all probably well aware of the scale of the disaster at this site?

Daily fire fighting

Nearly every day, some access path somewhere goes horribly wrong… the under- manned and over-worked DBA group are tasked to find and fix ASAP. Cures range from a quick INDEX create or change, or perhaps even a really needed RUNSTATS or REORG.

Why do the Access Paths go “wrong”?

The real goal is to stop firefighting and to investigate the root cause. Why do so many access paths go wrong on such a regular basis? The answer is the systemic horribleness of RUNSTATS collection. Dynamic SQL is, obviously, very very sensitive to RUNSTATS. For one thing, the statements are kicked out of the cache! The very next time they come back, the DB2 Optimizer redrives the cost calculations and “Hey Presto!” you have a bad access path. Terry Purcell and Pat Bossmann have often said that about 90% of DB2 performance problems stem from bad RUNSTATS. The old adage “garbage in – garbage out” is still true!

Timing is everything

The timing of the RUNSTATS is critical for stable access paths.

Basic rules are:

1	Only do a RUNSTATS if you really really need to!
	a. RUNSTATS are not cheap! b. The Dynamic Statement cache gets wiped c. Locks on the Catalog can occur
2	Avoid doing RUNSTATS even if RTS says to run one!
	a. Lots of people use the incorrect counters to trigger a RUNSTATS. Use the correct ones for the correct Object type b. Never RUNSTATS LOB spaces – completely pointless work! c. Even if a MASSDELETE has occurred do you really want to “reset” the DB2 catalog statistics? d. VOLATILE tables must be handled with extreme care!
3	Choose your RUNSTATS parameters wisely!
	a. Doing a blind “RUNSTATS the world” is just as bad as running an empty RUNSTATS! b. HISTOGRAM should be used with caution c. More than a hundred COLGROUPs should start alarm bells ringing

Quite a list here, and it really only shows some „Rules of Thumb“. I’ll bet you all have you own?

Is there a way back from the abyss?

But what happens if you have 1000’s of partitions with terabytes of data and the RUNSTATS was, shall we say, ill-advised or badly timed? Can you go back in time? Hands up those who wants to do a PiT recovery on the production catalog! No takers???

Yes! There is a way back from the abyss

I’ll bet you are all well ahead of me here, but the way to do this is pretty straightforward. You simply acquire our latest tool, RUNSTATS Rescue, to handle it all for you. Or, you could try and reset the data in the DB2 catalog from off-line backups that you happen to have taken before the RUNSTATS that is now killing you. …You did do that, right?

Why a tool?

Apart from the fact that this tool is from us, my firm, just trying to “roll your own” can be a real nightmare. Why?

Because you must first find out all of the objects that were touched by the badly performing SQL.
Then you must get all of the DB2 Optimizer relevant data back from a point in time before the RUNSTATS executed, and/or the last REBIND(s),
and then you must flush the dynamic statement cache and REBIND any static SQL.

Sounds like a lot of work.

What else must you do?

You also have to be transparent and so log what you do. You must allow for the ability to back-out your changes as perhaps you make another access path even worse. And it would be really cool if you could do “on the fly” explains to check that the RUNSTATS really *is* the root of all that evil. Remember that ZPARMS and BUFFERPOOLS also have a major influence on access paths. Even the speed of your machine! It is also a must to then be able to go even further back in time – perhaps as much as a year?

Hang on – What about PLAN STABILITY?

Doesn’t plan stability save you? I hear you all cry. Well, “No” is the short answer! If your package is invalidated by a Schema change (the classics are index drop and recreate or VIEW change), then plan stability does not work anymore. Further, in DB2 12, Dynamic Plan Stability has been announced. Sadly it *also* fails right here as there is no SWITCH PREVIOUS/ORIGINAL support!

It all works together

So, for the static SQL case where the package is not invalided, Plan Stability is good. If not: – RUNSTATS Rescue to the rescue. For Dynamic SQL – RUNSTATS Rescue is the answer.

As always, any questions or comments would be most welcome!

TTFN,

Roy Boxwell

21. Oktober 201614. Juni 2023

2016-10 Discovering hidden recovery problems in the SYSLGRNX – Update 2020-09

A query to read the SYSLGRNX

This month I want to have a look inside the SYSLGRNX – Not just for fun but actually to see if I, and you, have a serious RECOVER problem.

The Problem began with too many TP Image Copy steps

The problems all began when it was noticed by one of our customers that our DB2 Utility generation software (RTDX) was generating Image Copy jobs for “old” PBG partitions that were no longer being updated and they asked us why we were recommending an Image Copy.

RTDX and Image Copy

RTDX (RealTime DBAExpert for z/OS and SAX – Space AssuranceExpert) is threshold based and uses the Real-time Statistics (RTS) tables as the major driver of decisions about REORG, RUNSTATS, and IMAGE COPY. The normal rules for Image Copy are “If the RTS COPY counters have not been updated then there has not been an update and we do not need a Copy”. This is normally extended with a special copy rule “But generate a Full Image Copy every xx days even if no updates”. This is done to guarantee that you can perform a recovery even if your non-DB2 migration/back-up software automatically deletes datasets older than xx days.

Everything green?

So you think you are covered? Not so fast! When the RTS were first introduced back in DB2 V7 they were not that stable and, sometimes, the RTS DB was actually stopped. This meant that the numbers were sometimes “not that reliable” – Now for REORG and RUNSTATS it does not really matter because, at some point, the counters start to count again and all is OK. But for IMAGE COPY it would be a disaster to “miss” an update! RTDX handles this with an extra “belt and braces” or, for the Americans amongst us, “belt and suspenders” check: “CHECK SYSLGRNX? Y/N” setting. If set to “Y” then RTDX generates an extra job that extracts the SYSLGRNX data and loads it into a DB2 table. This can then be queried by RTDX to see when objects were opened for update (Just like DB2 does in RECOVER processing to get the required log ranges.)

Something Old Something New

So now you know how the system works I can explain what happened at the customer site. They use Partition By Growth tablespaces (PBGs) a lot as “containers” to just insert records into. They had PBGs with hundreds of partitions and they wondered why RTDX generated Image Copy steps for partitions 1 – 136 when they only held “old non-changed data”.

Check the code

First thing I did was to check the code for any silly mistakes. Everything was fine. I then got the SYSLGRNX extract file and saw that the Partitions were indeed in there… I then worked with the customer directly using log prints and tracing until we finally saw what was happening.

PBG doing the ripple!

When a new partition was added, all of the previous partitions header pages got an update “max no. of partitions”. Now this update counts as a system page and so is *not* in the RTS Counters as they only count data pages! RECOVER of course does not care! It would demand to look for the log ranges and what would happen if your logs have been deleted? Yep – Unrecoverable data!

APAR PI60104 fixes the problem

So you can apply the APAR and all is good? Not really. You still have to go and check if you have any potentially unrecoverable PBG datasets out there. How do you do that? Well – you ask me!

Reading SYSLGRNX is not so easy

The problem with SYSIBM.SYSLGRNX is that the data is nearly unusable when you want to join to the DB2 Catalog. E.g. Two byte character fields that actually contain SMALLINT values and, sometimes, with the high bit set…

Just creating a table and then a dumb INSERT INTO SELECT FROM dies because of the DSNDB01 Cursor rules. So first you’ll need to create your new version of SYSLGRNX:

-- CREATE NEW VERSION OF SYSLGRNX WITH SMALLINTS --
CREATE TABLE BOXWELL.SYSLGRNX2
   (LGRDBID          SMALLINT          NOT NULL
   ,LGRPSID          SMALLINT          NOT NULL
   ,LGRUCTS          TIMESTAMP         NOT NULL
   ,LGRSRBA          CHAR(10)          FOR BIT DATA NOT NULL
   ,LGRSPBA          CHAR(10)          FOR BIT DATA NOT NULL
   ,LGRPART          SMALLINT          NOT NULL
   ,LGRSLRSN         CHAR(10)          FOR BIT DATA NOT NULL
   ,LGRELRSN         CHAR(10)          FOR BIT DATA NOT NULL
   ,LGRMEMB          SMALLINT          NOT NULL
   ,LGRNEGPSID       CHAR(1)           NOT NULL)
;
COMMIT ;

Cross loader to the rescue!

After trying various tricks to get at the data I hit on the idea of abusing the cross loader to do it for me. Here is my JCL that you must tailor for your site to do the dirty deed:

//LOAD EXEC PGM=DSNUTILB,REGION=32M,PARM='DC10,LOAD' 
//STEPLIB  DD DISP=SHR,DSN=DSNC10.SDSNEXIT.DC10 
//         DD DISP=SHR,DSN=DSNC10.SDSNLOAD 
//SYSUT1   DD UNIT=SYSDA,SPACE=(CYL,(9,9)) 
//SORTOUT  DD UNIT=SYSDA,SPACE=(CYL,(9,9)) 
//SYSERR   DD UNIT=SYSDA,SPACE=(CYL,(9,9)) 
//SYSMAP   DD UNIT=SYSDA,SPACE=(CYL,(9,9)) 
//SYSDISC  DD UNIT=SYSDA,SPACE=(CYL,(9,9)) 
//UTPRINT  DD SYSOUT=* 
//SYSPRINT DD SYSOUT=* 
//SYSIN    DD * 
EXEC SQL 
DECLARE C1 CURSOR FOR 
SELECT COALESCE( 
       CASE WHEN ASCII(SUBSTR(A.LGRDBID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRDBID, 2, 1)) > 32767 THEN  
                (ASCII(SUBSTR(A.LGRDBID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRDBID, 2, 1))) - 32768      
            ELSE ASCII(SUBSTR(A.LGRDBID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRDBID, 2, 1))               
       END , 0)         AS LGRDBID                            
      ,COALESCE(                                              
       CASE WHEN ASCII(SUBSTR(A.LGRPSID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRPSID, 2, 1)) > 32767 THEN  
                (ASCII(SUBSTR(A.LGRPSID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRPSID, 2, 1))) - 32768      
            ELSE ASCII(SUBSTR(A.LGRPSID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRPSID, 2, 1))               
       END , 0)         AS LGRPSID                            
      ,TIMESTAMP(                                             
           CASE WHEN SUBSTR(A.LGRUCDT, 5 , 2 ) > '83' THEN '19'
                ELSE '20'                                     
           END CONCAT                                         
           SUBSTR(A.LGRUCDT, 5 , 2 ) CONCAT '-' CONCAT        
           SUBSTR(A.LGRUCDT, 1 , 2 ) CONCAT '-' CONCAT        
           SUBSTR(A.LGRUCDT, 3 , 2 ) CONCAT '-' CONCAT        
           SUBSTR(A.LGRUCTM, 1 , 2 ) CONCAT '.' CONCAT        
           SUBSTR(A.LGRUCTM, 3 , 2 ) CONCAT '.' CONCAT        
           SUBSTR(A.LGRUCTM, 5 , 2 ) CONCAT '.' CONCAT        
           SUBSTR(A.LGRUCTM, 6 , 2 ) CONCAT '0000'            
                 )      AS LGRUCTS                            
      ,A.LGRSRBA                                              
      ,A.LGRSPBA                                              
      ,A.LGRPART                                              
      ,A.LGRSLRSN                                             
      ,A.LGRELRSN                                             
      ,ASCII(SUBSTR(A.LGRMEMB, 2, 1)) AS LGRMEMB              
      ,CASE WHEN ASCII(SUBSTR(A.LGRPSID, 1, 1)) * 256         
               + ASCII(SUBSTR(A.LGRPSID, 2, 1)) > 32767 THEN ‘Y’
            ELSE ‘N’                                          
       END              AS LGRNEGPSID                         
FROM SYSIBM.SYSLGRNX A                                        
ENDEXEC                                                               
LOAD DATA INCURSOR(C1) RESUME NO SHRLEVEL NONE REPLACE                    
  INTO TABLE BOXWELL.SYSLGRNX2                                        
/*

The use of ASCII is to “translate” the character data into smallint and also to detect if we have negative numbers. This happens if you run out of DBIDs and also if you use CLONE objects. The requirement for the two COALESCEs is down to the cross loader syntax check on the CASE statements by the way…

This could load a lot of data of course!

Do not forget the INDEX

Now the Index that we need to speed up the processing (and the RUNSTATS of course!)

-- CREATE INDEX ON THE COLUMNS WE NEED --
CREATE INDEX BOXWELL.SYSLGRNX_IX
          ON BOXWELL.SYSLGRNX2
       (LGRDBID,LGRPSID,LGRPART,LGRMEMB)
        USING STOGROUP SYSDEFLT
             PRIQTY -1
             SECQTY -1
        CLUSTER
        CLOSE YES
;
COMMIT ;

And the RUNSTATS

RUNSTATS  TABLESPACE <yourdatabase>.<yourtablespace>
          TABLE (ALL)
          INDEX(ALL)
          FREQVAL NUMCOLS 1 COUNT 100 BOTH
          FREQVAL NUMCOLS 2 COUNT 100 BOTH
          FREQVAL NUMCOLS 3 COUNT 100 BOTH
          SHRLEVEL CHANGE REPORT NO UPDATE ALL HISTORY NONE

Now we can SQL So finally we have the SYSLGRNX data in a format that we can actually use! Here is my SQL that lists out any PBG spaces that have, according to the RTS, no updates but the last Image Copy timestamp is earlier than the SYSLGRNX record:

-- FINALLY THE QUERY TO SEE IF A PROBLEM EXISTS OR NOT --
--
-- SELECT ALL PBGS IN SYSTABLESPACE THAT HAVE DATA IN THE
-- SYSLGRNX WHICH ALSO HAVE AN ENTRY IN THE RTS WHERE THE
-- COUNTERS ARE NULL OR ZERO BUT THE COPYLASTTIME IS EARLIER
-- THAN THE SYSLGRNX TIMESTAMP.
--  THIS MEANS THAT THE RTS THINKS "NO REASON TO COPY"
--  BUT SYSLGRNX THINKS "SOMETHING WAS UPDATED"
--

SELECT A.DBNAME, A.NAME, B.LGRPART, A.DBID, A.OBID, A.PSID
      ,B.LGRUCTS, B.LGRSRBA, B.LGRSPBA, B.LGRSLRSN
      ,B.LGRELRSN, HEX(B.LGRMEMB) AS MEMBER
      ,C.COPYLASTTIME
      ,MAX(D.START_RBA) AS HIGHEST_IC_RBA
      ,MAX(D.TIMESTAMP) AS HIGHEST_IC_TS
FROM SYSIBM.SYSTABLESPACE A
INNER JOIN
    BOXWELL.SYSLGRNX2    B
 ON  A.DBID     = B.LGRDBID
 AND A.PSID     = B.LGRPSID
 AND A.INSTANCE = B.LGRINST
INNER JOIN
     SYSIBM.SYSTABLESPACESTATS C
 ON  C.DBID     = A.DBID
 AND C.PSID     = A.PSID
 AND C.PARTITION= B.LGRPART
 AND C.INSTANCE = A.INSTANCE
LEFT OUTER JOIN
     SYSIBM.SYSCOPY       D
  ON D.DBNAME    = A.DBNAME
 AND D.TSNAME    = A.NAME
 AND D.DSNUM   IN ( 0 , B.LGRPART)
WHERE A.TYPE   = 'G'
 AND (C.COPYUPDATEDPAGES = 0
  OR C.COPYUPDATEDPAGES IS NULL)
 AND (C.COPYCHANGES = 0
  OR C.COPYCHANGES IS NULL)
 AND (C.COPYLASTTIME < B.LGRUCTS
  OR C.COPYLASTTIME IS NULL)
 AND (D.ICTYPE = 'F'
  OR D.ICTYPE IS NULL)
GROUP BY A.DBNAME, A.NAME, B.LGRPART, A.DBID, A.OBID, A.PSID
     ,B.LGRUCTS, B.LGRSRBA, B.LGRSPBA, B.LGRSLRSN
     ,B.LGRELRSN, HEX(B.LGRMEMB)
     ,C.COPYLASTTIME
ORDER BY 1 , 2 , 3 , 7
WITH UR
;

And the winner is?

I hope you have zero records returned? Even if you do have some records, it is relatively easy to quickly get them image copied so that you can sleep at night. If zero, you still have a great copy of the SYSLGRNX that you can use for other purposes!

A customer replies

Here’s some real data back from a customer:

discovering hidden recovery problems in the SYSLGRNX; DB2 z/OS; PBG partition

The COUNT is the number of SYSLGRNX entries they have – So they have nine DB & TS groups and from a further analysis (removing the data sharing member duplicates) they have 1926 distinct DB, TS and Partitions!

They scheduled image copy processing for all of these and were then done.

I hope you liked this month’s topic.

As always, any questions or comments would be most welcome!

TTFN,

Roy Boxwell

26. September 201614. Januar 2021

2016-09 Solving SEQUENCE or space problems in Db2

What do you do if your critical production tablespace reaches its maximum of 32 datasets on a Saturday?

Could you actually get the REORG through before prime time starts on Monday morning?

I have recently been involved with trialing and testing our space monitor software (SpaceAssuranceExpert or SAX) on Db2 z/OS after some enhancements had been added. It was originally designed—many moons ago—to monitor the size of the secondary extents that Db2 was using and to dynamically issue an ALTER SECQTY to guarantee that the maximum size of the object was reached *before* users ran out of physical extents.

MGEXTSZ to the Rescue?

Now you might be wondering “What’s that got to do with the price of beef?” because, as we all know, Db2 V8 introduced a “sliding scale” to the size of the secondary extents so that it could also guarantee that a dataset hit its maximum size *before* you ran out of extents.

Extents are not everything

The “problem” is that extents are not everything. In fact, one major area of concern is the number of datasets. If it is 01:00 on a Saturday morning and your critical production tablespace has reached its maximum of 32 datasets – what are you going to do? Could you actually get the REORG through before prime time starts on Monday morning? Or what happens when Partition 26 completely fills up?

SAX to the rescue!

This is where our SAX tool saves the day. It is an STC that runs 24×7 catching the IFCIDs that Db2 throws whenever it issues an extent request for a dataset. Using the Db2 Catalog, SAX then determines the exact make-up (geometry) of the object being extended and can use two levels of warning percentages to start triggering alarm bells way, way before it all goes pear-shaped!

Here is my little “ready-reckoner” for Linear Dataset Allocations:
Space and sequence problems in DB2 z/OS; degenerated extent

Object type: TABLESPACE        !   Maximum number of data sets 
-------------------------------+------------------------------- 
LOB tablespaces                !   254                         
-------------------------------+------------------------------- 
Non-partitioned tablespaces    !   32                          
-------------------------------+------------------------------- 
Partitioned tablespaces        !    1 (Percent used check)      
-------------------------------+------------------------------- 
Partitioned By Growth          !   MAXPARTITIONS. LPS check if 
tablespaces                    !   more than one. If on last   
                               !   partition then percent used.
-------------------------------+------------------------------- 
                                                                  
Object type: INDEX             !   Maximum number of data sets 
-------------------------------+------------------------------- 
Non-partitioned indexes on     !   MIN ( 4096 , 2 power 32 /   
tablespace with LARGE,         !       ( DSSIZE / TS PGSIZE)) 
DSSIZE, or more than 64        !   Eg: 128 GB DSSIZE with      
Partitions                     !         8 KB Tablespace Page  
                               !   gives 256 Pieces (datasets) 
                               !   Or    4 GB DSSIZE with      
                               !         4 KB Tablespace Page  
                               !   gives 4096 Pieces (datasets)
-------------------------------+------------------------------ 
Non-partitioned indexes        !   32                          
otherwise                      !                             
-------------------------------+------------------------------ 
Partitioned indexes            !   1 (Percent used check)      
-------------------------------+------------------------------

Here you can see that it is not as easy to calculate how many datasets are allowed as it used to be. You must also make sure you understand PBG space definitions. SAX allows two percentages and uses them in two different ways:

The number of datasets that have been allocated
The used space within a linear dataset

The second is also used if it is a PBG with MAXPARTITIONS 1, (e.g. The Db2 Catalog), or if the Partition being extended is the last allowable Partition.

How big can my PARTITION get?

There is a full description in the SQL guide all about the maximum size of a partition. Here is a little summary of this info:

Use the DSSIZE parameter to control how big a partition is (or for LOB spaces how big the LOB space can get):

1G	1 Gigabyte
2G	2 Gigabytes
4G	4 Gigabytes
8G	8 Gigabytes
16G	16 Gigabytes
32G	32 Gigabytes
64G	64 Gigabytes
128G	128 Gigabytes
256G	256 Gigabytes

To specify a value greater than 4G, the data sets for the table space must be associated with a DFSMS data class that has been specified with extended format and extended addressability.

How does the number of partitions affect my size?

If NUMPARTS is used along with DSSIZE then the maximum size of each partition depends on the value of NUMPARTS, as shown in the following list. Otherwise, the maximum size of each partition defaults to 4G.

Value of NUMPARTS Maximum partition size (default for DSSIZE)

1 to 16	4GB (4G)
17 to 32	2GB (2G)
33 to 64	1GB (1G)
65 to 254	4GB (4G)

How does my size affect the number of partitions?

If NUMPARTS is greater than 254, the maximum partition size (and the default for DSSIZE) then depends on the actual page size of the table space.

Page size	Maximum partition size (default for DSSIZE)
4K	4GB (4G)
8K	8GB (8G)
16K	16GB (16G)
32K	32GB (32G)

If DSSIZE is explicitly specified, the maximum number of partitions that can be specified, or is the default, is limited by the maximum table space size. For example:

For a partitioned table space with a 4K page size, if DSSIZE 64GB is specified, the maximum NUMPARTS value is 256.
For a partitioned table space with an 8K page size, if DSSIZE 64GB is specified, the maximum NUMPARTS value is 512.
For a partitioned table space with a 32K page size, if DSSIZE 128GB is specified, the maximum NUMPARTS value is 1024.

Special rules for LOBs

For LOB table spaces, if DSSIZE is not specified, the default for the maximum size of each data set is 4GB. The maximum number of data sets is 254.

What about PBGs?

To use these UTS types you must specify the MAXPARTITIONS clause. It specifies that the table space is a partition-by-growth table space. The data set for the first partition is allocated unless the DEFINE NO clause is specified for the partition. The data sets for additional partitions are not allocated until they are needed. (Unless you use the NUMPARTS clause)

You specify the maximum number of partitions to which the table space can grow, which must be in the range of 1 to 4096, also depending on the corresponding values of the DSSIZE and page size clauses.

How does my MAXPARTITIONS affect my size?

The maximum value for MAXPARTITIONS is a function of DSSIZE and table space page size:

DSSIZE value	4K page	8K Page	16K page	32K page
1G – 4G	4096	4096	4096	4096
8G	2048	4096	4096	4096
16G	1024	2048	4096	4096
32G	512	1024	2048	4096
64G	254	512	1024	2048
128G	128	256	512	1024
256G	64	128	256	512

WTO to Job Ticket

These warnings are issued as WTOs and can easily be picked up by system automation tools to open job tickets or send e-mails to alert DBAs—days or weeks before the system stops working.

For a warning SAX issues WTO ids:

O2RTSU04 - 12W (non-partitioned spaces)

O2RTSU04 - 14W (partitioned spaces)

O2RTSU04 - 16W (partition by growth spaces)

For a critical SAX issues WTO ids:

O2RTSU04 - 13W (non-partitioned spaces)

O2RTSU04 - 15W (partitioned spaces)

O2RTSU04 - 17W (partition by growth spaces)

This is not all that SAX does, in fact, it covers all of these problems:

Can this data set reach its maximum physical size *before* running out of physical extents? (The actual size is dependent on the “geometry” of the object of course!)
Will this dataset run out of datasets? (Again, how many datasets an object can actually have is dependent on the “geometry” of the object)
Is this partition nearing its maximum size?
Do I have a SEQUENCE/IDENTITY problem coming up?
Did Db2 ask for one extent but got more back?
Are any of my SMS disk storage pools running out of space?

New in SAX – SEQUENCE Support

Number four on that list is brand new. We have a customer who got bitten by a nasty problem. They had a SEQUENCE defined with the NO CYCLE parameter so it could not loop around and then finally they hit the last available number. Not good! They asked if SAX could be modified to also take care of this hidden nasty and we readily agreed so that all customers can benefit. The parameters panel got this set of new Parameters:
Space and sequence problems in DB2 z/OS; degenerated extent
As mentioned in the screen shot above from our online help panel, it will check the SYSSEQUENCES every PING minutes which, by default, is 30. When the WARN SUPP INTVL is set, you can reduce the number of warnings issued so as not to overload your problem ticket System!

Degenerated Extent support

There is another WTO that can be issued by degenerated extents (Number five in the list)

O2RTSU04 – 10W (Audit SECQTY)

These occur when Db2 requests one extent but gets back, say, five. This is ok, but it eats through the number of extents quite quickly and it implies the need for a disk defragmentation to be scheduled.

Let’s talk about Extents

While talking about extents, what I have also seen, is that the number of extents is sometimes getting very large indeed. At one customer site they had numerous datasets with over 4,000 extents! Now we all know that no-one knows where data is really stored on the modern disk sub-systems, but still… I would schedule a reorg at say 1,000 extents. The number of extents changed a *long* time ago in z/OS 1.7 to raise it from 255 to 7,257, spread over 59 volumes, *but* still limited to 123 extents per volume. This little nugget of information is *very* important if you are thinking of going down the “one huge EAV volume for all my data” road, (these disks can have up to 262,668 cylinders or about 223GB), as the extents per volume limit is still there.

In comparison, the good old MOD-3s had 3,339 cylinders and 3GB of space.

SMS Storage Group Checks

Finally, SAX can also check and alert if your SMS storage groups start getting full. This is especially handy for your Db2 Catalog, Copy Pools and Work Pool SMS Storage groups. Depending on the thresholds you define you can get either

O2RTSU05 - 05W SMS STOGROUP XXXXXXXX: % ALLOC = XXXX

O2RTSU05 - 05W SMS STOGROUP XXXXXXXX: GB FREE = XXXXXXXXXXXX

WTOs being issued. Not really normal DBA work, but very handy nevertheless!

The Future?

SAX will continue to be updated and enhanced for the new features and functionality that Db2 brings in future releases. For example: Relative Page Numbers in Db2 12, where all partitions can get their own DSSIZE with seven byte RIDs.

Of course, you could do all of this on your own too—but then you’d have to maintain it! And that’s enough of me being a sales guy. Back to what I really love the most: solving Db2 problems.

As usual any questions or comments are welcome,

TTFN Roy Boxwell

26. August 201614. Januar 2021

2016-08 – Is it Safe? How to recover accidently dropped tables

Do you have the DDL anywhere?
Was there a “recent” image copy or disk back-up? Who can you call for help?

The newsletter title this month is really nothing to do with the film “Marathon Man”, but sometimes backup and recovery can feel just like having your teeth drilled… Anyway, the title is actually meant to get you to think again about your site’s back-up and recovery definitions—specifically that age-old chestnut about “accidently” dropped objects.

RECOVER from DROP?

I saw in LISTSERV a discussion about recovering from dropped tables and tablespaces. It is a pretty horrible situation when you realize that you just confirmed the drop of a test table to then suddenly see that as your finger is descending towards the ENTER key, that it is, in fact, a different name…

recover accidently dropped db2 zos table based on DDL extraction

DDL and Back-ups Handy?

Normally, at this point, the air is filled with colorful language and interesting local metaphors. Once it has calmed down a bit the real work starts: Do you have the DDL anywhere? Was there a “recent” image copy, or a disk back-up? Who can you call for help?

Mirror Mirror on the wall

Remember that mirroring etc. will not help you as the DROP has also been successfully mirrored. So within a moment, the data was also dropped at your disaster recovery site. (Argh!)

Newbies then start looking in SYSIBM.SYSCOPY for the last image copies, while us grey-haired oldies more mature experts start looking in production control copy libraries and BETA92!

The heat is on!

At this point a couple of things happen: The telephone starts ringing and a manager-type person materializes to ask annoying questions all the time. Typically: “How long will it take to get the data back?”

Now you have two possibilities:

1. You are in luck! Someone somewhere extracted all the DDL for the table(s) with DBID, PSID, and OBID(s) so you could generate a DSN1COPY job from the last found Image Copy dataset

recover accidently dropped db2 zos table based on DDL extraction

2. You are *not* in luck! You have no idea how the table looks, and you cannot see if anyone ALTERed it in the last five years or so.

Crashed and Burned

If you are in position 2, it is now a good time to update your CV and make sure your desk is neat and tidy…Now you might have access to some nifty third party vendor tools, but for that you must at least have the dataset name of the last Image Copy and—of course—the third party tool itself! Or the ability to read the DB2 log and “resurrect” the table defs from there—Not a pretty place to go I assure you!

Back up that chain of thought for a moment

So let’s rewind and imagine that you are doing this all differently… What about beginning today with an extract of all the DDL on your system and then capturing all of the IFCID 62’s that list out any DDL changes, going further what about getting all of the IFCID 220’s to get dataset allocations correlated with Utility IDs and DBIDs and PSIDs.

Imagine what you could then do?

Wow! Cavalry over the hill

Yep, you have a ”history” of all the DDL that has happened on your machine right up to the point when your object was DROPed *and* you have the dataset name(s) of the last image copies as well as all the internal IDs to enable a successful DSN1COPY job complete with OBID translation! Cool huh? So suddenly you are now a hero instead of a villain!
recover accidently dropped db2 zos table based on DDL extraction

recover accidently dropped db2 zos table based on DDL extraction

DIY or Buy in?

So much for the theory – What about in practice? Well you can write it all yourself or you could use this as a sort of useful side effect from our WorkloadExpert (WLX) software which has all this built in! We already get all these IFCIDs, we already have a DDL Generator bundled with the WLX Software, for the Audit Use Cases, and so it really kills two birds with one stone!

Restricted movement?

Now, of course, you could argue:“Wait! I have RESTRICT ON DROP set for all my productive tables!” Now this works really well for accidental drops, but I have seen lots of places where it should be used but was in fact forgotten. How can you check? Run this little SQL to validate that what you think is true really is true:

recover accidently dropped db2 zos table based on DDL extraction

SELECT COUNT(*) AS TABLES                                     
       ,COUNT(CASE WHEN CLUSTERTYPE = 'Y' THEN 1 ELSE NULL END) 
                 AS DROP_RESTRICT                               
       ,COUNT(CASE WHEN CLUSTERTYPE = ' ' THEN 1 ELSE NULL END) 
                 AS NOT_DROP_RESTRICT                           
FROM SYSIBM.SYSTABLES                                         
WHERE     TYPE   IN ('T' , 'M' , 'H' , 'R' )

  AND NOT DBNAME IN ('DSNDB01' , 'DSNDB06' )

WITH UR
 ;

I hope the results don’t have you feeling like Dustin Hoffman in the movie.

As usual any questions or comments are welcome,

TTFN Roy Boxwell

28. Juni 201614. Januar 2021

2016-07 – Migration Planning: Stuck in a BIND

Migration planning to DB2 11 z/OS:

Two queries to list the NULLID and the PLAN packages that need a DB2 REBIND

I was at a customer site the other week and we were getting weird data out of the IFCIDs to read the Dynamic Statement Cache (DSC) – in fact, it was ending with an RC 00E60833, which was very odd as that has nothing to do with the DSC!

Old Structure problem?

Anyway, I had a hunch that perhaps old structures were causing grief as, at this site, the problems were happening only for about four hours overnight; never during peak load and absolutely never, ever in Test or QA.

HealthCheck time!

So I ran our little PerformanceHealthCheck freeware and, among a ton of other info, got the following data:

PHC009W Packages last bound in DB2 V0710         :     358

Now this is OK, as long as they are *not* being executed! So, looking into our WorkloadExpert tool, we could see that there were lots of SQLs—both static and dynamic—that were running in these DB2 V7 last bound packages.

NULLID Problem?

What was especially worrying, was the high number of NULLID packages that were in the list. These NULLID Collections are normally used by JAVA and other remote access routines. They are normally always allocated and in-use—making a REBIND a bit of a challenge. I wrote a little SQL to list out the possible bad guys:

SELECT SUBSTR(A.COLLID, 1, 18) AS COLLID 
, SUBSTR(A.NAME, 1, 8) AS PACKAGE
, A.RELBOUND 
, A.LASTUSED
FROM SYSIBM.SYSPACKAGE A 
WHERE A.RELBOUND IN (' ', 'K', 'L', 'M', 'O')
AND   A.LASTUSED > CURRENT DATE - 14 DAYS 
ORDER BY A.LASTUSED DESC, A.NAME 
WITH UR
;

The contents of the column RELBOUND are:

Blank is before DB2 V7, K is DB2 V7, L is DB2 V8, M is DB2 9, O is DB2 10, and P is DB2 11.

Of course your query must be changed, depending on what level of DB2 system you are currently running on. If running on a DB2 10 NFM then remove the ‘O’ in the IN list. It will then return all executed (so in use!) packages that have run within the last 14 days – Feel free to change the ORDER BY too, if you like.

Our output was quite scary as lots of packages were there.

REBIND the world

The DBAs rebound all of the static packages that had been executed this year, (not quite the world, but still way more than I thought could still be running anywhere in the world!) and then, on a Sunday morning, when they could “break in” – they rebound all of the NULLID packages on the list.

Since then – no problem! But is this the end of my story? Naturally not!

Now this particular customer is on DB2 10 NFM, and so these very old structures made we wonder…I did a post on LISTSERV about this. Pat Bossman, from IBM, answered (Corrections from Pat came in a later mail and I have cut out some extra clutter) :

Hello,

For migration to DB2 10, packages last bound [before] DB2 6 or higher are autobound.

http://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/com.ibm.db2z10.doc.inst
/src/tpc/db2z_relincompatapplsqlfromv9.dita?lang=en

For migration to DB2 11, it’s [before] Version 9 and higher.

http://www.ibm.com/support/knowledgecenter/SSEPEK_11.0.0/com.ibm.db2z11.doc.inst/
src/tpc/db2z_relincompatapplsqlfromv10.dita?lang=en

With RELBOUND of V7, you should hit it in on migration to DB2 11. So yes, you should REBIND those packages on DB2 10 to avoid post-migration autobind.

I’m looking into the nullid package issue. The assumption if a structure does not have embedded SQL it does not require REBIND is false. We still require PLAN REBINDs, or they also are autobound on migration – even if the PLAN does not have DBRMs anymore. There are still structures in there that need updating.

(Don’t forget about PLAN REBINDs!)

Best regards,

Pat

PLAN Ahead

So now you know! Even PLANs should get a REBIND *before* you migrate, here you can use our EarlyPrecheck or BIX software to aid in the pre-migration checklist, in order to stop any AUTOBIND funnies that could occur. In fact, I would recommend REBINDing the PLAN after successful completion of the DB2 release migration. It will save you a step in a year or two.

Finally I wrote another little SQL to give the “overview” of all “old” PLANs:

-- LIST OUT ALL EXECUTED PLANS BOUND IN "OLDER VERSIONS"
 -- OF DB2
 SELECT C.NAME AS PLAN
 FROM SYSIBM.SYSPLAN     C
 WHERE C.RELBOUND IN (' ', 'K', 'L', 'M', 'O'))
 ORDER BY 1
 WITH UR
 ;

(Again, remove the ‘O’ is you are in DB2 10 NFM.) This then gives you, hopefully, a small list of PLANs that need a REBIND and you are ready to go!

I hope you liked this month’s topic.

As always, any questions or comments would be most welcome!

TTFN,

Roy Boxwell

17. Mai 201614. Januar 2021

Can you guess how interesting the DSN_PREDICAT_TABLE really is?

DB2 z/OS literal replacement:

Do you know the queries to list the Dynamic SQL which have literals or Parameter markers?

Here is a very handy method to (fairly) quickly see if you have a literal problem that is killing the DSC or causing access path problems.

I was asked by a customer the other week, to help them out with a classic DB2 SQL Performance problem involving dynamic SQL. Should Literals or Parameter markers be used in dynamic SQL? Now, of course, the classic answer is: “It depends.” But this customer wanted to go a stage further and get a list of all the dynamic SQL which had literals in the text.

EXPLAIN to the rescue!

Naturally an EXPLAIN must be the starting point, because manually parsing the SQL text is incredibly difficult. If you have paid for the DB2 Optimizer—and the ability of EXPLAIN to output to seventeen tables—why not use it and see what you get?

Manual time

After trolling through various manuals, it quickly becomes apparent that the table of choice is the DSN_PREDICAT_TABLE, as it contains at least one row for each predicate and, after all, it is predicates that are truly interesting for the optimizer. (Naturally DSC usage also hangs on the use, or not, of literals also in the SELECT—but that is another story!)

What is in the table?

The contents are quite simple really:

SELECT  A.QUERYNO                 Identification

      , A.QBLOCKNO                Identification

      , A.APPLNAME                Identification

      , A.PROGNAME                Identification

      , A.PREDNO                  Identification

      , A.TYPE                    Type of op AND, OR, EQUAL etc

      , A.LEFT_HAND_SIDE          Column/Value/colexp etc

      , A.LEFT_HAND_PNO           Child predicate number

      , A.LHS_TABNO               Table no.

      , A.LHS_QBNO                Query Block no.

      , A.RIGHT_HAND_SIDE         As left hand

      , A.RIGHT_HAND_PNO          As left hand

      , A.RHS_TABNO               As left hand

      , A.RHS_QBNO                As left hand

      , A.FILTER_FACTOR           Estimated FF

      , A.BOOLEAN_TERM            Whole WHERE is Boolean?

      , A.SEARCHARG               DM or RDS?

      , A.JOIN                    Simple join or not?

      , A.AFTER_JOIN              Predicate after/during join?

      , A.ADDED_PRED              T Transitive clos., B Bubble,
                                  C correlation,

                                  J Join, K like, L local, P push down,

                                  R page range, S simplification

      , A.REDUNDANT_PRED          Is the predicate redundant?

      , A.DIRECT_ACCESS           ROWID Possible?

      , A.KEYFIELD                Is the predicate in indexes?

      , A.EXPLAIN_TIME            Identification

      , A.CATEGORY                IBM Internal use

      , A.CATEGORY_B              IBM Internal use

      , A.TEXT                    First 2000 bytes of text

      , A.PRED_ENCODE             IBM Internal use

      , A.PRED_CCSID              IBM Internal use

      , A.PRED_MCCSID             IBM Internal use

      , A.MARKER                  Host vars, parameter markers,
                                  special regs

      , A.PARENT_PNO              If a root predicate then zero

      , A.NEGATION                Is NOT used?

      , A.LITERALS                Literals separated by colons

      , A.CLAUSE                  HAVING, ON, WHERE or SELECT

      , A.GROUP_MEMBER            Identification

      , A.ORIGIN                  Origin of predicate. Blank, C, R or U

      , A.UNCERTAINTY             Level of uncertainty

      , A.SECTNOI                 Identification

      , A.COLLID                  Identification

      , A.VERSION                 Identification

--V11 , A.EXPANSION_REASON        Archive or Temporal table indicator

FROM BOXWELL.DSN_PREDICAT_TABLE A

The first four columns are used for joining to your existing EXPLAIN tables (PLAN_TABLE etc.)

Columns of interest

Now what interested me straightaway, were the columns MARKER and LITERALS. Looking in the Managing Performance documentation you see that these columns:

MARKER CHAR(1) NOT NULL WITH DEFAULT

Whether this predicate includes host variables, parameter markers, or special Registers.

LITERALS VARCHAR(128) NOT NULL

This column indicates the literal value or literal values separated by colon symbols.

So now it looks quite easy just do a select where LITERALS is non blank. Of course that fails miserably…