2024-12 Security & Audit Check

Hi all! Welcome to the end-of-year goody that we traditionally hand out to guarantee you have something to celebrate at the end-of-year party! This time, I wish to introduce to you a full vulnerability check of your Db2 for z/OS systems.

DORA!

You should all be aware, and scared, of DORA by now. If not, read my prior newsletter 2024-11 DORA or check out my IDUG presentation or my recorded webinar. Whatever you do, you must get up to speed with DORA as it comes into force on the 17th Jan 2025 which is only one month away from the publishing date of this newsletter!

Not just Born in the USA!

Remember, DORA is valid for the whole wide world, not just businesses within the EU. If you do *any* sort of financial trading within the EU you are under the remit of DORA, just like you are with GDPR! Even if you are not trading within the EU block, doing a full vulnerability check is still a pretty good idea!

PCI DSS V4.0.1

We also now have the Payment Card Industry Data Security Standard (PCI DSS) V4.0.1 going live at the end of March 2025… Coincidence? I don’t think so. Mind you, at least the Americans do not fine anyone who fails!

What are We Offering?

This year, the product is called the SecurityAudit HealthCheck for Db2 z/OS or SAC2 for short. It is a very lightweight and robust tool which basically does all of the CIS Vulnerability checks as published by the Center for Internet Security (CIS) in a document for Db2 13 on z/OS:

CIS IBM Z System Benchmarks (cisecurity.org)

https://www.cisecurity.org/benchmark/ibm_z

This contains everything you should do for audit and vulnerability checking and is well worth a read!

Step-By-Step

First Things First!

The first thing SAC2 does, is evaluate *all* security-relevant ZPARMs and report which ones are not set to a „good“ value. It then goes on to check that any default values have not been left at the default value. This especially means the TCP/IP Port number, for example. Then it finishes off by validating that SSL has been switched on for TCP/IP communications and that any and all TCP/IP ALIAS defs also have the correct settings.

Communication is Important!

Next up, is a full evaluation of your Communication Data Base (CDB). This data has been around for decades and started life for SNA and VTAM connections between Host Db2s. These days, SNA is dead and most of the connections are coming from PCs or Servers. That means that there *could* be a lot of dead data in the CDB and, even worse, ways of connecting to your mainframe that you did not even know, or forgot, existed! Think plain text password with SQLID translation for example!

Danger in the Details

Naturally, blindly changing CDB rows is asking for trouble, and if SAC2 finds anything odd/old/suspicious here, you must create a project to start removal. There is a strong correlation between „Oh I can delete that row!“ and „Why can’t any of my servers talk to the mainframe anymore?“. The tool points out certain restrictions and pre-reqs that have to be done *before* you hit the big button on the wall! JDBC version for example.

Taking it All for GRANTed?

GRANTs can be the root of all evil! GRANT TO PUBLICs just make auditors cry, and use of WITH GRANT OPTION makes them jump up and down. Even IBM is now aware that blanket GRANTing can be dangerous for your health! SAC2 analyzes *all* GRANTs to make sure that PUBLIC ones are discovered on the Catalog and Directory as these should NEVER be done (with the one tiny exception of, perhaps on a good day when the sun is shining, the SYSIBM.SYSDUMMY1), then further checking all User Data as PUBLIC is just lazy. Checking everything for WITH GRANT OPTION is just making sure you are working with modern security standards!

Fun Stuff!

These days you should be using Trusted Contexts to access from outside the Host. This then requires Roles and all of this needs tamper-proof Audit Policies. On top of all this are the extra bits and pieces of Row Permissions and Column Masks. All of these must be validated for the auditors!

Elevated Users?

Then it lists out the group of privileged User IDs. These all have elevated rights and must all be checked in detail as who has what and why?

Recovery Status Report

Finally, it lists out a full Recovery Status Report so that you can be sure that, at the time of execution, all of your data was Recoverable.

It is a Lot to Process!

It is indeed. The first time you run it, you might well get tens of thousands of lines of output but the important thing is to run it and break it down into little manageable sections that different groups can then work on. This is called „Due Diligence“ and can save your firm millions of euros in fines.

Lead Overseer

Great job title, but if this person requests data then you have 30 days to supply everything they request. Not long at all! SAC2 does the lion’s share of the work for you.

Again and Again and Again

Remember, you must re-run this vulnerability check on a regular basis for two major reasons:

  1. Things change – Software, Malware, Attackers, Defenders, Networks, Db2 Releases etc.
  2. Checks get updated – The auditors are alway looking for more!

Stay Out of Trouble!

Register, download, install and run today!

I hope it helps you!

TTFN

Roy Boxwell

Future Updates:

The SAC2 licensed version will be getting upgraded in the first quarter of 2025 to output the results of the run into a Comma Separated File (CSV) to make management reporting and delegation of projects to fix any found problems easier. It will also get System Level Backup (SLB) support added. SLB is good but you *still* need Full Image Copies! Further, it will be enhanced to directly interface with our WLX Audit product.

2024-11 DORA

A title that should strike fear into the hearts of all readers! Nah! I really hope not! This month, I wish to run through the parts of DORA that impact the Db2 for z/OS world mostly…

What is DORA?

Dora is the Digital Operational Resilience Act and it was created on the 14th of December 2022 with an enablement date of 17th January 2025, giving us all over two years to read it, understand it, action the requirements from it, and accept it fully into our computing culture.

How many of those things have you accomplished in the last two years? You mean you had a day job as well?

Not just Born in the USA!

DORA is valid for the whole wide world, not just businesses within the EU. If you do *any* sort of financial trading within the EU you are under the remit of DORA, just like you are with GDPR!

PCI SSC DSS new update!

As well as DORA, there is a „new“ version of Payment Card Industry Security Standards Council Data Security Standard 4.0.1. It comes into force on the 31st March 2025 and it contains a lot of overlap with „our“ DORA here in the EU! Here’s a link to their website.

What is the Aim of DORA?

The idea is to bring together all the disparate EU regulations into one new regulation for nearly every financial trading house (FINTEC), apart from a few exclusions e.g. microenterprises, thus simplifying the requirements and easing audit and control.

Is it Just a New Level of Audit?

Most definitely not! As the name suggests, Digital Resilience is all about surviving an attack, or a disaster, and being back and processing data (money) as soon as possible. Included within is indeed a bunch of auditable things, but I will get to them later in this newsletter.

What does DORA Cover then?

Security, Operations, Recoverability and Test. Not only these, but these are, at least for me, the biggies. The last of them – Test – is incredibly important in what they mean by „Test“. They mean Performance Test, Reliability Test and Vulnerability Test. These are not all new for us but some are. We all remember GDPR and friends, where types of data had to be „respected“ otherwise you got a hefty fine. Now, with DORA, the way you work, run, update and check your systems must all be proven and reported. If you do not deliver you get – guess what? – hefty fines!

Who’s the Boss?

You might have read about this, or seen a presentation, but the absolute „boss“ for this regulation is the English PDF version here:

All other versions are translated and so may have errors.

Just the Facts, Ma’am

Well, no, I am not actually going to list all the facts of DORA here, but the highlights for me are the following opening paragraphs:

46 Mandates vulnerability testing

48 Maintained systems (Current release/PTF/APAR etc.)

49 & 50 Recoverability and RTO

56 Performance, Testing and Scanning

This is a brave new world for lots of us! One of the buzzwords is ICT which is basically IT.

Book, Chapter and Verse

Chapter 2 Section II Article 6 ICT risk management framework Paragraphs 2, 4 & 6

This is all about risk management and covers unauthorized access, segregation of duties and regular internal audits.

Chapter 2 Section II Article 8 Identification Paragraphs 1, 3 & 7

Hammers home the requirement for regular updates and risk assessments after major changes.

Chapter 2 Section II Article 9 Protection and prevention Paragraphs 1, 2, 3 & 4

This is pure audit: Monitor continuously what is happening, get tooling and policies in place. Make sure that all data is secure at rest, in use and in transit. Validate authenticity and guarantee strong authentication.

This is the most important part for me – It means encrypt at rest, use SSL for *all* things remote, do not use technical user ids with passwords, use Certificates and/or use Trusted Contexts and implement at least MFA for all users. The „in use“ part is a bit crazy, especially for Db2, but I am pretty sure that the Bufferpool is „trusted“ and so we, as z/OS users, can ignore this part for now…

Chapter 2 Section II Article 10 Detection Paragraphs 1 & 3

Detect whether weird stuff is happening and monitor user activity!

Chapter 2 Section II Article 11 Response and recovery Paragraphs 1 & 3

Make sure you have the ability to recover quickly, and in a timely manner, making sure you have response and recovery plans all ironed out.

Chapter 2 Section II Article 12 Backup policies Paragraphs 1 & 2

Guarantee that your image copies are enough and available to minimize downtime and limit disruption. These must be regularly tested!

Chapter 4 Article 24 Testing Paragraphs 1 & 2

Test! Test and test again – at least yearly!

Chapter 4 Article 25 Testing of ICT tools and systems Paragraphs 1 & 2

Performance Testing, Penetration Testing and Vulnerability Testing.

Chapter 4 Article 26 Advanced testing and TLPT Paragraphs 1, 2 & 6

More advanced testing including a Thread-Led Penetration Test on live production systems at least every three years! All must be documented of course…

Chapter 5 Article 35 Powers of the Lead Overseer Paragraphs 1, 6, 7 & 8

Lead Overseer – I get this mixed up with Supreme Leader all the time…

The Lead Overseer is the DORA God in a given country and can simply ask for any of the details I have just listed. Failure to deliver the goods within 30 days (Calendar days, not working days!) will result in fines…

I am Fine!

Well, the fines are pretty nasty… The Lead Overseer can fine any firm up to 1% of the average daily turnover from the previous financial year. This is then compounded by the fact that the Lead Overseer can levy this fine *every* day for up to six months!!! Ouch!!!

An Example Calculation

Taking a large bank as an example, just to show the math. The turnover in 2023 was nearly 60 billion euros. Divided by 365 gives us 164 million euros per day. Taking 1% of this (worst case) gives 1.64 million euros. Assuming the Lead Overseer is being especially nasty and levels the fines for 182 days leads to around 298 million euros in fines.

I, for one, do *not* want to be the first firm in Europe getting this… and it is all in the public domain which is then a massive image loss as well!

What can I do?

Well, first up, make sure all data is encrypted at rest – This should be a no-brainer due to modern disks/SSDs anyway.

Then, make sure that *all* remote access is using the SECPORT and is being encrypted in flight – again, this should be easy but remember to then set the PORT to be the same as the SECPORT which then forces all of this. Do not forget to check your TCP/IP ALIASs!

Do a full recoverability test to guarantee that you have all the Image Copies, Logs, Archive logs that you require to actually do a full recovery. If you can also meet your RTOs here then even better! Here our RealTime DBAExpert (RTDX) software can really help out, with timely Image Copies and a verification of complete recoverability.

Audit your Db2 Systems!

Do a Vulnerability Test on your Db2 Systems!

Audit?

I have done a lot of blogs and Webinars just about Audit, so I will spare you the details, but you must actually get it done. You will almost certainly require an external auditor who does a final check/validation that your audit is good to go and then you are done. Here, our excellent WLX Audit product can suddenly become your best friend!!!

Feeling Vulnerable Today?

The Center for Internet Security (CIS) has released a document for Db2 13 on z/OS:
CIS IBM Z System Benchmarks

It contains everything you should do for Audit and vulnerability checking and is well worth a read and then action the reports within.

Docu Docu Docu

All of these things must be performed, documented and repeated on a regular basis and sometimes even after a simple „change“ has occurred.

The world is a bad place and DORA is here to help us really, but the start will, as always, be a hard climb!

TTFN,

Roy Boxwell

2024-10 Soundex and other cool features part eight(!) for Db2 z/OS 12 and 13

In this, obviously, never ending series of new features, I will roll up all the new ones since my „SOUNDEX and other „cool“ features – Part seven All new for Db2 12“ newsletter from 2021-05.

Starting with Db2 12 – PTFs first

APAR PH36071 added support for the SUBSTR_COMPATIBILITY ZPARM parameter with default PREVIOUS and other valid value CURRENT. In Db2 12 FL500 and above, with this APAR applied and the value set to CURRENT, then the SUBSTR Built-in Function (BiF) will return an error message for invalid input.

APAR PH42524 added MULTIPLY_ALT support to the IBM Db2 Analytics Accelerator (IDAA).

APAR PH47187 added support for UNI_90 locale in the LOWER, TRANSLATE and UPPER BiFs.

APAR PH48480 added LISTAGG and RAND to the IDAA offload support. Note: you must enter YES in ENABLE ACCELERATOR SPECIFIC RESULTS field on panel DSNTIPBA to get this boost.

Db2 12 FL 100

Yes, they introduced a new BiF for this level way after I wrote my last newsletter all about BiFs and Functions. The new BiF is the BLOCKING_THREADS table function. This is very handy when DBAs are about to start doing DDL work. Adding or ALTERing a Column or what have you! The output is a table about who is blocking access to the database in question and can really save a massive amount of work if you can check *before* you do all your ALTERs that you can indeed succeed in getting them all done!

Now Db2 13 – PTFs first

APAR PH47187 added support for UNI_90 locale in the LOWER, TRANSLATE and UPPER built-in functions (BiFs).

APAR PH48480 added LISTAGG and RAND to the IBM Db2 Analytics Accelerator (IDAA) offload support. Remember, you must enter YES in ENABLE ACCELERATOR SPECIFIC RESULTS field on panel DSNTIPBA.

APAR PH51892 introduced vector prefetch for SQL Data Insights and improves the BiF AI_SEMANTIC_CLUSTER. You must also go to Db2 13 FL504.

APAR PH55212 enhanced SQL Data Insights and added support of numeric data types to the BiF AI_ANALOGY.

Db2 13 FL500

This was the release that introduced the Db2 SQL Data Insights with the new BiFs AI_ANALOGY, AI_SEMANTIC_CLUSTER and AI_SIMILARITY.

Db2 13 FL504

A new AI BiF: AI_COMMONALITY was released and when you use LISTAGG you can now add an ORDER BY to the full select.

Db2 13 FL505

Another new BiF: INTERPRET which can change nearly any argument to nearly any other data type. The most useful thing you can do with it is something like:

INTERPRET(BX'0000000000B0370D' AS BIGINT)    --     11548429

So this is taking a hex RID and interpreting it as a BIGINT. This is very useful when you get RID problems with the LOAD utility, for example. You can then simply plug in the BIGINT value into a query like:

SELECT * FROM TABLE1 A WHERE RID(A) = 11548429;

And then you’ll find the bad guy(s) very easily!

Naturally, I will be keeping this newsletter up-to-date, as necessary.

Any questions or ideas, do not hesitate to drop me a line,

TTFN,

Roy Boxwell

2024-07 One size fits all?

A really brief blog this month, as I found something out while playing with buffer pool tuning (I have no hobbies!)

Buffer Pools …

There has been a long debate down the years about buffer pool management. How big should they be? How many should you have? Which parameters for which type? And so on. I have written numerous blogs, held IDUG Presentations and discussed myriad times with colleagues and customers about this topic.

Something New???

I found something new! Well, new to me anyways!!!

There was a trend a few years ago, called “big memory” where people were encouraged to recombine many data-sharing members downwards so you went from a 14 way to eight way or six way to four way. This was then backed up with more DBATs and expanded buffer pools. The argument was pretty convincing: Let Db2 do the magic!

Too Much Work!

You, as a puny human, have no chance of really knowing what is going on. So just splitting the buffer pools at the highest possible level is all you can actually reasonably do! The argument went that it is easiest to go for around 9 – 10 buffer pools.

DISPLAY GROUPBUFFERPOOL Done on a Regular Basis?

Of course you do!!!

What I found out recently, is the ominous counter in message DSNB415I titled “PREFETCH DISABLED NO READ ENGINE”. Nowadays, each Db2 subsystem/member of a datasharing system has 900 read engines but before, it was split out in 14 or eight sub-systems and is now being used in far fewer.

Zero Counter No More!

This counter was *always* zero whenever and wherever I checked, until last week… Now I see PREFETCH DISABLED NO READ ENGINE on a regular basis. There is a great blog from Robert Catterall.

In the environments I am looking at, they also have PREFETCH DISABLED NO BUFFER occurrences and well over 100 I/Os /second, so it does indeed mean trouble!

Time to Tune!

All you can do here is:

  • Check your VPSEQT and see if you can nudge it up a bit.
  • Throw some memory at the problem by increasing the VPSIZE and hope that the requested pages are then found in the buffer pool.
  • Move smaller objects into PGSTEAL(NONE) buffer pools thus requiring no PREFETCH.
  • As a last gasp, try and tune the SQL to not use PREFETCH as a method – very tricky of course!

Keep on Truckin‘

But let’s be honest for a minute, if you have 900 read engines all humming along then your system is really running under pressure and you should *expect* to get this counter, I guess!

As Robert says: No need to panic!

TTFN,

Roy Boxwell

2024-06 Time for a change?

This month I wish to share an interesting voyage of discovery to do with TIMESTAMP calculations.

How Interesting …

It all started decades ago, when I wanted to find out how many microseconds there were between two timestamps. A simple idea for a trace in various programs which, when analyzed, could highlight problematic code paths.

Use DISPLAY Stupid!

Just DISPLAY the current timestamp at the start of each SECTION. The trace analysis program would then simply subtract two timestamps – et Voila! You have a difference!

Nope … That Fails

Well, I found out very quickly that math on TIMESTAMPs does *not* work like that! What you actually get is a “duration” which, in my humble opinion, is worthless! Here’s an example:

SELECT TIMESTAMP('2023-12-21-15.06.40.120234') -
       TIMESTAMP('2023-12-21-15.06.40.120034') 
FROM SYSIBM.SYSDUMMY1  ;                       
---------+---------+---------+---------+--------
---------+---------+---------+---------+--------
           .000200

Looks good doesn’t it? Exactly 200 microseconds – as it should be.

Now look at this example:

SELECT TIMESTAMP('2023-12-21-15.06.41.120234') -
       TIMESTAMP('2023-10-22-17.08.55.130234') 
FROM SYSIBM.SYSDUMMY1  ;                        
---------+---------+---------+---------+--------
---------+---------+---------+---------+--------
  129215745.990000

What?

Yep, what the calculation does is really “subtract”… What you get is a decimal 14 containing the YYYY (leading zeroes blanked of course!) years, MM months, DD days HHMMSS then a decimal point and then your usual six digits for microseconds.

Not Good!

This, for me, was not actually usable! So, what I did, was then extract all the fields multiplying by the different values and adding them all together to get a SECONDS field. This gave me what I wanted but was a bit messy.

Months?

Whoever dreamed up the western calendar obviously was not a programmer! 31, 30, 28 sometimes 29 days in a month but then not always… The days in month has been, and always will be, a real PITA. However, IBM Db2 has a nice Built-in Function (BiF) called DAYS (not DAY – That just extracts the day portion of the calendar!)

DAYS ( expression ) – The result is 1 more than the number of days from January 1, 0001 to D, where D is the date that would occur if the DATE function were applied to the argument.

All in UTC and this is *very* handy as it bypasses the days-in-the-month problem completely!

Armed with this it made the COBOL code simpler and better.

TIMESTAMPDIFF to the Rescue!

Then in Db2 V8 came a new BiF – TIMESTAMPDIFF – will it be our savior?

When it was announced I thought: Wow! This is great – it will do all the work for me – just tell it what you want as output and give it two timestamps and you are done!

TIMESTAMPDIFF( numeric-expression, string-expression)

Not Really …

The problem is in the first sentence of the docu that most people do not bother to read:

The TIMESTAMPDIFF function returns an estimated number of intervals of the type that is defined by the first argument, based on the difference between two timestamps.

Seconds is Good?

Naturally, I used 2 (to get seconds) as the numeric-expression and, to begin with, was a happy bunny with the results.

Guestimate?

Then I noticed that all was not well in the world of TIMSTAMPDIFF and that the “estimated” number can be difficult to judge! The problem gets clearer when you review the “assumptions” list at the end of the docu:

The following assumptions are used in estimating a difference:

  • One year has 365 days
  • One year has 52 weeks
  • One year has 12 months
  • One month has 30 days
  • One day has 24 hours
  • One hour has 60 minutes
  • One minute has 60 seconds

Now we all know that this is not true… Not all years have 365 days or even 52 weeks and nearly all months do not have 30 days! Now in my trace program it was just a bit irritating, but if you are using TIMESTAMPDIFF believing you really get the number of DAYS between two dates then it is time to think again!

Seeing is Believing

Here’s an example showing where it first works fine and then one day earlier and it all goes astray:

SELECT                                                              
 TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') -          
                        TIMESTAMP('2023-10-22-00.00.01')))  AS TSDIFF
,           DAYS(       TIMESTAMP('2023-12-21-00.00.01')) -         
            DAYS(       TIMESTAMP('2023-10-22-00.00.01'))   AS DAYS 
FROM SYSIBM.SYSDUMMY1  ;                                            
---------+---------+---------+---------+---------+---------+---------
     TSDIFF         DAYS                                             
---------+---------+---------+---------+---------+---------+---------
    5184000           60                                             
DSNE610I NUMBER OF ROWS DISPLAYED IS 1                              
SELECT                                                              
 TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') -          
                        TIMESTAMP('2023-10-21-00.00.01')))  AS TSDIFF
,           DAYS(       TIMESTAMP('2023-12-21-00.00.01')) -         
            DAYS(       TIMESTAMP('2023-10-21-00.00.01'))   AS DAYS 
FROM SYSIBM.SYSDUMMY1  ;                                            
---------+---------+---------+---------+---------+---------+---------
     TSDIFF         DAYS                                            
---------+---------+---------+---------+---------+---------+---------
    5184000           61                                            
DSNE610I NUMBER OF ROWS DISPLAYED IS 1

 As you can easily see, the day has changed by one but the TSDIFF has not…Not good! The problem is naturally caused by going over some internal threshold. If you do not care about accuracy, it is fine.

I Do!

So, what I have done is write my own “timestampdiff” in SQL:

SELECT                                                              
       MIDNIGHT_SECONDS(TIMESTAMP('2023-12-21-00.00.01')) -         
       MIDNIGHT_SECONDS(TIMESTAMP('2023-10-22-00.00.01'))           
      + ( 86400 * (DAYS(TIMESTAMP('2023-12-21-00.00.01')) -         
                   DAYS(TIMESTAMP('2023-10-22-00.00.01')))) AS DIFF 
,TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') -          
                        TIMESTAMP('2023-10-22-00.00.01')))  AS TSDIFF
,           DAYS(       TIMESTAMP('2023-12-21-00.00.01')) -         
            DAYS(       TIMESTAMP('2023-10-22-00.00.01'))   AS DAYS 
FROM SYSIBM.SYSDUMMY1  ;                                            
---------+---------+---------+---------+---------+---------+---------
       DIFF       TSDIFF         DAYS                               
---------+---------+---------+---------+---------+---------+---------
    5184000      5184000           60                               
DSNE610I NUMBER OF ROWS DISPLAYED IS 1                              
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100         
---------+---------+---------+---------+---------+---------+---------
SELECT                                                              
       MIDNIGHT_SECONDS(TIMESTAMP('2023-12-21-00.00.01')) -         
       MIDNIGHT_SECONDS(TIMESTAMP('2023-10-21-00.00.01'))            
      + ( 86400 * (DAYS(TIMESTAMP('2023-12-21-00.00.01')) -         
                   DAYS(TIMESTAMP('2023-10-21-00.00.01')))) AS DIFF 
,TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') -          
                        TIMESTAMP('2023-10-21-00.00.01')))  AS TSDIFF
,           DAYS(       TIMESTAMP('2023-12-21-00.00.01')) -         
            DAYS(       TIMESTAMP('2023-10-21-00.00.01'))   AS DAYS 
FROM SYSIBM.SYSDUMMY1  ;                                            
---------+---------+---------+---------+---------+---------+---------
       DIFF       TSDIFF         DAYS                               
---------+---------+---------+---------+---------+---------+---------
    5270400      5184000           61                                
DSNE610I NUMBER OF ROWS DISPLAYED IS 1                              
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100         
---------+---------+---------+---------+---------+---------+---------
SELECT                                                               
       MIDNIGHT_SECONDS(TIMESTAMP('2023-12-21-00.00.01')) -         
       MIDNIGHT_SECONDS(TIMESTAMP('2023-10-20-00.00.01'))           
      + ( 86400 * (DAYS(TIMESTAMP('2023-12-21-00.00.01')) -          
                   DAYS(TIMESTAMP('2023-10-20-00.00.01')))) AS DIFF 
,TIMESTAMPDIFF( 2, CHAR(TIMESTAMP('2023-12-21-00.00.01') -          
                        TIMESTAMP('2023-10-20-00.00.01')))  AS TSDIFF
,           DAYS(       TIMESTAMP('2023-12-21-00.00.01')) -         
            DAYS(       TIMESTAMP('2023-10-20-00.00.01'))   AS DAYS 
FROM SYSIBM.SYSDUMMY1  ;                                   
---------+---------+---------+---------+---------+---------+
       DIFF       TSDIFF         DAYS                       
---------+---------+---------+---------+---------+---------+
    5356800      5270400           62                      
DSNE610I NUMBER OF ROWS DISPLAYED IS 1

It all looks a lot better and actually returns the correct number of seconds between two timestamps!

How Does it Work?

The key part is just some math:

       MIDNIGHT_SECONDS(TIMESTAMP('from ts')) -         
       MIDNIGHT_SECONDS(TIMESTAMP('to ts'))           
      + ( 86400 * (DAYS(TIMESTAMP('from ts')) -         
                   DAYS(TIMESTAMP('to ts')))) AS DIFF

I use the MIDNIGHT_SECONDS BiF (First appeared in DB2 V6.1) that returns the number of seconds from midnight up to the given timestamp. I then simply subtract the “to ts” value from the „from ts“ value and then use the DAYS BiF again subtracting from each other to get the days difference multiplying by 86400 as seconds/day. There is no need to subtract an extra day as, if that was required, it is automatically handled by the MIDNIGHT_SECONDS subtraction. Then these two results are simply added together.

Code Review Time?

I have re-reviewed all my code to make sure that any use of TIMESTAMPDIFF can accept “approximate” answers and in all other cases replaced it with the above code.

I might even open an Aha! Idea about getting this as a BiF – it is not that complex and is “better” than the best guess system we have now. In fact, I have – DB24ZOS-I-1599 – please vote if you think it would be great to get this as a simple BiF!

Db2 Does Not Stand Alone Here!

We are also not alone in this problem! Oracle, MySQL and JAVA all have the same “approximation” routines. I did a Google search for a web-based timestamp calculator and it made exactly the same mistakes. I guess that there is a “fast algorithm” out there that does the best guess quickly…but sadly not accurately!

This problem is right up there with Daylight Saving Time and the grief that causes! See my earlier blog.

TTFN,

Roy Boxwell

2024-05 pre or co processor?

You pick your frame of reference and you pays your money, as they say!

Which way should we all be going these days? The good old precompiler, since the beginning of time, or the modern sleek coprocessor? This month, I will show you what they both do, what they do the same, what they do differently, and the pros and cons of both!

COBOL For All!

Yep, this newsletter is *just* about COBOL. Sorry if you use JAVA, Ruby, Python, or CosyPinkTeabags I stick with a language that works on computers so big you cannot lift ‘em!

In the Beginning…

Many, many moons ago, the great precompiler was launched. The problem was that Structured Query Language (SQL) is not COBOL in any way shape or form, but companies needed a way to integrate SQL code into COBOL code. The mainframe world had had this problem before with CICS, where the elegant solution was a CICS translator that ran through the code, removed all the special CICS calls, and replaced them with correctly formed COBOL calls. All done under the covers so that the application programmer did not have to know, or do, anything. CICS also now supports the integrated CICS translator by the way.

SQL was More Complicated

Naturally, the abilities in SQL caused some headaches… The syntax checking required the use of TABLE DECLARES (Still optional to this day, which I find astonishing!) To generate executable code the system had to do two things:

  1. Replace the EXEC SQL with comments and then calls to Assembler routines with parameters that listed out the SQL statement to be executed and all of the host variables involved. It also created a CONTOKEN or Consistency token that Db2 uses to check that any given load module “fits” to the given Package (DBRM) at run time.
  2. Output a DBRM, also with the CONTOKEN within it, so that then the PACKAGE/PLAN could be bound into an executable object. Every statement in the package matched the COBOL assembler calls one to one. At execution time, if the CONTOKEN in the package did not match the CONTOKEN in the load module, you would get a nasty error, typically an SQLCODE -805 with one of five sub-types, as something is wrong somewhere!

The Problem?

At the point where the precompiler was run, *no* COBOL had been compiled so any use of COPY books caused syntax errors – here the EXEC SQL INCLUDE jumped in to help *but* for, some unknown reason, they did not implement a REPLACING syntax like COPY has… This meant that the INCLUDEd code was not 100% the same as your “normal” COPY book – Very annoying!

Coprocess This!

This situation caused problems, and after a few years IBM launched the coprocessor which is basically the IBM COBOL compiler with the precompiler bolted within. This gave two immediate benefits:

  1. Simpler JCL (No precompiler step anymore!)
  2. Simpler COBOL copy book management as the COPY syntax worked fine!

So Why are We Not *All* Using the Coprocessor Today?

Well, as always, the devil is in the detail. The one major stopper I have seen is Db2 columns defined as “FOR BIT DATA”. Now, in the normal COBOL world, this just means “no code page conversion please.” The data is quite probably hexadecimal and will be completely mangled if it gets a code page conversion! What you actually get depends on what you are doing but it is quite easy to get an SQLCODE -333.

Unicode – EBCDIC – ASCII

The triumvirate of pain! If I had a dollar for every file transfer I have received that was originally EBCDIC and got ASCII transferred… but I digress!

There is a Fix!

There are two solutions here:

  1. Use a DECLARE VARIABLE in the code to “inform” the Compiler *not* to do a code page conversion when this COBOL host variable is being processed. This is naturally more work for the programmer and “dangerous” too, as it is another “point of failure” that never existed before! This scares people – we all hate change after all.
  2. Use NOSQLCCSID in the COBOL coprocessor parameters. This is recommended as the easiest way to stay plug compatible with the precompiler. Naturally, at some point, it will be time to bite the bullet and do the code change required!

The DECLARE you might need looks like this:

exec sql                                          
   declare :PACKAGE-CONTOKEN variable for bit data
end-exec

The problem is normally caused by INSERT and UPDATE processing. Just SELECT appears to always work fine in my tests, (I created an EBCDIC table and a UNICODE table and then did multiple Inserts and Selects with hex data). However, your data constellation might well be different from mine – Always test first!

Bottom Line

If you are not using FOR BIT DATA you have no problem!

Here’s a little SQL to show you where you have any columns with FOR BIT DATA in your system:

SELECT SUBSTR(STRIP(TBCREATOR) CONCAT '.' CONCAT STRIP(TBNAME)
            , 1 , 32) AS TABLE_NAME                          
     , NAME           AS COL_NAME                            
FROM SYSIBM.SYSCOLUMNS                                       
WHERE FOREIGNKEY = 'B'                                        
ORDER BY 1 , 2                                               
FOR FETCH ONLY                                               
WITH UR                                                      
;

JCL Changes Required

To get the Coprocessor up and running I just added these two lines into my SYSOPTF for COBOL 6.3:                            

,CODEPAGE(1141),NOSQLCCSID                            
,SQL("APOSTSQL ATTACH(CAF) COMMA DATE(ISO) TIME(ISO)")

Notice the NOSQLCCSID so I do not have to do any code changes!

Naturally, the STEPLIB must be enhanced:    

//STEPLIB   DD DISP=SHR,DSN=IGY630.SIGYCOMP    
//          DD DISP=SHR,DSN=DSND1A.SDSNEXIT.DD10
//          DD DISP=SHR,DSN=DSND1A.SDSNLOAD

And there are two new DD cards:

//DBRMLIB   DD DISP=SHR,DSN=SE.MDB2VNEX.TDBRM(COCOMP)
//SYSTERM   DD SYSOUT=*

That’s It!

All in all, I think if you are still using the precompiler you should take some time to migrate over to the coprocessor as it makes the modules much smaller and, by default, faster!

Faster? I Hear you Shout

Well, what the coprocessor also does, is completely remove all SQL based working storage and, most importantly, the PERFORM on the 1st call of the SQL-INITIAL section. Now the precompiler is clever but it is not Einstein! When the very first SQL gets called, this section is performed which defines in memory *all* the SQL in the program. So, let’s say you have a program with 1,000 SQLs. You are only using one of them, but for that one call all 1,000 working storage areas will be initialized through DSNHADDR and DSNHADD2 calls using the PLIST blocks. It is fast I know, but it is also just overhead!

Another Bonus!

Plus, if your COBOL is passing host variables through linkage section usage and the address changes between calls then you *must* currently reset the SQL-INIT-FLAG to zero *every* time… With the coprocessor you do not have to do that anymore – Another win!

Just the Facts Ma’am

In one of my test programs the precompiler generated:

1674 lines of working storage

232 lines of SQL-INITIAL code

For every EXEC SQL (31 in total) that was commented out a

PERFORM SQL-INITIAL UNTIL SQL-INIT-DONE
CALL 'DSNHLI2' USING SQL-PLIST11       
END-CALL

Block was written.

The coprocessor generated nothing! and the executable load module was 8% smaller!

Pros and Cons

Pros of precompiler are: No need to DECLARE for bit data columns and no JCL change.

Cons of precompiler are: No real COPY book support, larger code and bigger module size.

Pros of coprocessor are: No reset of SQL-INIT-FLAG, Perfect COPY book support, no generated code, smaller module size, faster run time execution and load.

Cons of coprocessor are: JCL change required and possible DECLARE VARIABLEs needed depending on usage.

What are your experiences with the precompiler and/or the coprocessor?

I would love to hear from you!

TTFN,

Roy Boxwell

2024-04 SCA you like?

This month, I must thank one of my readers who simply asked, “Roy, can you do a newsletter about the Shared Communications Area (SCA) for me please?” Naturally, I replied (after getting two beers from him of course!) So now I wish to delve into the inner workings of the Coupling Facility (CF) and the SCA…

Warning: Scary Stuff Ahead!

In a non-datasharing world, a -DISPLAY GROUP shows this sort of output:                                       

DSN7100I  -DD10 DSN7GCMD
*** BEGIN DISPLAY OF GROUP(........) CATALOG LEVEL(V13R1M504)
                  CURRENT FUNCTION LEVEL(V13R1M504)          
                  HIGHEST ACTIVATED FUNCTION LEVEL(V13R1M504)
                  HIGHEST POSSIBLE FUNCTION LEVEL(V13R1M504) 
                  PROTOCOL LEVEL(2)                          
                  GROUP ATTACH NAME(....)                    
---------------------------------------------------------------------
DB2          SUB                     DB2    SYSTEM    IRLM    
MEMBER   ID  SYS  CMDPREF   STATUS   LVL    NAME      SUBSYS IRLMPROC
-------- --- ---- --------  -------- ------ --------  ----   --------
........   0 DD10 -DD10     ACTIVE   131504 S0W1      IDD1   DD10IRLM
---------------------------------------------------------------------
SPT01 INLINE LENGTH:        32138                                   
*** END DISPLAY OF GROUP(........)                                  
DSN9022I  -DD10 DSN7GCMD 'DISPLAY GROUP ' NORMAL COMPLETION

If that is what you see in your production system, dear reader, then this newsletter is, sadly, not for you!

Hopefully your output actually looks like my little test system:

DSN7100I  -SD10 DSN7GCMD                                             
*** BEGIN DISPLAY OF GROUP(GSD10C11) CATALOG LEVEL(V13R1M504)       
                  CURRENT FUNCTION LEVEL(V13R1M504)                 
                  HIGHEST ACTIVATED FUNCTION LEVEL(V13R1M504)       
                  HIGHEST POSSIBLE FUNCTION LEVEL(V13R1M504)        
                  PROTOCOL LEVEL(2)                                 
                  GROUP ATTACH NAME(SD1 )                           
---------------------------------------------------------------------
DB2          SUB                     DB2    SYSTEM    IRLM          
MEMBER   ID  SYS  CMDPREF   STATUS   LVL    NAME      SUBSYS IRLMPROC
-------- --- ---- --------  -------- ------ --------  ----   --------
MEMSD10    1 SD10 -SD10     ACTIVE   131504 S0W1      JD10   SD10IRLM
MEMSD11    2 SD11 -SD11     ACTIVE   131504 S0W1      JD11   SD11IRLM
---------------------------------------------------------------------
SCA   STRUCTURE SIZE:    16384 KB, STATUS= AC,   SCA IN USE:     3 %
LOCK1 STRUCTURE SIZE:    16384 KB                                   
NUMBER  LOCK ENTRIES:     4194304                                   
NUMBER  LIST ENTRIES:       16354, LIST ENTRIES  IN USE:          81
SPT01 INLINE LENGTH:        32138                                    
*** END DISPLAY OF GROUP(GSD10C11)                                  
DSN9022I  -SD10 DSN7GCMD 'DISPLAY GROUP ' NORMAL COMPLETION

The interesting stuff is from the “SCA   STRUCTURE SIZE:” line down to the “NUMBER  LIST ENTRIES:” line.

Off We Go!

To understand what an SCA is, we need to backtrack a second and first discuss what a CF is?

What is Inside?

The Coupling Facility is de facto *the* central part of data sharing, it contains three objects:

  1. A lock structure called LOCK1, where all the table locks using hashes live, and Record Lock Elements (RLE),
  2. A list structure that is actually the SCA, which contains a bunch of stuff I will go into later,
  3. The Group Buffer pools. Technically, you can run without these but then why are you data sharing if you are not sharing data?

Lock Structure Details

It is called LOCK1 and the system lock manager (SLM) uses the lock structure to control shared Db2 resources like tablespaces and pages and can enable concurrent access to these. It is split internally into two parts: The first part is the Lock Table Entry (LTE), and the second part is a list of update locks normally actually called the Record List Entry (RLE). The default is a 50:50 split of memory. The size of this structure must be big enough to avoid hash contention, which can be a major performance problem. Do not forget that the IRLM reserves 10% of the RLEs for “must complete” processes so you never actually get to use them all!

List Structure Details

This is really the SCA and it contains Member names, BSDS names, Data Base Exception Table (DBET) statuses and recovery information. Typically, at installation time, you pick a number for the INITSIZE (first allocated size of the SCA) from a list of 16 MB , 32 MB, 64 MB or 128 MB. Each of these INITSIZEs then has a SIZE which is typically twice the INITSIZE size as a maximum limit.

Baboom!

IBM write quite happily “Running out of SCA space can cause Db2 to fail” – I can change that to “does” not “can”!

Double Trouble?

The LOCK1 and the SCA do *not* have to be duplexed but it is very highly recommended to do so, otherwise, you have a single point of failure which defeats the whole point of going data sharing, really.

Death by DBET

The DBET data is, strangely enough, the thing that can easily kill ya!

How So?

Imagine you have the brilliant idea of using COPY YES indexes as you have tested a few and seen that RECOVER INDEX is quicker, better, faster than REBUILD or DROP/CREATE for the critical indexes at your shop.

What Happened Next?

So how do you enable COPY YES at the index level? Just a simple ALTER INDEX xxx.yyy COPY YES is all it takes. But *what*, dear friends, does this ALTER do under the covers? It sets the INDEX to ICOPY status – „Not too bad“, you say as the index is fully available, „just wait until the next COPY and that status is then cleared“ – But wait, that is a DBET status! It creates a DBET entry in your SCA… What if you alter 15,000 Indexes to all be COPY YES? Yep – Kiss goodbye to your Db2 sub-system!

Suicide Protection

Some software out there (RealTimeDBAExpert from SEGUS for example!) actually warns you about this and recommends you do this sort of thing in chunks. First the ALTER, then the COPY, one hundred blocks at a time and then the next batch etc.

and then if you do press PF1:

And even then, we have an emergency stop built-in that you can still override but then you must *know* the possible risk involved:

On the other hand, some software is like using SDSF DA when you put a P for “print” by your userid!

Now you know what *not* to do!

Automagic?

Use of the ALLOWAUTOALT(YES) has been discussed for years… on the one hand it automagically adds storage if you are running out, which is a good thing, *but* it also allows other competing systems to decrease storage which can then lead to you running out of storage and losing this Db2 sub-system… nasty, nasty!

Operator command time!

/f <yourirlm>,STATUS,STOR

Gives me:

DXR100I JD11002 STOR STATS
PC: YES  LTEW:  2 LTE:     4M RLE:   16354  RLEUSE:      18        
BB PVT:  1266M  AB PVT (MEMLIMIT):   2160M                         
CSA USE: ACNT:     0K  AHWM:     0K  CUR:  2541K  HWM:  6122K      
        ABOVE 16M:    64   2541K     BELOW 16M:     0      0K      
        AB CUR:               0K     AB HWM:               0K      
PVT USE:   BB CUR:  5684K        AB CUR:     5M                    
           BB HWM:    18M        AB HWM:     5M                    
CLASS   TYPE  SEGS     MEM   TYPE  SEGS     MEM   TYPE  SEGS     MEM
ACCNT    T-1     2      4M    T-2     1      1M    T-3     2      8K
PROC     WRK    14     70K    SRB     5      5K    OTH     4      4K
MISC     VAR    41   7549K    N-V    22    565K    FIX     1     24K

This maps pretty nicely to the – DISPLAY GROUP I just did again:

*** BEGIN DISPLAY OF GROUP(GSD10C11) CATALOG LEVEL(V13R1M504)       
                  CURRENT FUNCTION LEVEL(V13R1M504)                 
                  HIGHEST ACTIVATED FUNCTION LEVEL(V13R1M504)       
                  HIGHEST POSSIBLE FUNCTION LEVEL(V13R1M504)        
                  PROTOCOL LEVEL(2)                                 
                  GROUP ATTACH NAME(SD1 )                           
---------------------------------------------------------------------
DB2          SUB                     DB2    SYSTEM    IRLM          
MEMBER   ID  SYS  CMDPREF   STATUS   LVL    NAME      SUBSYS IRLMPROC
-------- --- ---- --------  -------- ------ --------  ----   --------
MEMSD10    1 SD10 -SD10     ACTIVE   131504 S0W1      JD10   SD10IRLM
MEMSD11    2 SD11 -SD11     ACTIVE   131504 S0W1      JD11   SD11IRLM
---------------------------------------------------------------------
SCA   STRUCTURE SIZE:    16384 KB, STATUS= AC,   SCA IN USE:     3 %
LOCK1 STRUCTURE SIZE:    16384 KB                                   
NUMBER  LOCK ENTRIES:     4194304                                   
NUMBER  LIST ENTRIES:       16354, LIST ENTRIES  IN USE:          18
SPT01 INLINE LENGTH:        32138                                   
*** END DISPLAY OF GROUP(GSD10C11)

Looking at this line in the Modify output:     

PC: YES  LTEW:  2 LTE:     4M RLE:   16354  RLEUSE:      18        

I only have two members so my LTEW (LTE Width) is two bytes, LTE is 4M which is my NUMBER LOCK ENTRIES: 4194304 , RLE is 16354 which is my NUMBER LIST ENTRIES: 16354 and RLEUSE is 18 which is my LIST ENTRIES IN USE: 18. A perfect match – if only they had agreed on a naming convention! ! !

You can also see that my SCA STATUS is AC for ACTIVE and the SCA IN USE is a measly 3%, so no worries for me today!

Finally, you can see that we have 50:50 split as the SCA and the LOCK1 are the same size: 16 MB.

One more line of interest:                  

BB PVT:  1266M  AB PVT (MEMLIMIT):   2160M       

Here you can see the calculated MEMLIMIT for the IRLM, which for my test system is very low, but you should check that the number is good for your site as the range is now much bigger!

Panic Time?

I would start to get seriously sweaty if the SCA IN USE ever got near 70%.

And?

I would not use ALLOWAUTOALT(YES) and allocate 256MB or, if forced, I would go with INITSIZE 128 MB and SIZE 256 MB and then ALLOWAUTOALT(YES).

Lock it Down!

The Max Storage for locks is between 2,048 MB and 16,384 PB with a default of 2,160 MB that I have in the PVT (MEMLIMIT) output above.

Locks per table(space) (NUMLKTS) is 0 to 104,857,600 with a default of 2,000 in Db2 12 and 5,000 in Db2 13. If this number is exceeded then lock escalation takes place unless it is zero in which case there is no lock escalation. As IBM nicely put it “Do not set the value to 0, because it can cause the IRLM to experience storage shortages”.

Locks per user (NUMLKUS) is also from 0 to 104,857,600 with a default of 10,000 in Db2 12 and 20,000 in Db2 13. 0 means no limit. IBM do not recommend 0 or a large number here unless it is really required to run an application. Db2 assumes that each lock is approximately 540 bytes of memory. Here you can also drain your IRLM until it runs out of ACCOUNT T-1 storage space:

DXR175E xxxxxxxx IRLM IS UNABLE TO OBTAIN STORAGE – PVT

This is not a message you ever want to see on the master console! Just do the math on your locks and your memory size!

The Bachelor Problem – Fear of Commitment

You must get the developers to COMMIT or not use row-level locking everywhere.

SCA DBET Calculation

The lock size for each DBET entry is approximately 1,864 bytes from my tests in Db2 12 FL 510. How did I determine that, you ask?

What I did, was multiple ALTER INDEX aaa.bbb COPY YES SQLs until the „SCA in use“ percentage changed from 4% to 5%, and then I kept doing ALTERs and DISPLAYs until the percentage changed from 5% to 6%. It took me exactly 90 ALTERs and with an SCA size of 16,384KB, that gives me a size of around 1,864 bytes per DBET entry. Use this as a Rule Of Thumb for your DBET inducing ALTERs! It gives me a hard limit of about 6,000 Alters.

I hope you found this little discourse into the world of the Coupling Facility and SCA useful!

TTFN,

Roy Boxwell

2024-03 I am fine, up to a certain DEGREE

This month is all about going parallel! In the Db2 world, we have had the ability to run SQLs using parallel processing for decades. It started off a bit wobbly and most people didn’t use it, or even like it, but these days it is extremely useful for certain cases.

Sort Yourself Out

Sort is very important here. If you can do a parallel sort with only one extra parallel task it will halve your elapsed time… if you can add more tasks it takes even less elapsed. Naturally, you do not save on CPU here, in fact it will probably “cost” more, but you are trading CPU for elapsed time.

Ground Rules

IBM states in the documentation that parallel processing is “Only for partitioned objects” but then they mention that even non-partitioned objects can benefit, as the access to the non-clustering index and the data can be done in parallel… which is not helpful if you only have clustering index access, of course!

Bells and Whistles!

There are quite a few things to adjust and play with on the road to parallel processing!

No Way!

If you declare your cursor as WITH HOLD and with isolation level RR or RS then there is *no* CPU parallelism allowed at all but you can still get parallel sorts.

ZPARM Time

First up is CDSSRDEF (CURRENT DEGREE) where the IBM recommendation must be read:

CURRENT DEGREE field (CDSSRDEF subsystem parameter)

The CDSSRDEF subsystem parameter determines the default value that is to be used for the CURRENT DEGREE special register. The default value is used when a degree is not explicitly set in the SQL statement SET CURRENT DEGREE.

Acceptable values: 1, ANY

Default: 1

Update: option 30 on panel DSNTIPB

DSNZPxxx: DSN6SPRM CDSSRDEF

1 Specifies that when a query is dynamically prepared, the execution of that query will not use parallelism. If this value is specified, Db2 does not use any optimization hints for parallelism.

ANY Specifies that when a query is dynamically prepared, the execution of that query can involve parallelism.

Recommendation: In almost all situations, accept the default value of 1. You should use parallelism selectively where it provides value, rather than globally. Although parallelism can provide a substantial reduction in elapsed time for some queries with only a modest overhead in processing time, parallelism does not always provide the intended benefit. For some queries, and in many other situations, query parallelism does not provide an improvement, or it uses too many resources. If you are using nearly all of your CPU, I/O, or storage resources, parallelism is more likely to cause degradation of performance. Use parallelism only where it is most likely to provide benefits.

The ZPARM INDEX_IO_PARALLELISM should get an honorable mention here as it was there right up until Db2 11 and it is, in fact, still within the Db2 12 and 13 indexes at the end of the installation guide pdf, but it has thrown off its mortal coil…

INDEX_IO_PARALLELISM

Specifies whether I/O parallelism is enabled for index insertion.

Acceptable values: YES, NO

Default: YES

DSNZPxxx: DSN6SPRM INDEX_IO_PARALLELISM

Security parameter: No

YES I/O parallelism is enabled for index processing. I/O parallelism allows concurrent insert operations on multiple indexes and can reduce I/O wait time when many indexes are defined in a table.

NO I/O parallelism is disabled for index processing.

Naturally, YES is what it should be set to!

To round out the dearly departed, there was also this one:

PARALLELISM EFFICIENCY field (PARA_EFF subsystem parameter)

Controls the efficiency that DB2 assumes for parallelism when DB2 chooses an access path. Valid values are integers 0 – 100. The integer represents a percentage efficiency.

It came in Db2 9 with PM16020 and started life with a default value of 100 before getting a default change to 50 in Db2 10. It got deprecated in Db2 12 and was removed in Db2 13.

Then PARAMDEG with a handy Top Tip:

MAX DEGREE field (PARAMDEG subsystem parameter)

The PARAMDEG subsystem parameter specifies the maximum degree of parallelism for a parallel group. When you specify a non-zero value for this parameter, you limit the degree of parallelism so that Db2 cannot create too many parallel tasks that use virtual storage.

Acceptable values: 0 – 254

Default: 0

Update: option 30 on panel DSNTIPB

DSNZPxxx: DSN6SPRM PARAMDEG

0 Specifies no limit to the maximum degree of parallelism that Db2 chooses based on the cost estimate for the query and the system configuration, in particular the number of processors online. Db2 counts both general purpose and zIIP processors equally, and applies further adjustment to determine the degree to use.

1 – 254 Specifies the maximum degree of parallelism that Db2 uses. When optimization hints for parallelism are used, the value of the PARAMDEG subsystem parameter does not limit the degree of parallelism at bind time. However, the value of the PARAMDEG subsystem parameter is enforced at execution time. So, if the value of the PARAMDEG subsystem parameter is lower than the degree of parallelism that is specified at bind time, the degree of parallelism is reduced at execution time.

Tip: For systems with more than two zIIP processors configured, use the number of zIIP processors as the starting value, and then adjust as needed for your response time requirements.

Basically, set this value to be one or two times the number of online available CPUs but take into account the ZiiPs.

Then its new DPSI baby brother:

MAX DEGREE FOR DPSI (PARAMDEG_DPSI subsystem parameter)

The PARAMDEG_DPSI subsystem parameter specifies the maximum degree of parallelism that you can specify for a parallel group in which a data partitioned secondary index (DPSI) is used to drive parallelism.

A DPSI is a non-partitioning index that is physically partitioned according to the partitioning scheme of the table. When you specify a value of greater than 0 for this parameter, you limit the degree of parallelism for DPSIs so that Db2 does not create too many parallel tasks that use virtual storage.

Acceptable values: 0-254, DISABLE

Default: 0

Update: option 30 on panel DSNTIPB

DSNZPxxx: DSN6SPRM PARAMDEG_DPSI

Data sharing scope: All members use the same setting

0 Specifies that Db2 uses the value that is specified for the PARAMDEG subsystem parameter, instead of PARAMDEG_DPSI, to control the degree of parallelism when DPSI is used to drive parallelism. This is the default value for the field.

1 Specifies that Db2 creates multiple child tasks but works on one task at a time when DPSI is used to drive parallelism.

2-254 Specifies that Db2 creates multiple child tasks and works concurrently on the tasks that are specified. The number of specified tasks may be larger or smaller than the number of tasks as specified in PARAMDEG. When PARAMDEG is set to 1, the rest of the query does not have any parallelism.

DISABLE Specifies that Db2 does not use DPSI to drive parallelism. Parallelism might still occur for the query if PARAMDEG is greater than 1.

This is for fine-tuning the DPSI use case. Remember, you can have 4096 partitions and so it could well be that a query goes crazy if it sees the ability to go massively parallel. Here you can limit, or even inhibit, that from happening. Only use if you have been bitten by a rogue SQL!

Then we get to Utility parallel processing which is not for all Utilities – but REORG TABLESPACE and COPY are there!

MAX UTILS PARALLELISM field (PARAMDEG_UTIL subsystem parameter)

The PARAMDEG_UTIL subsystem parameter specifies the maximum number of parallel subtasks for some utilities.

PARAMDEG_UTIL affects the following utilities:

• REORG TABLESPACE

• REBUILD INDEX

• CHECK INDEX

• UNLOAD

• LOAD

• COPY

• RECOVER

Acceptable values: 0 – 32767

Default: 99

Update: option 34 on panel DSNTIPB

DSNZPxxx: DSN6SPRM PARAMDEG_UTIL

0 No additional constraint is placed on the maximum degree of parallelism in a utility.

1 – 32767 Specifies the maximum number of parallel subtasks for all affected utilities.

Interesting default huh? In Db2 11 it was actually zero!

LOAD must get a special mention here, as back in Db2 11 it got a new keyword PARALLEL
( nnn ) which enabled parallel loading from a *single* input file (Initially not for PBGs but then that was allowed in Db2 12). This little chestnut has often been forgotten.

Further into the guts of REORG is this little one:

REORG LIST PROCESSING field (REORG_LIST_PROCESSING subsystem parameter)

The REORG_LIST_PROCESSING subsystem parameter specifies the default setting for the PARALLEL option of the Db2 REORG TABLESPACE utility.

Acceptable values: PARALLEL, SERIAL

Default: PARALLEL

Update: option 37 on panel DSNTIPB

DSNZPxxx: DSN6SPRM REORG_LIST_PROCESSING

PARALLEL The default value PARALLEL specifies that the REORG TABLESPACE utility is to use a default PARALLEL YES option when the PARALLEL keyword is not specified in the utility control statement. The PARALLEL YES option specifies that the REORG TABLESPACE utility is to process all partitions that are specified in the input LISTDEF statement in a single execution of the utility.

SERIAL Specifies that the REORG TABLESPACE utility is to use a default PARALLEL NO option when the PARALLEL keyword is not specified in the utility control statement. The PARALLEL NO option specifies that each partition that is specified in the input LISTDEF statement is to be processed in a separate execution of the utility.

I would also happily stick with the default unless you have experienced serious problems with VTS or some such.

Don’t Forget the Bufferpool, Stupid!

BUFFERPOOLs play a major role. The setting of VPSEQT (Virtual bufferpool sequential steal threshold) and VPPSEQT (Virtual bufferpool parallel sequential threshold) might well both have to be raised. Remember that the VPPSEQT is a percentage of the VPSEQT available pages. The bufferpool size itself might well have to be raised (VPSIZE) if you do not see an improvement in the degree of parallelism.

All Ready?

So, you have set, checked, reviewed, changed the ZPARMS and are ready to go?

Where’s the ON Switch?

For Static SQL just bind or rebind with DEGREE(ANY), for dynamic SQL issue a

SET CURRENT DEGREE = ‘ANY’ ;

CDSSRDEF Zparm has the default for this parameter.

If you bind with isolation level CS, then also try and make sure you use CURRENTDATA(NO) as well. This helps performance anyway and also aids Db2 in working with ambiguous cursors. Explicit read-only is always better!

For sorts, where the big elapsed time gains come from, make sure you have sufficiently sized work files allocated! Here, even the WITH HOLD and isolation RR or RS can benefit.

Gotchas?

Always at least one problem isn’t there? If you do DEGREE(ANY) you can expect the EDMPOOL usage to go up between 50% and 70% due to run-time structures. Check your SYSPACKAGE AVGSIZE before and after the BIND/REBIND if you are worried. Always monitor this pool and make sure it is correctly sized for your workload!

Naturally, the CPU might well go up but you should see a good drop in elapsed times and, as far as sort is concerned, you might actually manage a successful parallel sort in a normally constrained sub-system!

Where’s the OFF Switch?

For Static SQL just bind or rebind with DEGREE(1), for dynamic SQL issue a

SET CURRENT DEGREE = ‘1’;

This is also the default value but what if someone changed your default?

If you shrink the VPPSEQT to 0 that will disable all parallel access for objects in that bufferpool.

Insert rows in the resource limit tables – Not recommended as this is a lot of work!

Are You Running in Parallel then? EXPLAIN is Your Friend!

There are a few “access patterns” that allow CP parallelism and they are all documented in the Managing Performance book – “Checklist of query restrictions for query CP parallelism” table.

How do You Check?

The columns of interest in the PLAN_TABLE are the ACCESS_DEGREE and JOIN_DEGREE. If either of these two is not NULL then you are using, or hoping to use, parallel processing! The moment you have more than one table then four other columns become interesting: ACCESS_PGROUP_ID, JOIN_PGROUP_ID, SORTN_PGROUP_ID and SORTC_PGROUP_ID. PGROUP is short for PARALLEL GROUP and for a given PLANNO step they all have the same number. Finally, the PARALLELISM_MODE reflects which type you are using which, these days, can *only* be ‘C’ for Query CP parallelism (Came in DB2 V4). It used to also have the values ‘I’ for parallel I/O operations (Came in DB2 V3)  and ‘X’ for Sysplex query parallelism (Came in DB2 V5) but they were deprecated in Db2 9 and are now both dead and buried!

Which SQLs are Best for Parallel Processing?

SQLs that are I/O intensive and scan lots of pages while returning just a few rows, SQLs that have lots of aggregate functions, and naturally SQLs that require Sort – all of these are good candidates.

Trial it DUMMY!

Best thing to do is a trial rebind to a dummy collection of all your SQL in a sandbox system with production statistics copied over and with parallel processing enabled (Do not forget the bufferpools!). This will then quickly reveal which queries could indeed benefit from going parallel and enable you to activate it only at the package/SQL level that you require in production.

Apply and Test

Once you have your candidate list, and I hope it is not that long, you can simply enable it all in production and do a live run through and review. First with EXPLAIN and then really live.

Remember that it is not good for *all* SQLs but it can really help when it hits the spot!

My Favorite Table Ever!

Db2 11 introduced an automatic five level control system for parallel queries:

Level 1 OK: Query runs with planned parallel degree

Level 2 Mild warning: Reduce parallel degree by ¼

Level 3 Moderate warning: Reduce parallel degree by ½ or to degree 2

Level 4 Severe warning: Reduce to sequential run

Level 5 Melt down: Reduce to sequential run

Level 5 is sub-optimal!

The original table is here: (Do a search for „Melt“)

https://www.redbooks.ibm.com/redbooks/pdfs/sg248222.pdf

How to Tame the Beast?

For static SQL, just bind/rebind with DEGREE(1) to switch off or DEGREE(ANY) to switch on.

For dynamic SQL, if you cannot add the SET CURRENT DEGREE = ‘ANY’ and RLF tables do not work for you then the only way is to assign the tables in question to their own bufferpools and set the VPPSEQT to a value, or leave at default 50, and for *all* other bufferpools set the VPPSEQT to 0 which switches off parallel processing.

Another way of handling dynamic, but a bit over the top for my taste, is a new data-sharing member where the ZPARM CDSSRDEF is set to ‘ANY’. Then any dynamic work that should be allowed to go parallel is simply routed to just this member.

Whaddya all think?

Going to start testing out parallel processing anytime soon?

I’d love to hear from you!

TTFN

Roy Boxwell

Feedback:

One of my readers mentioned that the primary reason they use parallelism is to increase zIIP offload which offers not just elapsed reduction but also saves the cost of the cpu as well.

Naturally, if you get this, then you are really laughing all the way to the bank!

2024-01 Happy New SQLCODE!

Aren’t I supposed to be wishing Happy New Year there? Not this month! One of my readers asked me a question about SQLCODEs and it opened up a quite fascinating can of worms!

What is the SQLCODE?

Anyone reading this blog should already know exactly what the SQLCODE is, however, for those of you out there using Google in 2045, the definition is, at least in COBOL:

01 SQLCA.
   05 SQLCAID     PIC X(8).
   05 SQLCABC     PIC S9(9) COMP-5.
   05 SQLCODE     PIC S9(9) COMP-5.
   05 SQLERRM.
      49 SQLERRML PIC S9(4) COMP-5.
      49 SQLERRMC PIC X(70).
   05 SQLERRP     PIC X(8).
   05 SQLERRD     OCCURS 6 TIMES
                  PIC S9(9) COMP-5.
   05 SQLWARN.
      10 SQLWARN0 PIC X.
      10 SQLWARN1 PIC X.
      10 SQLWARN2 PIC X.
      10 SQLWARN3 PIC X.
      10 SQLWARN4 PIC X.
      10 SQLWARN5 PIC X.
      10 SQLWARN6 PIC X.
      10 SQLWARN7 PIC X.
   05 SQLEXT.
      10 SQLWARN8 PIC X.
      10 SQLWARN9 PIC X.
      10 SQLWARNA PIC X.
      10 SQLSTATE PIC X(5).

The third field within the SQLCA is the SQLCODE, which is a four-byte signed integer and it is filled after every SQL call that a program/transaction makes.

All Clear So Far?

So far so good! In the documentation from IBM is this paragraph about how to handle and process SQLCODEs:

SQLCODE

Db2 returns the following codes in SQLCODE:

• If SQLCODE = 0, execution was successful.

• If SQLCODE > 0, execution was successful with a warning.

• If SQLCODE < 0, execution was not successful.

SQLCODE 100 indicates that no data was found.

The meaning of SQLCODEs, other than 0 and 100, varies with the particular product implementing SQL.

Db2 Application Programming and SQL Guide

So, every programmer I have ever talked to checks if the SQLCODE is 0 – Green! Everything is fine, if the SQLCODE is negative – Bad message and ROLLBACK, if the SQLCODE is +100 – End of cursor or not found by direct select/update/delete – normally 100% Ok, everything else issues a warning and is naturally very dependent on the application and business logic!

What’s Wrong With This Picture?

Well, sometimes zero is not really zero… I kid you not, dear readers! The sharp-eyed amongst you, will have noticed the last bytes of the SQLCA contain the SQLSTATE as five characters. Going back to the documentation:

An advantage to using the SQLCODE field is that it can provide more specific information than the SQLSTATE. Many of the SQLCODEs have associated tokens in the SQLCA that indicate, for example, which object incurred an SQL error. However, an SQL standard application uses only SQLSTATE.

So, it still seems fine, but then it turns out that SQLCODE 0 can have non-all zero SQLSTATEs!

New in the Documentation

At least for a few people this is a bit of a shock. From the SQLCODE 000 documentation:

SQLSTATE

  • 00000 for unqualified successful execution.
  • 01003, 01004, 01503, 01504, 01505, 01506, 01507, 01517, or 01524 for successful execution with warning.

Say What?

Yep, this is simply stating that you have got an SQLCODE 0 but up to nine different SQLSTATEs are possible… This is not good! Most error handling is pretty bad, but now having to theoretically add SQLSTATE into the mix makes it even worse!

What are the Bad Guys Then?

01003 Null values were eliminated from the argument of an aggregate function.

01004 The value of a string was truncated when assigned to another string data type with a shorter length.

01503 The number of result columns is larger than the number of variables provided.

01504 The UPDATE or DELETE statement does not include a WHERE clause.

01505 The statement was not executed because it is unacceptable in this environment.

01506 An adjustment was made to a DATE or TIMESTAMP value to correct an invalid date resulting from an arithmetic operation.

01507 One or more non-zero digits were eliminated from the fractional part of a number used as the operand of a multiply or divide operation.

01517 A character that could not be converted was replaced with a substitute character.

01524 The result of an aggregate function does not include the null values that were caused by evaluating the arithmetic expression implied by the column of the view.

Not Good!

From this list the 01004, 01503, 01506 and especially 01517 just jump right out and scream at you! Here in Europe, we have a right to have our names or addresses correctly written and, in Germany with all the umlauts, it can get difficult if you then have a 01517 but SQLCODE 0 result!

I hope you don’t find this newsletter too unsettling as, after all, Db2 and SQL normally works fine, but I do think that these SQLSTATEs should really have warranted a positive SQLCODE when they were first created…

What do you all think?

TTFN,

Roy Boxwell

Update:

One of my readers wonders how practicle these „errors“ are. A good point, and so here is a nice and easy recreate for the 01003 problem:

CREATE TABLE ROY1 (KEY1   CHAR(8) NOT NULL,                 
                   VALUE1 INTEGER ,                         
                   VALUE2 INTEGER )                         
;                                                           
INSERT INTO ROY1 (KEY1)               VALUES ('A') ;        
INSERT INTO ROY1 (KEY1)               VALUES ('AA') ;       
INSERT INTO ROY1 (KEY1,VALUE1)        VALUES ('B', 1) ;     
INSERT INTO ROY1 (KEY1,VALUE1,VALUE2) VALUES ('BB', 1 , 1) ;
INSERT INTO ROY1 (KEY1,VALUE1)        VALUES ('C', 2) ;     
INSERT INTO ROY1 (KEY1,VALUE1,VALUE2) VALUES ('C', 2 , 2) ; 
SELECT * FROM ROY1                                          
;                                                           
CREATE VIEW ROYVIEW1 AS                                     
(SELECT AVG(VALUE1) AS AVGVAL1, AVG(VALUE2) AS AVGVAL2      
 FROM ROY1)                                                 
;                                                           
SELECT * FROM ROYVIEW1                                      
;                                                           

This set of SQL ends up with these outputs:

---------+---------+---------+--------
KEY1           VALUE1       VALUE2    
---------+---------+---------+--------
A         -----------  -----------    
AA        -----------  -----------    
B                   1  -----------    
BB                  1            1    
C                   2  -----------    
KEY1           VALUE1       VALUE2    
---------+---------+---------+--------
C                   2            2    
DSNE610I NUMBER OF ROWS DISPLAYED IS 6

---------+---------+---------+---------+---------+---------+-----
    AVGVAL1      AVGVAL2                                                
---------+---------+---------+---------+---------+---------+-----
          1            1                                                
DSNT400I SQLCODE = 000,  SUCCESSFUL EXECUTION                           
DSNT418I SQLSTATE   = 01003 SQLSTATE RETURN CODE                        
DSNT415I SQLERRP    = DSN SQL PROCEDURE DETECTING ERROR                 
DSNT416I SQLERRD    = 0 0  0  -1  0  0 SQL DIAGNOSTIC INFORMATION       
DSNT416I SQLERRD    = X'00000000'  X'00000000'  X'00000000'  X'FFFFFFFF'
         X'00000000'  X'00000000' SQL DIAGNOSTIC INFORMATION
DSNT417I SQLWARN0-5 = W,,W,,, SQL WARNINGS                              
DSNT417I SQLWARN6-A = ,,,,   SQL WARNINGS                               
DSNE610I NUMBER OF ROWS DISPLAYED IS 1                                  

This is not so good if you ask me…

2023-11 Are we really living in an agile world?

This month I want to run through all the purely SQL changes that we have received in Db2 12 and in Db2 13 right up until Db2 13 FL504. I was reading about agile development and how fast the deliveries are and so I wondered how many purely SQL changes have we, as SQL Users, actually received over the last six to seven years?

Back to the facts…

Ok, in Db2 12 FL501 we got LISTAGG which was very cool indeed, apart from the fact that you could not use ORDER BY which irritated all the SQL developers I know quite a bit! IBM also created a whole bunch of Accelerator only „pass-thru“ functions that I cannot ever use as I do not know whether or not any of my customers actually has an Accelerator… so, for me, they do not really count. In total 28 BiF’s either got pass-thru or extra Accelerator support so if you *have* an accelerator good news indeed! This was all enabled over Db2 12 FL Levels 504 and 507 as well as APAR PH48480.

Uniwhat?

A bunch of, for me, weird things were the UNI_60 and UNI_90 support added to LOWER, TRANSLATE and UPPER for both Db2 12 and 13. There must be a use case out there but I am lucky enough not to have found it yet!

Super MERGE

MERGE got a major overhaul in Db2 12 with the addition of DELETE support but the story of MERGE did not end there! It still had a major performance problem if any of the index columns got updated as then it was forced to fallback to a tablespace scan. With Db2 13 FL504 (APAR PH47581) this problem was solved – nearly…

The following conditions must be met to enable Db2 to use an index for a MERGE operation when index key columns are being updated:
– The MERGE statement contains a corresponding predicate in one of the
following forms, for each updated index key column:
index-key-column = literal-value, where literal-value is a constant or any
expression that can be treated as a literal, including a host variable, parameter
marker, or non-column expression.
index-key-column IS NULL
– If a view is involved, WITH CHECK OPTION is not specified.

Db2 SQL Reference

MERGE has basically become one of the most powerful SQL statements out there and you can actually cause terrible trouble if you use DRDA with VALUES clauses and hard coded „FOR 10 ROWS“ style of SQLs. All is very well documented and worth a read under the heading:

DRDA considerations when NOT ATOMIC CONTINUE ON SQLEXCEPTION is specified (or the NOT ATOMIC CONTINUE ON SQLEXCEPTION clause is not specified and source-values (VALUES) is specified)

Db2 SQL Reference

This Db2 13 APAR also enabled the chance of getting List Prefetch as an access path which is, as far as I can tell, the only „new“ access path in Db2 13.

Pagination anyone?

The use of OFFSET was a great innovation for Db2 but the „other“ pagination was better! I mean data-dependent pagination which changed this old chestnut of an SQL:

WHERE (LASTNAME = 'SMITH' AND FIRSTNAME > 'JOHN') 
   OR (LASTNAME > 'SMITH')

Into this modern SQL:

WHERE (LASTNAME, FIRSTNAME) > ('SMITH', 'JOHN')

Much much better and for online generated dynamic SQL, I am talking to you CICS, a fantastic win! To verify this you get a range-list index scan „NR“ in your ACCESSTYPE PLAN_TABLE column when you EXPLAIN it.

One at a time please

Piece-wise DELETE using the FETCH FIRST nnnn ROWS was also a really good idea instead of causing possible lock escalations and/or timeouts. A simple loop around the DELETE statement and Bob’s your uncle!

Db2 13 – What’s New?

The big stuff here was the extension of the PROFILE table (I keep talking about it don’t I?) as now it also handles Local things – This is a game changer! Starting with CURRENT_LOCK_TIMEOUT & DEADLOCK_RESOLUTION_PRIORITY but I am sure this list will grow and grow. The PROFILE table is just way to good not to use these days!

AI got a major boost

We got a bunch of AI stuff in Db2 13 (SQL Data Insights) but the first new „agile“ one was in FL504 when AI_COMMONALITY was released. It will hopefully enable shops to find outliers in the data which were not there at training time.

Db2 V8 finally done!

Finally, the last thing that was started way back when in DB2 V8 was done! The length of a column name has been expanded from 30 up to 128 bytes. However, do not do this! The SQLDA is *not* designed for this and so it might look nice on paper but, depending on how you interface to them, it might cause serious grief!

Lower cadence higher quality

IBM have announced a cadence of two FL’s per year down from the 3 – 4 when Agile all started and so I am happy that the list of changes will keep getting longer and the quality of the code higher.

Just SQL!

Please remember all I am talking about here is SQL relevant enhancements – there are tons of others as well – just think about Utilities or FTB etc. etc. For a full list always download and read the latest „What’s New?“ guide.

Did I miss anything? Drop me a line if you think so!

TTFN,

Roy Boxwell