Hi! Continuing on with my AI blog (last one. I promise!) I wish to delve into the innards of the USS part of the SQL Data Insights experience and show you what it all costs!
A Quick Review Perhaps?
Please check my older newsletters for everything about install etc. of SQL DI, and one important thing which is the latest Vector Prefetch APARs (also see my last newsletter for details). Now. I will be doing „before and after“ performance reviews with this feature on and off.
Bad News First!
What I have found, is that when I take a 500,000 row table into SQL DI and choose 17 columns, it takes the *entire* machine as well as all local page datasets and I was forced to cancel it after five hours…
Looking in the Logs…
If you go trawling around your Unix Directories, you will trip over these paths:
/u/work/sqldi
Home is Where the Spark is!
This is “home” where all of the SQL DI stuff is “installed”, naturally your name might be different!
Under here is the next layer of interest to me for the Spark processing.
It is Magic!
/u/work/sqldi/spark – Now this is where Spark does “the magic” and actually computes all your vector table data. It runs in stages and the first is the Base10 (I guess numeric analysis) part. For my test data it looks like this:
Scroll down to the bottom:
So, this ran really quickly!
Internals…
Then it does a ton of internal stuff and it starts actually doing the learning, which is “progressed” in a file like this:
Just Sitting There, Typing REF and Pressing ENTER…
Of course your name will be different, but just sitting there in OMVS and using the REF command you will see this file grow in size every now and again. When it does, quickly Browse on in and you will see stuff like this:
ibm-data2Vec (1.1.0 for zOS) starting execution using file /var/sqldi/temp/training/DSNAIDB_AIDB_DAI
ibm-data2vec found the required library: libzaio.so. Proceeding with the training..
ibm-data2vec will use following mode: CBLAS
User has not provided training chunk size. Using 1 GB chunk size for reading training file.
ibm-data2Vec is preallocating space for the model using user-provided value 1230314
ibm-data2Vec starting execution using file /var/sqldi/temp/training/DSNAIDB_AIDB_DAIN0610_IQATW001_1
83951683 ! 2023-03-15 07:17:27 ! Time elapsed learning vocab from train file = 145.91525s
Processed 13103200 words in the training file. There are 1213852 unique words in the vocabulary: Pri
Model training code will generate vectors for row-identifier (pk_id) or user-specified primary keys
83951683 ! 2023-03-15 07:17:27 ! Stage 1 completed. Time elapsed during file reading = 145.91643s
Training the database embedding (db2Vec) model using 12 CPU thread(s)
Whole Machine Gone – Oh Oh!
Now, in my case, it just sat there for a while taking all paging, all frames, all ziip and cp cpu and then it wrote out:
Epoch 0 learning rate Alpha=0.024704 Training Progress=5.00%
Epoch 0 learning rate Alpha=0.024404 Training Progress=10.00%
Epoch 0 learning rate Alpha=0.024099 Training Progress=15.00%
Epoch 0 learning rate Alpha=0.023791 Training Progress=20.00%
Epoch 0 learning rate Alpha=0.023486 Training Progress=25.00%
Epoch 0 learning rate Alpha=0.023182 Training Progress=30.00%
Epoch 0 learning rate Alpha=0.022885 Training Progress=35.00%
Epoch 0 learning rate Alpha=0.022582 Training Progress=40.00%
Epoch 0 learning rate Alpha=0.022286 Training Progress=45.00%
Epoch 0 learning rate Alpha=0.021980 Training Progress=50.00%
Epoch 0 learning rate Alpha=0.021673 Training Progress=55.00%
That last line was written out at 12:42 and after starting at 07:17 you can see that I still had nearly a five hour wait ahead of me. Time to cancel and rethink this!
Restart!
Thankfully, on the GUI interface (where you cannot see this progress info, sadly!) the “Stop training” button worked after a while. If it does not respond then you can just issue the
S SQLDAPPS,OPTION='SQLDSTOP'
command to stop it. Then, once all stopped, and the cpus have cooled down a bit, you can select a smaller data set and retry learning!
Smaller is Sometimes Better!
And with 40.000 rows it is much faster:
50397300 ! 2023-03-15 12:17:16 ! Stage 1 completed. Time elapsed during file reading = 26.992490s
Training the database embedding (db2Vec) model using 12 CPU thread(s)
Epoch 0 learning rate Alpha=0.024765 Training Progress=5.00%
Epoch 0 learning rate Alpha=0.024539 Training Progress=10.00%
Epoch 0 learning rate Alpha=0.024308 Training Progress=15.00%
Epoch 0 learning rate Alpha=0.024073 Training Progress=20.00%
Epoch 0 learning rate Alpha=0.023826 Training Progress=25.00%
Epoch 0 learning rate Alpha=0.023591 Training Progress=30.00%
Epoch 0 learning rate Alpha=0.023354 Training Progress=35.00%
Epoch 0 learning rate Alpha=0.023115 Training Progress=40.00%
Epoch 0 learning rate Alpha=0.022878 Training Progress=45.00%
Epoch 0 learning rate Alpha=0.022637 Training Progress=50.00%
Epoch 0 learning rate Alpha=0.022406 Training Progress=55.00%
Naturally, this is heavily dependent on the machine you have, the memory you have and the size of your local paging dataset.
EXPLAIN Yourself!
So now to do some EXPLAIN runs and then a quick comparison of the “double” AI Whammy that I have, quickly followed by the “New” PTF that, hopefully, sorts it all out.
Double Trouble?
You might have noticed that in my test SQLs I have to use the BiF AI twice. Once for the SELECT and once for the WHERE. This is because the use of the AI_VALUE column is not supported in the WHERE predicate.
Naturally, you can re-write the query to look like this:
SELECT * FROM
(SELECT AI_SEMANTIC_CLUSTER( PROGRAM,
'DSNTIAUL',
'DSN§EP2L',
'DSN§EP4L') AS AI_VALUE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,SUBSTR(A.PRIM_AUTHOR , 1 , 8) AS PRIM_AUTHOR
,SUBSTR(A.PROGRAM , 1 , 8) AS PROGRAM
,SUBSTR(A.REF_TABLE , 1 , 18) AS REF_TABLE
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM DAIN0610.IQATW001 A
WHERE 1 = 1
AND A.PROGRAM NOT IN ('DSNTIAUL',
'DSN§EP2L',
'DSN§EP4L')
AND A.STMT_ORIGIN = 'D'
)
WHERE AI_VALUE IS NOT NULL
ORDER BY 1 DESC -- SHOW BEST FIRST
--ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 10 ROWS ONLY ;
Does My Work File Look Big to You?
The problem is that now you have a HUGE work file… In my tests it was always much quicker to code the AI BiF twice. After all, it is always „Your Mileage May Vary“, „The Cheque is in the post“ or „It depends“, isn’t it?
AI Does Use the Optimizer!
EXPLAIN Output… The AI Does indeed get output by EXPLAIN (I was surprised about this to be honest!) for the following query:
SELECT AI_SEMANTIC_CLUSTER( PROGRAM,
'DSNTIAUL',
'DSN§EP2L',
'DSN§EP4L') AS AI_VALUE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,SUBSTR(A.PRIM_AUTHOR , 1 , 8) AS PRIM_AUTHOR
,SUBSTR(A.PROGRAM , 1 , 8) AS PROGRAM
,SUBSTR(A.REF_TABLE , 1 , 18) AS REF_TABLE
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM DAIN0610.IQATW001 A
WHERE 1 = 1
AND A.PROGRAM NOT IN ('DSNTIAUL',
'DSN§EP2L',
'DSN§EP4L')
AND AI_SEMANTIC_CLUSTER( PROGRAM,
'DSNTIAUL',
'DSN§EP2L',
'DSN§EP4L')
IS NOT NULL
AND A.STMT_ORIGIN = 'D'
ORDER BY 1 DESC -- SHOW BEST FIRST
--ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 10 ROWS ONLY ;
The EXPLAIN output looks like:
Then it gets an interesting STAGE2 RANGE predicate!
which resolves into:
So here we see what the BiF is doing from the perspective of the Optimizer! If you run the nested table version of the query then this line does *not* appear at all!
Notice here that the RANGE is now a STAGE1!
Optimize This!
So IBM Db2 has incorporated it into the Optimizer which is a good thing. But please remember: your SQL can have local predicates that cut down the size of the work file and so evens out the access times… Basically, you must code both and test to see which of the solutions is better for typical usage (As always really…)
Time, Measure, Repeat
Ok, now just doing one execute of the double query requires 2.58 seconds of CPU and 15.35 seconds elapsed. The statement is *in* the DSC so prepare time can be ignored. Here you can see it has been executed twice so we have average values but I am using the CPU from the batch job as it is more precise.
Changing the query to now fetch back all rows instead of first ten requires 7.06 seconds of CPU and 48.78 seconds elapsed. But it returned over 200K rows!
While the query was running you can see the SQLD SQL DI in SDSF taking quite large chunks of zIIP time…
Now I will enable Vector Prefetch with a value of 10GB to see if it makes an impact for these queries. To do this you must update the ZPARM MXAIDTCACH and then enable the changed ZPARM.
That is Not What I was Expecting!
First query is now 2.56 CPU and 15.26 Elapsed. More like background noise than an improvement. And now with the FETCH FIRST removed 7.07 and 49.36 seconds. I guess my queries are not improved with Vector Prefetch!
Could be Me…
From the IBM Vector Prefetch docu:
Time to Join the Real World!
Now let’s try and see if I can discover something new in real data! Anything sensitive has been obfuscated!
SELECT AI_SIMILARITY( PROGRAM,
'IQADBACP') AS AI_VALUE
,SUBSTR(A.PRIM_AUTHOR , 1 , 8) AS PRIM_AUTHOR
,SUBSTR(A.PROGRAM , 1 , 8) AS PROGRAM
,SUBSTR(A.REF_TABLE , 1 , 18) AS REF_TABLE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM DAIN0610.IQATW001 A
WHERE 1 = 1
AND NOT A.PROGRAM = 'IQADBACP'
AND AI_SIMILARITY ( PROGRAM,
'IQADBACP')
IS NOT NULL
AND A.STMT_ORIGIN = 'D'
ORDER BY 1 DESC -- SHOW BEST FIRST
--ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 10 ROWS ONLY;
This is similar to my test from last month but now on real data. Note that I have added a predicate A.STMT_ORIGIN = ‚D‘ as I only want Dynamic SQL programs:
Dynamic Hits?
Here you can see that it has found a variety of programs that also do dynamic SQL but I also „helped“ it by only asking for dynamic SQL. So now once again but this time without the predicate A.STMT_ORIGIN = ‚D‘:
Success!
It has found nearly all from the first list but also different ones, crucially it has *not* found any Static SQL!
So, that’s enough of AI for the next few months for me. However, if you have any questions or ideas that I could try out feel free to email!
OK, I kept you all waiting long enough… Here are my AI results with Db2 13 FL501!
Start at the Start
We begin with the beginning as last time:
Let’s get Connected!
Here you can see that I have already defined my little test Db2 13 system to the system:
Join the Dots …
Now just click on the vertical dots:
Here you can Disconnect, Edit (Which shows you the same window as “add connection”), List AI objects or Delete.
What do we have?
Choosing List AI objects you see what has been created:
Clicking on the down arrow on the left-hand side to expand looks a lot better than last month:
Clickedy-click-click
Now, clicking on the vertical dots on the right hand side, you can choose to Disable AI query or Enable AI query. (I have actually added a new column for consideration, so first I clicked on Disable and then clicked again on Enable)
Just the Facts, Ma’am – Again
Here you must make your “Usual Suspects” decision: which columns to actually use in building the AI Model. I am using our WorkLoadExpert performance table in this newsletter and have selected 17 columns that I think will work together nicely. Only one can be a “Key” column – I choose STMT_ID in this case. Once you are done selecting columns, click on the big blue “Next” button where you may then add additional filters to remove any rows you know are to be ignored:
Playing Chicken?
When done, click on the big blue “Enable” button and you get your last chance to chicken out:
SIO and CPU Records!
Click here and then get a cup of coffee….or go to SDSF and marvel at how much CPU and IO Spark actually uses and does this as the light bulbs dim in your part of the world…
You Keep me Spinning
Oddly, at least when I do this, the Initializing spinning wheels:
Right Round and Around
… never stop. The WLM Stored procedure for utilities was finally kicked off about 40 minutes later:
J E S 2 J O B L O G -- S Y S T E M
10.43.19 STC09611 ---- WEDNESDAY, 01 MAR 2023 ----
10.43.19 STC09611 $HASP373 DD10WLMU STARTED
10.43.19 STC09611 IEF403I DD10WLMU - STARTED - TIME=10.43.19
10.43.19 STC09611 ICH70001I SQLDIID LAST ACCESS AT 09:37:37
A Loaded Question?
And loaded all the required data:
ICE134I 0 NUMBER OF BYTES SORTED: 99083595
ICE253I 0 RECORDS SORTED - PROCESSED: 49173, EXPECTED: 49173
A quick exit and re-logon to the web interface…and Tra la!
Not only AI but Dr Who!
It is also strange that it seems to be in a time machine, one hour in advance of my local…Anyways, my new data is there and so onward! (I have since heard that our time zone setting is actually to blame and that just going back one level, and then forward again, stops the spinning wheel problem. However, just wait until Spark finishes and the stored procedure has loaded your data!)
Never Trust a Statistic You haven’t Faked Yourself!
Clicking on Data statistics shows:
Influencer of the Day?
Then you can look at the Column influence:
Super Model?
Back at the top you can then review the Model details:
Or just a Cluster….
Here are the Cluster center details:
Going back to the List AI Objects window, there are two blue buttons: Add object and Run query. I did not discuss Run Query last month but it gives you a SPUFI-like ability on the PC, tailored to the AI BiFs:
Lets RUN Away!
Clicking on Query type gives a drop-down list of the basic AI BiFs where it then gives you an example SQL (based on the documentation, *not* on any AI Tables you might have done!). Once you type in any query the “run” box turns blue:
It Works!
Click run and see the results:
Data Review
Once the model is trained, you can then review on the host what it has done. In SPUFI you can find details of what you have done in the pseudo Db2 catalog tables that support Data Insights, (I have removed a ton of rows to make this readable – sort of!):
SELECT * FROM
SYSAIDB.SYSAIOBJECTS ;
---------+---------+---------+---------+---------+---------+---------+---------+-------+-
OBJECT_ID OBJECT_NAME OBJECT_TYPE SCHEMA NAME
---------+---------+---------+---------+---------+---------+---------+---------+-------+-
26 -------------------------------- T IQA061QB IQATW001
-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
STATUS CONFIGURATION_ID MODEL_ID CREATED_BY CREATED_DATE
-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
Enabled 36 36 SQLDIID 2023-02-24-07.57.42.086932
-------+---------+---------+---------+---------+---------+---------+---------+
LAST_UPDATED_BY LAST_UPDATED_DATE DESCRIPTION
-------+---------+---------+---------+---------+---------+---------+---------+
SQLDIID 2023-03-01-10.43.38.407460 ----------------
SELECT * FROM
SYSAIDB.SYSAICONFIGURATIONS ;
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----
CONFIGURATION_ID NAME OBJECT_ID RETRAIN_INTERVAL KEEP_ROWIDENTIFIER_KEY
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-----
36 -------------------------------- 26 ---------------- Y
---+---------+-------
NEGLECT_VALUES
---+---------+-------
---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------
CREATED_BY CREATED_DATE LAST_UPDATED_BY LAST_UPDATED_DATE
---+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------
SQLDIID 2023-03-01-09.51.00.994421 SQLDIID 2023-03-01-09.51.00.994461
SELECT * FROM
SYSAIDB.SYSAICOLUMNCONFIG
ORDER BY 1 , 3 , 2 ;
--+---------+---------+---------+---------+---------+---------+---------+---------+---------+
CONFIGURATION_ID COLUMN_NAME COLUMN_AISQL_TYPE COLUMN_PRIORITY NEGLECT_VALUES
--+---------+---------+---------+---------+---------+---------+---------+---------+---------+
36 END_USERID C H
36 PRIM_AUTHOR C H
36 PROGRAM C H
36 REF_TABLE C H
36 REF_TAB_QUAL C H
36 STMT_ORIGIN C H
36 STMT_TEXT C H
36 TRANSACTION C H
36 WORKSTATION C H
36 COPIES_NO I H
.
.
.
36 WLX_TYPE I H
36 WORKSTATION_OLD I H
36 STMT_ID K H
36 CPU_TIME N H
36 ELAPSE_TIME N H
36 EXECUTIONS N H
36 GETP_OPERATIONS N H
36 ROWS_EXAMINED N H
36 ROWS_PROCESSED N H
36 STMT_LENGTH N H
When the column COLUMN_AISQL_TYPE has a value of “I” it means it is ignored by AI processing. Also note that this table SYSAICOLUMNCONFIG gets two extra columns (COLUMN_VECTOR_CARDINALITY and MAX_DATA_VALUE_LEN) once you apply the vector prefetch upgrade APARs:
For IBM Z AI Optimization (zAIO) library and IBM Z AI Embedded (zADE) library in the IBM Z Deep Neural Network (zDNN) stack on z/OS: • Apply OA63950 and OA63952 for z/OS 2.5 (HZAI250). • Apply OA63949 and OA63951 for z/OS 2.4 (HBB77C0).
For OpenBLAS on z/OS: • Apply PH49807 and PH50872 for both z/OS 2.5 and z/OS 2.4 (HTV77C0). • Apply PH50881 for z/OS 2.5 (HLE77D0). • Apply PH50880 for z/OS 2.4 (HLE77C0).
For Db2 13 for z/OS, apply PH51892. Follow the instructions for DDL migration outlined in the ++ HOLD text. By default, the new Db2 subsystem parameter MXAIDTCACH is set to 0, indicating that vector prefetch is disabled. To enable vector prefetch, set MXAIDTCACH to a value between 1 and 512. This parameter is online changeable. See “IBM Db2 13 for z/OS documentation” on MXAIDTCACH.
For SQL Data Insights 1.1.0 UI and model training (HDBDD18), apply PH51052.
Further, the table SYSAIMODELS got a new column MODEL_CODE_LEVEL and an increase in size for the METRIC column to 500K with the above APARs.
SELECT * FROM
SYSAIDB.SYSAIMODELS ;
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------
MODEL_ID NAME OBJECT_ID CONFIGURATION_ID VECTOR_TABLE_CREATOR
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------
36 -------------------------------- 26 36 DSNAIDB
+---------+---------+--
VECTOR_TABLE_NAME
+---------+---------+--
AIDB_IQA061QB_IQATW001
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
VECTOR_TABLE_STATUS VECTOR_TABLE_DBID VECTOR_TABLE_OBID VECTOR_TABLE_IXDBID VECTOR_TABLE_IXOBID VECTOR_TABLE_VERSION
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+
A 329 3 329 4 1
-------+---------+---------+---------+---------+---------+---------+---------+-------
METRICS
-------+---------+---------+---------+---------+---------+---------+---------+-------
[{"discriminator":8.59443984950101,"influence":0.9367419701380996,"name":"TRANSACTION",
-------+---------+---------+
INTERPRETABILITY_STRUCT
-------+---------+---------+
-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----
CREATED_BY CREATED_DATE LAST_UPDATED_BY LAST_UPDATED_DATE
-------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----
SQLDIID 2023-03-01-09.51.03.777504 SQLDIID 2023-03-01-10.43.37.796847
---+---------+---------+---------+---------+--
MODEL_ROWID
---+---------+---------+---------+---------+--
2495E518C773B081E09C018000000100000000002213
SELECT * FROM
SYSAIDB.SYSAICOLUMNCENTERS
ORDER BY 1 , 2 , 3 ;
----+---------+---------+---------+---------+---------+---------+
MODEL_ID COLUMN_NAME CLUSTER_MIN LABEL
----+---------+---------+---------+---------+---------+---------+
36 CPU_TIME -0.7200000000000000E+76 EMPTY
36 CPU_TIME +0.0 E+00 c0
36 CPU_TIME +0.2000000000000000E+01 c1
36 CPU_TIME +0.1617671400000000E+08 c9
36 ELAPSE_TIME -0.7200000000000000E+76 EMPTY
36 ELAPSE_TIME +0.0 E+00 c0
36 ELAPSE_TIME +0.2000000000000000E+01 c1
36 ELAPSE_TIME +0.1008466600000000E+08 c9
36 ELAPSE_TIME +0.1074954980000000E+09 c10
SELECT * FROM
SYSAIDB.SYSAITRAININGJOBS ;
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------
TRAINING_JOB_ID OBJECT_ID CONFIGURATION_ID MODEL_ID STATUS PROGRESS RESOURCE
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--------+-------
33 26 33 33 F 0
34 26 34 34 F 0
35 26 35 35 C 100
36 26 36 36 C 100
-+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+------
MESSAGES
-+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+------
{"messages":"failed to train model: Something went wrong with the zLoad, please check the SQL DI log for more details.","resumeI
{"messages":"failed to train model: Something went wrong with the zLoad, please check the SQL DI log for more details.","resumeI
{"messages":"model training is completed","sparkSubmitId":"driver-20230224105851-0002"}
{"messages":"model training is completed","sparkSubmitId":"driver-20230301085133-0003"}
-+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----
START_TIME END_TIME CREATED_BY CREATED_DATE
-+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+----
2023-02-24-08.01.20.737455 2023-02-24-08.51.56.386011 SQLDIID 2023-02-24-08.01.20.737455
2023-02-24-10.52.27.687965 2023-02-24-11.43.22.095144 SQLDIID 2023-02-24-10.52.27.687965
2023-02-24-11.58.20.109571 2023-02-24-12.49.20.660143 SQLDIID 2023-02-24-11.58.20.109571
2023-03-01-09.51.03.777662 2023-03-01-10.43.38.407414 SQLDIID 2023-03-01-09.51.03.777662
---+---------+---------+---------+---------+---------+------
LAST_UPDATED_BY LAST_UPDATED_DATE
---+---------+---------+---------+---------+---------+------
SQLDIID 2023-02-24-08.51.56.386030
SQLDIID 2023-02-24-11.43.22.095164
SQLDIID 2023-02-24-12.49.20.660160
SQLDIID 2023-03-01-10.43.38.407425
KPIs from my Data
Here are a few KPIs from these first test runs:
SELECT COUNT(*) FROM IQA061QB.IQATW001 ;
64390
SELECT COUNT(*) FROM DSNAIDB.AIDB_IQA061QB_IQATW001 ;
49173
SELECT SUBSTR(A.COLUMN_NAME, 1, 12) AS COLUMN_NAME
, SUBSTR(A.VALUE , 1, 12) AS VALUE
, A.VECTOR
FROM DSNAIDB.AIDB_IQA061QB_IQATW001 A
ORDER BY 1 , 2 ;
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--
COLUMN_NAME VALUE VECTOR
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+--
CPU_TIME c0 3E594822BC9D2C7A3CD4F61DBD37E5033D34B314BD4CF8E3BD4B4D47BCB6CE293D1DBA1A3D858FDF3DC4DF08BD9E77753CCED43F
CPU_TIME c1 3D9214383CFE4C90BDB3DFE4BBE407563BBA69553DB48FEFBCF39451BC6BABF0BDA31BDFBDB52F883C30B992BC8D71AF3D9E54FF
ELAPSE_TIME c0 3E55B744BCCC5CED3D129B14BC9E553C3C9B121EBD8949C0BD4F838DBD1582A33D36D6363DA1F72F3DBCB033BDAFB88F3D4DE348
ELAPSE_TIME c1 3DE390AC3D2DCC98BD2DF437BC5B7F713D766D103BD1AC10BB48E2C43B9FA9E6BD80D5D7BDC40AFE3CE586C9BCACADE93DFE2745
END_USERID BOXWEL2 3D505075BD80E40F3D3AAB60BBA463F6BBCC51C43D92B118BD044D20BD8C6B3B3CC315133BBB087A3DC1D5923DC4EB763D039C8B
END_USERID BOXWEL3 3D2FB919BC5013E3BD6652DDBD4654DA3DA4AC83BA70024FBD7FAFD0BCF16670BB2CCB4B3DBE32E93DFE13383CB052283C82FD46
As I mentioned last month the vector tables are very “special”!
What now?
So now we have analyzed a bunch of SQL WorkLoadExpert data from our own labs. What can we do?
First up, I wish to see what user KKKKKKK does with dynamic SQL that is “similar” to what I do with table IQATW001 but I am only interested in those SQLs where the AI thinks it is more than 0.5 (so very analogous):
SELECT AI_ANALOGY('BOXWEL3' USING MODEL COLUMN PRIM_AUTHOR,
'IQATW001' USING MODEL COLUMN REF_TABLE ,
'KKKKKKK' USING MODEL COLUMN PRIM_AUTHOR,
REF_TABLE ) AS AI_VALUE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,SUBSTR(A.PRIM_AUTHOR , 1 , 8 ) AS PRIM_AUTHOR
,SUBSTR(A.PROGRAM , 1 , 8 ) AS PROGRAM
,SUBSTR(A.REF_TABLE , 1 , 18) AS REF_TABLE
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM IQA061QB.IQATW001 A
WHERE A.PRIM_AUTHOR = 'KKKKKKK'
AND AI_ANALOGY('BOXWEL3' USING MODEL COLUMN PRIM_AUTHOR,
'IQATW001' USING MODEL COLUMN REF_TABLE ,
'KKKKKKK' USING MODEL COLUMN PRIM_AUTHOR,
REF_TABLE )
> 0.5
ORDER BY 1 DESC -- SHOW BEST FIRST
--ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 2000 ROWS ONLY ;
All interesting stuff! I use dynamic SQL to INSERT into the table a lot, and it has determined that use of dynamic SQL with tables R510T002 and IQATA001 is analogous. In fact, it is! The SQLs were all INSERT, DELETE and UPDATE… Clever ol’ AI!
Dynamic Duo?
Now I wish to see which programs process dynamic SQL like the IBM DSNTIAD and DSNTIAP programs:
SELECT AI_SEMANTIC_CLUSTER( PROGRAM,
'DSNTIAD',
'DSNTIAP') AS AI_VALUE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,SUBSTR(A.PROGRAM , 1 , 8) AS PROGRAM
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM IQA061QB.IQATW001 A
WHERE A.PROGRAM NOT IN ('DSNTIAD', 'DSNTIAP')
AND A.STMT_ORIGIN = 'D'
ORDER BY 1 DESC -- SHOW BEST FIRST
--ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 10 ROWS ONLY ;
Again, very nice – it spotted all of the RealTime DBAExpert Dynamic SQL access programs in use…
Undynamic Duo?
Ok, now the opposite of that query, show me the SQLs that are like them but not them!
SELECT AI_SEMANTIC_CLUSTER( PROGRAM,
'DSNTIAD',
'IQADBACP',
'SEDBTIAA') AS AI_VALUE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,SUBSTR(A.PRIM_AUTHOR , 1 , 8) AS PRIM_AUTHOR
,SUBSTR(A.PROGRAM , 1 , 8) AS PROGRAM
,SUBSTR(A.REF_TABLE , 1 , 18) AS REF_TABLE
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM IQA061QB.IQATW001 A
WHERE A.PROGRAM NOT IN ('DSNTIAD', 'IQADBACP' ,'SEDBTIAA')
--AND A.STMT_ORIGIN = 'D'
--ORDER BY 1 DESC -- SHOW BEST FIRST
ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 10 ROWS ONLY ;
Aha! It found a little assembler program that fires off SQL like the top three!
The Apple doesn’t Fall far from the Tree
Finally, I want to see which programs behave like IQADBACP (our main dynamic SQL driver program):
SELECT AI_SIMILARITY( PROGRAM,
'IQADBACP') AS AI_VALUE
,A.WLX_TIMESTAMP
,A.STMT_ID
,A.STMT_TIMESTAMP
,SUBSTR(A.PROGRAM , 1 , 8) AS PROGRAM
,A.EXECUTIONS
,A.GETP_OPERATIONS
,A.ELAPSE_TIME
,A.CPU_TIME
,A.STMT_TEXT
FROM IQA061QB.IQATW001 A
WHERE NOT A.PROGRAM = 'IQADBACP'
AND A.STMT_ORIGIN = 'D'
ORDER BY 1 DESC -- SHOW BEST FIRST
--ORDER BY 1 -- SHOW WORST FIRST
FETCH FIRST 10 ROWS ONLY;
And the output:
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---
AI_VALUE WLX_TIMESTAMP STMT_ID STMT_TIMESTAMP PROGRAM
---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---
+0.4575602412223816E+00 2023-02-02-10.45.26.535375 7 2023-01-17-16.50.54.118774 O2DB6X
+0.4575602412223816E+00 2023-01-06-05.27.28.779825 7 2023-01-17-16.50.54.118774 O2DB6X
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 220 2023-01-20-10.11.14.618038 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 222 2023-01-20-10.11.38.136712 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 221 2023-01-20-10.11.21.993833 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 252 2023-01-20-10.55.07.078652 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 251 2023-01-20-10.54.37.901247 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 233 2023-01-20-10.47.23.961076 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 232 2023-01-20-10.46.59.756430 DSMDSLC
+0.4400676488876343E+00 2023-01-06-05.27.28.779825 224 2023-01-20-10.33.42.609175 DSMDSLC
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-
EXECUTIONS GETP_OPERATIONS ELAPSE_TIME CPU_TIME STMT_TEXT
+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------+-
0 0 0 0 SELECT COALESCE(COALESCE(A.DBNAME,B.DBNAME),C
2 7 27753 1236 SELECT COALESCE(COALESCE(A.DBNAME,B.DBNAME),C
2 1387 57974 14900 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
6 4170 68943 53330 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
6 4596 286233 99773 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
1 851 55367 42542 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
1 298 122961 24848 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
2 1260 68272 48952 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
1 192 3395 2508 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
3 810 43520 23771 SELECT CASE WHEN B.VCATNAME < ' ' THEN '00000
Again, it found all of the correct programs.
Quibble Time!
I did find some small problems…
I use ALIASes a lot and they appear in the drop-down selection box when in “Add object”, but if you choose one as an AI Object:
This then leads on to the second quibble… The red windowed error messages stay there until you click them away… This can lead you to believe that a problem exists when in reality everything is groovy!
I also found out that the spinning wheel completes if you wait for Spark and LOAD and then go back and forward on the panel.
Finally, the way you move around the product is a bit odd… sometimes you use the browser back function, sometimes you click on a “Back” button, sometimes you click on a bread crumb, sometimes there are multiple options hidden under triple vertical dots which change depending on where you are in the process.
I am sure these little UI bugs will all get ironed out very quickly!
End of Quibbles.
First Baby Steps Taken!
This little trip into the AI world is really just the tip of the iceberg. I will be doing many more AI queries over the coming months, and I hope to show all my results, either here or in another one of my Newsletters and/or at the German GUIDE in April 2023 and, hopefully, at the IDUG 2023 as well.
Any questions about AI, do not fear to ask, and when not me then ChatGPT!
TTFN
Roy Boxwell
Um unsere Webseite für Sie optimal zu gestalten und fortlaufend verbessern zu können, verwenden wir Cookies. Lehnen Sie Cookies ab, stehen einige Funktionen der Website nicht zur Verfügung. Weitere Informationen hierzu erhalten Sie in unserer Datenschutzerklärung.AkzeptierenAblehnenDatenschutzerklärung