Part five of my – everlasting – walk through of new Aggregate and Scalar functions. This time I will start with the new AGGREGATE functions introduced in Db2 12.
LISTAGG
This is mainly of interest due to the fact that it was the very first Agile Function Level feature in Db2. You must be at FL501 to use this in SQL otherwise you get a
DSNT408I SQLCODE = -4700 ATTEMPT TO USE NEW FUNCTION BEFORE FUNCTION LEVEL IS ACTIVATED
Or a
DSNT408I SQLCODE = -4743, ERROR: ATTEMPT TO USE A FUNCTION WHEN THE APPLICATION COMPATIBILITY SETTING IS SET FOR A PREVIOUS LEVEL
What this function does, is basically a horizontal recursive join which can be sorted in a different sequence. Here’s the SQL example:
SELECT WORKDEPT,
LISTAGG(LASTNAME, ', ') WITHIN GROUP(ORDER BY LASTNAME)
AS EMPLOYEES
FROM SAMPLE.EMP
GROUP BY WORKDEPT;
Which returns:
---------+---------+---------+---------+---------+---------+---------+-----+
WORKDEPT EMPLOYEES
---------+---------+---------+---------+---------+---------+---------+-----+
A00 HAAS, LUCCHESI, O'CONNELL
B01 THOMPSON
C01 KWAN, NICHOLLS, QUINTANA
D11 ADAMSON, BROWN, JONES, LUTZ, PIANKA, SCOUTTEN, STERN, WALKER,
YOSHIMURA
D21 JEFFERSON, JOHNSON, MARINO, PEREZ, PULASKI, SMITH
E01 GEYER
E11 HENDERSON, PARKER, SCHNEIDER, SETRIGHT, SMITH
E21 GOUNOT, LEE, MEHTA, SPENSER
DSNE610I NUMBER OF ROWS DISPLAYED IS 8
So you can see it adds the LASTNAME and a comma repeatedly for each WORKDEPT value.
Db2 Catalog Docu
This function is actually very handy for one of the things I create for our programmers here in the labs. It is always a good idea to “know the catalog” and so I created a little word document years ago with all the tablespaces, tables, indexes and index columns with sort order so that they did not have to look in three separate books to find the data. Here is the SQL for listing out the tables with indexes:
SELECT SUBSTR(A.DBNAME, 1, 8) AS DB
,SUBSTR(A.NAME, 1, 8 ) AS TS
,SUBSTR(B.NAME, 1, 18) AS TABLE
,SUBSTR(C.NAME, 1, 8 ) AS INDEX
,B.TYPE
,CASE C.UNIQUERULE
WHEN 'D' THEN '-'
ELSE 'Y'
END AS U
,D.COLSEQ
,SUBSTR(D.COLNAME, 1, 18) AS COL
,CASE D.ORDERING
WHEN ' ' THEN 'I'
WHEN 'A' THEN '-'
WHEN 'D' THEN 'D'
WHEN 'R' THEN 'R'
END AS ORDER
FROM SYSIBM.SYSTABLESPACE A
,SYSIBM.SYSTABLES B
,SYSIBM.SYSINDEXES C
,SYSIBM.SYSKEYS D
WHERE A.DBNAME IN ('DSNDB01','DSNDB06')
AND A.DBNAME = B.DBNAME
AND A.NAME = B.TSNAME
AND B.CREATOR = 'SYSIBM'
AND B.NAME = C.TBNAME
AND B.CREATOR = C.TBCREATOR
AND C.NAME = D.IXNAME
AND C.CREATOR = D.IXCREATOR
ORDER BY DB, TS , TABLE , INDEX, D.COLSEQ ;
It returns 855 rows of data on my test Db2 12 Fl501 system:
---------+-------+-------+---+-------+------+--------+------------------+--
DB TS TABLE INDEX TYPE U COLSEQ COL ORDER
---------+-------+-------+---+-------+------+--------+------------------+--
DSNDB01 DBD01 DBDR DSNDB01X T Y 1 DBID -
DSNDB01 DBD01 DBDR DSNDB01X T Y 2 SECTION -
DSNDB01 SCT02 SCTR DSNSCT02 T Y 1 SCTNAME -
DSNDB01 SPT01 SPTR DSNSPT01 T Y 1 SPTLOCID -
DSNDB01 SPT01 SPTR DSNSPT01 T Y 2 SPTCOLID -
DSNDB01 SPT01 SPTR DSNSPT01 T Y 3 SPTNAME -
And ending with
DSNDB06 SYSXML SYSXMLRELS DSNXRX02 T - 1 XMLTBOWNER -
DSNDB06 SYSXML SYSXMLRELS DSNXRX02 T - 2 XMLTBNAME -
DSNDB06 SYSXML SYSXMLSTRINGS DSNXSX01 T Y 1 STRINGID -
DSNDB06 SYSXML SYSXMLSTRINGS DSNXSX02 T Y 1 STRING -
DSNE610I NUMBER OF ROWS DISPLAYED IS 855
Then I used my amazing WORD skills to move all the COL columns after each other to get a DOC file that looks like this:
DSNDB0n. SYSIBM. SYSIBM. U INDEX FIELD
DBD01 DBDR DSNDB01X Y DBID.SECTION
SCT02 SCTR DSNSCT02 Y SCTNAME
SPT01 SPTR DSNSPT01 Y SPTLOCID.SPTCOLID.SPTNAME.etc.
Which, of course, the programmers loved, as it enabled them to quickly find which columns, in which sort order, (I convert the ORDER D COL column to be in bold by the way), are available on the catalog and directory tables. I started this with DB2 V5…
Now with LISTAGG the query looks like:
SELECT SUBSTR(A.DBNAME, 1, 8)
,SUBSTR(A.NAME, 1, 8 )
,SUBSTR(B.NAME, 1, 18)
,SUBSTR(C.NAME, 1, 8 )
,B.TYPE AS T
,CASE C.UNIQUERULE
WHEN 'D' THEN '-'
ELSE 'Y'
END AS U
,LISTAGG( CHAR(SUBSTR(D.COLNAME, 1, 18) CONCAT
CASE D.ORDERING
WHEN ' ' THEN 'I'
WHEN 'A' THEN '-'
WHEN 'D' THEN 'D'
WHEN 'R' THEN 'R'
END
)
, ' ') WITHIN GROUP (ORDER BY D.COLSEQ ASC)
FROM SYSIBM.SYSTABLESPACE A
,SYSIBM.SYSTABLES B
,SYSIBM.SYSINDEXES C
,SYSIBM.SYSKEYS D
WHERE A.DBNAME IN ('DSNDB01','DSNDB06')
AND A.DBNAME = B.DBNAME
AND A.NAME = B.TSNAME
AND B.CREATOR = 'SYSIBM'
AND B.NAME = C.TBNAME
AND B.CREATOR = C.TBCREATOR
AND C.NAME = D.IXNAME
AND C.CREATOR = D.IXCREATOR
GROUP BY A.DBNAME, A.NAME, B.NAME, C.NAME, B.TYPE, C.UNIQUERULE
;
And the output:
---------+---------+-------+-----------+---------------------------+---
T U
---------+---------+-------+-----------+---------------------------+---
DSNDB01 DBD01 DBDR DSNDB01X T Y DBID - SECTION -
DSNDB01 SCT02 SCTR DSNSCT02 T Y SCTNAME -
DSNDB01 SPT01 SPTR DSNSPT01 T Y SPTLOCID - SPTCOLID - SPTNAME
And ends
DSNDB06 SYSXML SYSXMLRELS DSNXRX02 T - XMLTBOWNER - XMLTBNAME
DSNDB06 SYSXML SYSXMLSTRINGS DSNXSX01 T Y STRINGID -
DSNDB06 SYSXML SYSXMLSTRINGS DSNXSX02 T Y STRING -
DSNE610I NUMBER OF ROWS DISPLAYED IS 287
Which is a lot less data for my poor little fingers to work with!
Of course it is not perfect… the really big GOTCHA, is ORDER BY is *not* allowed!
PERCENTILES
For the next couple of examples, I will be using the table SAMPLE.EMP that contains these rows and columns of interest:
---------+---------+---------+---------+---------+-----
EMPNO WORKDEPT SALARY BONUS COMM
---------+---------+---------+---------+---------+-----
000210 D11 18270.00 400.00 1462.00
000190 D11 20450.00 400.00 1636.00
000180 D11 21340.00 500.00 1707.00
000160 D11 22250.00 400.00 1780.00
000170 D11 24680.00 500.00 1974.00
000150 D11 25280.00 500.00 2022.00
000200 D11 27740.00 600.00 2217.00
000220 D11 29840.00 600.00 2387.00
000060 D11 32250.00 600.00 2580.
DSNE610I NUMBER OF ROWS DISPLAYED IS 9
PERCENTILE_CONT
Calculates the requested percentile as a continuous value. Use this if you want a calculated value based upon your input. E.g.:
SELECT PERCENTILE_CONT(0,50)
WITHIN GROUP (ORDER BY SALARY ASC )
FROM SAMPLE.EMP
WHERE WORKDEPT = 'D11'
;
---------+---------+---------+---------+---------+---------
---------+---------+---------+---------+---------+---------
+0.2468000000000000E+05 (EMPLOYEE 150 row 5)
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
Here you can see that I require the 50th percentile, and as the data happens to have nine rows it would be the 5th row If you look at the data you will see that that is indeed the case.
Now, reversing the direction of the percentile, I want the 90th descending value. This row does not exist in the input, so the function computes the value that will probably best fit
DSNE616I STATEMENT EXECUTION WAS SUCCESSFUL, SQLCODE IS 100
---------+---------+---------+---------+---------+---------
SELECT PERCENTILE_CONT(0,90)
WITHIN GROUP (ORDER BY SALARY DESC)
FROM SAMPLE.EMP
WHERE WORKDEPT = 'D11'
;
---------+---------+---------+---------+---------+---------
---------+---------+---------+---------+---------+---------
+0.2001400000000000E+05 (value between first and second rows)
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
PERCENTILE_DISC
Calculates the requested percentile as a discrete value. Use this if you do not want a calculated value based upon your input. E.g.:
SELECT PERCENTILE_DISC(0,50)
WITHIN GROUP (ORDER BY SALARY ASC )
FROM SAMPLE.EMP
WHERE WORKDEPT = 'D11'
;
---------+---------+---------+---------+--
---------+---------+---------+---------+--
24680.00 (EMPLOYEE 150 row 5)
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
Here you can see that I still require the 50th percentile, and, as the data happens to still have nine rows, it would be the 5th row again.
Now, reversing the direction of the percentile, I want the 90th descending value. This row does not exist in the input so the function returns the nearest input value.
DSNE616I STATEMENT EXECUTION WAS SUCCESSFU
---------+---------+---------+---------+--
SELECT PERCENTILE_DISC(0,90)
WITHIN GROUP (ORDER BY SALARY DESC)
FROM SAMPLE.EMP
WHERE WORKDEPT = 'D11'
;
---------+---------+---------+---------+--
---------+---------+---------+---------+--
18270.00 (90th is not in the input data but this is the nearest)
DSNE610I NUMBER OF ROWS DISPLAYED IS 1
Feel free to send me your comments and ask questions.
TTFN,
Roy Boxwell
Senior Architect