Hi Douglas,
looks like you have new management not knowing their data nor the history of it. ;-)
Theoretically you should have an XML DTD and have this assigned to the XML column to be able to verify the XML structure while inserting data. See https://www.ibm.com/docs/en/db2/11.5?topic=data-xml-parsing
Now you seem to have a mess of structured text in the XML column and the database admin shall generate a report for management. What info from the XML element does your management want? Let them have a look at https://www.w3schools.com/xml/xml_elements.asp and let them define their goal. A customer of mine used a CLOB to store a free text form in rougly XML syntax in a table for additional agreements in an insurance policy. This is not solvable by Db2 alone nor would the output of such a report be helpful.
The closest solution to your problem could export the XML into text and do some text extraction/search to find tag names in angle brackets < > and dump them, then do a sort and uniq -c to do the calculation.
I'm not aware, that Db2 is storing the tag names of the deconstructed XML document in the XML storage object https://www.ibm.com/docs/en/db2/11.5?topic=storage-xml-object in a form accessible by SQL.
if you would have indexes on XML data, you could find some paths in syscat.indexes.
Also annotated schema decomposition https://www.ibm.com/docs/en/db2/11.5?topic=purexml-annotated-xml-schema-decomposition could only help you, if you have a DTD, which probably management doesn't have.
Good luck!
------------------------------
Roland Schock
IBM Champion and IBM Gold Consultant
------------------------------
Original Message:
Sent: Wed October 11, 2023 04:41 PM
From: Douglas Kostelnik
Subject: Getting a list of XML Fields in an XML column
Hi,
I started a new job a couple months back. The company has a number of tables (<10) that have an XML column in them that people have been using for a variety of reasons. What they want to do is identify all of the fields in each row (per table) and count how many times those fields appear in the XML column in the table - basically doing an inventory and count of the XML fields in each table. Each row can have different fields in the column.
The end goal is to get a list of all of the fields (and number of times they exist) per table, and determine which ones are valid. Then add those fields as columns to the table, populate the new columns and remove the XML column completely.
The part I need help on is the Inventory and Counting of the individual fields. Does anyone have a process or method that does this?
------------------------------
Douglas Kostelnik
------------------------------