For organizations looking to build a vocabulary of business terms using extensive predefined taxonomies such as the IBM Knowledge Accelerators, the size of these vocabularies can present a challenge.
Business users typically just need access to the set of business terms required to catalog the data assets in their environment.
The proposed solution is to maintain the broader set of possible business terms as part of a "Source Vocabulary". From this source vocabulary, select and copy/clone the subset of business terms required by the business to a separate "Cloned Vocabulary". "Cloning" includes copying the business term, along with the associated term properties and relationships to other cloned business terms.
What's included?
- A set of python scripts that automate cloning of a subset of business terms in IBM Knowledge Catalog.
- A readme and set of guidelines for implementing a cloning process.
How does it work?
The first step is to identify the subset of business terms in the Source Vocabulary that are to be cloned to the Cloned Vocabulary. The identification can be based on:
- A specific category or set of categories containing business terms.
- A specific catalog which has data assets that are mapped to business terms.
- A combination of both of the above.
The python cloning scripts copy the business term, along with the associated term properties, and relationships to other cloned business terms to the Cloned Vocabulary.
Cloning can be done on an iterative basis, so that the required business terms can be added to the cloned glossary over time whenever required.
Source to Cloned Vocabulary process flow.
Prerequisites
Required services: To use the industry accelerator, you must install one or more of the following services on IBM Cloud Pak for Data
Service |
Required for |
Watson Knowledge Catalog |
Working with and cloning data governance artifacts, such as business terms and categories. See Installing Watson Knowledge Catalog. |
Watson Studio |
Optionally, can be used to execute the python cloning scripts. See Installing Watson Studio. Alternatively, another python environment can be used -- see the README included with the accelerator for more details. |
Importing the accelerator
To use this accelerator on Cloud Pak for Data v4.8.0.0, complete the following steps:
- Download the knowledge-accelerator-scripts.zip file, which is available on the https://github.com/IBM/Industry-Accelerators repository.
- In the zip file, refer to the following documentation:
- For considerations and guidelines on implementing a cloning process, refer to the PDF "Business Term Cloning Script for use with IBM Knowledge Catalog".
- For technical details on configuring and executing the cloning scripts, refer to the README file.
Release Notes
This accelerator has been verified on:
- Cloud Pak for Data v4.8.0.0
About the developer:
IBM
Terms and Conditions
The terms under which you are licensing IBM Cloud Pak for Data also apply to your use of the Industry Accelerators.
Before you use the Industry Accelerators, you must agree on these additional terms and conditions that are set forth here.
This information contains sample modules, exercises, and code samples (the code may be provided in source code form ("Source Code")) (collectively "Sample Materials").
License:
Subject to the terms herein, you may copy, modify, and distribute these Sample Materials within your enterprise only, for your internal use only; provided such use is within the limits of the license rights of the IBM agreement under which you are licensing IBM Cloud Pak for Data.
The Industry Accelerators might include applicable third-party licenses. Review the third-party licenses before you use any of the Industry Accelerators. You can find the third-party licenses that apply to each Sample Material in the notices.txt file that is included with each Sample Material.
Code Security:
Source Code may not be disclosed to any third parties for any reason without IBM's prior written consent, and access must be limited to your employees who have a need to know.
You have implemented and will maintain the technical and personnel focused security policies, procedures, and controls that are necessary to protect the Source Code against loss, alteration, unlawful forms of processing, unauthorized disclosure, and unauthorized access.
You will promptly (and in no event any later than 48 hours) notify IBM after becoming aware of any breach or other security incident that you know, or should reasonably suspect, affects or will affect the Source Code or IBM, and will provide IBM with reasonably requested information about such security incident and the status of any remediation and restoration activities.
You will not permit any Source Code to reside on servers located in the Russian Federation, the People's Republic of China, or any territories worldwide in which the Russian Federation or People's Republic of China claim sovereignty (collectively, "China or Russia"). Company shall not permit anyone to access or use any such Source Code from or within China or Russia, and Company will not permit any development, testing, or other work to occur in China or Russia that would require such access or use. Upon reasonable written notice, IBM may extend these restrictions to other countries that the United States government identifies as potential cyber security concerns.
IBM may request that you verify compliance with these Code Security obligations, and you agree to cooperate with IBM in that regard.
General:
Notwithstanding anything to the contrary, IBM PROVIDES THE SAMPLE MATERIALS ON AN "AS IS" BASIS AND IBM DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, ANY IMPLIED WARRANTIES OR CONDITIONS OF MERCHANTABILITY, SATISFACTORY QUALITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE, AND ANY WARRANTY OR CONDITION OF NON-INFRINGEMENT. IBM SHALL NOT BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY OR ECONOMIC CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR OPERATION OF THE SAMPLE MATERIALS. IBM SHALL NOT BE LIABLE FOR LOSS OF, OR DAMAGE TO, DATA, OR FOR LOST PROFITS, BUSINESS REVENUE, GOODWILL, OR ANTICIPATED SAVINGS. IBM HAS NO OBLIGATION TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS OR MODIFICATIONS TO THE SAMPLE MATERIALS.