Hello Watson Studio Community!
We are seeking community advice and guidance with our Visual Recognition project within Watson Studio. Any and all advice and discussion is greatly appreciated.
Initiative BackgroundI lead a group of people who make up the Governance Team for a new marketplace initiative and we are currently focused on expanding the Marketplace's business value proposition through the incorporation of AI and Machine Learning. We have kicked-off a Visual Recognition proof-of-concept project within Watson Studio with the explicit purpose of determining whether VR can successfully address an ongoing challenge for all marketplaces:
proper item/product classification.
Business ProblemProper classification of marketplace items is essential for usability as well as the proper functioning of
many downstream features and processes. As sellers/suppliers load their products and services to the marketplace, they are required to classify each item with the most precise classification code possible (UNSPSC.org). Unfortunately, proper classification by suppliers is notoriously poor for a multitude of reasons including:
- Use of old and/or incorrect classification versions
- Copy/Paste mistakes
- Lack of effort, experience/skills, and/or resource bandwidth
Proposed SolutionUsing various technologies, we have incorporated an "Automated Classification Analysis Tool" that is an analytics solution that evaluates the context of each item and recommends the most appropriate UNSPSC classification code(s). Unfortunately, many marketplace items do not have ample context to ensure a highly reliable confidence score in the recommended classification(s).
This Proof-of-Concept project is focused on determining whether we can significantly improve the results and confidence scores of the "Automated Classification Analysis Tool" by not only evaluating the context of marketplace items but by incorporating Watson VR to also analyze their associated images.
EXAMPLE: if a seller/supplier loads a computer mouse to the marketplace, Watson must evaluate the associated images that the supplier provided. If Watson cannot verify that each image represents a computer mouse, the item will be flagged for manual review by a member of the Marketplace Governance Team. If Watson is correct, the supplier will be asked to correct the item by providing valid images. If Watson is incorrect, the Governance Team will utilize the images to further train Watson VR.
To train Watson VR on each of the targeted classification codes (~18K), we will initially load and train Watson with 100 – 500 images per UNSPSC classification code.
Issues / Questions / Help Needed
Within our Visual Recognition Project within Watson Studio, we have created a VR model and began creating classifications for marketplace categories (UNSPSC.org codes) and loading training images for each. When loading the images, we are receiving an error stating that the training data must not exceed 250MB.
QUESTION #1: Is this limit due to our use of a "free trial" account?
QUESTION #2: Is this limit per classification or for the entire model or just the newly added training data?
QUESTION #3: The marketplace will eventually consist of ~18K classifications and we will provide ~100 training images for each. If the 250MB limit is not due to our use of a trail-account, that would seem to indicate that we will have to spread the ~18K marketplace classifications across many VR models. Is this accurate?
QUESTION #4: Based on our training and understanding, the creation of thousands of VR models does not seem appropriate for our purposes. Any suggestions/recommendations?
------------------------------
David W. Kovalcik
------------------------------
#GlobalAIandDataScience#GlobalDataScience