Open Source Offerings

 View Only

Analyze JSON datasets using MongoDB operations

By KALONJI BANKOLE posted Mon October 19, 2020 02:58 PM

MongoDB is a NoSQL JSON document based database. As an alternative to traditional SQL based methods, Mongo instead offers a vast amount of query operators packaged as the "MongoDB Query Language" (MQL). Developers can use MQL filtering, aggregation, and sorting operations to rapidly generate analytics from JSON documents. A specific use case we've implemented recently is to use MQL operations to determine the highest performing mechanic shop based on their customer reviews. During our analysis, we show how to use the Mongo Query operations to carry out the following.

- Count rows that match a specific condition (select reviews based on sentiment)
- Select businesses that are within a specific range of an address using the "$geoNear" query
- Use an aggregation pipeline to select a set of businesses within a given radius, count their reviews based on sentiment / repair type, return a sorted list of the best performing mechanic shop for a given repair type (Brakes, Engine, etc.)

We have provided a step by step walkthrough of this tutorial within a Python Jupyter notebook. The MQL operations can easily be integrated with other projects as they are programming language agnostic. To visit the Github repository containing the instructions to deploy MongoDB in the IBM Cloud and run the analytics notebook, please visit the following link.

Also, we have posted a video of a demo here