Embeddable AI

 View Only

Manipulating JSON From the Command Line

By Daniel Toczala posted Thu October 29, 2020 04:53 PM

  
Note: This is not MY article - it is a reprint of a blog post that I saw in an area not available to the public.  I contacted the author and asked if I could repost this.  He said that I could.  If I can get him to not be so shy, I will put his name here as the author - so he can collect the awards and accolades that he so richly deserves.  I did make some minor changes to this post - in order to more widely publish this information.  As an ex-developer, I love command line stuff like this!

Recently I was working with Watson Discovery queries and wanted to be able to easily sift through the JSON output to get to specific data elements of interest.  Working from the command line I was able to easily construct the queries and run them using CURL, but the output was voluminous and difficult to navigate.  Also, I didn't want to (a) logon to the IBM Cloud (you know my feelings about that); and (b) I didn't want to have to keep rerunning the query just because I wanted to see different elements of the output.  Enter the open-source JSON Query tool known as "jq"!

Now I could run the query once and capture the entire JSON output into a file…and then use jq to extract the JSON elements I was interested in.  You have to have some understanding of the format and elements of your JSON file in order to effectively use this tool.  I typically use this JSON viewer to get familiar with the elements.  In this example, I'm using Discovery News to get news items related to a specific named entity (some customer name, let's use "AcmeCorp"). Feel free to run these examples and use different entity names (use "%20" to represent spaces)! 

  • The initial CURL command to get recent news about AcmeCorp (redirected to a file named acme.json):
curl -u "apikey:yQ7AAFZsyZZZXXJ5xs99bBZf1lW7864okYNssAsss4q" "https://api.us-south.discovery.watson.cloud.ibm.com/instances/123bafff-5993-4dff-8f2f-f877d8aaa365/v1/environments/system/collections/news-en/query?version=2019-04-30&query=enriched_text.entities.text:Acme%20Corp" >acme.json
  • How many results did I get?
cat acme.json | jq ".matching_results"
  • What are the titles of the results and when were they published? (those are square brackets following the results)
cat acme.json | jq ".results[] |.title, .publication_date"
  • Cool - ok - show me the text returned with the results and the URL for the article…(again, note that those are square brackets following the results)
cat acme.json | jq ".results[] |.title, .publication_date, .url, .text"
  • I don't really need an interim file so I'll pipe the CURL output directly to jq and then redirect THAT output to a file… (you know where those square brackets are by now)
curl -u "apikey:yQ7AAFZsyZZZXXJ5xs99bBZf1lW7864okYNssAsss4q" "https://api.us-south.discovery.watson.cloud.ibm.com/instances/123bafff-5993-4dff-8f2f-f877d8aaa365/v1/environments/system/collections/news-en/query?version=2019-04-30&query=enriched_text.entities.text:Acme%20Corp" | jq ".results[] |.title, .publication_date, .url, .text" >acme.json

So there you have it  - a cool tool for working with JSON from the command line or a bash script.  I didn't dive into the syntax of jq in this brief post but you can find everything you need here: https://stedolan.github.io/jq/. The tutorial is great and you can be up and running in no time.

 

 


#BuildwithWatsonApps
#EmbeddableAI
0 comments
11 views

Permalink