IBM Cloudability

 View Only

 Cloudability API: how does pagination work?

Jump to  Best Answer
  • Cloudability
Pedro Martins's profile image
Pedro Martins posted Fri January 13, 2023 12:22 PM

I am trying to retrieve some data from cost report endpoint that has a lot of results, but I think there is something wrong with pagination. Accordingly to documentation, I receive a "next" token that must be used in a new query.  I am receiving some repeated token paginations as shown below (I have limited it to a million results):

Current quantity: 50000
token='cfea0fff'
Current quantity: 100000
token='110becbd'
Current quantity: 150000
token='3c1f73ba'
Current quantity: 200000
token='7519c170'
Current quantity: 250000
token='cfea0fff'
Current quantity: 300000
token='110becbd'
Current quantity: 350000
token='3c1f73ba'
Current quantity: 400000
token='d27f7e41'
Current quantity: 450000
token='d27f7e41'
Current quantity: 500000
token='7519c170'
Current quantity: 550000
token='110becbd'
Current quantity: 600000
token='7519c170'
Current quantity: 650000
token='cfea0fff'
Current quantity: 700000
token='d27f7e41'
Current quantity: 750000
token='7519c170'
Current quantity: 800000
token='d27f7e41'
Current quantity: 850000
token='7519c170'
Current quantity: 900000
token='d27f7e41'
Current quantity: 950000
token='7519c170'
Current quantity: 1000000
token='cfea0fff'


Here I am using 50,000 as limit and setting token in the next query. No matter whether I use or not offset value (adding 50,000 per query), the results are the same. I have saved the content into an Excel worksheet and the results repeat each 50,000.

Can someone look if there is a bug or explain if I am doing something wrong?


#Cloudability
Greg Winfield's profile image
Greg Winfield  Best Answer
Yes thats right - the 1st call doesnt, in the API endpoint call, BUT in the body of JSON text the next token is supplied which needs to be appended to subsequent calls. 

If you're seeing a different behaviour which is not aligned to documentation I think its best you raise a support ticket to get to the bottom of this in case I am missing something.
#Cloudability
Rene Norskov's profile image
Rene Norskov
@Pedro Martins It's not something I can answer, but I have reached out to our SME's who will hopefully be able to help.​
#Cloudability
Debbie Hagen's profile image
Debbie Hagen
Hello @Pedro Martins,
I see that @Rene Norskov is working on getting you an answer.  In the meantime here are a couple articles you may find useful. 
Getting started with Cloudy APIs

Quick starter: migrating from v1 to v3
​​
#Cloudability
Pedro Martins's profile image
Pedro Martins
@Debra I am using API version 3.​​​​
#Cloudability
Andrew Midgley's profile image
Andrew Midgley
Can you share some of the requests you are making @Pedro Martins so I can look through them? Obviously don't share your API key. When using the auto pagination (via token) there is no need to use the offset, so definitely don't include that when running the requests.​
#Cloudability
Pedro Martins's profile image
Pedro Martins

Here is the main code to get results. No matter whether I use offset or not, tokens keep repeating.

offset = 0
limit = 50000
filters = ['enhanced_service_name==AWS EC2']
cost_data = []
response = cldy.cost_reports.list_cost_reports(
    start_date=start_date,
    end_date=end_date,
    dimensions=dimensions,
    metrics=metrics,
    filters=filters,
    offset=offset,
    limit=limit,
    view_id=view_id)
if not 'results' in response:
    pprint(response)
    sys.exit(1)
cost_data.extend(response['results'])
print(f'Current quantity: {len(cost_data)}')
token = response.get('pagination', {}).get('next', '')
print(f'{token=}')
while token != '' and len(cost_data) < 1000000:
    response = cldy.cost_reports.list_cost_reports(
        start_date=start_date,
        end_date=end_date,
        dimensions=dimensions,
        metrics=metrics,
        filters=filters,
        offset=offset,
        limit=limit,
        view_id=view_id,
        token=token)
    offset += limit
    cost_data.extend(response['results'])
    print(f'Current quantity: {len(cost_data)}')
    token = response.get('pagination', {}).get('next', '')
    print(f'{token=}')
print(f'{len(cost_data)} entries loaded')

Class Cldy contains functions that use requests library to access each endpoint. For list_cost_reports function the endpoint is https://api.cloudability.com/v3/reporting/cost/run.


#Cloudability
Andrew Midgley's profile image
Andrew Midgley
Debugging that code is going to be beyond my abilities here @Pedro Martins. If possible, the approach that would help here would be to manually go through the requests that are being made, or getting the script to print out the requests that it is making. Even if you could make the first 6 or 7 requests yourself via a client like cURL or Postman and then verify the pagination.

For what it's worth i just did a test with Apptio's own ​Cloudability data and cycled through 15 cost report pages with Postman with no issues. Here is an example of the report i was running:

https://api.cloudability.com/v3/reporting/cost/run?start_date=2022-02-01&end_date=2022-02-28&dimensions=vendor,region,resource_identifier&metrics=total_adjusted_amortized_cost,usage_hours&sort=total_adjusted_amortized_costDESC&filters=transaction_type%3D%3Dusage&token=889f68f4

I was paginating 10k rows at a time since i didn't set a limit.
#Cloudability
Greg Winfield's profile image
Greg Winfield
@Pedro Martins yes i came across this as well querying the Azure API. 
not sure of how much help it is but this is what i wrote in bash against cldy.  you should encounter a null token to indicate the end.

#!/bin/bash
#PreRequsite - must have JQ installed for this to work correctly!
#Clear out any previous files or temp files
rm report.json output.json

#Append your API call here with the date range and dimensions/metrics you need.
parent_url="https://api.cloudability.com/v3/internal/reporting/cost/run?dimensions=resource_identifier&end=2022-11-29&limit=0&metrics=unblended_cost&relativePeriods=custom&start=
2022-11-01&viewId=0"

#Add your Cldy API token (to do to use environmentID and opentoken)
auth_token="<CldyAPIKey>:"

#Make first API call to get the first pagination token to pass into the loop
curl -o report.json -s -X GET "$parent_url" -u "$auth_token"
pagToken=`(cat report.json | jq '.pagination.next' | sed 's/\\"//g')`

#Loop thru API call till you hit NULL pagination token and write to output.json
while [ "$pagToken" != "null" ]
do
        #Be Kind to API Endpoints
        sleep 5
        #Start Making calls till end pagination Token
        curl -o newreport.json -s -X GET "$parent_url""&token=""$pagToken" -u "$auth_token"
        pagToken=`(cat newreport.json | jq '.pagination.next' | sed 's/\\"//g')`
        cat newreport.json >> output.json
        rm newreport.json
        #Output the pagination Token so you know something is going on
        echo $pagToken
done​

#Cloudability
Andrew Midgley's profile image
Andrew Midgley
What @Greg Winfield mentioned here also crossed my mind @Pedro Martins, that is you are reaching the last page and then effectively going back through the loop since you start again with no token. There's a good chance putting a check in as Greg mentioned will solve your issue.​​
#Cloudability
Pedro Martins's profile image
Pedro Martins
@Greg Winfield, the first call does not have a token because it needs to retrieve the first continuation token. Then the loop has a token parameter, which is renewed each loop until no new token is returned. That is a simple logic and it works everywhere.

I have put these million lines in a worksheet and I noticed that it repeat the same lines, it seems like token is being ignored, even tough it returns a different token each loop.

I have read the documentation several times and I am following exactly what is explains: get next value from pagination property and use in the next call as token.
#Cloudability