Sterling Managed File Transfer

Sterling Managed File Transfer

Come for answers, stay for best practices. All we're missing is you.

 View Only

Paging and Filtering with Apache Cassandra

By Tanvi Kakodkar posted Fri January 24, 2020 02:31 PM

  
Originally updated on January 26, 2015 by foxmccloud

Paging and Filtering with Apache Cassandra
Authors: Jared Gray (jrgray@us.ibm.com) and Rick Gunderson (rgunderson@ca.ibm.com)

Introduction

Application developers using Cassandra will commonly need to display particular subsets of data. More specifically, the need often emerges to display only those database records matching certain filter criteria, e.g. all sales records where the seller is Supermart Corporation AND the sale value exceeds $100 USD. Cassandra’s limited query language can make these filtering tasks significantly difficult to design and implement. The degree of difficulty depends primarily on the number of filters to be supported and the complexity of the filter operations themselves.

Splitting the records that match the filter requirements into a series of pages is also necessary for a variety of practical purposes. For instance, the number of records matching the filter criteria may exceed the memory limits of a client machine, necessitating the use of paged result data. Additionally, if many records are to be displayed in an interactive user interface, users should not view all of the query results at once; a user should see small subsets of the results and be able to request the next and/or previous pages of data.

Cassandra does not natively support such end-to-end paging and filtering mechanisms, leaving application developers to implement their own solutions. In our own attempts to implement this functionality with Casssandra, we have discovered many pitfalls of the Cassandra Query Language and devised unique solutions to circumvent these limitations. Unfortunately, at the time of writing, we have also been unable to find robust, open source documentation describing pagination in Cassandra.

To address this deficiency, we outline a general approach to paging with Cassandra. Our design supports rich user interfaces by accommodating combinations of data filters, forward and backward page navigation, and sorting of the result data. In this paper, we describe the Cassandra database tables, PRIMARY KEY definitions, and query language constructs used to implement our solution.

In particular, we describe two kinds of filtering. The first is exact-match filtering, in which we filter out all records whose column values exactly match a set of desired filter values. As an example, we might want to retrieve all sales records where the seller is Supermart Corporation AND the sale value is exactly $100 USD. Range filtering, the other variety of filtering we support, is a generalization of exact-match filtering in which precisely one column value can lie between a user-specified range. For instance, we could retrieve all sales records where the seller is Supermart Corporation AND the sale value is between $0 and $100 USD.

Sample Data

The table below defines the sample data set we will use for this our paging and filtering query examples.

paging_table
partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C01’ ‘D01’ ‘01’
‘A01’ ‘B01’ ‘C01’ ‘D02’ ‘02’
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’
‘A01’ ‘B02’ ‘C03’ ‘D05’ ‘05’
‘A01’ ‘B02’ ‘C03’ ‘D06’ ‘06’
‘A02’ ‘B03’ ‘C04’ ‘D07’ ‘07’

The left-most column is special, as it is the partition key for a given table. There must be exactly one partition key column per table, which is the first column listed in a table’s primary key. This column is significant from a query construction standpoint because it can only be restricted in a WHERE clause by either of the IN or EQUALS (“=”) operators.

All columns whose names begin with cluster_ are clustering keys. These are significant because they can only be specified in a WHERE clause that restricts the partition key and all preceding clustering keys using the EQUALS operator.

More formally, let  be the set of clustering keys for a Cassandra database table ordered according to their appearance in the table’s PRIMARY KEY clause. That is,  is the first clustering key and  is the last clustering key. Suppose the table is not indexed by a secondary index. It is syntactically valid to add clustering key  to a WHERE clause only if all partition keys  have been specified, for all .

Note

Though necessary, this is not a sufficient condition. There are additional restrictions regarding the construction of syntactically-valid WHERE clauses, the most significant of which will be discussed as they are relevant.

For clarity, we now show the CREATE TABLE Cassandra Query Language (CQL) statement corresponding to the table above:

CREATE TABLE paging_table (
    partition       TEXT,   // partition key column
    cluster_01      TEXT,   // first clustering key column
    cluster_02      TEXT,   // second clustering key column
    cluster_03      TEXT,   // third clustering key column
    non_primary_key TEXT,   // column not part of primary key
    PRIMARY KEY (partition, cluster_01, cluster_02, cluter_03)
) WITH CLUSTERING ORDER BY (cluster_01 ASC, cluster_02 ASC, cluster_03 ASC);

As an additional note, the data above are sorted in ascending order of the table’s primary key definition.

Exact-Match Filtering and Pagination

In this section, we will discuss exact-match filtering. As a reminder, in exact-match filtering, we filter for all records whose column values exactly match the provided filter specification.

Exact-Match Filtering, Ascending Sort

Recall that our primary key definition lists partition as the partition key column, and cluster_01 is the first clustering key. Suppose we wish to use a filter value of ‘A01’ for partition, a filter value of ‘B01’ for cluster_01, and a page size of two.

Note

It is not possible in general to page without filtering at all, leaving single-column filtering as the simplest use case. See Paging Limitations & Considerations for more details.

First Page

To retrieve the first page of exact-match data from the table, in ascending order, execute the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND cluster_01 = 'B01'
ORDER BY cluster_01 ASC
LIMIT 2;

Notice that we order by the first clustering key column. Recall that Cassandra query results are returned in sorted order: first, on the partition key column, then on each of the ordered partition keys in sequence. Since we are querying for all records with the same partition key, ordering on the first cluster key will sort the output data correctly.

Note

Why we are ordering the results based on the value of the first clustering key, rather than using the partition key? If we attempt to sort based on the value of the partition key, we will observe a Cassandra error similar to the following: “Order by is only supported on the clustered columns of the PRIMARY KEY.” That is, the column name that appears in the ORDER BY clause must be a clustering key. Furthermore, ordering based on the first cluster key is always syntactically correct, independent of the contents of the WHERE clause.

The results for the first page query are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C01’ ‘D01’ ‘01’
‘A01’ ‘B01’ ‘C01’ ‘D02’ ‘02’

Next Page

Suppose the current page of results is the following:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C01’ ‘D01’ ‘01’
‘A01’ ‘B01’ ‘C01’ ‘D02’ ‘02’

To retrieve the next page of data from the table, we use the results from the last record that appeared on the current page. We then add an additional restriction to the WHERE clause to ensure that we retrieve subsequent results based on the contents of the current page. To retrieve the next page in this example, execute the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND
    (cluster_01, cluster_02, cluster_03) > ('B01', 'C01', 'D02')
ORDER BY cluster_01 ASC
LIMIT 2;

Notice, that we although we use a tuple inequality on the clustering keys–(cluster_01, cluster_02, cluster_03) > ('B01', 'C01', 'D02')–we maintain the use of a EQUAL operator on the partition key. This is in fact required, as the partition key cannot be part of a multi-column relation.

The results for the next page query are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’

Previous Page

Suppose the current page of results is the following:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B02’ ‘C03’ ‘D05’ ‘05’
‘A01’ ‘B02’ ‘C03’ ‘D06’ ‘06’

To retrieve the previous page of data from the table, we use the results from the first record that appears on the current page.

A False Start

To retrieve the previous page in this example, we may be tempted to try the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND
    (cluster_01, cluster_02, cluster_03) < ('B01', 'C03', 'D05')
ORDER BY cluster_01 ASC
LIMIT 2;


Danger
This query is not correct!

Results are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C01’ ‘D01’ ‘01’
‘A01’ ‘B01’ ‘C01’ ‘D02’ ‘02’

Clearly, we did not retrieve the correct results. Instead of getting the previous page, we inadvertently retrieved the first page of results. If we increase the LIMIT on the statement above, we would see that the results we are looking for are at the end of the result set attained from this query. This realization is key for devising a solution.

The Correct Approach

To retrieve the previous page in this example, we modify the “false start” query slightly; we now use a decending-order sort instead of an ascending-order sort in the ORDER BY clause. In particular, we execute the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND
    (cluster_01, cluster_02, cluster_03) < ('B01', 'C03', 'D05')
ORDER BY cluster_01 DESC
LIMIT 2;

Results are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’

Notice that we have the correct results, but they are sorted in reverse of the desired, ascending order. To address this final problem, the application parsing Cassandra’s result set must reverse the order of the result data.


Caution

We have introduced a linear-time computational complexity by requiring that the application retrieving query results from Cassandra reverse the order of the result set. This may prove problematic for retrieving exceptionally large pages but can be mitigated by retrieving pages with fewer entries instead.

Once order-reversed, the results are as follows:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’


Exact-Match Filtering, Descending Sort

The descending-order case is similar to the ascending-order case. Consult General Patterns for more information.

Range Filtering and Pagination

An alternative style of filtering is range filtering, in which we filter for all records whose column values exactly match the provided filter specification, except one column whose value resides between two extremes.

Range Filtering, Ascending Sort

Recall that our primary key definition lists partition as the partition key column, and cluster_01 is the first clustering key. Suppose we wish to use a filter value of ‘A01’ for partition, a filter range of ‘B01’ to ‘BO2’ (inclusive) for the cluster_01, and a page size of two.

First Page

To retrieve the first page of range-filtered data from the table, in ascending order, execute the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND
  cluster_01 >= 'B01' AND cluster_01 <= 'B02'
ORDER BY cluster_01 ASC
LIMIT 2;

The results for the first page query are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C01’ ‘D01’ ‘01’
‘A01’ ‘B01’ ‘C01’ ‘D02’ ‘02’


Next Page

Suppose the current page of results is the following:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C01’ ‘D01’ ‘01’
‘A01’ ‘B01’ ‘C01’ ‘D02’ ‘02’

To retrieve the next page of data from the table, we use the results from the last record that appeared on the current page. We then add additional restrictions to the WHERE clause to ensure that we retrieve subsequent results based on the contents of the current page. To retrieve the next page in this example, execute the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND
  (cluster_01, cluster_02, cluster_03) < ('B02', MAX_TEXT, MAX_TEXT) AND
  (cluster_01, cluster_02, cluster_03) > ('B01', 'C01', 'D02')
ORDER BY cluster_01 ASC
LIMIT 2;

The values supplied in the LESS THAN inequality are derived from the the end of the filter range. Those values in the GREATER THAN inequality are derived from the last record on the current page. Also, we use MAX_TEXT as a constant, representing the maximum possible TEXT value (exclusive) that this table will ever hold.

The results for the next page query are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’


Previous Page

Suppose the current page of results is the following:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B02’ ‘C03’ ‘D05’ ‘05’
‘A01’ ‘B02’ ‘C03’ ‘D06’ ‘06’

To retrieve the previous page of data from the table, we use the results from the first record that appears on the current page. In this example, execute the following query:

SELECT *
FROM paging_table
WHERE partition = 'A01' AND
  (cluster_01, cluster_02, cluster_03) < ('B02', 'C03', 'D05') AND
  (cluster_01, cluster_02, cluster_03) > ('B01', MIN_TEXT, MIN_TEXT)
ORDER BY cluster_01 DESC
LIMIT 2;

The values supplied in the LESS THAN inequality are derived from the first record on the current page. Those values in GREATER THAN inequality are derived from the beginning of the filter range. Also, we use MIN_TEXT as a constant, representing the minimum possible TEXT value (exclusive) that this table will ever hold.

Results are shown below:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’

Notice that we have the correct results, but they are sorted in reverse of the desired, ascending order. To address this final problem, the application parsing Cassandra’s result set must reverse the order of the result data.

Once order-reversed, the results are as follows:

partition cluster_01 cluster_02 cluster_03 non_primary_key
‘A01’ ‘B01’ ‘C02’ ‘D03’ ‘03’
‘A01’ ‘B01’ ‘C02’ ‘D04’ ‘04’


Range Filtering, Descending Sort

The descending-order case is similar to the ascending-order case. Consult General Patterns for more information.

Paging Limitations & Considerations

In this chapter, we review some of the query language limitations discussed or hinted at previously.

Paging without Filtering

It is not possible in general to page result sets when no filters are used. This is true because the partition key must be restricted in the WHERE clause by an EQUALS or IN relation (supposing secondary indexes are not used). It may be possible to implement paging without filtering when the cardinality of partition keys is very low for a particular table, but we have not explored this.


Note:

Apache Cassandra is also considering allowing IN relations to apply to clustering keys. This may increase the utility of our pagination approach for low-cardinality cluster key columns.

Hot Spots

Notice that all of our query examples rely on the fact that the records to be filtered have the same partition key column value. Because this value is used to determine the location of Cassandra replicas, a potentially small subset of Cassandra nodes (which are the replicas for a particular partition key) will be burdened by handling all pagination and filtering requests for records sharing a partition key. This can be mitigated by increasing the replication factor used in a Cassandra cluster, which in turn increases the replica count and balances load more evenly.

Primary Key Uniqueness

Recall that Cassandra’s primary keys are also a uniqueness constraint; if a new record is inserted with the same primary key values as an existing record, the existing record is overwritten. When the set of filter columns is not guaranteed to be unique, it is often necessary to add an additional column whose values are always globally unique to the primary key definitions of the filtering table.

Maintaining Multiple Tables

To support filtering with n different filter columns, and only one potential order of filters is allowed, then only one Cassandra table is needed. However, to support arbitrary combinations of filters and to allow users to omit arbitrary filter columns,  tables must be created and maintained.

If these tables become inconsistent with respect to each other, this may result in significant problems for the application attempting to use them for paging. Maintaining consistency between these tables is not the scope of this document.


Client-Side Sorting

To retrieve the previous page of results, the application consuming Cassandra’s result sets must reverse the order of the query results. This is a linear-time operation in the application, and may become problematic from a performance standpoint when large page sizes are used.

Asynchronous Updates to Database

If data is changed while users are paging through data using our paging and filtering approach, unexpected behavior may occur. For example, if new data is inserted, but the user has already paged past it, the newly-inserted data will not appear in the paged result set. It is worth noting that similar issues would occur in relational databases unless a transaction was held open for the duration of the paging access.

Maintaining Page Offset

The only way to compute the current page number is to query the database for all records between the first page and a record on the current page. This is mal-performant, and often user interfaces can get away with simply using an approximation instead of the actual offset, e.g. “paged forward X times”. Also, due to the potential of asynchronous updates, a reliable page offset count cannot be maintained. As soon as the current offset is computed, it may change.

Counting Records Matching a Filter

It may be tempting to count the number of all records matching a filter for various purposes, e.g. displaying “user is on page X of Y” in a user interface. However, this should not be attempted, as SELECT COUNT is mal-performant in Cassandra.

Forward-Incompatible Queries in Cassandra

Tuple relations appear not to be tested very well in Cassandra at the time of writing. Several of our queries have broken since Cassandra 2.0.6, particularly in Cassandra 2.0.9 and the 2.1 branch. Part of the purpose of this document is to clarify the kinds of queries we intend to run so that these regressions can be fixed.

String Collation

Cassandra’s internal collation logic for data types dictates the sort order of result data. Collation also informs the MIN_VALUE and MAX_VALUE calculations in the General Patterns section. It may be possible to define custom collators in Java to mitigate this, but we have done no work in this area. This may be a concern for internationalization efforts.

General Patterns

We conclude this document by reviewing the general syntax for pagination and filtering.

Conventions

We use a variety of naming conventions in this section, many of which are documented here for clarity. First, partition is the partition key column of the table to page. Next, let  be the set of clustering keys for a Cassandra database table ordered according to their appearance in the table’s PRIMARY KEY clause. That is,  is the first clustering key and  is the last clustering key. Let  be the number of columns to use in the filter. Let first_result and last_result be the first and last database records presented on the current page of results, respectively. Finally, let filter_begin represent the values at the start of the filter range (inclusive) and filter_end represent the values at the end of the filter range (inclusive).

Exact-Match Filtering, Ascending Sort
First Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_k) =
    (filter_value_1, filter_value_2, ..., filter_value_k)
ORDER BY <cluster_1> ASC
LIMIT <page_size>;

Next Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value>  AND
    (cluster_1, cluster_2, ..., cluster_k) =
    (filter_value_1, filter_value_2, ..., filter_value_k) AND
    (cluster_k+1, cluster_k+2, ..., cluster_n) >
    (last_result.cluster_k+1, last_result.cluster_k+2, ..., last_result.cluster_n)
ORDER BY <cluster_1> ASC
LIMIT <page_size>;

Previous Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value>  AND
    (cluster_1, cluster_2, ..., cluster_k) =
    (filter_value_1, filter_value_2, ..., filter_value_k) AND
    (cluster_k+1, cluster_k+2, ..., cluster_n) <
    (first_result.cluster_k+1, first_result.cluster_k+2, ..., first_result.cluster_n)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Then, the application consuming the result set must reverse the ordered of the retrieved results.


Exact-Match Filtering, Descending Sort
First Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_k) =
    (filter_1, filter_2, ..., filter_k)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Next Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value>  AND
    (cluster_1, cluster_2, ..., cluster_k) =
    (filter_1, filter_2, ..., filter_k) AND
    (cluster_k+1, cluster_k+2, ..., cluster_n) <
    (last_result.cluster_k+1, last_result.cluster_k+2, ..., last_result.cluster_n)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Previous Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value>  AND
    (cluster_1, cluster_2, ..., cluster_k) =
    (filter_1, filter_2, ..., filter_k) AND
    (cluster_k+1, cluster_k+2, ..., cluster_n) >
    (first_result.cluster_k+1, first_result.cluster_k+2, ..., first_result.cluster_n)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Then, the application consuming the result set must reverse the ordered of the retrieved results.


Range Filtering, Ascending Sort
First Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_k) >=
        (filter_begin.cluster_1, filter_begin.cluster_2, ..., filter_begin.cluster_k) AND
    (cluster_1, cluster_2, ..., cluster_k) <=
        (filter_end.cluster_1, filter_end.cluster_2, ..., filter_end.cluster_k)
ORDER BY <cluster_1> ASC
LIMIT <page_size>;

Next Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_n) >
        (last_result.cluster_1, last_result.cluster_2, ..., last_result.cluster_n) AND
    (cluster_1, cluster_2, ..., cluster_k,
        cluster_k+1, ..., cluster_n-1, cluster_n) <
    (filter_begin.cluster_1, filter_begin.cluster_2, ..., filter_begin.cluster_k,
        MAX_VALUE, ..., MAX_VALUE, MAX_VALUE)
ORDER BY <cluster_1> ASC
LIMIT <page_size>;

Previous Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_n) <
        (first_result.cluster_1, first_result.cluster_2, ..., first_result.cluster_n) AND
    (cluster_1, cluster_2, ..., cluster_k,
        cluster_k+1, ..., cluster_n-1, cluster_n) >
    (filter_begin.cluster_1, filter_begin.cluster_2, ..., filter_begin.cluster_k,
        MIN_VALUE, ..., MIN_VALUE, MIN_VALUE)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Then, the application consuming the result set must reverse the ordered of the retrieved results.


Range Filtering, Descending Sort
First Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_k) >=
        (filter_begin.cluster_1, filter_begin.cluster_2, ..., filter_begin.cluster_k) AND
    (cluster_1, cluster_2, ..., cluster_k) <=
        (filter_end.cluster_1, filter_end.cluster_2, ..., filter_end.cluster_k)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Next Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_k,
        cluster_k+1, ..., cluster_n-1, cluster_n) >
    (filter_begin.cluster_1, filter_begin.cluster_2, ..., filter_begin.cluster_k,
        MIN_VALUE, ..., MIN_VALUE, MIN_VALUE) AND
    (cluster_1, cluster_2, ..., cluster_n) <
    (last_result.cluster_1, last_result.cluster_2, ..., last_result.cluster_n)
ORDER BY <cluster_1> DESC
LIMIT <page_size>;

Previous Page

SELECT <columns>
FROM <table_name>
WHERE <partition> = <partition_filter_value> AND
    (cluster_1, cluster_2, ..., cluster_k,
        cluster_k+1, ..., cluster_n-1, cluster_n) <
    (filter_end.cluster_1, filter_end.cluster_2, ..., filter_end.cluster_k,
        MAX_VALUE, ..., MAX_VALUE, MAX_VALUE) AND
    (cluster_1, cluster_2, ..., cluster_cluster_n) >
    (last_result.cluster_1, last_result.cluster_2, ..., last_result.cluster_n)
ORDER BY <cluster_1> ASC
LIMIT <page_size>;

Then, the application consuming the result set must reverse the ordered of the retrieved results.

 


#DataExchange
#IBMSterlingB2BIntegratorandIBMSterlingFileGatewayDevelopers
0 comments
14 views

Permalink