Introduction
While distributed databases like Apache Cassandra® have been around for several years, they haven’t always been easy to work with. Although deploying and maintaining Cassandra has been greatly simplified with the introduction of Astra DB, development with tools from the Microsoft DotNet ecosystem has remained somewhat cumbersome. In an effort to simplify this process, DataStax (an IBM company) has released a new C# client for the Astra DB Data API.
In this article, we will discuss Astra DB and the Data API, as well as introduce the C# client for the Astra Data API. We will examine how the Data API client can be leveraged in a typical Model-View-Controller (MVC) application design and discuss any nuances with data access.
For the application, we will use the example KillrVideo streaming service that we had previously discussed. Our discussions will be around the technical implementation of the C# DataStax Data API client into a model/repository layer, as well as in the context of supporting various data access operations for KillrVideo’s video streaming use cases.
The DataStax Data API
Traditionally, developers have interacted with Cassandra and Astra DB by using the Cassandra Query Language (CQL) and its associated language drivers. Astra DB offers the Data API as another option for application interaction, as a series of HTTP endpoints. Additionally, there are language-specific Data API client drivers which are designed to use these endpoints and lower the learning curve for building effective data storage services.
The idea behind the Data API, comes from the fact that Cassandra and CQL have been historically marked as “difficult to use.” At DataStax, we took a look at what other NoSQL databases were doing in terms of developer tooling, and noticed how their ability to make the dev process easier correlated with increased developer adoption.
With the C# Data API client, DotNet developers can now work with Astra DB from a document perspective. Data is stored in Astra DB inside of a “collection,” which is a special kind of table. This flexibility makes it easy for developers to build applications whose data storage can scale and grow as needed.
Additionally, the Data API also allows developers to work with existing CQL tables in Astra DB from this document paradigm. This aspect is called the “Table API,” and it greatly lowers the complexity behind modernizing legacy applications. Now, let’s move on to working with the C# DataAPI client.
Note: As of this writing, the C# Data API client is currently in a pre-GA preview. The demonstrated syntax in this article may be subject to change. Be sure to check the DataStax Data API client documentation for updates.
Installation of the C# Data API client
To install the C# Data API client, you can use the DotNet package manager:
dotnet add package DataStax.AstraDB.DataApi
This will get it added as a dependency inside your .csproj file.
Note: The "dotnet add" command above will function properly once the client is GA. As it is currently not, some adjustments are required, as shown below:
dotnet add package DataStax.AstraDB.DataApi --version 2.0.1-beta
Connecting to Astra DB
To build our connection to Astra DB, we are going to define three environment variables:
|
|
|
|
|
API endpoint for your Astra database.
|
ASTRA_DB_APPLICATION_TOKEN
|
API token created for your DB in the Astra UI.
|
|
|
|
Table 1 – Environment variables for the Astra DB credentials.
These variables will need to be defined in your DotNet environment for the backend service to run.
Killrvideo C# Data API Backend
There are two backend repositories in the KillrVideo GitHub organization which run on the DotNet framework. We will want the one which connects to Astra DB using the Data API. You can clone this repository with the following command:
git clone git@github.com:KillrVideo/kv-be-csharp-dataapi-table.git
Once you have pulled the repository down, you can change into the kv-be-csharp-dataapi-table directory, and instantiate the environment variables from Table 1.
CassandraConnection.cs
To connect with Astra DB from our repository classes, we will build a new class named CassandraConnection. This class will implement an interface named ICassandraConnection, and will define private variables for the Astra token, keyspace name, and API endpoint, as shown below:
using DataStax.AstraDB.DataApi;
using DataStax.AstraDB.DataApi.Core;
namespace kv_be_csharp_dataapi_table.Repositories;
public class CassandraConnection : ICassandraConnection
{
private readonly string? _astraDbApplicationToken;
private readonly string? _astraDbKeyspace;
private readonly string? _astraApiEndpoint;
The class will need a constructor to define these variables from the environment:
public CassandraConnection()
{
_astraDbApplicationToken = System.Environment.GetEnvironmentVariable("ASTRA_DB_APPLICATION_TOKEN");
_astraDbKeyspace = System.Environment.GetEnvironmentVariable("ASTRA_DB_KEYSPACE");
_astraApiEndpoint = System.Environment.GetEnvironmentVariable("ASTRA_DB_API_ENDPOINT");
}
Finally, the constructor will need to define a method to instantiate the client and return the database connection object:
public Database GetDatabase()
{
var client = new DataApiClient();
var database = client.GetDatabase(
_astraApiEndpoint,
_astraDbApplicationToken,
_astraDbKeyspace
);
return database;
}
}
With this class complete, we will now be able to inject it into our repository classes.
Video.cs
Before we can build our repository classes, we will need to build our object model classes. For this example, we will start with the Video class. But first, we should have a look at the definition for the underlying videos CQL table:
CREATE TABLE killrvideo.videos (
videoid uuid PRIMARY KEY,
added_date timestamp,
category text,
content_features vector<float, 384>,
content_rating text,
description text,
language text,
location text,
location_type int,
name text,
preview_image_location text,
tags set<text>,
userid uuid,
views int,
youtube_id text);
CREATE CUSTOM INDEX videos_content_features_idx ON killrvideo.videos (content_features)
USING 'StorageAttachedIndex'
WITH OPTIONS = {'similarity_function': 'COSINE'};
From our table definition, there are three things of note:
Now, we can start on the Video class. There are three attributes that we will be using on various class properties:
We can now use those attributes to annotate various properties of our Video model class:
using DataStax.AstraDB.DataApi.Tables;
using Newtonsoft.Json;
namespace kv_be_csharp_dataapi_table.Models;
public class Video
{
[ColumnPrimaryKey(1)]
[ColumnName("videoid")]
[JsonProperty("video_id")]
public Guid videoId { get; set; } = Guid.NewGuid();
[ColumnName("userid")]
[JsonProperty("user_id")]
public Guid userId { get; set; } = Guid.NewGuid();
public string name { get; set; } = string.Empty;
public string description { get; set; } = string.Empty;
public string location { get; set; } = string.Empty;
[ColumnName("location_type")]
[JsonProperty("location_type")]
public int locationType { get; set; } = 0;
[ColumnName("preview_image_location")]
[JsonProperty("preview_image_location")]
public string previewImageLocation { get; set; } = string.Empty;
[ColumnName("content_features")]
[JsonProperty("content_features")]
public float[]? contentFeatures { get; set; }
[ColumnName("added_date")]
[JsonProperty("added_date")]
public DateTime addedDate { get; set; } = DateTime.UtcNow;
public HashSet<string> tags { get; set; } = new();
public int views { get; set; } = 0;
[ColumnName("youtube_id")]
[JsonProperty("youtube_id")]
public string youtubeId { get; set; } = string.Empty;
[ColumnName("content_rating")]
[JsonProperty("content_rating")]
public string contentRating { get; set; } = string.Empty;
public string category { get; set; } = string.Empty;
public string language { get; set; } = string.Empty;
}
With our Video class defined and properly attributed, we can move on to the VideoDAL repository class.
VideoDAL.cs
To perform data operations on the videos table in Astra DB, we will build a class named VideoDAL. It will implement the IVideoDAL interface, define a private variable for the database, which it will instantiate in its constructor:
using DataStax.AstraDB.DataApi.Core;
using kv_be_csharp_dataapi_table.Models;
namespace kv_be_csharp_dataapi_table.Repositories;
public class VideoDAL : IVideoDAL
{
private readonly Database _database;
public VideoDAL(ICassandraConnection cassandraConnection)
{
_database = cassandraConnection.GetDatabase();
}
For our data access methods, we will need to be able to retrieve a video by its videoid. We will start by defining a Table object of type Video, by using the GetTable<T> method while passing the name of the videos table. Next, we will build a Filter object of type Video, and specify an equals-condition of the column named “videoid” and the value of the videoId variable. From there, we can await the result of the FindOneAsync method (while passing filter as a parameter):
public async Task<Video?> GetVideoByVideoId(Guid videoId)
{
var table = _database.GetTable<Video>("videos");
var filterBuilder = Builders<Video>.Filter;
var filter = filterBuilder.Eq("videoid", videoId);
return await table.FindOneAsync(filter);
}
Inserting a new video is even simpler. We can even do that without a filter, because of how we attributed the Video class. We can simply await the InsertOneAsync method, while passing a Video object:
public async Task SaveVideo(Video video)
{
var table = _database.GetTable<Video>("videos");
await table.InsertOneAsync(video);
}
Now perhaps we needed to perform an update to a specific column, such as updating the number of views for a specific video. For that, we can use a filter similar to the one above. We then build an Update object, where we can chain together calls to the Set methods with the column and its new value. In this case, we can also see how a lambda expression can be used in both the Filter and Update objects:
public async Task UpdateVideoView(Video video)
{
var table = _database.GetTable<Video>("videos");
var filter = Builders<Video>.Filter
.Eq(v => v.videoId, video.videoId);
var update = Builders<Video>.Update
.Set(v => v.views, video.views);
await table.UpdateOneAsync(filter, update);
}
Another use case that we have is leveraging vector search to support a “related videos” query. To solve this problem, we can pass TableSort and Vector objects to the Find method to indicate how we are running an ANN search on the content_features column. This should return rows whose content_features vectors are most similar to the vector value. Additionally, we will also pass the limit variable to the Find method to only return a certain number of results:
public async Task<IEnumerable<Video>> GetByVector(float[]? vector, int limit)
{
var table = _database.GetTable<Video>("videos");
var vectorSearchData = table.Find()
.Sort(
Builders<Video>.TableSort
.Vector(v => v.contentFeatures, vector)
).Limit(limit);
return vectorSearchData;
}
With these data access methods defined in our “model” layer, we can continue to serve them from our controllers, which are called by the web front end. An example of this can be seen below:
Figure 1 - The KillrVideo web application showing details on a specific video, with results from the “Related Videos” (vector search) call on the right side.
Conclusions
As we have shown, the C# Data API client for Astra DB provides an easy path to working with existing Cassandra CQL tables in Astra DB. This makes it a great tool for modernizing legacy Cassandra/Astra applications, as well as for development of new, data-intensive solutions.
If your team uses the Microsoft DotNet ecosystem and you need an easy-to-use, distributed database, be sure to check out Astra DB and the Data API client for C#. Remember that Astra DB also allows you to store vector embeddings and execute vector searches, to help you bring your existing project into the future of AI-driven applications.
Links
#watsonx.data
#datastax
#astradb