Introduction
A recent study from PWC on the UK market estimates that the emerging open banking ecosystem will create, by 2022, new products and services adopted by 71% of SMEs and 64% of retail customers (Ref [1]). Numbers will differ across geographies: they depend on the pace of implementation of new standards and regulations. In this article I am often referring to PSD2, the directive that is shaping open banking across Europe. However, industry and regulators are promoting similar types of initiatives across the world and, regardless of where you live, there is no doubt open banking will transform the way in which you pay for goods and manage your finances.
Many of the innovative solutions emerging in this space are based on aggregating account information beyond the traditional boundaries of a single institution. If you can build, in an easy and secure fashion, a holistic view of a customer financial profile you can re-design many traditional financial processes. Typical examples include cash management, risk scoring, spending analysis and, more in general, financial advice. This is the first of two posts focused on the integration logic common across all these scenarios: retrieving the data, aggregating, and subsequently querying it. This first article looks at the requirements and the logical solution, while the next will present a concrete implementation.
Actors and high-level process
The problem of exchanging sensitive information in an open ecosystem is at the core of this use case. In a previous article (Ref [2]) I have looked at it from the point of view of a bank exposing customer data to third parties. In this one, on the other hand, I focus on a third party consuming the API of multiple banks.
 Figure 1 – Actors
Let’s start by introducing the key actors involved in account information aggregation, shown in Figure 1:
- The bank: the institution exposing account information APIs.
- The account information service provider: a third party building value-added services on top of the APIs exposed by banks
- The end customer: me and you, end customers of a bank. We are the owners of our financial data and, according to recent regulations (for example PSD2) we can decide to give to the third party consent to access that data on our behalf.
The “customer consent” just mentioned, is at the core of the relationship between these parties. The process implemented by an account information service provider will typically include three high-level steps (Figure 2)
Figure 2 – High-level process
During the initial setup a customer will grant an individual consent for every bank the third party will access on his behalf. Under PSD2 that involves being authenticated by each one of those banks. The article of Ref [2] talks about how a bank’s APIs enable this process. At the end of these interactions the third party receives access tokens, the temporary keys that grant access to the data agreed by the customer. The account information services provider can then retrieve the data, aggregate it, and apply the logic provided by its service.
Integration challenges
Automating the process described above is not trivial. Think of these technical challenges:
- Number of integration points. PSD2 doesn’t mandate a single API standard (see Ref [3]). Different standards are used in different geographies and even the implementation of the same standard can differ significantly between banks. Every bank must be treated as a different integration point. If you want to offer your services to a broad number of customers, you have to integrate with all the banks operating in your market.
- Maintenance. A third party has no control on the lifecycle of the endpoints exposed by banks. Every bank moves to new interface versions according to its own schedule. Your implementation needs to keep up with that.
- Performance. You aggregate a large number of transactions across multiple accounts. Retrieving a full set of customer data requires multiple API calls returning large payloads. This can have a significant impact on the performance of your solution.
- Data query. PSD2 APIs provide a flat list of accounts and transactions; if you want to apply any logic to that data or just display it on a chart, you must to be able, at a minimum, to query, filter and aggregate it.
Solution architecture
Figure 3 presents an account information aggregation solution pattern, with the integration components highlighted in cyan.
 Figure 3 – Logical solution
Here is a description of each element:
- End user application. Because the focus of this post is integration, I’ll refer to “End user application” in very broad terms. It is the collection of capabilities responsible for orchestrating the end user experience and exposing interfaces directly to the end customer.
- User management. As mentioned above, under PSD2 banks are responsible for authenticating customers sharing data with third parties. When that happens, the customer is proving her identity directly to the bank, not to the third party. The third party can only count on the fact that “a” customer of that bank has signed off consent. Hence the data aggregator has to manage the identities of its users and map them to customer consents.
- Local datastore. A database acting as a local cache of the data retrieved from banks. It supports advanced data query and filtering, reduces the number of roundtrips to the banks’ APIs and, in general, enables a better user experience.
- Open banking gateway. The component of the solution acting as a single gateway to the public APIs exposed by banks. It exposes a single stable interface, hiding the variations between the endpoints implemented by different institutions.
- Data flow orchestration. The integration component responsible for aggregating account information data and making it consumable for further processing. The third party is interested in the full transaction history of customers that might have several accounts with multiple institutions. This requires the aggregation of tens of API calls returning large payloads. The orchestration should be done asynchronously, managing every API call as an independent event.
- Connectors. Connectors act as a link between the data flow orchestration and the data sources/targets, in this case the “local datastore” and the “open banking gateway”. They simplify the solution by discovering the metadata of the target system, listening for events or handling pagination. In short, they take care of low-level integration concerns, so that you can focus on designing the data flow.
Conclusion
In this article I have highlighted the role of customer consent in integrating data across an open ecosystem, called out technical challenges linked to the proliferation of API endpoints, and identified a logical solution architecture. If you want to further discuss any of these points, feel free to send me a note at cmarcoli@uk.ibm.com.
 In the next post (Ref[4]) I’ll walk through a physical implementation of this architecture I have described here.
References
[1] 
The future of banking is open [2] 
PSD2 Reference Architecture [3] 
Will this be the year of API standardisation? [4] 
Account information aggregation – 2.Implementation [5] 
Open data: the new frontier of integration