Data Cloud Architecture Accelerates Collaboration in a Global Financial Institution

Brief on Data Cloud Architecture Framework

Constantin Stanca
Data Cloud Architecture
6 min readMay 3, 2024

--

The following picture summarizes the Data Cloud Architecture Framework core concepts.

Data Cloud Architecture Framework is built on the idea of fitting into and not conflicting with how an organization runs its business. An Organization can consist of multiple Business Entities. Business Entities that share their Assets based on a common set of standards, governance, operating procedures and have a mandate/goal of collaborating on assets (Trust Relationship) are members of the same Constellation. One Business Entity can be a member of multiple Constellations adhering to multiple Trust Relationships. Constellations can be Internal and External. Some Business Entities can be members of only Internal Constellations, while others can be members of both Internal and External Constellations.

Constellations and Trust in a Financial Institution

Many organizations struggle to even give access to systems and data within their own four walls. Others are mandated by regulatory concerns and are constrained not by the technology, but by process. Every industry has different regulatory requirements for how data is accessed.

A financial institution must possess the ability to anticipate and respond to a broad range of threats while also taking steps to comply with increasingly onerous and complicated laws and regulations. The Top Compliance Regulations for Financial Institutions are Sarbanes-Oxley Act (SOX), Gramm-Leach-Bliley Act (GLBA), Payment Card Industry Data Security Standard (PCI DSS), 23 NYCRR 500 set of cybersecurity regulations, California Consumer Privacy Act (CCPA), General Data Protection Regulation (GDPR), multiple other financial regulations and compliance requirements like encryption, firewalls and web gateways, intrusion detection, logging and data collection, required policies and processes, vendor management.

Financial institutions are organizationally hybrid in nature consisting of lines of business (LOB), e.g. Asset Management, Wealth Management, Markets, Banking, and functional groups (FG), e.g. Strategy, Risk, Audit, Compliance, Human Capital Management, Legal, Engineering. Functional groups cover common aspects of lines of business. Together, lines of business and functional groups form an internal constellation. The organization as a whole or individual lines of business could also be members of external constellations as well. The diagram below shows how the entire Organization as a whole is a member of an external constellation via an “External Marketplace” that implements a Trust Relationship.

Let’s take a closer look at how a global financial institution like Goldman Sachs implements Trust Relationship on Data Collaboration. Goldman Sachs is a member of an external constellation including providers and consumers governed by an External Trust Relationship, but it has also implemented an internal constellation guided by an Internal Trust Relationship. The Trust Relationship not only covers secure delivery but also name, provider, consumer, purpose, schema, data dictionary, service-level initiatives like interval of change, latency, completeness, freshness, availability, performance and volume, terms (allowed usage and access pattern, query frequency), entitlements, cost and billing.

The External Marketplace (Vendor Supplied Data) reshares public or private market data in a trusted manner to the internal consumers who are various lines of business or functional groups. Goldman Sachs made Governed Data Collaboration possible using the Legend Data Platform, Snowflake Data Cloud, Snowflake Native Apps Framework, and Snowflake Governance capabilities to implement the Trust governing collaboration of Business Entities members of internal and external constellations, thus allowing teams to understand all that data, but also transform it, govern it, share it, and model it — improving timely, data-driven insights and collaborative decision making. This approach simplified a complex process into a self-service experience for researchers and business users.

“The whole idea is to reduce friction and bring this data as quickly and as smoothly as possible into the hands of our users,” says Abhishek Narang, Managing Director, Data Engineering at Goldman Sachs. “In the past, business teams received massive tables of data replicated into their databases, from which it was hard to derive insights. In this new model, we have built in governance models to make data access more meaningful to the end user.”

Third-party data is modeled in Legend, correct digital rights and entitlements are enforced, data quality is met and data is versioned appropriately. This approach increased the level of trust and consequently accelerated adoption. Modeled and access-controlled data is made available to business units for use in anything from research initiatives and alpha generation to finding creative opportunities to help clients.

“Legend and Snowflake Native Apps in the Data Cloud allow us to encapsulate all the entitlements, all the model spec — which is our IP — and share that with the end user,” says Narang. “It basically superimposes governance with the classic data sharing approach.”

The combination of Legend’s semantic modeling, Snowflake Native Apps, and Secure Data Sharing simplifies governed collaboration for Goldman Sachs. The rules dictating how data is connected and entitled are packaged within the app, and remain within it when the app is reshared. This approach reduces risks associated with siloed data engineering teams in individual business groups, including disparate approaches to enforcing entitlements and governance, duplication of data and effort, cost increases, or different understandings of the data. For example, one team might write queries differently from others, or someone may misunderstand how one data set joins to another and write different semantics in their notebooks.

To further combat these risks and facilitate self-service insights, Goldman Sachs made the discovery process easy, using a vendor data catalog allowing users to identify what’s available via Legend and which business units can consume the data. Trust was ensured by implementing appropriate access. Once users figured out which data they need, all they have to do is request and consume the Snowflake Native App with a single click. To the user, it looks like a regular view or a table because it’s shared with user-defined table functions (UDTFs) — but they’re actually getting all of the model specification packaged in the app with the appropriate governance.

Data Cloud Architecture Trust Accelerates Collaboration

The impact of this new approach is substantial for Goldman Sachs. Data Cloud simplifies real-time data sharing. There is no data movement — no need to make copies. It has built-in access Controls which are simple & granular. Data set onboarding and access processes that took weeks or months have been reduced to days. Researchers, quants, and data scientists can now use Legend Snowflake Native Apps in the Data Cloud to find appropriate data for their needs, analyze it, transform it, and share it without having to track down a data engineer, and without worrying about degraded quality or governance. Data engineering teams can spend their time on more strategic projects instead of managing operational requests for new data sets.

Collaboration relies on Trust. High level of trust accelerates collaboration. While Data Cloud Architecture does not make any assumption as to the level of Trust between Business Entities, it does however put forward considerations for how to layer restrictions and use common security practices to meet the objective.

Snowflake’s end-to-end governance framework Horizon and platform capabilities like end to end encryption, column level security, row level security, data masking, cell level encryption, differential privacy, data clean room, Intellectual property protection can support contractual standards (DPA, etc), regulatory standards (GDPR, CCPA, etc), and any other standards that the financial institution might need to follow. All of which are inputs to ultimately define the trust relationship of the constellation.

--

--

Constantin Stanca
Data Cloud Architecture

An experienced leader, architect and developer of complex and large enterprise systems