Summary
Coalesce Quality is a cloud application provided to customers as a software as a service. It is deployed in two regions — EU (default, hosted on Google Cloud in Europe) and US (hosted on Google Cloud in the United States) — with total data isolation between regions. Coalesce Quality is designed and operated with security at the top of mind and follows the shared responsibility security model. The customer is responsible for security within the context of their use of the service, while Coalesce Quality is responsible for the security of the service itself and shares responsibility with its cloud provider. Coalesce Quality high-level architecture
Data access level
Coalesce Quality provides rich data monitoring and testing functionality that requires configurable levels of access to customer data depending on the use case:- [Minimal] access to logs for relevant data tools such as transformation layers (dbt, SQLMesh, Coalesce Transform), data warehouse, data orchestrators, and BI tools to provide data observability functionality such as parsing information about the execution of data transformation, lineage, logs, and alerting
- [Recommended] access to
information_schemain the data warehouse to understand freshness and volume of data across tables in the data warehouse to provide automated data anomaly detection - [Recommended] access to query logs to allow Coalesce Quality to process query logs and expose additional functionality such as advanced lineage parsing, monitor automation, and query analytics
- [Recommended] access to code repository (GitHub, Bitbucket, GitLab) to connect source code with data assets and facilitate data diagnostics workflows
- [Where necessary] Access to relevant datasets to calculate aggregated metrics (count, sum, min, max, or similar) in selected tables if the customer wishes to deploy custom monitors to detect data anomalies across key segments of the customer’s data
- [Where necessary] Access to relevant datasets to execute SQL tests. Coalesce Quality by default executes all testing expressions wrapped in a
count(*)query, which means only the number of failures is processed and stored. The only exception is tests with the configuration optionsaveFailures=true, in which case failed row-level records are processed in transit and in memory for the purpose of writing logs to audit tables in the customer data platform. No data is persisted in Coalesce Quality. See SQL Tests audit table schema for details on the table structure.
Data Storage and Processing Locations
Coalesce Quality is available in two regions with total data isolation between them. Each region has its own processing and storage layers, and no customer data crosses region boundaries. EU Region (Default):- The processing layer uses ClickHouse in EU-based locations.
- The storage layer uses Google Cloud Platform with data stored in Europe.
- The processing layer uses ClickHouse in US-based locations.
- The storage layer uses Google Cloud Platform with data stored in the United States.
Cloud Security
Coalesce Quality utilises Google Cloud to take advantage of the same secure-by-design infrastructure, built-in protection, and global network that Google uses to protect its information, identities, applications, and devices. We use Google Cloud Armor as a network security service and Google Cloud Monitoring to monitor the performance, availability, and health of our applications and infrastructure.Authentication and Authorization
Access to the application is secured by Auth0. Currently, we support two authentication modes: unique username/password pair generated for each user, or social login via Google Workspace. Coalesce Quality enables SSO via Google Workspace (but can add further apps based on requirements) and uses role-based access authorization, supporting two user profiles: an administrator and an analyst.Encryption
Coalesce Quality protects individual systems or information by means of cryptographic controls. All data in transit and at rest is encrypted by default.Google Cloud
- All data stored in Google Cloud is encrypted at the storage level using AES256. Data for storage is split into chunks, and each chunk is encrypted with a unique data encryption key.
- Data in transit between end users’ browsers and the Coalesce Quality Google Cloud cluster is encrypted with SSL with automatic certificate rotation managed by Google Cloud.
ClickHouse
- ClickHouse encrypts information in transit by supporting TLS 1.2 and 1.3 when interacting with ClickHouse Cloud over the public internet.
- Data at rest is also encrypted using AES-256 encryption applied to AWS S3 buckets.