Data Sources
The Qualytics platform connects to your enterprise data sources through "Datastores" - our unified connection framework that enables organizations to: - Connect to any Apache Spark-compatible data source - Support both traditional databases and modern object storage - Profile and monitor structured data across your ecosystem - Maintain secure, performant data access - Scale data quality operations across diverse data platforms - Centralize data quality management across sources
These data source integrations ensure comprehensive quality management across your entire data landscape, regardless of where your data resides.
Understanding Datastores
A Datastore in Qualytics represents any structured data source, including: - Traditional relational databases (RDBMS) - Raw files (CSV, XLSX, JSON, Avro, Parquet) - Cloud storage platforms (AWS S3, Azure Blob Storage, GCP Cloud Storage)
Qualytics integrates with these data sources through a layered architecture:
Configuring Data Sources
Start connecting your data sources by adding a new Datastore:
- Navigate to the Datastores tab in the main menu
- Click the Add Source Datastore button:
Info
Qualytics supports any Apache Spark-compatible data source:
1. Traditional RDBMS
2. Raw files (CSV, XLSX, JSON, Avro, Parquet) on:
- AWS S3
- Azure Blob Storage
- GCP Cloud Storage
Connection Management
Each Datastore type requires specific connection credentials. For example, here's a Snowflake connection configuration:
Successfully configured Datastores appear on your home screen:
Datastore Operations
Each Datastore provides access to Qualytics' core capabilities and operations:
Initial Datastore View:
Activity Monitoring:
Getting Started with a New Datastore
When you first configure a Datastore:
- An automatic Catalog operation initiates to discover your data assets
- After cataloging completes, run a Profile operation to:
- Generate comprehensive metadata
- Infer data quality checks
- Enable quality monitoring
To start profiling: