Skip to content

Data Sources

The Qualytics platform connects to your enterprise data sources through "Datastores" - our unified connection framework that enables organizations to: - Connect to any Apache Spark-compatible data source - Support both traditional databases and modern object storage - Profile and monitor structured data across your ecosystem - Maintain secure, performant data access - Scale data quality operations across diverse data platforms - Centralize data quality management across sources

These data source integrations ensure comprehensive quality management across your entire data landscape, regardless of where your data resides.

Understanding Datastores

A Datastore in Qualytics represents any structured data source, including: - Traditional relational databases (RDBMS) - Raw files (CSV, XLSX, JSON, Avro, Parquet) - Cloud storage platforms (AWS S3, Azure Blob Storage, GCP Cloud Storage)

Qualytics integrates with these data sources through a layered architecture:

Screenshot

Configuring Data Sources

Start connecting your data sources by adding a new Datastore:

  1. Navigate to the Datastores tab in the main menu
  2. Click the Add Source Datastore button:

Screenshot Screenshot

Screenshot Screenshot

Info

Qualytics supports any Apache Spark-compatible data source:

1. Traditional RDBMS 
2. Raw files (CSV, XLSX, JSON, Avro, Parquet) on:
    - AWS S3
    - Azure Blob Storage
    - GCP Cloud Storage

Screenshot Screenshot

Connection Management

Each Datastore type requires specific connection credentials. For example, here's a Snowflake connection configuration:

Screenshot Screenshot

Successfully configured Datastores appear on your home screen:

Screenshot Screenshot

Datastore Operations

Each Datastore provides access to Qualytics' core capabilities and operations:

Initial Datastore View: Screenshot Screenshot

Activity Monitoring: Screenshot Screenshot

Getting Started with a New Datastore

When you first configure a Datastore:

  1. An automatic Catalog operation initiates to discover your data assets
  2. After cataloging completes, run a Profile operation to:
  3. Generate comprehensive metadata
  4. Infer data quality checks
  5. Enable quality monitoring

To start profiling:

Screenshot Screenshot

Screenshot Screenshot