Skip to content

Computed Fields

Computed Fields allow you to enhance data analysis by applying dynamic transformations directly to your data. These fields let you create new data points, perform calculations, and customize data views based on your specific needs, ensuring your data is both accurate and actionable.

Let's get started 🚀

Add Computed Fields

Step 1: Log in to Your Qualytics Account, navigate to the side menu, and select the source datastore where you want to create a computed field.

datastore datastore

Step 2: Select the Container within the chosen datastore where you want to create the computed field. This container holds the data to which the new computed field will be applied, enabling you to enhance your data analysis within that specific datastore.

For demonstration purposes, we have selected the Bank Dataset-Staging source datastore and the bank_transactions_.csv container within it to create a computed field.

container container

Step 3: After selecting the container, click on the Add button and select Computed Field from the dropdown menu to create a new computed field.

computed-field computed-field

A modal window will appear, allowing you to enter the details for your computed field.

add-field add-field

Step 4: Enter the Name for the computed field and select Transformation Type from the dropdown menu.

REF. FIELDS ACTION
1. Field Name (Required) Add a unique name for your computed field.
2. Transformation Type (Required) The type of transformation you want to apply from the available options.

Info

Transformations are changes made to data, like converting formats, doing calculations, or cleaning up fields. In Qualytics, you can use transformations to meet specific needs, such as cleaning entity names, converting formatted numbers, or applying custom expressions. With various transformation types available, Qualytics enables you to customize your data directly within the platform, ensuring it’s accurate and ready for analysis.

Transformation Types Purpose Reference
Cleaned Entity Name Removes business signifiers (such as 'Inc.' or 'Corp') from an entity name. See here
Convert Formatted Numeric Removes formatting (such as parentheses for denoting negatives or commas as delimiters) from values that represent numeric data, converting them into a numerically typed field. See here
Custom Expression Allows you to create a new field by applying any valid Spark SQL expression to one or more existing fields. See here

transformation-type transformation-type

Step 5: After selecting the appropriate Transformation Type, click the Save button.

save save

Step 6: After clicking on the Save button, your computed field is created and a success flash message will display saying A computed field has been created successfully.

success success

You can find your computed field by clicking on the dropdown arrow next to the container you selected when creating the computed field.

field-created field-created

Computed Fields Details

Totals

1. Quality Score: This provides a comprehensive assessment of the overall health of the data, factoring in multiple checks for accuracy, consistency, and completeness. A higher score, closer to 100, indicates optimal data quality with minimal issues or errors detected. A lower score may highlight areas that require attention and improvement.

2. Sampling: This shows the percentage of data that was evaluated during profiling. A sampling rate of 100% indicates that the entire dataset was analyzed, ensuring a complete and accurate representation of the data’s quality across all records, rather than just a partial sample.

3. Completeness: This metric measures how fully the data is populated without missing or null values. A higher completeness percentage means that most fields contain the necessary information, while a lower percentage indicates data gaps that could negatively impact downstream processes or analysis.

4. Active Checks: This refers to the number of ongoing quality checks being applied to the dataset. These checks monitor aspects such as format consistency, uniqueness, and logical correctness. Active checks help maintain data integrity and provide real-time alerts about potential issues that may arise.

5. Active Anomalies: This tracks the number of anomalies or irregularities detected in the data. These could include outliers, duplicates, or inconsistencies that deviate from expected patterns. A count of zero indicates no anomalies, while a higher count suggests that further investigation is needed to resolve potential data quality issues.

Totals Totals

Profile

This provides detailed insights into the characteristics of the field, including its type, distinct values, and length. You can use this information to evaluate the data's uniqueness, length consistency, and complexity.

No Profile Description
1 Declared Type Indicates whether the type is declared by the source or inferred.
2 Distinct Values Count of distinct values observed in the dataset.
3 Min Length Shortest length of the observed string values or lowest value for numerics.
4 Max Length Greatest length of the observed string values or highest value for numerics.
5 Mean Mathematical average of the observed numeric values.
6 Median The median of the observed numeric values.
7 Standard Deviation Measure of the amount of variation in observed numeric values.
8 Kurtosis Measure of the ‘tailedness’ of the distribution of observed numeric values.
9 Skewness Measure of the asymmetry of the distribution of observed numeric values.
10 Q1 The first quartile; the central point between the minimum and the median.
11 Q3 The third quartile; the central point between the median and the maximum.
12 Sum Total sum of all observed numeric values.

Profile Profile

You can hover over the (i) button to view the native field properties, which provide detailed information such as the field's type (numeric), size, decimal digits, and whether it allows null values.

Hover Hover

Manage Tags in field details

Tags can now be directly managed in the field profile within the Explore section. Simply access the Field Details panel to create, add, or remove tags, enabling more efficient and organized data management.

Step 1: Log in to your Qualytics account and click the Explore button on the left side panel of the interface.

exlore explore

Step 2: Click on the Profiles tab and select fields.

profiles-tab profiles-tab

Step 3: Click on the specific field that you want to manage tags.

specific specific

A Field Details modal window will appear. Click on the + button to assign tags to the selected field.

field field

Step 4: You can also create the new tag by clicking on the âž• button.

plus plus

A modal window will appear, providing the options to create the tag. Enter the required values to get started.

modal-window modal-window

For more information on creating tags, refer to the Add Tag section.

Filter and Sort Fields

Filter and Sort options allow you to organize your fields by various criteria, such as Name, Checks, Completeness, Created Date, and Tags. You can also apply filters to refine your list of fields based on Type and Tags

Sort

You can sort your checks by Anomalies, Checks, Completeness, Created Date, Name, Quality Score, and Type to easily organize and prioritize them according to your needs.

Sort Sort

No Sort By Description
1 Anomalies Sorts fields based on the number of detected anomalies.
2 Checks Sorts fields by the number of active validation checks applied.
3 Completeness Sorts fields based on their data completeness percentage.
4 Created Date Sorts fields by the date they were created, showing the newest or oldest fields first.
5 Name Sorts fields alphabetically by their names.
6 Quality Score Sorts fields based on their quality score, indicating the reliability of the data in the field.
7 Type Sorts fields based on their data type (e.g., string, boolean, etc.).

Sort Sort

Whatever sorting option is selected, you can arrange the data either in ascending or descending order by clicking the caret button next to the selected sorting criteria.

caret caret

Filter

You can filter your checks based on values like Type and Tag to easily organize and prioritize them according to your needs.

caret caret

No Filter Description
1 Type Filters fields based on the data type (e.g., string, boolean, date, etc.).
2 Tag Select this to filter the fields based on specific tags, such as Healthcare, Compliance, or Sensitive.

Types of Transformations

Cleaned Entity Name

This transformation removes common business signifiers from entity names, making your data cleaner and more uniform.

Options for Cleaned Entity Name

REF. FIELDS ACTIONS
1. Drop from Suffix Add a unique name for your computed field.
2. Drop from Prefix Removes specified terms from the beginning of the entity name.
3. Drop from Interior Removes specified terms from the beginning of the entity name.
4. Additional Terms to Drop (Custom) Allows you to specify additional terms that should be dropped from the entity name.
5. Terms to Ignore (Custom) Designate terms that should be ignored during the cleaning process.

Example for Cleaned Entity Name

Example Input Transformation Output
1 "TechCorp, Inc." Drop from Suffix: "Inc." "TechCorp"
2 "Global Services Ltd." Drop from Prefix: "Global" "Services Ltd."
3 "Central LTD & Finance Co." Drop from Interior: "LTD" "Central & Finance Co."
4 "Eat & Drink LLC" Additional Terms to Drop: "LLC", "&" "Eat Drink"
5 "ProNet Solutions Ltd." Terms to Ignore: "Ltd." "ProNet Solutions"

Convert Formatted Numeric

This transformation converts formatted numeric values into a plain numeric format, stripping out any characters like commas or parentheses that are not numerically significant.

Example for Convert Formatted Numeric

Example Input Transformation Output
1 "$1,234.56" Remove non-numeric characters: ",", "$" "1234.56"
2 "(2020)" Remove non-numeric characters: "(", ")" "-2020"
3 "100%" Remove non-numeric characters: "%" "100"

Custom Expression

Enables the creation of a field based on a custom computation using Spark SQL. This is useful for applying complex logic or transformations that are not covered by other types.

Using Custom Expression:

You can combine multiple fields, apply conditional logic, or use any valid Spark SQL functions to derive your new computed field.

Example: To create a field that sums two existing fields, you could use the expression SUM(field1, field2).

Example for Custom Expression

Example Input Fields Custom Expression Output
1 field1 = 10, field2 = 20 SUM(field1, field2) 30
2 salary = 50000, bonus = 5000 salary + bonus 55000
3 hours = 8, rate = 15.50 hours * rate 124
4 status = 'active', score = 85 CASE WHEN status = 'active' THEN score ELSE 0 END 85