Skip to main content

Testing Data Products with Computational Policies

Testing a data product is essential to verify its quality and readiness for production. This process involves evaluating output ports, workloads, and compliance with data policies to ensure reliable and meaningful results. By running these tests, you can identify and resolve issues early, saving time and improving product trustworthiness and overall data governance.

1. Introduce Data Product Testing

Okay, We are going to check how to test a Data Product

Introduce Data Product Testing

2. Describe Product Components

So this is a product composed by two output ports and 1 workload, just an example.

Describe Product Components

3. Access Product Interface

Each component contributes to create the overall Data Product descriptor

Access Product Interface

4. Explain Descriptor Creation

Witboost creates automatically the full descriptor based on the target environment.

Explain Descriptor Creation

5. Test

This side bar opens the test pan

Test

6. Run a test

Once here we can run a test to understand if our Data Product is ready to be promoted in production

Run a test

7. Initiate Test Run

Let's run it

Initiate Test Run

8. Test Results

We now get all the results from policies that have been enabled for Data Product

Test Results

9. Report Test Result

We have three possible feedbacks from each policy: failure, success or warning

Report Test Result

10. Identify Missing Constraints

In this case we are understanding now that we didn't implement Data Quality and also we are not fully compliant with architectural requirements like exposing output ports with GraphQL

Identify Missing Constraints

11. Metadata Issues

We also have some issue about metadata compliance and Dora classification, let's check for more details.

Metadata Issues

12. Review Policy Problems

to better understand the issue, we can check details and in this specific case is clear that we are missing one field.

Review Policy Problems

13. Close the details

Close the details

14. Description Deficiency

At the same way, we are missing meaningful descriptions in our data contract.

Description Deficiency

15. Schema Issues

Description are really important to crate the proper context and business documentation around data products

Schema Issues

16. Open Descriptor Details

Open Descriptor Details

17. Discuss Data Duplication Issue

Also from this other policy we can understand that we are about duplicate too much information that is already present in the Data ProducMarketplace

Discuss Data Duplication Issue

18. Identify Duplication in Product

This policy compare the schemas of the Data Product we are testing with all the schemas that have been already published in the Marketplace

Identify Duplication in Product

19. Relate Duplication to Churn

And we can also explore which of the existing Data Products are more similar to the one we are developing. It is important to detect this kind of issue before we complete all the developments.

Relate Duplication to Churn

Testing these areas and creating a quality gate helps maintaining high standards and reliable Data Products even if developed in a highly decentralized way.

Powered by guidde