Skip to main content

Data Contract Guardian

Infrastructure template

The infrastructure template of a guardian can be registered just like other infrastructure templates by calling the templates registration endpoint. To configure the template as a data contract guardian template, you must include a properties.dataContractGuardian object in the request body along with the associated governance policies.

info

For detailed information about the endpoint and the properties.dataContractGuardian field, refer to the Builder's API reference

When configuring the properties.dataContractGuardian object, you need to specify a list of policies from the Computational Governance Platform that will be associated with the guardian template. You can include at most one policy per system resource type.

For example, if you intend to apply the infrastructure template to data contracts contained in systems with resource type Data Product, and systems with resource type BI Project, you need to setup a policy for each of them.

You have two options for specifying policies:

  1. Automatic creation: provide the policy attributes, and Witboost will create it for you
  2. Manual creation: alternatively, you can manually create the passive policy in the Computational Governance Platform and provide its unique ID in your registration request

Example:

{
// ...
"properties": {
"dataContractGuardian": {
"policies": [
{
"resourceType": "dataproduct",
// This policy will be automatically created with the provided name and description
"spec": {
"name": "Data Product Guardian Policy",
"description": "Passive policy for monitoring Data Product contracts."
}
},
{
"resourceType": "biproject",
// This provides a PASSIVE policy already registered in the Computational Governance Platform
"spec": {
"id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
}
}
]
}
}
}

Guardian provisioning

When the Provisioning Coordinator receives a deployment request for a guardian component, it checks if the related infrastructure template is registered and whether it's a guardian infrastructure template; otherwise, an error is emitted and the provisioning plan is terminated.

If the infrastructure template is correctly registered, the Coordinator retrieves the related passive policy based on the resource type of the system being deployed.

The ID of this policy is then included in the descriptor of the guardian component (or subcomponent), which is sent to the guardian's tech adapter in the provisioning request. Specifically, the passive policy ID can be found under (string) property info.privateInfo.__dataContractGuardian.policyId

Guardian policy

Once the guardian policy ID is received and the data contract is successfully deployed, the guardian can start monitoring the data contract and periodically send monitoring results as passive policy results.

To submit these results, the guardian will make a POST request to the Computational Governance Platform's endpoint /v1/computational-governance/policies/{policyId}/evaluation-results:

  • path param policyId is the passive policy ID received from the Coordinator
  • query param environment is the environment where guardian and data contract are deployed
  • request body (JSON array) — must contain a single object with the following properties:
    • resource — details about the data contract resource (i.e., the component or subcomponent with __dataContractEnabled: true)
      • id — data contract URN (component or subcomponent)
      • displayName — data contract name (optional)
      • descriptor — data contract descriptor. Can be left blank (empty string) on guardian policies
    • result — monitoring results
      • satisfiesPolicy — if false, the data contract is considered violated, if true the data contract is marked as compliant. Note: This flag takes precedence over the details property when declaring the compliance status of the data contract. In other words, even if the details provide a different indication, the satisfiesPolicy flag will ultimately determine whether the data contract is deemed compliant or in violation
      • errors — can be left empty on guardian policies
      • details — monitoring details. For guardian policies, this property must adhere to a specific JSON schema — detailed below — that outlines the required structure and data types
note

Refer to the Computational Governance Platform's API reference for additional details on the endpoint.

Result details schema

warning

If the details object does not comply with the specified JSON schema, the Witboost Marketplace will be unable to correctly render the status of the data contract. In such cases, an error message will be displayed, indicating that the monitoring results cannot be processed.

{
"$schema": "http://json-schema.org/draft-07/schema",
"title": "Guardian policy result details",
"type": "object",
"required": ["results"],
"properties": {
"results": {
"type": "object",
"patternProperties": {
".*": {
"type": "array",
"items": {
"$ref": "#/$defs/validation"
}
}
},
"description": "An object where each key represents the identifier of a component within the data contract and the related value is an array of validation results for that component. The `root` alias can be used to target the primary component with the `__dataContractEnabled: true` property in its descriptor. Its (consumable) subcomponents are individually addressable through their respective URNs."
},
"notes": {
"$ref": "#/$defs/notes"
}
},
"$defs": {
"notes": {
"type": "object",
"properties": {
"errorSummary": {
"type": "object",
"description": "Additional information to be attached to a data contract violation",
"properties": {
"message": {
"type": "string",
"description": "A display message that summarizes the current data contract issues",
"example": "Several issues detected on the data contract"
}
}
}
}
},
"validation": {
"type": "object",
"description": "Validation pertaining to a specific aspect of the data contract. This may refer to a particular section, such as the 'SLA section', which can include nested details, or it may specify a distinct requirement defined within the data contract, such as 'UP Time.' This structure facilitates precise referencing of both major sections and individual validation criteria.",
"properties": {
"key": {
"type": "string",
"description": "A unique key used to identify the validated element",
"example": "sla"
},
"name": {
"type": "string",
"description": "Display name of the validated aspect",
"example": "Service Level Agreement"
},
"description": {
"type": "string",
"description": "Description of the validated aspect"
},
"compliant": {
"type": "boolean",
"description": "Required when the validated aspect is not a section. Indicates whether the data contract meets the compliance criteria for this specific requirement."
},
"issues": {
"type": "array",
"description": "When `compliant` is false, this array lists the specific issues identified in relation to the data contract requirement",
"items": {
"type": "string",
"example": "Late data"
}
},
"children": {
"type": "array",
"description": "When the aspect represents a section, this array contains the validations for the elements included within the section",
"items": {
"$ref": "#/$defs/validation"
}
}
},
"required": ["key"],
"oneOf": [
{
"required": ["compliant"],
"not": {
"required": ["children"]
}
},
{
"required": ["children"],
"not": {
"required": ["compliant"]
}
}
],
"dependencies": {
"issues": ["compliant"]
}
}
}
}

Key points:

  • complete status reporting — the details object should report the compliance status for each data contract requirement using the compliant flag
  • previous results are overridden — each new monitoring result overwrites any prior results for the same data contract
  • subcomponent validations — when a data contract component includes consumable subcomponents, the monitoring result may include validations specific for such subcomponents by referencing them by URN in the results map. In the user manual we referred to this scenario as Distributed data contract definition

Sample scenario #1

Data contract descriptor:

id: urn:dmb:cmp:domain:producer-system:0:data-contract
name: Data Contract
consumable: false
__dataContractEnabed: true
dataContract:
schema:
- name: message-field-1
dataType: string
- name: message-field-2
dataType: boolean
constraint: NOT_NULL
SLA:
upTime: 99.9%
# ...
components: # subcomponents
- id: urn:dmb:cmp:domain:producer-system:0:data-contract:landing-topic
name: Landing Topic
kind: storage
technology: Kafka
consumable: false
# ...

- id: urn:dmb:cmp:domain:producer-system:0:data-contract:non-compliant-topic
name: Non-Compliant Topic
kind: storage
technology: Kafka
consumable: false
# ...

- id: urn:dmb:cmp:domain:producer-system:0:data-contract:compliant-topic
name: Compliant Topic
kind: outputport
technology: Kafka
consumable: true
# ...

- id: urn:dmb:cmp:domain:producer-system:0:data-contract:guardian
name: Data Contract Guardian
kind: workload
consumable: false
# ...
  • the parent component is the data contract (__dataContractEnabled: true)
  • the parent component defines a data contract specification (dataContract)
  • there is one single consumable subcomponent (Compliant Topic) which however does not define any data contract specification (dataContract)

We expect this data contract's monitoring results to only target the parent component (i.e., the root component of the data contract)

[
{
"resource": {
"id": "urn:dmb:cmp:domain:producer-system:0:data-contract", // the data contract id (URN of the component with __dataContractEnabled: true)
"descriptor": ""
},
"result": {
"satisfiesPolicy": false, // there's a violation, so the policy should be marked as not satisfied
"details": {
"results": {
"root": [
{
"key": "schema",
"name": "Schema",
"description": "Events schema",
"children": [
{
"key": "message-field-1",
"name": "Message Field 1",
"compliant": true
},
{
"key": "message-field-2",
"name": "Message Field 2",
"compliant": false, // violation
"issues": ["Received some events without this required field"]
}
]
},
{
"key": "sla",
"name": "Service Level Agreement",
"children": [
{
"key": "upTime",
"name": "UP Time",
"compliant": true
}
]
}
]
},
"notes": {
"errorSummary": {
"message": "Data contract violations detected. Some events are currently under investigation, so they are not currently available in the Compliant Topic."
}
}
}
}
}
]

Sample scenario #2

Data contract descriptor:

id: urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract
name: Daily Transaction Summary Data Contract
consumable: false
__dataContractEnabed: true
dataContract:
SLA:
upTime: 99.9%
timeliness: 1 day
# ...
components: # subcomponents
- id: urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:financial-reporting
name: Financial Reporting Summary
kind: outputport
consumable: true
dataContract:
schema:
- name: date
dataType: date
description: 'Date of the transaction summary'
- name: total_transactions
dataType: int
description: 'Total number of transactions for the day'
- name: total_volume
dataType: float
description: 'Total transaction volume in the specified currency'
# ...
# ...
- id: urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:fraud-trend-reporting
name: Fraud Trend Analysis Summary
kind: outputport
consumable: true
dataContract:
schema:
- name: date
dataType: date
description: 'Date of the transaction summary'
- name: total_transactions
dataType: int
description: 'Total number of transactions for the day'
- name: high_risk_transaction_count
dataType: int
description: 'Count of transactions flagged as high risk'
# ...
# ...
  • the parent component is the data contract (__dataContractEnabled: true)
  • the parent component defines a data contract specification (dataContract)
  • there are two consumable subcomponents which complement the parent data contract with their own specification (dataContract)

We expect this data contract's monitoring results to reference three targets:

  • root
  • urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:financial-reporting
  • urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:fraud-trend-reporting
[
{
"resource": {
"id": "urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract", // the data contract id (URN of the component with __dataContractEnabled: true)
"descriptor": ""
},
"result": {
"satisfiesPolicy": true, // no violations, the policy is satisfied
"details": {
"results": {
"root": [
{
"key": "sla",
"name": "Service Level Agreement",
"children": [
{
"key": "upTime",
"name": "UP Time",
"compliant": true
},
{
"key": "timeliness",
"name": "Timeliness",
"compliant": true
}
]
}
],
"urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:financial-reporting": [
{
"key": "schema",
"name": "Schema",
"children": [
// Schema validations
// ...
]
}
],
"urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:fraud-trend-reporting": [
{
"key": "schema",
"name": "Schema",
"children": [
// Schema validations
// ...
]
}
]
}
}
}
}
]

Sample scenario #3

Let’s consider a descriptor similar to that of scenario #2, but instead of defining a parent data contract with two consumable subcomponents, we will designate each consumable subcomponent as an independent data contract.

id: urn:dmb:cmp:domain:producer-system:0:tx-summary
name: Daily Transaction Summary
consumable: false
# ...
components: # subcomponents
- id: urn:dmb:cmp:domain:producer-system:0:tx-summary:financial-reporting-data-contract
name: Financial Reporting Summary
kind: outputport
consumable: true
__dataContractEnabled: true
dataContract:
schema:
- name: date
dataType: date
description: 'Date of the transaction summary'
- name: total_transactions
dataType: int
description: 'Total number of transactions for the day'
- name: total_volume
dataType: float
description: 'Total transaction volume in the specified currency'
# ...
# ...
- id: urn:dmb:cmp:domain:producer-system:0:tx-summary:fraud-trend-reporting-data-contract
name: Fraud Trend Analysis Summary
kind: outputport
consumable: true
__dataContractEnabled: true
dataContract:
schema:
- name: date
dataType: date
description: 'Date of the transaction summary'
- name: total_transactions
dataType: int
description: 'Total number of transactions for the day'
- name: high_risk_transaction_count
dataType: int
description: 'Count of transactions flagged as high risk'
# ...
# ...

The parent component contains two data contracts but it is NOT a data contract itself.

The two data contracts operate independently, and could be guarded by different guardians, each with its own policies. As a result, their monitoring results are submitted separately, and each subcomponent serves as the root for its own monitoring results.

Sample monitoring result for data contract urn:dmb:cmp:domain:producer-system:0:tx-summary:financial-reporting-data-contract:

[
{
"resource": {
"id": "urn:dmb:cmp:domain:producer-system:0:tx-summary:financial-reporting-data-contract", // the data contract id (URN of the component with __dataContractEnabled: true)
"descriptor": ""
},
"result": {
"satisfiesPolicy": true,
"details": {
"results": {
"root": [
{
"key": "schema",
"name": "Schema",
"children": [
// Schema validations on Financial Reporting Data Contract
// ...
]
}
]
}
}
}
}
]

Sample monitoring result for data contract urn:dmb:cmp:domain:producer-system:0:tx-summary:financial-reporting-data-contract:

[
{
"resource": {
"id": "urn:dmb:cmp:domain:producer-system:0:tx-summary-data-contract:fraud-trend-reporting", // the data contract id (URN of the component with __dataContractEnabled: true)
"descriptor": ""
},
"result": {
"satisfiesPolicy": true,
"details": {
"results": {
"root": [
{
"key": "schema",
"name": "Schema",
"children": [
// Schema validations on Fraud Trend Analysis Data Contract
// ...
]
}
]
}
}
}
}
]