Skip to main content

Data Catalog Plugin API (2.2.0)

A Data Catalog Plugin is a microservice that is responsible for provisioning and unprovisioning metadata into a Data Catalog a the end of the deployment phase of a resource through Witboost.

It takes in a system descriptor enriched with components' provisioning results and returns the result of the provisioned Data Catalog entities.

The Data Catalog contract is similar to a technology adapter, with a few differences. It takes in the whole system descriptor already enriched with tech adapters' deploy info.

Base URL

A Data Catalog Plugin is not meant to be called directly by users from their browsers, but instead it is called by the Witboost Control Plane. Therefore, the base URL of the API will depend on the network address given by the Kubernetes network e.g. http://<service k8s endpoint name>.<namespace>:<port>. Below, we are providing an example base URL, yours may differ.

DataCatalogPlugin

Validate a Data Catalog Entity provision request (async)

Validate output ports metadata attached in the provisioning request.

In particular, it validates the format and the existence of glossary terms and classification tags

Request Body schema: application/json
required

Details of a provisioning request to be validated

descriptorKind
required
string (DescriptorKind)
Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS"
descriptor
required
string

A provisioning request in yaml format

Responses

Request samples

Content type
application/json
{
  • "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
  • "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
  • "removeData": false
}

Response samples

Content type
application/json
"2f4b3b3b-4b3b-4b3b-4b3b-4b3b4b3b4b3b"

Get status of an async validation task

path Parameters
token
required
string
Example: 2a4bb060-27c9-403d-ab5d-d3776cf56c3b

Token obtained after a call to the async validate endpoint

Responses

Response samples

Content type
application/json
{
  • "status": "COMPLETED",
  • "info": {
    }
}

Validate a Data Catalog Entity provision request (sync)

Synchronously validate a provisioning request and return the validation result.

You want to implement this endpoint to assess that the provided resource descriptor conforms to the expected structure and has all the necessary information to be deployed on the infrastructure.

It is highly recommended to implement the asynchronous validation endpoint instead of this one.

Request Body schema: application/json
required

Details of a provisioning request to be validated

descriptorKind
required
string (DescriptorKind)
Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS"
descriptor
required
string

A provisioning request in yaml format

Responses

Request samples

Content type
application/json
{
  • "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
  • "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
  • "removeData": false
}

Response samples

Content type
application/json
{
  • "valid": true
}

Provision Data Catalog entity

Provisioning is the operation that deploys the metadata of a system or a single component into a third-party Data Catalog.

This request can be handled in synchronous or asynchronous mode depending on the implementation.

Request Body schema: application/json
required

A Data Product descriptor, enriched with components' provisioning results, wrapped as a string into a simple object

descriptorKind
required
string (DescriptorKind)
Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS"
descriptor
required
string

A provisioning request in yaml format

Responses

Request samples

Content type
application/json
{
  • "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
  • "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
  • "removeData": false
}

Response samples

Content type
application/json
{
  • "status": "COMPLETED",
  • "info": {
    }
}

Unprovision Data Catalog entity

Unprovisioning is the operation that removes the metadata of a system or a single component from a third-party Data Catalog.

This request is synchronous and returns the result of the unprovisioning process.

Request Body schema: application/json
required

A system descriptor and the provisioning results wrapped as simple object

descriptorKind
required
string (DescriptorKind)
Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS"
descriptor
required
string

A provisioning request in yaml format

Responses

Request samples

Content type
application/json
{
  • "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
  • "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
  • "removeData": false
}

Response samples

Content type
application/json
{
  • "status": "COMPLETED",
  • "info": {
    }
}

Get status of Data Catalog entity provision/unprovision task

path Parameters
token
required
string
Example: f6440015-1404-4e96-a250-7f680dbc32d4

Data Catalog Provision/Unprovision Task Token obtained after a call to the provision/unprovision endpoint

Responses

Response samples

Content type
application/json
{
  • "status": "COMPLETED",
  • "info": {
    }
}

Get linked Data Catalog entity

Return the reference (id, links, etc) to the Data Catalog entity that refers to the provided Output Port

query Parameters
componentId
required
string
Example: componentId=urn:dmb:dp:finance:a-system:0:component:1

Output Port URN to get the reference to the Data Catalog entity

Responses

Response samples

Content type
application/json