A Data Catalog Plugin is a microservice that is responsible for provisioning and unprovisioning metadata into a Data Catalog a the end of the deployment phase of a resource through Witboost.
It takes in a system descriptor enriched with components' provisioning results and returns the result of the provisioned Data Catalog entities.
The Data Catalog contract is similar to a technology adapter, with a few differences. It takes in the whole system descriptor already enriched with tech adapters' deploy info.
A Data Catalog Plugin is not meant to be called directly by users from their browsers, but instead it is called by the Witboost Control Plane. Therefore, the base URL of the API will depend on the network address given by the Kubernetes network e.g. http://<service k8s endpoint name>.<namespace>:<port>. Below, we are providing an example base URL, yours may differ.
Validate output ports metadata attached in the provisioning request.
In particular, it validates the format and the existence of glossary terms and classification tags
Details of a provisioning request to be validated
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}"2f4b3b3b-4b3b-4b3b-4b3b-4b3b4b3b4b3b"| token required | string Example: 2a4bb060-27c9-403d-ab5d-d3776cf56c3b Token obtained after a call to the async validate endpoint |
{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "valid": true
}, - "privateInfo": {
- "valid": true
}
}
}Synchronously validate a provisioning request and return the validation result.
You want to implement this endpoint to assess that the provided resource descriptor conforms to the expected structure and has all the necessary information to be deployed on the infrastructure.
It is highly recommended to implement the asynchronous validation endpoint instead of this one.
Details of a provisioning request to be validated
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}{- "valid": true
}Provisioning is the operation that deploys the metadata of a system or a single component into a third-party Data Catalog.
This request can be handled in synchronous or asynchronous mode depending on the implementation.
A Data Product descriptor, enriched with components' provisioning results, wrapped as a string into a simple object
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "entitiesCommitted": 10
}, - "privateInfo": {
- "entitiesCommitted": 10,
- "glossaryId": "dfasd13"
}, - "logs": [
- {
- "timestamp": "2021-09-30T10:00:00Z",
- "level": "INFO",
- "message": "Entity provisioned successfully"
}
]
}
}Unprovisioning is the operation that removes the metadata of a system or a single component from a third-party Data Catalog.
This request is synchronous and returns the result of the unprovisioning process.
A system descriptor and the provisioning results wrapped as simple object
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "entitiesRemoved": 10
}, - "privateInfo": {
- "entitiesRemoved": 10,
- "glossaryId": "dfasd13"
}, - "logs": [
- {
- "timestamp": "2021-09-30T10:00:00Z",
- "level": "INFO",
- "message": "Entity unprovisioned successfully"
}
]
}
}| token required | string Example: f6440015-1404-4e96-a250-7f680dbc32d4 Data Catalog Provision/Unprovision Task Token obtained after a call to the provision/unprovision endpoint |
{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "entitiesCommitted": 10
}, - "privateInfo": {
- "entitiesCommitted": 10,
- "glossaryId": "dfasd13"
}, - "logs": [
- {
- "timestamp": "2021-09-30T10:00:00Z",
- "level": "INFO",
- "message": "Entity provisioned successfully"
}
]
}
}Return the reference (id, links, etc) to the Data Catalog entity that refers to the provided Output Port
| componentId required | string Example: componentId=urn:dmb:dp:finance:a-system:0:component:1 Output Port URN to get the reference to the Data Catalog entity |