Data Catalog Plugin API
Read this page in wide view mode from here
Data Catalog Plugin API (2.2.0)
A Data Catalog Plugin is a microservice that is responsible for provisioning and unprovisioning metadata into a Data Catalog a the end of the deployment phase of a resource through Witboost.
It takes in a system descriptor enriched with components' provisioning results and returns the result of the provisioned Data Catalog entities.
The Data Catalog contract is similar to a technology adapter, with a few differences. It takes in the whole system descriptor already enriched with tech adapters' deploy info.
A Data Catalog Plugin is not meant to be called directly by users from their browsers, but instead it is called by the Witboost Control Plane. Therefore, the base URL of the API will depend on the network address given by the Kubernetes network e.g. http://<service k8s endpoint name>.<namespace>:<port>. Below, we are providing an example base URL, yours may differ.
Validate a Data Catalog Entity provision request (async)
Validate output ports metadata attached in the provisioning request.
In particular, it validates the format and the existence of glossary terms and classification tags
Request Body schema: application/jsonrequired
Details of a provisioning request to be validated
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
Responses
Request samples
- Payload
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}Response samples
- 202
- 400
- 500
"2f4b3b3b-4b3b-4b3b-4b3b-4b3b4b3b4b3b"Get status of an async validation task
path Parameters
| token required | string Example: 2a4bb060-27c9-403d-ab5d-d3776cf56c3b Token obtained after a call to the async validate endpoint |
Responses
Response samples
- 200
- 400
- 500
{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "valid": true
}, - "privateInfo": {
- "valid": true
}
}
}Validate a Data Catalog Entity provision request (sync)
Synchronously validate a provisioning request and return the validation result.
You want to implement this endpoint to assess that the provided resource descriptor conforms to the expected structure and has all the necessary information to be deployed on the infrastructure.
It is highly recommended to implement the asynchronous validation endpoint instead of this one.
Request Body schema: application/jsonrequired
Details of a provisioning request to be validated
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
Responses
Request samples
- Payload
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}Response samples
- 200
- 500
{- "valid": true
}Provision Data Catalog entity
Provisioning is the operation that deploys the metadata of a system or a single component into a third-party Data Catalog.
This request can be handled in synchronous or asynchronous mode depending on the implementation.
Request Body schema: application/jsonrequired
A Data Product descriptor, enriched with components' provisioning results, wrapped as a string into a simple object
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
Responses
Request samples
- Payload
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}Response samples
- 200
- 202
- 400
- 500
{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "entitiesCommitted": 10
}, - "privateInfo": {
- "entitiesCommitted": 10,
- "glossaryId": "dfasd13"
}, - "logs": [
- {
- "timestamp": "2021-09-30T10:00:00Z",
- "level": "INFO",
- "message": "Entity provisioned successfully"
}
]
}
}Unprovision Data Catalog entity
Unprovisioning is the operation that removes the metadata of a system or a single component from a third-party Data Catalog.
This request is synchronous and returns the result of the unprovisioning process.
Request Body schema: application/jsonrequired
A system descriptor and the provisioning results wrapped as simple object
| descriptorKind required | string (DescriptorKind) Enum: "DATAPRODUCT_DESCRIPTOR" "DATAPRODUCT_DESCRIPTOR_WITH_RESULTS" |
| descriptor required | string A provisioning request in yaml format |
Responses
Request samples
- Payload
{- "descriptorKind": "DATAPRODUCT_DESCRIPTOR",
- "descriptor": "dataProductOwnerDisplayName: John Smith\nprojectOwner: user:john.smith_example.com\nprojectOwnerDisplayName: John Smith\nenvironment: production\ndomain: finance\nprojectKind: system\nkind: dataproduct\ndomainId: urn:dmb:dmn:finance\nid: urn:dmb:dp:finance:cashflow:0\ndescription: Data product representing operating cashflows\ndevGroup: teamalpha\nownerGroup: john.smith_example.com\ndataProductOwner: user:john.smith_example.com\nversion: 0.1.0\nfullyQualifiedName: Cash Flows\nname: CashFlow\ninformationSLA: daily\nmaturity: Approved\nuseCaseTemplateId: urn:dmb:utm:dataproduct-template:0.0.0\ninfrastructureTemplateId: urn:dmb:itm:dataproduct-provisioner:1\nbilling: {}\ntags: []\nspecific: {}\ncomponents:\n - kind: outputport\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-output-port\n description: Output port exposing cashflow data\n name: CashFlow OutputPort\n fullyQualifiedName: null\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:output-port-provisioner:1\n useCaseTemplateId: urn:dmb:utm:output-port-template:1.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n outputPortType: SQL\n creationDate: null\n startDate: null\n processDescription: null\n dataContract:\n schema:\n - name: id\n dataType: string\n tags: []\n - name: direction\n dataType: string\n tags: []\n - name: amount\n dataType: number\n description: monetary value of the cashflow\n tags: []\n - name: counterpartyId\n dataType: string\n description: identifier of the counterparty\n tags: []\n termsAndConditions: \"\"\n endpoint: null\n SLA:\n intervalOfChange: 2 day\n timeliness: 2 day\n upTime: 99.9%\n dataSharingAgreement:\n purpose: \"\"\n billing: \"\"\n security: \"\"\n intendedUsage: \"\"\n limitations: \"\"\n lifeCycle: \"\"\n confidentiality: \"\"\n tags: []\n sampleData:\n columns:\n - id\n - direction\n - amount\n - counterpartyId\n rows:\n - - CF-001\n - Incoming\n - 10000\n - cpty:12345\n - - CF-002\n - Outgoing\n - 5000\n - cpty:67890\n semanticLinking: []\n specific:\n format: Iceberg\n consumable: true\n shoppable: true\n - kind: workload\n id: urn:dmb:cmp:finance:cashflow:0:cashflow-calculation\n description: Workload that calculates cashflows from input deals\n useCaseTemplateId: urn:dmb:utm:workload-template:0.0.0\n infrastructureTemplateId: urn:dmb:itm:workload-provisioner:1\n fullyQualifiedName: CashFlow Calculation\n name: CashFlow Calculation\n technology: Spark\n version: 0.0.0\n dependsOn:\n - urn:dmb:cmp:finance:cashflow:0:storage-component\n readsFrom: []\n tags: []\n specific: {}\n consumable: false\n shoppable: false\n - kind: storage\n id: urn:dmb:cmp:finance:cashflow:0:storage-component\n description: Internal storage for the data product\n name: Storage Component\n fullyQualifiedName: Internal Storage\n version: 0.0.0\n infrastructureTemplateId: urn:dmb:itm:storage-provisioner:0\n useCaseTemplateId: urn:dmb:utm:storage-template:0.0.0\n dependsOn: []\n platform: AWS\n technology: S3\n storageType: Database\n tags: []\n specific:\n bucket: finance-cashflows\n consumable: false\n shoppable: false",
- "removeData": false
}Response samples
- 200
- 202
- 400
- 500
{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "entitiesRemoved": 10
}, - "privateInfo": {
- "entitiesRemoved": 10,
- "glossaryId": "dfasd13"
}, - "logs": [
- {
- "timestamp": "2021-09-30T10:00:00Z",
- "level": "INFO",
- "message": "Entity unprovisioned successfully"
}
]
}
}Get status of Data Catalog entity provision/unprovision task
path Parameters
| token required | string Example: f6440015-1404-4e96-a250-7f680dbc32d4 Data Catalog Provision/Unprovision Task Token obtained after a call to the provision/unprovision endpoint |
Responses
Response samples
- 200
- 400
- 500
- 501
{- "status": "COMPLETED",
- "info": {
- "publicInfo": {
- "entitiesCommitted": 10
}, - "privateInfo": {
- "entitiesCommitted": 10,
- "glossaryId": "dfasd13"
}, - "logs": [
- {
- "timestamp": "2021-09-30T10:00:00Z",
- "level": "INFO",
- "message": "Entity provisioned successfully"
}
]
}
}Get linked Data Catalog entity
Return the reference (id, links, etc) to the Data Catalog entity that refers to the provided Output Port
query Parameters
| componentId required | string Example: componentId=urn:dmb:dp:finance:a-system:0:component:1 Output Port URN to get the reference to the Data Catalog entity |
Responses
Response samples
- 200
- 500