Skip to main content

How to develop and register Source Adapters

A Source Adapter is a small HTTP service that allows KGM to ingest or query business knowledge from an upstream system.

Depending on your needs, an adapter may support:

  • Ingestion, by exposing the entire knowledge graph as JSON-LD
  • Passthrough, by forwarding Search and SPARQL queries

Adapters are registered via KGM configuration.

info

Refer to the Source Adapter API reference for complete details about the REST endpoints that must be implemented by a Source Adapter.

Implementing an Ingestion Adapter

To support ingestion, your adapter must expose:

GET /v1/graph/jsonld

KGM calls this endpoint to ingest the graph of concepts and relationships from the source.

Requirements

Your adapter must:

  • return a valid JSON-LD document (Content-Type: application/ld+json)
  • include all concepts and relationships from the source
  • represent the graph using RDF in JSON-LD
  • return the JSON-LD as a file (OpenAPI: type: string, format: binary)
  • support large outputs (KGM streams the file on read)

Example

1. Source

Imagine your upstream source contains the following concepts:

Customer
Definition: "A person or organization that purchases goods or services."

Order
Definition: "A commercial request issued by a customer to purchase goods or services."
Related to: Customer

Invoice
Definition: "A financial document issued to a customer after an order has been created."
Broader than: Order

This is a typical business glossary structure:

  • Customer is a top-level concept
  • Order is associated to Customer
  • Invoice is a subtype/narrower concept of Order

Your Source Adapter must convert this into JSON-LD representing RDF triples.

2. JSON-LD Representation

The ingestion adapter will return one JSON-LD document describing all concepts and relationships.

The adapter may use a custom vocabulary or SKOS or any RDF-compliant schema.

Below is an example using a simple custom vocabulary, with compact IRIs defined via @context.

{
"@context": {
"ex": "https://example.com/vocab/",
"concept": "https://example.com/concepts/",
"label": "ex:label",
"definition": "ex:definition",
"broader": "ex:broader",
"related": "ex:related"
},
"@graph": [
{
"@id": "concept:Customer",
"@type": "ex:BusinessConcept",
"label": "Customer",
"definition": "A person or organization that purchases goods or services."
},
{
"@id": "concept:Order",
"@type": "ex:BusinessConcept",
"label": "Order",
"definition": "A commercial request issued by a customer to purchase goods or services.",
"related": { "@id": "concept:Customer" }
},
{
"@id": "concept:Invoice",
"@type": "ex:BusinessConcept",
"label": "Invoice",
"definition": "A financial document issued to a customer after an order has been created.",
"broader": { "@id": "concept:Order" }
}
]
}
warning

Remote context loading is disabled

The JSON-LD specification JSON-LD supports loading contexts from external URLs (for example, "@context": "htts://example.tld/context.tld").

Example of a JSON-LD document with remote context:

{
"@context": "https://example.tld/context.tld", // remote context
"@type": "Person",
"prefix:name": "John Doe",
"prefix:email": "john@example.com"
}

In the example above, the JSON-LD processor would normally make an HTTPS request to https://example.tld to fetch the context definition.

However, to prevent outbound network calls from the cluster, remote context loading is disabled.

If your JSON-LD documents rely on specific context definitions, consider including the context inline within your documents rather than referencing remote URLs.

{
// Explicit context object
"@context": {
"prefix": "http://sample.tld"
},
"@type": "Person",
"prefix:name": "John Doe",
"prefix:email": "john@example.com"
}

Configuring an Ingestion Adapter in KGM

Ingestion adapters are registered statically in the KGM configuration.

Example:

kgm:
sources:
- id: domain-glossary
name: My Domain Glossary
baseUrl: 'https://glossary.example.com'
modes:
ingest:
enabled: true
schedule: '0 0 12 ? * * *'

Meaning:

  • KGM will call the ingestion endpoint every day at 12 pm
  • The entire JSON-LD graph will be streamed and ingested and will replace any previously ingested knowledge from the same source
warning

When a Source Adapter is unregistered (removed from the KGM configuration), all triples previously ingested from that source are automatically removed from KGM.

info

It's also possibile to manually trigger the ingestion from a Source Adapter via KGM APIs.

Implementing a Passthrough Adapter

If the source supports native query capabilities, you may implement passthrough endpoints.

GET /v1/search/query?term=<term>&offset=<n>&limit=<n>

Requirements:

  • Accepts the same query parameters as KGM's internal search
  • Forwards the search to the upstream system
  • Returns the source's response

Passthrough SPARQL

POST /v1/graph/sparql
Content-Type: application/sparql-query

Requirements:

  • accept a SPARQL query
  • forward it to the upstream SPARQL endpoint
  • return the raw SPARQL JSON results unchanged

Configuring a Passthrough Adapter in KGM

Passthrough mode is also configured statically.

kgm:
sources:
- id: domain-glossary
name: My Domain Glossary
baseUrl: 'https://glossary.example.com'
passthrough:
enabled: true

In this example:

  • KGM will not ingest anything from this source
  • All search and SPARQL queries for this source will be forwarded to the upstream system
    • When KGM receives GET /passthrough/domain-glossary/v1/search/query?term=customer it forwards it to GET https://glossary.example.com/v1/search/query?term=customer
    • When KGM receives POST /passthrough/domain-glossary/v1/graph/sparql it forwards it to POST https://glossary.example.com/v1/graph/sparql
tip

Refer to the KGM API reference for additional info on the passthrough endpoints.