How to develop and register Source Adapters
A Source Adapter is a small HTTP service that allows KGM to ingest or query business knowledge from an upstream system.
Depending on your needs, an adapter may support:
- Ingestion, by exposing the entire knowledge graph as JSON-LD
- Passthrough, by forwarding Search and SPARQL queries
Adapters are registered via KGM configuration.
Refer to the Source Adapter API reference for complete details about the REST endpoints that must be implemented by a Source Adapter.
Implementing an Ingestion Adapter
To support ingestion, your adapter must expose:
GET /v1/graph/jsonld
KGM calls this endpoint to ingest the graph of concepts and relationships from the source.
Requirements
Your adapter must:
- return a valid JSON-LD document (
Content-Type: application/ld+json) - include all concepts and relationships from the source
- represent the graph using RDF in JSON-LD
- return the JSON-LD as a file (OpenAPI:
type: string,format: binary) - support large outputs (KGM streams the file on read)
Example
1. Source
Imagine your upstream source contains the following concepts:
Customer
Definition: "A person or organization that purchases goods or services."
Order
Definition: "A commercial request issued by a customer to purchase goods or services."
Related to: Customer
Invoice
Definition: "A financial document issued to a customer after an order has been created."
Broader than: Order
This is a typical business glossary structure:
- Customer is a top-level concept
- Order is associated to Customer
- Invoice is a subtype/narrower concept of Order
Your Source Adapter must convert this into JSON-LD representing RDF triples.
2. JSON-LD Representation
The ingestion adapter will return one JSON-LD document describing all concepts and relationships.
The adapter may use a custom vocabulary or SKOS or any RDF-compliant schema.
Below is an example using a simple custom vocabulary, with compact IRIs defined via @context.
{
"@context": {
"ex": "https://example.com/vocab/",
"concept": "https://example.com/concepts/",
"label": "ex:label",
"definition": "ex:definition",
"broader": "ex:broader",
"related": "ex:related"
},
"@graph": [
{
"@id": "concept:Customer",
"@type": "ex:BusinessConcept",
"label": "Customer",
"definition": "A person or organization that purchases goods or services."
},
{
"@id": "concept:Order",
"@type": "ex:BusinessConcept",
"label": "Order",
"definition": "A commercial request issued by a customer to purchase goods or services.",
"related": { "@id": "concept:Customer" }
},
{
"@id": "concept:Invoice",
"@type": "ex:BusinessConcept",
"label": "Invoice",
"definition": "A financial document issued to a customer after an order has been created.",
"broader": { "@id": "concept:Order" }
}
]
}
Remote context loading is disabled
The JSON-LD specification JSON-LD supports loading contexts from external URLs (for example, "@context": "htts://example.tld/context.tld").
Example of a JSON-LD document with remote context:
{
"@context": "https://example.tld/context.tld", // remote context
"@type": "Person",
"prefix:name": "John Doe",
"prefix:email": "john@example.com"
}
In the example above, the JSON-LD processor would normally make an HTTPS request to https://example.tld to fetch the context definition.
However, to prevent outbound network calls from the cluster, remote context loading is disabled.
If your JSON-LD documents rely on specific context definitions, consider including the context inline within your documents rather than referencing remote URLs.
{
// Explicit context object
"@context": {
"prefix": "http://sample.tld"
},
"@type": "Person",
"prefix:name": "John Doe",
"prefix:email": "john@example.com"
}
Configuring an Ingestion Adapter in KGM
Ingestion adapters are registered statically in the KGM configuration.
Example:
kgm:
sources:
- id: domain-glossary
name: My Domain Glossary
baseUrl: 'https://glossary.example.com'
modes:
ingest:
enabled: true
schedule: '0 0 12 ? * * *'
Meaning:
- KGM will call the ingestion endpoint every day at 12 pm
- The entire JSON-LD graph will be streamed and ingested and will replace any previously ingested knowledge from the same source
When a Source Adapter is unregistered (removed from the KGM configuration), all triples previously ingested from that source are automatically removed from KGM.
It's also possibile to manually trigger the ingestion from a Source Adapter via KGM APIs.
Implementing a Passthrough Adapter
If the source supports native query capabilities, you may implement passthrough endpoints.
Passthrough Search
GET /v1/search/query?term=<term>&offset=<n>&limit=<n>
Requirements:
- Accepts the same query parameters as KGM's internal search
- Forwards the search to the upstream system
- Returns the source's response
Passthrough SPARQL
POST /v1/graph/sparql
Content-Type: application/sparql-query
Requirements:
- accept a SPARQL query
- forward it to the upstream SPARQL endpoint
- return the raw SPARQL JSON results unchanged
Configuring a Passthrough Adapter in KGM
Passthrough mode is also configured statically.
kgm:
sources:
- id: domain-glossary
name: My Domain Glossary
baseUrl: 'https://glossary.example.com'
passthrough:
enabled: true
In this example:
- KGM will not ingest anything from this source
- All search and SPARQL queries for this source will be forwarded to the upstream system
- When KGM receives
GET /passthrough/domain-glossary/v1/search/query?term=customerit forwards it toGET https://glossary.example.com/v1/search/query?term=customer - When KGM receives
POST /passthrough/domain-glossary/v1/graph/sparqlit forwards it toPOST https://glossary.example.com/v1/graph/sparql
- When KGM receives
Refer to the KGM API reference for additional info on the passthrough endpoints.