Governance Service
Introduction
The Governance Service serves as the central repository for all data governance-related data and functions inside the OIH. It offers both a database for long-term storage of relevant data, and an API for data retrieval and validation.
Besides that, it provides functionalities for displaying said data and for validating policies against provided data.
Technologies used
MongoDB: MongoDB is used as the Governance Service storage solution.
How it works
Policies
The “Policy” functions of the Governance Service are intended to offer a way to control exactly which and how data objects are passed through a flow, irrespective of connector logic.
Therefore, the Governance Service provides an endpoint to validate any policies on provided data.
You can find more details about policies here:
- OIH Policies: OIH Policies offer a way for you to control exactly which and how data objects are passed through a flow, irrespective of connector logic.
Additionally, the Governance Service manages default and user created functions, that can be used in policies.
Data Provenance
The “Data Provenance” function of the Governance Service is intended to allow users to reconstruct their data’s path through the OIH from the very first time it was synchronized up until the current moment.
This way, the data owner will be able to track all origins and destinations of their data, and whether it has been modified inside the OIH. This way, the data owner will be made more capable of complying with data governance policies and laws, such as the GDPR.
To this end, the Governance Service is capable of receiving metadata about certain events, such as a data object being transmitted from one application to another, and stores it as a detailed data provenance event. These events can then be retrieved, filtered, and searched using the Service’s API.
Provenance data model
The used data model is based on PROV-DM. This allows for easy mapping and export of provenance data to other systems. The model describes tuples of entities, agents, and activities, in addition to optional situational fields, such as describing one agent acting on behalf of another.
Example provenance object:
{
"entity": {
"id": "aoveu03dv921dvo",
"entityType": "oihUid"
},
"activity": {
"activityType": "ObjectReceived",
"used": "getPersons",
"startedAtTime": "2020-10-19T09:47:11+00:00",
"endedAtTime": "2020-10-19T09:47:15+00:00"
},
"agent": {
"id": "w4298jb9q74z4dmjuo",
"agentType": "Component",
"name": "Google Connector"
},
"actedOnBehalfOf": [
{
"first": true,
"id": "w4298jb9q74z4dmjuo",
"agentType": "Component",
"actedOnBehalfOf": "j460ge49qh3rusfuoh"
},
{
"id": "j460ge49qh3rusfuoh",
"agentType": "User",
"actedOnBehalfOf": "t454rt565zz57"
},
{
"id": "t454rt565zz57",
"agentType": "Tenant"
}
]
}
Data visualization
The Governance Service also provides several endpoints for showing the data distribution and any flow warnings related to the governance functionality.
Besides that it can provide an interactive HTML view of the data distribution visualized in form of an interactive graph which can be embedded.
Using the Governance Service
Besides a running instance of the Governance Service, it is required to set the flag governance: true in the nodeSettings of each flow step.
Furthermore, the ID-Linking functionality of the Ferryman should be activated by setting the nodeSettings flag idLinking:true. This will also require that the Data Hub Service is running. Otherwise, it is not guaranteed that every provenance event has the required oihId.
Governance Service API
The Governance Service offers a REST API through which the stored provenance data can be retrieved. To interact with this API, the user must supply a valid bearer token generated by the Identity Management.
List of supported Methods and Routes
endpoint | method | description | comments |
---|---|---|---|
/event | GET | Searches stored provenance events. | Based on the supplied filter criteria as detailed below |
The following query parameters can be appended to the URL to further refine the result list:
- page[size]
- page[number]
- from
- until
- filter[agent.id]
- filter[agent.agentType]
- filter[actedOnBehalfOf]
- filter[activityId]
- filter[activityType]
More details about the endpoint can be found in the swagger documentation of the service.
Policy related endpoints
You can find more details about policies here:
- OIH Policies: OIH Policies offer a way for you to control exactly which and how data objects are passed through a flow, irrespective of connector logic.
endpoint | method | description | comments |
---|---|---|---|
/applyPolicy | POST | Applies a policy and returns the modified data and a passed flag | Based on the supplied policy and the userId |
Expects the following parameters:
Body:
- data (the data object)
- metdata (with policy)
Query:
- action
Each policy can make use of the provided default functions and the created stored functions
Default functions:
equals notEquals
exists notExists
Numeric:
smallerThan biggerThan
smallerOrEqual biggerOrEqual
Strings:
contains notContains hasLength
Object keyLength Number of keys in object
isType Checks if value is type
anonymize Anonymize field specified by key with ‘XXXXXXXXXX’
endpoint | method | description | comments |
---|---|---|---|
/storedFunction | POST | Adds a new stored function | Based on the userId |
/storedFunction | DELETE | Deletes a stored function | Based on functionId and the userId |
/storedFunction | GET | Gets a list of all stored functions | Based on the supplied filter criteria as detailed below and the userId |
The following query parameters can be appended to the URL to further refine the result list:
- page[size]
- page[number]
- from
- until
- filter[name]
-
filter[id]
- names (‘function1,function2, … function-n’)
sort
- createdAt
- -createdAt
- updatedAt
- -updatedAt
Dashboard related endpoints
endpoint | method | description | comments |
---|---|---|---|
/distribution | GET | Returns overview of data distribution. | Based on the supplied user credentials |
Returns an object with the total amounts of:
- retrieved
- updated
- created
- deleted
for each connector
endpoint | method | description | comments |
---|---|---|---|
/distribution/graph | GET | Returns data distribution in form of a graph | Based on the supplied user credentials |
Format:
{
nodes: [
{
id: serviceName,
created: 0,
updated: 0,
retrieved: 0,
deleted: 0,
},
},
],
edges: [
{
data: {
id: flowId,
created: 0,
updated: 0,
retrieved: 0,
deleted: 0,
source: false,
target: false,
},
},
]
}
endpoint | method | description | comments |
---|---|---|---|
/distribution/graph/html | GET | Delivers a html page which is rendering the graph of the data distribution | Also dynamically showing additional data. Based on the supplied user credentials |
/objectStatus/:id | GET | Returns oihUid and all recordUids of a given object | Based on the supplied user credentials |
The following parameter is required in the URL:
oihUid or recordUid
endpoint | method | description | comments |
---|---|---|---|
/objectStatus/:id | GET | Returns a list of current warnings and advisories regarding the flow configuration | Based on the supplied user credentials |
Currently warns about not existent nodeSettings and missing governance flags in flows.
endpoint | method | description | comments |
---|---|---|---|
/objectStatus/:id | GET | Returns a combined list of object distribution overview and flow warnings | Based on the supplied user credentials |
REST-API documentation
Visit http://governance-service.openintegrationhub.com/api-docs/ to view the Swagger API-Documentation
Interaction with other Services
-
Ferryman: The Governance Service receives provenance events emitted by the ferryman module running on top of each Connector. The Ferryman also calls the governance service to check any provided policies
-
Data Hub: Optionally the ferryman sends the recordId the connector provides for an entry to the Data Hub for ID-linking to one OihId
-
Identity Management: The Governance Service API endpoints relies on a bearer token supplied by the Identity Management to determine which integration flows the current user may see, and which actions they may take.