Introduction
In my last article about the Cumulocity IoT Domain Model, I described how you can design a good model and shared some best practices about storing device meta & transactional device data.
In this article, I’ll focus on additional aspects of the domain model which are storing application configuration, audit data and why you should always have in mind how to retrieve/query data when you design a domain model.
Let’s get started with the application data!
Author remark: The guidance provided in this articles is based on project experience of @Tobias_Sommer and @Stefan_Witschel (me). Kudos goes to Tobias who collected & shared this with me.
How to store application data
When you start developing applications on top of Cumulocity IoT you will come to the point where you need to persist certain information from your application. This can be information that exists only once for your application (e.g. general configurations), this can be sensitive data (e.g. access credentials for 3rd party systems) or also user-specific metadata that the application needs to store per user.
You will have different options to store such information and each option has certain advantages and disadvantages so let us go through them.
Storing the data in a managed object in the inventory
Using tenant options which is a simple key-value store.
Using the tenant object
1. Storing application data in inventory
The inventory of a tenant is the most generic collection and you can store any JSON structure in here. It is perfect especially for more complex metadata as you can create a single JSON object that holds a lot of information. The Cockpit application for example stores the configuration and layout of a dashboard in the inventory. But be aware that in most tenants there will be users that can at least on API access the whole inventory. So if you want to store sensitive data you might want to encrypt the value and save the encrypted String in the JSON object. You also should be aware that many other things like assets and devices are stored in the inventory so you should have an easy way to query your configuration in a fast way e.g. using the identity API with a unique identifier or using a common type.
Here is an example of Dynamic MQTT Mapper using two kinds of managed objects.
One is to store application information about the MQTT Broker connection status and MQTT Broker SYS stats in a managed object. Here an external ID in the identity API is used with the key MQTT_MAPPING_SERVICE
. When the microservice starts up, it will first attempt to retrieve an existing object. If this retrieval fails because the object doesn't exist, the system will then create a new object.
The second managed object types are mappings which are persisted. Here we use a common type c8y_mqttMapping
to persist all mappings configured in that tenant. That way we can also retrieve them very easily by filtering on that type.
{
"owner": "me",
"creationTime": "2022-11-24T08:06:33.697Z",
"type": "c8y_mqttMapping",
"lastUpdated": "2022-11-30T14:07:30.299Z",
"id": "102364668",
"c8y_mqttMapping": {
"snoopStatus": "NONE",
"extension": null,
"templateTopicSample": "device/express/berlin_01",
"ident": "d90f2961-3fad-4c49-9b41-2778bd440ed0",
"tested": false,
"mapDeviceIdentifier": true,
"active": true,
"targetAPI": "INVENTORY",
"source": "{\"line\":\"Bus-Berlin-Rom\",\"operator\":\"EuroBus\",\"customFragment\":{\"customFragmentValue\":\"Express\"},\"capacity\":64,\"customArray\":[\"ArrayValue1\",\"ArrayValue2\"],\"customType\":\"type_International\"}",
"target": "{\"c8y_IsDevice\":{},\"name\":\"Vibration Sensor\",\"capacity\":100,\"type\":\"maker_Vibration_Sensor\"}",
"externalIdType": "c8y_Serial",
"templateTopic": "device/express/+",
"qos": "AT_LEAST_ONCE",
"substitutions": [
{
"pathSource": "_TOPIC_LEVEL_[2]",
"pathTarget": "_DEVICE_IDENT_",
"repairStrategy": "DEFAULT",
"expandArray": false
},
{
"pathSource": "line",
"pathTarget": "name",
"repairStrategy": "DEFAULT",
"expandArray": false
},
{
"pathSource": "customType",
"pathTarget": "type",
"repairStrategy": "DEFAULT",
"expandArray": false
},
{
"pathSource": "capacity",
"pathTarget": "capacity",
"repairStrategy": "DEFAULT",
"expandArray": false
}
],
"updateExistingDevice": true,
"mappingType": "JSON",
"lastUpdate": 1669817250289,
"name": "Device Mapping",
"snoopedTemplates": [],
"createNonExistingDevice": false,
"id": "102364668",
"subscriptionTopic": "device/#"
}
}
2. Storing application data in tenant options
Tenant Options is a simple key value store available on each tenant. It gives you very simple and fast access to a dedicated store for metadata that belongs to the tenant. Besides the key and value, a tenant option also has a category that allows you to easily query all tenant options in a single query for all your metadata (if you set a common application-specific category).
One of the most useful features of tenant options is that they also have a built-in encryption feature so that you can store sensitive data like passwords without bothering about your encryption. Only service users (like those generated for microservices) can access these parameters unencrypted and even if an admin user has access to the tenant options he would only get the values encrypted. You can use that feature by just using the prefix credentials
in your key e.g. credentials.password
Again the example of the Dynamic MQTT Mapper where we use tenant options to store the MQTT credentials to connect to the MQTT Broker in a separate category:
GET {{url}}/tenant/options/mqtt.dynamic.service
{
"credentials.connection.configuration": "<<Encrypted>>",
"service.configuration": "{\"logPayload\":true,\"logSubstitution\":true}"
}
3. Storing application data in tenant object
The tenant object is a bit of a special location to store metadata. This object can only be written by the parent tenant and the actual tenant can only read it. It is the perfect place to store billing relevant information e.g. how many device licenses this tenant owns. Such information shouldn’t be changeable by the tenant but might be necessary to know for certain application functionality.
Using audit records
Another part of the Cumulocity IoT Domain Model is the audit records. Mostly used by standard components also custom components can add audit records mainly to track important changes.
Audit records are similar to MEAs (see my other article) and should be cleaned up by retention rules.
An audit record should contain:
a string
activity
which is basically the title of the audit recorda string
text
which is the description of the audit recorda time stamp as
time
- Ideally UTC time format is useda managed object as
source
a type string as
type
an optional
user
optional one or multiple custom Properties
optional
change
array that contains one or multiple changesnewValue
- the new value of a propertypreviousValue
- the previous value of a property or “null” when emptyattribute
- the property that has been changedtype
- the type of the attribute
Example record:
{
"activity": "User updated",
"creationTime": "2023-06-28T15:34:09.682Z",
"changes": [
{
"newValue": "sdfsdf",
"attribute": "firstname",
"type": "java.lang.String",
"previousValue": null
}
],
"source": {
"self": "https://xxx.eu-latest.cumulocity.com/inventory/managedObjects/buguser",
"id": "buguser"
},
"type": "User",
"application": "administration",
"self": "https://xxx.eu-latest.cumulocity.com/audit/auditRecords/109942970",
"time": "2023-06-28T15:34:09.682Z",
"id": "109942970",
"text": "User buguser updated: firstname='sdfsdf'",
"user": "admin"
}
As the name audit records state it should be mainly used to track changes made in the system by users or applications e.g. permissions have been changed, operations have been triggered by user etc… To track the status of devices please use events.
Make clever use of the query language
The query language that is available to filter on the inventory API is a great tool both for UIs and applications. It is however by design not necessarily fast as it allows you to search on every parameter and of course, not every parameter is indexed in the database. When performing queries you can differentiate between a query using an index (IXSCAN) or not using any index (COLLSCAN). As the inventory will grow over time (more devices, more assets, etc.) the collection scan queries will become slower and slower and eventually it might be noticeable.
A very simple trick to mitigate that is to include parameters in the query that are indexed. Those parameters can be easily identified as direct query parameters exist for them (e.g. on inventory the type). So if you for example have a list of your shipping containers and want to filter this list on all their parameters, ensure that every shipping container object has the same type value and include this parameter in the query. Then it will automatically utilize the existing type index and quickly ignore all other objects with different types in the database.
This is mostly relevant for the inventory API and the query language but also for the others API you might be careful building your API-Requests retrieving data.
Here are some details about the parameters which are indexed (tested with 10.16 on eu-latest)
Please note: The used index can be configured on instance level. Also in future we want to handle this more dynamically so it might be you have additional parameters on your instance that are indexed (or less).
API | Parameter |
Inventory | type |
Inventory | text |
Inventory | childAdditionId |
Inventory | childAssetId |
Inventory | fragmentType |
Inventory | childDeviceId |
Inventory | ids |
events | source |
events | type |
events | dateFrom |
events | dateTo |
events | createdFrom |
events | createdTo |
alarms | source |
alarms | dateFrom |
alarms | dateTo |
alarms | createdFrom |
alarms | createdTo |
measurements | source |
measurements | dateFrom |
measurements | dateTo |
measurements | valueFragmentType |
measurements | valueFragmentSeries |
operations | deviceId |
operations | dateFrom |
operations | dateTo |
operations | status |
operations | agentId |
Summary
In this article I explained how application data could be stored in Cumulocity IoT, you can use the audit records in a good way and to optimize your queries.
You did well by understanding all the concepts of a “good” domain model now and can avoid major pitfalls in your data design!