IoT Platform Architecture Design

Overview of Architecture Design

As a bridge connecting physical devices and the digital world, the architecture design of an IoT platform is crucial for achieving efficient, stable, and secure device management. An excellent IoT platform should have the following basic functions to meet the needs of ordinary developers:

Multi-protocol support: The platform must be able to receive and process data from different devices using different communication protocols. This means that whether the device uses MQTT, CoAP, HTTP, or any other protocol, the platform can be compatible and effectively receive data.

Data processing capabilities: The received data needs to be effectively parsed and converted for storage, query, statistics, and analysis. The platform should provide powerful data processing capabilities, including real-time data analysis and visualization, to help developers gain insights from the data.

Flexible alert and notification mechanisms: When data is abnormal or meets specific conditions, the platform should be able to issue alerts in a timely manner and notify relevant personnel via email, SMS, or other means. In addition, developers should be able to easily export data for further analysis or reporting as needed.

Fine-grained permission management: To ensure data security and compliance, the IoT platform needs to provide fine-grained permission management functions, including data permission management, user management, role management, and department management. This way, developers can control who can access specific data or perform specific operations.

Comprehensive data security measures: Data security is the top priority of the IoT platform. The platform needs to implement comprehensive data security measures, including data encryption, secure storage, regular backups, and rapid recovery, to ensure the security and integrity of the data.

At the architectural level, the IoT platform can be divided into the following key layers:

Communication layer: Responsible for handling data transmission between devices and the platform, ensuring the real-time and reliability of data.
Data parsing layer: Converts the received raw data into structured data, providing a foundation for subsequent processing.
Data storage layer: Provides stable and efficient storage solutions for massive device data.
User management layer: Manages user information and permissions, ensuring that users can safely access and manage the platform.
Data permission layer: Controls different users’ access to data, ensuring data security.
Data analysis layer: Provides data analysis tools to help developers extract valuable information from the data.
Notification layer: Responsible for notifying relevant personnel of alert information, including email, SMS, phone calls, etc.
Operation and maintenance layer: Responsible for monitoring, maintaining, and optimizing the platform to ensure its stable operation.

Through such an architecture design, the IoT platform can not only meet the needs of ordinary developers in device connection, data processing, and user management but also provide strong security guarantees and flexible scalability, supporting developers in building innovative IoT applications.

Communication Layer

In the IoT development platform, the communication layer is the key layer connecting devices and the platform. It is responsible for handling data transmission between devices and the platform, ensuring the real-time and reliability of data. Common IoT development platforms need to support the following communication protocols:

MQTT: A lightweight publish/subscribe message transport protocol suitable for data transmission in low-bandwidth and unreliable network environments.
CoAP: A lightweight protocol based on UDP, suitable for resource-constrained IoT devices.
HTTP: A widely used application layer protocol suitable for scenarios requiring reliable transmission.
WebSocket: Provides full-duplex communication, suitable for scenarios requiring real-time data interaction.
Modbus: A communication protocol commonly used in the field of industrial automation, suitable for industrial control systems.
TCP/IP: Provides reliable data transmission, suitable for scenarios requiring high reliability.

Communication Layer - MQTT Protocol

When integrating the MQTT protocol into IoT projects, choosing the right MQTT Broker is crucial. Here are some popular MQTT Broker options and key factors to consider when choosing:

EMQ X: A high-performance, scalable MQTT Broker that supports multiple protocols and plugins, suitable for scenarios requiring high concurrency processing. Official website: https://www.emqx.io/
HiveMQ: Provides community and enterprise editions, supports MQTT 5, and has complete functions and enterprise-level features such as data persistence and cluster support. Official website: https://www.hivemq.com/
VerneMQ: A distributed MQTT Broker focused on reliability and scalability, suitable for large-scale IoT deployments. Official website: https://vernemq.com/
RabbitMQ: Although RabbitMQ itself is not an MQTT Broker, it supports the MQTT protocol through plugins and is a widely used open-source message broker that supports multiple messaging protocols. Official website: https://www.rabbitmq.com/
Mosquitto: An open-source MQTT Broker that is lightweight and easy to configure, suitable for small to medium-sized IoT projects. Official website: https://mosquitto.org/

MQTT Broker Comparison

Feature/Broker	EMQ X	HiveMQ	VerneMQ	RabbitMQ	Mosquitto
Open Source	Yes	Yes	Yes	No	Yes
Supported Protocols	MQTT, CoAP, LwM2M, HTTP, etc.	MQTT, WebSocket, etc.	MQTT	AMQP, STOMP, MQTT, etc.	MQTT
Performance	High performance, supports 100M+ connections	High reliability, enterprise-level performance	Distributed, high availability	Multi-functional, high throughput	Lightweight, suitable for small to medium scale
Security Features	TLS/SSL, access control, data encryption	TLS/SSL, client authentication	TLS/SSL, plugin-supported security features	TLS/SSL, access control, RabbitMQ advanced security plugins	TLS/SSL, access control
Cluster Support	Supported	Supported	Supported	Not supported in community edition, supported in enterprise edition	Basic support
Management Interface	User-friendly management UI	Management console and API	CLI tools and integrated API	Management plugins and API	Command line tools and configuration files
Data Persistence	Supported	Supported	Supported	Supported through plugins	Supported
Community Activity	High	High	Medium	High	Medium
Enterprise Support	Provided	Provided	Provided	Not applicable	Limited

Considerations when choosing an MQTT Broker:

Performance and Scalability: Ensure the Broker can handle a large number of concurrent connections and message throughput.
Security: Check if it supports TLS/SSL encryption and authentication to protect data transmission security.
Cluster Support: Evaluate whether the Broker supports distributed deployment for load balancing and high availability.
Management Tools: Whether the Broker provides easy-to-use management interfaces and monitoring tools to simplify daily operations.
Protocol Support: Confirm the MQTT protocol versions supported by the Broker, including compatibility with the latest version MQTT 5.0.

Recommendation for Enterprise-level IoT Platforms:
For enterprise-level IoT platforms seeking high performance, reliability, and professional technical support, EMQ X or HiveMQ is an ideal choice. These commercial MQTT Brokers provide powerful data processing capabilities and comprehensive security features to meet the needs of enterprises in large-scale deployments.

In the development practice of IoT projects, the MQTT protocol relies on an efficient topic subscription-publishing mechanism to complete data exchange. This mechanism allows devices or applications to subscribe to topics of interest to receive messages published under related topics. Therefore, in the development process of software applications, developers must build a fully functional MQTT client management module.

Next, consider a question: Can an application create countless MQTT clients? The answer to this question is, of course, no. So, the implementation of the MQTT client management module can be considered from the following points.

The number of MQTT clients in a single application needs to be limited based on certain external factors. Common external factors include:
1. The memory size of the application
2. The CPU usage of the application
3. The network bandwidth usage of the application
4. The number of threads in the application
Unified management of MQTT clients is required, including creation, destruction, monitoring, exception handling, etc. Since the number of MQTT clients in a single application is limited, multi-application (multi-instance) deployment issues, multi-instance load balancing issues, and multi-instance failover issues also need to be considered.
Consider the specific tasks executed in the MQTT client. If the data push interval is very short, allowing the MQTT client to perform some heavy tasks will increase the resource usage of the application. Therefore, it is recommended to directly distribute messages through message queues in the MQTT client to achieve decoupling.

Next, we will introduce an MQTT client management solution based on Redis. The detailed operations are as follows:

Construct two list data structure keys, one named use and the other named no_use. The former represents used, and the latter represents unused. The basic information in these two keys is as follows:
1. Current MQTT client ID
2. MQTT Broker IP address
3. MQTT Broker port
4. Authentication information (account password or others)
5. Topics to subscribe to
The application sets the maximum number of MQTT clients through command line, configuration files, etc., when starting.
External programs interact with all started applications by calling APIs (can be distributed through Nginx). External programs will construct request parameters based on MQTT client ID, MQTT Broker IP address, MQTT Broker port, authentication information (account password or others), topics to subscribe to, etc.
Any one of the started applications performs the following operations after receiving the request:
1. Select an application for processing through a load balancing algorithm, here using the simplest least priority algorithm.
2. Forward the request to the selected application to complete the creation of the MQTT client.
3. The selected application needs to establish a mapping table to maintain the relationship between the application and the MQTT client.
4. Store the successfully created MQTT client in the use key of Redis.

The above operation process is very simple for the management of MQTT clients, but there is a problem that when the application crashes, all MQTT clients associated with the application will be transferred to the no_use key, which will cause the MQTT clients to not work properly. Therefore, we need to implement a failover mechanism. The following is a feasible solution:

Each application should periodically write its own survival information to Redis, which is a key with an expiration time.
Each application should monitor the expiration of the key mentioned in the first point. When it is found to be expired, it is considered that the current application has crashed. The process after the crash is as follows:
1. Find the associated MQTT clients of the crashed application through the relationship key between the application and MQTT, and transfer them to the no_use key.
2. Select an application for processing through a load balancing algorithm, here using the simplest least priority algorithm.
3. Transfer the associated MQTT clients of the crashed application to the selected application.
4. The selected application needs to establish a mapping table to maintain the relationship between the application and the MQTT clients.
5. Store the successfully created MQTT clients in the use key of Redis.

The design of the MQTT client management scheme has come to an end, and now we need to focus on the implementation of MQTT clients in different languages. Specific candidates can be referred to:

Java:
- Eclipse Paho Java Client: https://github.com/eclipse/paho.mqtt.java
- HiveMQ MQTT Client: https://github.com/hivemq/hivemq-mqtt-client
Python:
- paho-mqtt: https://github.com/eclipse/paho.mqtt.python
- gmqtt: https://github.com/wialon/gmqtt
JavaScript/Node.js:
- MQTT.js: https://github.com/mqttjs/MQTT.js
- Paho MQTT JavaScript Client: https://github.com/eclipse/paho.mqtt.javascript
C#/.NET:
- MQTTnet: https://github.com/chkr1011/MQTTnet
- M2Mqtt: https://github.com/eclipse/paho.mqtt.m2mqtt
Go:
- Paho MQTT Go Client: https://github.com/eclipse/paho.mqtt.golang
- gmqtt: https://github.com/DrmagicE/gmqtt
C/C++:
- Eclipse Paho C Client: https://github.com/eclipse/paho.mqtt.c
- mosquitto-cpp: https://github.com/eclipse/mosquitto
Ruby:
- ruby-mqtt: https://github.com/njh/ruby-mqtt
- paho-mqtt: https://github.com/RubyDevInc/paho.mqtt.ruby
PHP:
- phpMQTT: https://github.com/bluerhinos/phpMQTT
- Mosquitto-PHP: https://github.com/mgdm/Mosquitto-PHP
Rust:
- rumqtt: https://github.com/bytebeamio/rumqtt
- paho-mqtt-rust: https://github.com/eclipse/paho.mqtt.rust
Swift:
- CocoaMQTT: https://github.com/emqx/CocoaMQTT
- MQTT-Client-Framework: https://github.com/novastone-media/MQTT-Client-Framework

Communication Layer-CoAP Protocol

When integrating the CoAP (Constrained Application Protocol) protocol into IoT projects, choosing the right CoAP library implementation is key. Here are some popular CoAP library options:

Californium: An open-source CoAP implementation developed by the Eclipse Foundation, written in Java, offering rich features and good scalability. Suitable for projects requiring high performance and reliability. Official website: https://www.eclipse.org/californium/
Libcoap: A lightweight C language CoAP implementation, suitable for resource-constrained embedded devices. It provides the core functions of the CoAP protocol and is easy to integrate into existing projects. Official website: https://libcoap.net/
Go-CoAP: A CoAP library written in Go, supporting both client and server functions. It provides a concise API, suitable for building high-performance CoAP applications. Official website: https://github.com/plgd-dev/go-coap

CoAP service is a long-running service, which is both a request-response model and an observer model. If using the request-response model, it can be designed as follows:

Use an open-source CoAP library to implement the CoAP server.
Create multiple CoAP servers and use a reverse proxy tool (Nginx) to achieve load balancing, providing unified CoAP services externally.
The device side sends data through the CoAP client.

If using the observer model, its specific implementation can refer to the MQTT client management scheme.

Communication Layer-HTTP Protocol

Although HTTP (Hypertext Transfer Protocol) was originally designed for web applications, it is also widely used in IoT projects, especially on devices that are not so resource-constrained. Choosing the right HTTP server is crucial for building efficient IoT applications. The development frameworks that can be used for web application development are:

Django: A high-level web framework based on Python that provides a complete development stack. Its “batteries included” philosophy and powerful ORM make it very suitable for quickly developing complex IoT backend systems. Official website: https://www.djangoproject.com/
Flask: A lightweight Python web framework with high flexibility, suitable for building small to medium-sized IoT applications. Its simple API and rich extension ecosystem make the development process more efficient. Official website: https://flask.palletsprojects.com/
Express.js: A web application framework based on Node.js, known for its simplicity and flexibility. It is particularly suitable for building fast, lightweight APIs, making it ideal for data processing and device communication in IoT projects. Official website: https://expressjs.com/
Spring Boot: A powerful framework based on Java that provides the ability to quickly build standalone, production-grade Spring applications. Its auto-configuration features and rich ecosystem make it an ideal choice for building IoT backend services. Official website: https://spring.io/projects/spring-boot
Gin: A web framework written in Go, known for its high performance and lightweight. Gin’s fast speed and low memory footprint make it very suitable for building high-concurrency IoT API services. Official website: https://gin-gonic.com/

Web applications are long-running services that follow a request-response model and can be designed as follows:

Use open-source web application development libraries to implement web application services.
Create multiple web application services and use a reverse proxy tool (Nginx) to achieve load balancing, providing unified web services externally.
The device side sends data through the HTTP client.

Communication Layer-WebSocket Protocol

WebSocket is a protocol that provides full-duplex communication over a single TCP connection, making it particularly suitable for IoT applications that require real-time data exchange. Unlike the request-response model of HTTP, WebSocket allows the server to push data to the client proactively, which is very useful in IoT scenarios. Here are some popular WebSocket libraries and frameworks:

Socket.IO: A powerful JavaScript library that supports real-time, bidirectional, and event-based communication. It can automatically downgrade to other transport methods when WebSocket is not available. Official website: https://socket.io/
ws: A simple, efficient, and easy-to-use pure WebSocket implementation suitable for the Node.js environment. It is the foundation of many other WebSocket libraries. Official website: https://github.com/websockets/ws
Spring WebSocket: Part of the Spring Framework, providing WebSocket support for Java applications and seamlessly integrating with other Spring components. Official website: https://docs.spring.io/spring-framework/docs/current/reference/html/web.html#websocket

The specific implementation of WebSocket applications can refer to the following process:

Use the selected WebSocket library to implement the WebSocket service.
Create multiple WebSocket service instances and use a load balancer (such as HAProxy) to achieve load balancing.
Implement simple connection management:
- Perform regular cleanup tasks to remove long-inactive connections.
- Implement a basic heartbeat mechanism, but complex connection pool management is not required.
Design the message processing flow, including message parsing, validation, and routing.
Implement a security authentication mechanism to ensure that only authorized devices can establish WebSocket connections.

Communication Layer-TCP/IP Protocol

TCP/IP is the fundamental protocol of the Internet and also plays an important role in IoT platforms. Although many higher-level protocols (such as HTTP, MQTT) are based on TCP/IP, sometimes we need to use TCP/IP directly for communication. Common TCP/IP development SDKs include:

Boost.Asio: A cross-platform C++ library that provides support for the TCP/IP protocol. Official website: https://www.boost.org/doc/libs/release/libs/asio/
Netty: An asynchronous event-driven network application framework based on Java that supports the TCP/IP protocol. Official website: https://netty.io/
lwIP: A lightweight open-source TCP/IP protocol stack suitable for embedded systems. Official website: https://savannah.nongnu.org/projects/lwip/
libuv: A cross-platform asynchronous I/O library that supports the TCP/IP protocol, commonly used in Node.js. Official website: https://libuv.org/

When using the TCP/IP protocol directly for communication in IoT platforms, the following points need to be noted:

Connection management: Implement the establishment, maintenance, and closure of connections to ensure the stability and reliability of the connection.
Data transmission: Design efficient data transmission mechanisms to ensure data integrity and timeliness.
Security: Implement data encryption and identity authentication to prevent data leakage and unauthorized access.
Compatibility: Ensure compatibility with different devices and operating systems, and support various network environments.
TCP/IP, as a low-level protocol, requires high technical skills from developers, who need to have strong network programming capabilities.

By using the TCP/IP protocol reasonably, efficient and reliable IoT communication can be achieved to meet the needs of various application scenarios.

Communication Layer-Security Authentication

The previous sections introduced the protocols used in the communication layer. Here, we will focus on security authentication in the communication layer. In the MQTT protocol, the implementation methods of security authentication include:

Username and password authentication: Assign a unique username and password to each device, and authenticate during connection establishment. Once authenticated, the device can transmit data; otherwise, the connection will be disconnected.
Certificate authentication: Use SSL/TLS certificates for mutual authentication to ensure the security and integrity of data transmission. Both the device and the server need to hold valid certificates and authenticate each other during connection establishment.

For other protocols, such as CoAP, HTTP, WebSocket, etc., developers need to implement the authentication module themselves. Common authentication modes include:

Account and password authentication: Assign an account and password to each device, and authenticate after the connection is established. Once authenticated, data transmission is allowed; otherwise, the connection is disconnected.
Token authentication: The device obtains a token through account and password during the first connection and uses the token for authentication in subsequent connections. The token has a validity period and is updated regularly to ensure security.
OAuth authentication: Use the OAuth protocol for authentication. The device obtains an access token through the OAuth server and uses the token for data transmission. This is suitable for scenarios requiring third-party authentication. This scenario is more often used for web application authentication and is less commonly used in IoT platforms.

By designing and implementing security authentication mechanisms reasonably, the data security and communication reliability of the IoT platform can be effectively ensured.

Data Parsing Layer

The main function of the data parsing layer is to parse and convert the data transmitted through various communication protocols into a human-readable format or a format that conforms to the system design. In the data parsing process, scripting languages such as JavaScript, Lua, and Python are usually used. To ensure the efficiency and consistency of the parsing process, the following design considerations are made from a program design perspective:

Unified design of return results: Ensure that all parsers return data in a consistent format for subsequent processing and analysis.
Abstract design of executors for different scripting languages: Provide a common interface that supports parsers in multiple scripting languages, making it easy to extend and maintain.

The following is a unified data return result.

type DataRowList struct {
	Time      int64     `json:"time"`       
	DeviceUid string    `json:"device_uid"` 
	IdentificationCode string `json:"identification_code"` 
	DataRows  []DataRow `json:"data"`
	Nc        string    `json:"nc"`
}

type DataRow struct {
	Name  string `json:"name"` // 信号键
	Value string `json:"value"` // 信号数值
}

The above program defines a structure named DataRowList and a structure named DataRow for unifying the return results of the data parsing layer.

The DataRowList structure contains the following fields:
- Time: A second-level timestamp indicating the time of the data.
- DeviceUid: The unique code of the device, which is the unique identifier on the software server side.
- IdentificationCode: The physical device identification code, used to further identify the device. This data needs to be emphasized when the MQTT client subscribes to wildcards.
- DataRows: A slice of DataRow structures containing multiple data rows.
- Nc: A string field representing the raw data, which is the data that needs to be parsed.
The DataRow structure contains the following fields:
- Name: The signal key, indicating the name of the data.
- Value: The signal value, indicating the specific value of the data.

By defining these two structures, it can be ensured that the data format returned by all parsers is consistent, which is convenient for subsequent processing and analysis. At the same time, the abstract design of executors for different scripting languages provides a common interface that supports parsers in multiple scripting languages, making it easy to extend and maintain.

Data link process:

The device transmits raw data to the IoT platform through the communication protocol.
The IoT platform parses the raw data through the parser.
The parsed data is stored.
A response confirming receipt is returned.

User Management Layer

The user management layer is a module that any enterprise-level system needs to have. Its main functions include:

User Management: Provide functions for creating, deleting, modifying, and querying users to ensure the accuracy and completeness of user information.
Role Management: Define and manage roles in the system, assign permissions to different roles, and ensure that users can perform corresponding operations according to their roles.
Permission Management: Control user access to system resources in a fine-grained manner to ensure data security and compliance.
Organizational Structure Management: Manage the organizational structure of the enterprise, including departments, teams, etc., support complex organizational hierarchy relationships, and facilitate the allocation and management of permissions.
Audit Logs: Record user operations, provide audit and tracking functions, and ensure the transparency and traceability of the system.
Single Sign-On (SSO): Integrate single sign-on functionality to simplify the user login process and improve user experience.
Multi-Factor Authentication (MFA): Provide multi-factor authentication mechanisms to enhance system security and prevent unauthorized access.
User Activity Monitoring: Monitor user activities in real-time, detect and handle abnormal behaviors in a timely manner, and ensure the safe operation of the system.

For the implementation of this part, you can refer to the open-source RBAC permission management framework, such as Casbin. Casbin is a powerful and efficient open-source access control library that supports multiple access control models, such as ACL, RBAC, ABAC, etc. For more information, please visit the official website: https://casbin.org/

Data Analysis Layer

The data analysis layer is a key module in the IoT platform. It is responsible for analyzing and processing the data in the platform to support business decisions and data-driven decisions. Common analysis functions include:

Data Interval Warning: By setting the normal range of data, when the data exceeds or falls below this range, the system will automatically send a warning notification.
Multi-Dimensional Data Joint Warning: Combine multiple data dimensions for comprehensive analysis, set complex warning conditions, and trigger warnings when specific conditions are met.
Statistical Analysis: Perform statistical analysis on historical data, generate various statistical reports and charts, and help users understand the distribution and trends of the data.
Data Visualization: Visually display data through charts, dashboards, etc., to help users quickly understand the information behind the data.

Warning Design

When designing the data analysis layer, it is necessary to avoid the problem of repeated deployment and try to use a visual configuration method to handle it. Even if some complex logic needs to be executed, it should be attempted to be completed using scripting languages. The basic design relationship is as follows:

Signals need to have multiple interval alarm thresholds to achieve graded alarm capabilities.
Signal data needs to be able to temporarily retain a maximum of N entries to achieve multi-dimensional data joint analysis capabilities.
Script management should be done for multi-dimensional data joint warnings, and it should be associated with signals to achieve multi-dimensional data joint analysis capabilities.

Data execution link in the data interval warning scenario:

The raw data is parsed by the parsing script to obtain the data to be analyzed.
The data enters the interval alarm message queue to determine whether it is within the alarm range, and if so, a message notification is made.
The data from the first step is placed in the temporary storage area.

The data execution link of multi-dimensional data joint warning:

The raw data is parsed by the parsing script to obtain the data to be analyzed.
Find the joint alarm conditions corresponding to the signal and determine whether the conditions are met.
If the conditions are met, a message notification will be made.

Statistical Design

Next, design the statistical content. Common statistics are triggered based on scheduled tasks. The execution logic of scheduled tasks is generally as follows:

Scheduled task trigger
Query the signal values that need to be counted according to the time range
Write specific statistical calculation rules
Store the statistical results

Extract some core parameters based on the above logic:

Scheduled task execution time (CRON expression): The execution time of the scheduled task is the end time of the time range
Lead time: The start time of the time range is obtained by subtracting the lead time from the execution time of the scheduled task
Calculation script: Write specific statistical calculation rules through scripting languages

In this design, the execution time of the scheduled task is determined by the CRON expression, which is used to set the execution time of the statistical task. The scheduled task is implemented through the expiration message of the message queue, rather than using distributed scheduled task frameworks such as XXL-JOB, to pursue lightweight deployment.

Visualization Design

The previous designs have completed the accumulation of data, and the next step is to show the relevant data content to the user. Open source components can be selected for visualization, and the specific options are:

Data Screen: It is recommended to use the open-source ECharts component for data screen display. ECharts is a powerful data visualization library that supports various chart types and interactive functions, suitable for building complex data screens. Official website: https://echarts.apache.org/
Data Dashboard: It is recommended to use the open-source Grafana component for data dashboard display. Grafana is an open-source monitoring tool that supports multiple data sources and rich visualization plugins, suitable for real-time data monitoring and analysis. Official website: https://grafana.com/
Data Report: It is recommended to use the open-source JasperReports component for data report display. JasperReports is a powerful report generation tool that supports multiple data sources and complex report designs, suitable for generating reports in various formats. Official website: https://community.jaspersoft.com/
Custom Data Screen: It is recommended to use the open-source DataV component for custom data screen display. DataV is a data visualization platform that provides rich visualization components and flexible configuration options, suitable for building personalized data screens. Official website: https://datav.aliyun.com/

Data Storage Layer

This section will analyze the content of the data storage layer.

Conventional Relational Data Storage: Choose MySQL or other relational databases. Relational databases are suitable for storing structured data, providing powerful query and transaction processing capabilities, and are suitable for storing conventional data such as user information and device information.
Signal Point Data Storage: Choose InfluxDB or other time-series databases. Time-series databases are designed to handle time-series data, suitable for storing sensor data, log data, etc., and provide efficient write and query performance.
Alarm History Data Storage: Choose MongoDB for storage. MongoDB is a NoSQL database suitable for storing unstructured and semi-structured data, supporting flexible document models, and suitable for storing alarm history records and other data.
Statistical Data Storage: Choose MongoDB for storage. The flexibility and scalability of MongoDB make it suitable for storing various statistical data, supporting complex queries and aggregation operations, and facilitating data analysis and report generation.

When designing the data storage layer, sharding is an important consideration. Taking InfluxDB as an example, if it is an enterprise-oriented project, the concept of organization in InfluxDB can be utilized. Create an organization for each customer and create different buckets under the organization for data isolation. Usually, buckets can be sharded according to the number of devices to improve data management efficiency.

For MongoDB, since it does not have the concept of organization, developers need to implement a similar organization management mechanism by themselves. This can usually be achieved through the following methods:

Use different databases to represent different organizations.
Use different collections in each database to store similar data.

This design method can effectively achieve data isolation, ensure that the data of different customers will not be confused, and facilitate management and expansion.

Notification Layer

This section will analyze the notification layer in detail. The platform’s notifications are usually divided into two categories: in-system notifications (site messages) and out-of-system notifications. The most important components of the notification layer are as follows:

Message templates, which generally use the ${} format for variable replacement.
Message channels, which use Feishu, DingTalk, WeChat, SMS, email, etc. for message notifications.

Message Templates

Message templates are the core part of the notification system, defining the format and content of notifications. By using placeholders (such as ${}), specific content can be dynamically replaced when sending notifications. Common message templates include:

Alarm Notification Template: Used to send alarm information when the device is abnormal or the data exceeds the threshold.
System Notification Template: Used to send information such as system updates and maintenance.
User Notification Template: Used to send information such as user registration and password reset.

Example templates: Alarm Notification: Device ${device_name} had an$ {alert_type} alarm at ${timestamp}, and the current value is$ {current_value}. System Notification: Dear user, the system will be maintained at ${maintenance_time}, which may affect your use. Please prepare in advance. User Notification: Hello, your account$ {username} has been successfully registered. Please click the following link to activate your account: ${activation_link}

Message Channels

Message channels are the transmission media of the notification system, and different message channels are suitable for different notification scenarios. Common message channels include:

Feishu: Suitable for instant communication and notifications within the enterprise.
DingTalk: Suitable for instant communication and notifications within the enterprise.
WeChat: Suitable for a wide range of user groups, supporting personal and enterprise notifications.
SMS: Suitable for important emergency notifications to ensure that users can receive them in time.
Email: Suitable for formal notifications and long text content.

Operation and Maintenance Layer

The operation and maintenance layer is an indispensable part of the IoT platform, mainly responsible for the monitoring, management, and maintenance of the platform. The design of the operation and maintenance layer needs to consider the high availability, scalability, and security of the system. The following are the key components of the operation and maintenance layer:

Monitoring System
Backup and Recovery

Monitoring System

The monitoring system is the core of the operation and maintenance layer, responsible for real-time monitoring of the platform’s operating status, and timely detection and handling of abnormal situations. Common monitoring systems include:

Prometheus: An open-source system monitoring and alerting toolkit, suitable for monitoring various metric data. Official website: https://prometheus.io/
Grafana: An open-source visualization tool that can be integrated with monitoring systems such as Prometheus, providing rich charts and dashboards. Official website: https://grafana.com/

Note: Prometheus monitoring requires relevant settings in the application, otherwise, monitoring cannot be performed.

Backup and Recovery

Backup and recovery are important safeguards of the operation and maintenance layer, ensuring that the system can be quickly restored in the event of a failure. Common backup and recovery strategies include:

Full Backup: Regularly perform full backups of the system to ensure data integrity.
Incremental Backup: Perform incremental backups of changed data to improve backup efficiency.
Disaster Recovery: Develop a disaster recovery plan to ensure that the system can be quickly restored in the event of a major failure.

Note: All backup solutions need to be handled through the program side, which may require additional development.

Alert Notification:  
Device ${device_name} triggered an ${alert_type} alert at ${timestamp}, with a current value of ${current_value}.  

System Notification:  
Dear user, the system will undergo maintenance at ${maintenance_time}. Your usage may be affected during this period, so please prepare in advance.  

User Notification:  
Hello, your account ${username} has been successfully registered. Please click the following link to activate your account: ${activation_link}.

Message Channels

Message channels are the transmission media of the notification system, and different message channels are suitable for different notification scenarios. Common message channels include:

Feishu: Suitable for instant communication and notifications within the enterprise.
DingTalk: Suitable for instant communication and notifications within the enterprise.
WeChat: Suitable for a wide range of user groups, supporting personal and enterprise notifications.
SMS: Suitable for important emergency notifications to ensure that users can receive them in time.
Email: Suitable for formal notifications and long text content.

Operation and Maintenance Layer

Monitoring System
Backup and Recovery

Monitoring System

Prometheus: An open-source system monitoring and alerting toolkit, suitable for monitoring various metric data. Official website: https://prometheus.io/
Grafana: An open-source visualization tool that can be integrated with monitoring systems such as Prometheus, providing rich charts and dashboards. Official website: https://grafana.com/

Note: Prometheus monitoring requires relevant settings in the application, otherwise, monitoring cannot be performed.

Backup and Recovery

Full Backup: Regularly perform full backups of the system to ensure data integrity.
Incremental Backup: Perform incremental backups of changed data to improve backup efficiency.
Disaster Recovery: Develop a disaster recovery plan to ensure that the system can be quickly restored in the event of a major failure.

Note: All backup solutions need to be handled through the program side, which may require additional development.

IoT Platform Architecture Design

IoT Platform Architecture Design

Overview of Architecture Design

Communication Layer

Communication Layer - MQTT Protocol

Communication Layer-CoAP Protocol

Communication Layer-HTTP Protocol

Communication Layer-WebSocket Protocol

Communication Layer-TCP/IP Protocol

Communication Layer-Security Authentication

Data Parsing Layer

User Management Layer

Data Analysis Layer

Warning Design

Statistical Design

Visualization Design

Data Storage Layer

Notification Layer

Message Templates

Message Channels

Operation and Maintenance Layer

Monitoring System

Backup and Recovery

Message Channels

Operation and Maintenance Layer

Monitoring System

Backup and Recovery

相关文章

Basic Knowledge of IoT

IoT Data Analysis

Data Collection

Device Management