Skip to content
On this page

2023-07-30

2023-07-30

Technological Stack for Cloud Digital Twin

  1. Cloud Platform: To deploy the machine learning model in the cloud and handle data processing, you can choose from popular cloud platforms like Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), or IBM Cloud. These platforms provide the necessary infrastructure and services for hosting applications and managing resources.
  2. Machine Learning Framework: Select a machine learning framework that supports training and retraining of models in the cloud. Popular choices include TensorFlow, PyTorch, and scikit-learn. These frameworks provide tools and libraries for building, training, and deploying machine learning models.
  3. Time Series Database: Use a time series database to store and manage the streamed time series data. InfluxDB or TimescaleDB are commonly used databases for handling time series data efficiently.
  4. Web Application Framework: For creating the web application to display the data and inference results, you can use web application frameworks such as Django, Flask (Python-based), or Node.js (JavaScript-based). These frameworks enable you to develop web applications and provide REST APIs for communication between the web app and backend services.
  5. Real-time Data Streaming: For handling the real-time streaming of time series data from sensors to the cloud, consider using technologies like Apache Kafka or Amazon Kinesis. These platforms can handle data ingestion and processing at scale.
  6. Web Sockets: To facilitate real-time updates in the web application, use WebSockets. WebSocket technology enables bidirectional communication between the server and web browsers, allowing for real-time updates of the displayed data.
  7. Visualization Library: Utilize a data visualization library like D3.js or Plotly to create interactive and informative charts and graphs to display the time series data and inference results.
  8. Authentication and Authorization: Implement user authentication and authorization mechanisms in the web application to ensure secure access to the displayed data.
  9. Containerization: Consider containerizing your application using Docker to package all the components and dependencies, making it easier to deploy and manage across different environments.
  10. Continuous Integration and Continuous Deployment (CI/CD): Implement CI/CD pipelines to automate the testing, building, and deployment of the application, ensuring smooth updates and maintenance.

Technological Stack for Local Digital Twin

  1. Machine Learning Framework: Select a machine learning framework that supports training and retraining of models on the PC. Popular choices include TensorFlow, PyTorch, scikit-learn, or XGBoost. These frameworks provide tools and libraries for building, training, and deploying machine learning models locally.
  2. Time Series Database: Use a time series database to store and manage the streamed time series data on the PC. InfluxDB or TimescaleDB are commonly used databases for handling time series data efficiently.
  3. Data Preprocessing: Implement data preprocessing pipelines to process and clean the time series data collected from the sensors. Tools like Pandas and NumPy in Python can be helpful for data manipulation and preparation.
  4. Real-time Data Streaming: For handling the real-time streaming of time series data from sensors to the PC, consider using technologies like Apache Kafka or RabbitMQ. These platforms can handle data ingestion and processing in real-time.
  5. Web Application Framework: For creating the web application to display the data and inference results, you can use web application frameworks such as Django, Flask (Python-based), or Node.js (JavaScript-based). These frameworks enable you to develop web applications and provide REST APIs for communication between the web app and backend services.
  6. Web Sockets: To facilitate real-time updates in the web application, use WebSockets. WebSocket technology enables bidirectional communication between the server and web browsers, allowing for real-time updates of the displayed data.
  7. Data Visualization Library: Utilize a data visualization library like D3.js or Plotly to create interactive and informative charts and graphs to display the time series data and inference results. Grafana.
  8. Local Hosting: The web application and other services can be locally hosted on the PC. You can use tools like NGINX or Apache to serve the web application.
  9. Authentication and Authorization: Implement user authentication and authorization mechanisms in the web application to ensure secure access to the displayed data.
  10. Containerization (Optional): Consider containerizing your application using Docker to package all the components and dependencies, making it easier to deploy and manage.
  11. Streaming Data Processing (Optional): For more complex data processing tasks, consider using stream processing frameworks like Apache Flink or Apache Spark Streaming.

Resume of Technological Stack

  1. Machine Learning Framework: TensorFlow, PyTorch, scikit-learn, or XGBoost.
  2. Time Series Database: InfluxDB or TimescaleDB.
  3. Data Preprocessing: Pandas and NumPy, Node-RED.
  4. Real-time Data Streaming: Apache Kafka or RabbitMQ. .
  5. Web Application Framework: Django, Flask (Python-based), or Node.js (JavaScript-based), Streamlit
  6. Web Sockets: WebSockets.
  7. Data Visualization Library: D3.js or Plotly. Grafana.
  8. Local Hosting: NGINX or Apache.
  9. Authentication and Authorization
  10. Containerization (Optional): Docker.
  11. Streaming Data Processing (Optional): Apache Flink or Apache Spark Streaming.
  12. Continuous Integration and Continuous Deployment (CI/CD):

Streamlit

Streamlit + streaming

InfluxDB + Streamlit

  • InfluxDB is a high-performance, open-source time series database. It is designed to handle large amounts of time-stamped data and enables fast storage and retrieval of time series data for real-time analytics and monitoring purposes. InfluxDB uses a flexible data model that allows storing, querying, and aggregating data based on time intervals, making it well-suited for applications that require tracking and analyzing time series data, such as IoT sensor data, monitoring systems, and financial data.
  • Telegraf is a server-based agent for collecting and sending all metrics and events from databases, systems, and IoT sensors. Telegraf is written in Go and compiles into a single binary with no external dependencies, and requires a very minimal memory footprint.
  • Telegraf | InfluxData
  • Sample data | InfluxDB OSS 2.7 Documentation
  • Visualize Data with Streamlit and InfluxDB | InfluxData

Grafana + InfluxDB + Node-RED

(41) Build Low-code Operational Digital Twins with Node-RED, InfluxDB and Grafana - YouTube

Streamlit + Kafka

Review of Digital Twin + Predictive

(41) "Predictive Digital Twins: From physics-based modeling to scientific machine learning" Prof. Willcox - YouTube Karen Willcox

Framework

Project Examples

Herrera (almeaibedDigitalTwinAnalysis2021, link, DOI, zolib) uses Model Predictive Controller for Green House control. Code is available. (steindlSemanticMicroserviceFramework2021, link, DOI, zolib)

Quix: no TimeDB, Kafka under the Hood

Hono - Ditto

Eclipse Hono provides remote service interfaces for connecting large numbers of IoT devices and interacting with them in a uniform way regardless of the device communication protocol. Hono supports devices communicating via common IoT protocols, as standard, such as HTTP, MQTT and AMQP. Eclipse Ditto is a framework that supports IoT Digital twins software pattern implementation. Ditto’s capabilities includes: mirrors physical assets/devices, acts as a “single source of truth” for a physical asset, provides aspects and services around devices, and keeps real and digital worlds in sync. Apache Kafka enables the building of data pipelines for real-time streaming applications. Kafka is seamlessly horizontally scalabe, false tolerant and high speed. The three main capabilities of Kafka is to, 1. publish and subscribe to stream of records, 2. store these records in fault tolerant way, and 3. process them as they occur. Influx DB is a real-time series database, which is simple to setup and scale. Grafana is an analytics and monitoring solution which provides plugin data source models and support for many of the most popular time series database

There are many technologies that can be used to realize bidirectional communication. For example, Ditto and RabbitMQ as mentioned above.

Many different technologies already exist for tackling the communication issues, i.e. RabbitMQ (https://www.rabbitmq.com/), Ditto(https://www.eclipse.org/ditto/), Apache Kafka(https://kafka.apache.org/), and https://www.rti.com/. In our case, we chose RabbitMQ to serve as a data broker because of its lightweight and easy deployment on premise.

Regarding data storage, we used InfluxDB for its ease of use, its bindings to different programming languages, and its simple visualization facilities.

Recent advances in tools for creating visual interfaces such as Unity(https://unity.com/), Qt(https://www.qt.io/). Grafana, Dash, Gazebo, and so on, have made it easier to create intuitive and visual interfaces for a DS. One of the challenges in DT engineering is the rapid construction of these interfaces from a PT. Such interfaces should, for example, allow the user to: 1) Selectively visualize the 3D PT and its state; 2) Create dashboards to plot the most important data; 3) Replay past behavior of the PT; 4) Spawn new what-if simulations; 5) Run optimizations; 6) Display predictive maintenance results;

Kafka vs MQTT

IMPORTANT INTERESTING PROJECT

kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference: Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required Streaming Machine Learning with Tiered Storage hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference/python-scripts/README.md at master · kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference (41) How to Build a Digital Twin for Thousands of IoT Devices with Apache Kafka and MongoDB - YouTube

TRY THIS BELOW Hello from OpenTwins | OpenTwins (roblesOpenTwinsOpensourceFramework2023, link, DOI, zolib) ertis-research/OpenTwins

Data Connectivity

HTTP/HTTPSWebSocketDDSMQTTAMQP
---------------
Transmit data over Internet from applications, websites, and so onTransmit data over Internet from applications, websites, and so onCommunication bus to connect intelligent machinesLightweight protocol for collecting data and publishing it to servers and subscribersQueue system to reliably transfer messages between

Cloud Platform

AWS IoTIBM Watson IoT for BluemixPTC ThingWorxMicrosoft Azure IoT
------------
Free Evaluation1-year free trial30-day trial, then free Lite version30-day trial

Tags

Edit this page
Last updated on 8/21/2023