On this page

2023-07-30
Technological Stack for Cloud Digital Twin
Technological Stack for Local Digital Twin
Resume of Technological Stack
Streamlit
InfluxDB + Streamlit
Grafana + InfluxDB + Node-RED
Streamlit + Kafka
Review of Digital Twin + Predictive
Framework
Project Examples
Quix: no TimeDB, Kafka under the Hood
Hono - Ditto
Kafka vs MQTT
IMPORTANT INTERESTING PROJECT
Data Connectivity
Cloud Platform

2023-07-30

On this page

2023-07-30
Technological Stack for Cloud Digital Twin
Technological Stack for Local Digital Twin
Resume of Technological Stack
Streamlit
InfluxDB + Streamlit
Grafana + InfluxDB + Node-RED
Streamlit + Kafka
Review of Digital Twin + Predictive
Framework
Project Examples
Quix: no TimeDB, Kafka under the Hood
Hono - Ditto
Kafka vs MQTT
IMPORTANT INTERESTING PROJECT
Data Connectivity
Cloud Platform

2023-07-30

Technological Stack for Cloud Digital Twin

Cloud Platform: To deploy the machine learning model in the cloud and handle data processing, you can choose from popular cloud platforms like Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform (GCP), or IBM Cloud. These platforms provide the necessary infrastructure and services for hosting applications and managing resources.
Machine Learning Framework: Select a machine learning framework that supports training and retraining of models in the cloud. Popular choices include TensorFlow, PyTorch, and scikit-learn. These frameworks provide tools and libraries for building, training, and deploying machine learning models.
Time Series Database: Use a time series database to store and manage the streamed time series data. InfluxDB or TimescaleDB are commonly used databases for handling time series data efficiently.
Web Application Framework: For creating the web application to display the data and inference results, you can use web application frameworks such as Django, Flask (Python-based), or Node.js (JavaScript-based). These frameworks enable you to develop web applications and provide REST APIs for communication between the web app and backend services.
Real-time Data Streaming: For handling the real-time streaming of time series data from sensors to the cloud, consider using technologies like Apache Kafka or Amazon Kinesis. These platforms can handle data ingestion and processing at scale.
Web Sockets: To facilitate real-time updates in the web application, use WebSockets. WebSocket technology enables bidirectional communication between the server and web browsers, allowing for real-time updates of the displayed data.
Visualization Library: Utilize a data visualization library like D3.js or Plotly to create interactive and informative charts and graphs to display the time series data and inference results.
Authentication and Authorization: Implement user authentication and authorization mechanisms in the web application to ensure secure access to the displayed data.
Containerization: Consider containerizing your application using Docker to package all the components and dependencies, making it easier to deploy and manage across different environments.
Continuous Integration and Continuous Deployment (CI/CD): Implement CI/CD pipelines to automate the testing, building, and deployment of the application, ensuring smooth updates and maintenance.

Technological Stack for Local Digital Twin

Machine Learning Framework: Select a machine learning framework that supports training and retraining of models on the PC. Popular choices include TensorFlow, PyTorch, scikit-learn, or XGBoost. These frameworks provide tools and libraries for building, training, and deploying machine learning models locally.
Time Series Database: Use a time series database to store and manage the streamed time series data on the PC. InfluxDB or TimescaleDB are commonly used databases for handling time series data efficiently.
Data Preprocessing: Implement data preprocessing pipelines to process and clean the time series data collected from the sensors. Tools like Pandas and NumPy in Python can be helpful for data manipulation and preparation.
Real-time Data Streaming: For handling the real-time streaming of time series data from sensors to the PC, consider using technologies like Apache Kafka or RabbitMQ. These platforms can handle data ingestion and processing in real-time.
Web Application Framework: For creating the web application to display the data and inference results, you can use web application frameworks such as Django, Flask (Python-based), or Node.js (JavaScript-based). These frameworks enable you to develop web applications and provide REST APIs for communication between the web app and backend services.
Web Sockets: To facilitate real-time updates in the web application, use WebSockets. WebSocket technology enables bidirectional communication between the server and web browsers, allowing for real-time updates of the displayed data.
Data Visualization Library: Utilize a data visualization library like D3.js or Plotly to create interactive and informative charts and graphs to display the time series data and inference results. Grafana.
Local Hosting: The web application and other services can be locally hosted on the PC. You can use tools like NGINX or Apache to serve the web application.
Authentication and Authorization: Implement user authentication and authorization mechanisms in the web application to ensure secure access to the displayed data.
Containerization (Optional): Consider containerizing your application using Docker to package all the components and dependencies, making it easier to deploy and manage.
Streaming Data Processing (Optional): For more complex data processing tasks, consider using stream processing frameworks like Apache Flink or Apache Spark Streaming.

Resume of Technological Stack

Machine Learning Framework: TensorFlow, PyTorch, scikit-learn, or XGBoost.
Time Series Database: InfluxDB or TimescaleDB.
Data Preprocessing: Pandas and NumPy, Node-RED.
Real-time Data Streaming: Apache Kafka or RabbitMQ. .
Web Application Framework: Django, Flask (Python-based), or Node.js (JavaScript-based), Streamlit
Web Sockets: WebSockets.
Data Visualization Library: D3.js or Plotly. Grafana.
Local Hosting: NGINX or Apache.
Authentication and Authorization
Containerization (Optional): Docker.
Streaming Data Processing (Optional): Apache Flink or Apache Spark Streaming.
Continuous Integration and Continuous Deployment (CI/CD):

Streamlit

Streamlit is an open-source Python library that makes it easy to create interactive web applications for machine learning and data science projects. With Streamlit, you can quickly build and deploy web apps using simple Python scripts, without needing to have any web development experience. Streamlit provides a user-friendly interface for creating interactive visualizations, displaying data, and allowing users to interact with your models or analysis. It also supports real-time updates, so you can easily iterate and make changes to your app as you develop it. Overall, Streamlit is a powerful tool for creating and sharing data-driven web applications.
Alternative to Django or Flask
Examples

Streamlit + streaming

InfluxDB + Streamlit

InfluxDB is a high-performance, open-source time series database. It is designed to handle large amounts of time-stamped data and enables fast storage and retrieval of time series data for real-time analytics and monitoring purposes. InfluxDB uses a flexible data model that allows storing, querying, and aggregating data based on time intervals, making it well-suited for applications that require tracking and analyzing time series data, such as IoT sensor data, monitoring systems, and financial data.
Telegraf is a server-based agent for collecting and sending all metrics and events from databases, systems, and IoT sensors. Telegraf is written in Go and compiles into a single binary with no external dependencies, and requires a very minimal memory footprint.
Telegraf | InfluxData
Sample data | InfluxDB OSS 2.7 Documentation
Visualize Data with Streamlit and InfluxDB | InfluxData

Grafana + InfluxDB + Node-RED

(41) Build Low-code Operational Digital Twins with Node-RED, InfluxDB and Grafana - YouTube

Streamlit + Kafka

Conf42: Building a real-time analytics dashboard with Streamlit, Apache Pinot, and Apache Kafka Streamlit+Kafka+Pinot
Featured Series: Building Streaming Data Pipelines • Supertype Streamlit+Kafka+Spark
FOSDEM 2023 - Building A Real-Time Analytics Dashboard with Streamlit, Apache Pinot, and Apache Pulsar Stramlit+Pinot+Pulsar

Review of Digital Twin + Predictive

(41) "Predictive Digital Twins: From physics-based modeling to scientific machine learning" Prof. Willcox - YouTube Karen Willcox

Framework

Project Examples

Herrera (almeaibedDigitalTwinAnalysis2021, link, DOI, zolib) uses Model Predictive Controller for Green House control. Code is available. (steindlSemanticMicroserviceFramework2021, link, DOI, zolib)

Quix: no TimeDB, Kafka under the Hood

Hono - Ditto

Eclipse Hono provides remote service interfaces for connecting large numbers of IoT devices and interacting with them in a uniform way regardless of the device communication protocol. Hono supports devices communicating via common IoT protocols, as standard, such as HTTP, MQTT and AMQP. Eclipse Ditto is a framework that supports IoT Digital twins software pattern implementation. Ditto’s capabilities includes: mirrors physical assets/devices, acts as a “single source of truth” for a physical asset, provides aspects and services around devices, and keeps real and digital worlds in sync. Apache Kafka enables the building of data pipelines for real-time streaming applications. Kafka is seamlessly horizontally scalabe, false tolerant and high speed. The three main capabilities of Kafka is to, 1. publish and subscribe to stream of records, 2. store these records in fault tolerant way, and 3. process them as they occur. Influx DB is a real-time series database, which is simple to setup and scale. Grafana is an analytics and monitoring solution which provides plugin data source models and support for many of the most popular time series database

There are many technologies that can be used to realize bidirectional communication. For example, Ditto and RabbitMQ as mentioned above.

Many different technologies already exist for tackling the communication issues, i.e. RabbitMQ (https://www.rabbitmq.com/), Ditto(https://www.eclipse.org/ditto/), Apache Kafka(https://kafka.apache.org/), and https://www.rti.com/. In our case, we chose RabbitMQ to serve as a data broker because of its lightweight and easy deployment on premise.

Regarding data storage, we used InfluxDB for its ease of use, its bindings to different programming languages, and its simple visualization facilities.

Recent advances in tools for creating visual interfaces such as Unity(https://unity.com/), Qt(https://www.qt.io/). Grafana, Dash, Gazebo, and so on, have made it easier to create intuitive and visual interfaces for a DS. One of the challenges in DT engineering is the rapid construction of these interfaces from a PT. Such interfaces should, for example, allow the user to: 1) Selectively visualize the 3D PT and its state; 2) Create dashboards to plot the most important data; 3) Replay past behavior of the PT; 4) Spawn new what-if simulations; 5) Run optimizations; 6) Display predictive maintenance results;

Kafka vs MQTT

IMPORTANT INTERESTING PROJECT

kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference: Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required Streaming Machine Learning with Tiered Storage hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference/python-scripts/README.md at master · kaiwaehner/hivemq-mqtt-tensorflow-kafka-realtime-iot-machine-learning-training-inference (41) How to Build a Digital Twin for Thousands of IoT Devices with Apache Kafka and MongoDB - YouTube

TRY THIS BELOW Hello from OpenTwins | OpenTwins (roblesOpenTwinsOpensourceFramework2023, link, DOI, zolib) ertis-research/OpenTwins

Data Connectivity


HTTP/HTTPS	WebSocket	DDS	MQTT	AMQP

---	---	---	---	---
Transmit data over Internet from applications, websites, and so on	Transmit data over Internet from applications, websites, and so on	Communication bus to connect intelligent machines	Lightweight protocol for collecting data and publishing it to servers and subscribers	Queue system to reliably transfer messages between

Cloud Platform


AWS IoT	IBM Watson IoT for Bluemix	PTC ThingWorx	Microsoft Azure IoT

---	---	---	---
Free Evaluation	1-year free trial	30-day trial, then free Lite version	30-day trial

Summary

Tags

Categories

2023-07-30

2023-07-30

Technological Stack for Cloud Digital Twin

Technological Stack for Local Digital Twin

Resume of Technological Stack

Streamlit

Streamlit + streaming

InfluxDB + Streamlit

Grafana + InfluxDB + Node-RED

Streamlit + Kafka

Review of Digital Twin + Predictive

Framework

Project Examples

Quix: no TimeDB, Kafka under the Hood

Hono - Ditto

Kafka vs MQTT

IMPORTANT INTERESTING PROJECT

Data Connectivity

Cloud Platform

Tags