Analysis of Engineering Systems:
While designing an intelligent engineering system/smart machine with mechatronic components and software, engineers have a few options. They can use
-
1. Numerical Techniques
-
2. Closed-form equations or physics models
-
3. Data-driven AI/ML models
1&2 use engineering relationships established through physics and solve those relationships through numerical or closed-form techniques. 3 is purely based on data. The first two methods can produce excellent results when the approximations of the physics modelling are properly taken care of. However, they become increasingly complicated for complicated machines and require intense domain knowledge. Even when all those are available, the behaviours of the systems observed through measurable variables tend to drift from predicted behaviours because of various approximations in modelling. This poses challenges for AI specialists and cyber-physical system engineers trying to develop a smart machine. Using 3 and available data they can generate a model for synthetic data which can show them the behaviour of the system.
Do we need Synthetic Data for Engineering Systems?
Synthetic data, when designed properly, is a perfect solution for understanding, designing, and controlling complicated smart machines. While designing such a system with hierarchical control – be it Connected Devices with DCS or autonomous engineering systems, software engineers need synthetic data for simulations and building the right AI systems. Engineers need data for corner cases, state transitions, or impending failures. The engineering community have already seen how synthetic data is leveraged by the autonomous car and drone industries. A similar effort is necessary for the rest of the engineering systems if autonomous and semi-autonomous behaviours need to be built in smart machines to realize the dream of Industry 4.0.
Dynamic System and Synthetic Data
Most of engineering systems spew time series through their observable and measurable I/O variables. Though a lot of research effort is spent on images, languages, and communication networks, there is a relative dearth of research on predicting synthetic data for time series of engineering machines with multi-dimensional I/Os – which are essentially all the machines and systems used in engineering space.
Creating a synthetic data model that can predict the multi-dimensional time-dependent I/O variables for different system parameters, and their transitions is equivalent to creating a dynamic model of the system. That is exactly what is required to simulate the corner cases and understand state transitions.
The model of the above type can be trained from the existing time series data collected experimentally from the same class of machines. These sets of time series data labelled by each unique parameter value, also called condition/regression variables, are the main source of creating a data-based dynamic model of the system. Not many tools are available for software engineers that can handle conditional time series. And when the conditioned variables as continuous, which is true for most of engineering systems, the work done is simply not sufficient.
A Few Available Algorithms
Vector Autoregression ( VAR) and Long Short Term Memory ( LSTM) can be used for predicting multi-variate time series. Those are often used to synthesize time series but their usage is limited to simple cases. A high level of success in real-life engineering time series synthesis is achieved by Generative Adversarial Network (GAN)–based algorithms that use LSTM for discriminators. Randomly Condition GAN (RCGAN) and Time Series GAN (TSGAN) are two good algorithms that can be used for synthetic time series generation. DoppelGANger (DG) which is an improvement over the previous algorithms needs a special discussion in this context. Though condition variables are handled in any Conditional GAN (CGAN) algorithms but DG treats condition variables and I/O data by two interdependent discriminators. In a very crude sense, DG uses a two-step process of synthesizing individual time series for one condition and then regress over various conditions to establish the relationship between their time-series data. This is very intuitive from a dynamic system POV, especially when the relationships are linear or mild non-linear.
Synthetic data for smart machines is not very well ventured and their generalization properties are not well researched. With the growth of AI in engineering domains – Industrial, construction, mining, O&G synthetic data models will be researched further and it will also improve the AI deployment in the connected and autonomous machines.
