Neural Networks for Steady-State Fluid Flow Prediction : Part 3
This article aims to give a broad overview of how neural networks, Fully Convolutional neural networks in specific, are able to learn fluid flow around an obstacle by learning from examples.
This series is divided into three parts.
Part 1: A data-driven approach to CFD
Part 2: Implementation details
Part 3: Results (this post)
In part 1, we gained a high-level overview of the data-driven approach to CFD and the steps that are needed to make it work. In the second part, we explored technical details of two essential steps of the data-driven approach to CFD: network architecture and accuracy measurements. In this final part, we will first discuss the number training samples that might be needed and the most promising type of activation function of the network. Then we will see some visualizations of the network results and finally, we will discuss the performance of the data-driven approach.
Parameter studies
Generating a large amount of simulation data samples is a computationally demanding subtask of the data-driven approach to CFD. Due to this computational load, finding the optimal amount of training data is important. The training set must be large enough to enable the neural network to generalize to unseen geometries and avoid over-fitting its parameters, but should not be unreasonably large (data set generation would take too long) at the same time.
Final validation loss over the number of training samples |
The above figure visualizes the final validation loss for different numbers of training samples after 15000 training steps. While the final validation loss is larger than 0.1 for training samples sizes smaller than 5000 samples, the final loss converges towards 0.02 with an increased number of training samples. Increasing the number of training samples over the number of 10000 does not further reduce the final validation loss. It follows that a number of 10000 training samples is sufficient to obtain the a minimal loss both on the validation data set and the training data set.
To determine the influence of the particular choice for activation functions, four activation function types were validated:
- Exponential Linear Units (ELU)
- Concatenated Exponential Linear Units (Concat ELU)
- Rectified Linear Units (ReLU)
- Concatenated Rectified Linear Units (Concat Relu)
The below figure shows the progression of the validation loss as a function of the number of steps during training for the four tested validation functions.
Validation loss over the number of steps for different types of activation functions |
Significant spikes in validation loss are introduced during the early stages of training especially for training with the Concat ELU and ELU activation functions. Spikes are an unavoidable consequence of mini-batch gradient descent with the Adam optimizer. Some mini-batches have by chance unlucky data, inducing those spikes in the validation loss. Thus, smaller spikes in validation loss are present for all evaluated types of activation functions. The ReLU activation function was chosen due to the non-existence of big spikes in validation loss during training and due to the overall smallest validation loss towards the end of training of approximately \(0.007\).
Results
In the below table, the final losses and accuracies of the network are listed for grid resolutions of 64 x 64 , 128 x 128, and 256 x 256 cells.
Accuracy table for different grid resolutions |
Overall, the loss values increase with increasing grid resolutions due to the fact that the loss is not normalized by the grid resolution. Thus, higher grid resolutions lead to higher losses. In the case of a 64 x 64 cells grid resolution, all measured accuracies have high values above 93%. Especially, the divergence accuracy is very high for all tested grid resolutions. It follows that the neural network predicts physically sensible solutions to the NSE for a given voxelized obstacle geometry. The cell-based, drag-based, and max-flow-based accuracies substantially decrease with increasing grid resolution. One cause for this observation might be that more than the determined necessary number of 10000 training samples are needed when using higher grid resolutions in order to allow the neural network to achieve similar accuracies to the ones measured for the lowest grid resolution. More training data allows the neural network to better approximate the more detailed simulation data contained in training data with higher grid resolutions. Another reason for the inferior prediction accuracies with higher grid sizes can be the network architecture. With higher grid sizes, the flow contains more features and cannot be estimated by the network with the same accuracy as features for lower grid sizes. Adding more residual blocks to the network might result in improved prediction accuracies.
The two images below show the ground truth vector field (obtained with a simulation) and prediction of the neural net (obtained with the data-driven approach to CFD). But which one is which?
Ground truth vector field (obtained with a simulation) and Prediction of the neural net (obtained with the data-driven approach to CFD). But which one is which? |
Visually, the vector fields for the simulation and the data-driven prediction do not differ substantially from each other for all grid resolutions. The absolute error for each cell (shown in the figure below) illustrates the domains within the simulation where high difference between simulation and prediction is observed. The line along the front (where the fluid flow hits the object) has the highest mismatch between simulation and prediction for all examined resolutions. This behavior can be explained by the high velocity gradient along the obstacle front as the fluid is slowed down from free-flow velocity to zero. Neural networks tend to have difficulties where big gradients in the input prevail. High absolute error also occurs at the border to the domain of slipstream behind the obstacle where there are high gradients along the y-axis in the x-direction of the fluid flow. This holds for all grid resolutions equally. Additional prediction error is introduced within the obstacle itself. Here, the fluid flow velocity is zero per definition. The neural network is not completely able to predict the fluid flow to be zero here. However, this error is smaller than the errors introduced in areas of high velocity gradients.
Absolute error between ground truth and prediction |
Block-shaped prediction artifacts are dominant in flow regions close to the outlet outside the slipstream behind the obstacle. These artifacts might be caused by the convolution filters in the neural network as the blocks might correspond to the visual input region of one convolutional layer into the next layer. An insufficient number of training samples or insufficient convergence of the neural network weights during training might be causes of the artifacts. Low prediction error is dominant in flow regions close to the inlet and in the slipstream of the obstacle. This can be explained by the fact that these are regions of uniform flow velocity. Close to the inlet, the flow velocity is very close to the inlet boundary condition velocity and behind the obstacle in the slipstream the flow velocity is approximately zero. These simple flow characteristics are easy for the neural network to learn.
Performance
To evaluate the advantages of the data-driven approach to CFD, not only the accuracy of the approach in relation to state-of-the-art CFD solvers must be determined, but also the time it takes to derive a solution to a posed problem.
Creating the 10000 samples of the external two-dimensional simulation data set takes approximately 13 hours on the used machine. In addition to the data-set creation time, the neural net training duration must be considered. The table below lists the training durations for the different grid resolutions, but also the number of simulation predictions per second once the neural net was trained on the simulation data set, and the speedup in comparison to the OpenFOAM simulation.
** |
Creating the minimum number of 10000 samples for the simulation data sets takes a considerable amount of time, especially for the three-dimensional simulations. The training process of the neural network is equally demanding in terms of required computational power and takes a similar amount of time until the neural network is sufficiently trained on the data set. It can be concluded that creating the necessary data sets with enough samples constitutes a substantial computational overhead for the data-driven approach to CFD.
However, once the neural network is trained on the simulation data set, a vast speedup in comparison to the traditional simulation-based approach to CFD can be observed. Due to the smaller numbers of input grid cells, the grids with the lowest resolution yield the largest speedups. Even the smallest speedup of about 57, which was achieved for the 256 x 256 grid cells is more than one magnitude faster than the state-of-the-art SimpleFoam CFD solver. The simulation speed of the data-driven approach to CFD allows for real-time flow field predictions of all tested simulation setups. The lowest prediction rate is still fast enough to allow for real-time predictions of fluid flow.
Summary
The aim of the proposed data-driven approach to CFD is to outperform traditional simulation-driven CFD in simulation setups the neural network was previously trained on. While there is substantial computational overhead in creating training samples and training the neural network on them, these calculations can be performed offline without interaction of the user. Thus, not introducing any waiting time for engineers. Compared to existing approx- imation models in the domain of CFD, neural networks enable an efficient estimation for the entire velocity field. Furthermore, designers and engineers can directly apply the CNN approximation model in their design space exploration algorithms without training extra lower-dimensional surrogate models.
The flow prediction results show that with an increased grid resolution, the overall accuracy of the predictions deteriorates. The reason for this unwanted behavior might be found in the lack of training samples and in the network design. While the number of training samples and the network architecture are sufficient to train the proposed neural network for low-res grid sizes and two-dimensional fluid data, this setup is insufficient for higher grid sizes with more fluid flow information that is encoded in the grid cells. More training samples provide a wider range of flow fields to learn from and a deeper or wider network architecture with more parameters promises to find more detailed flow patterns in the samples and thus potentially increases the prediction accuracy.
The data-driven approach can provide immediate feedback for real-time design iterations at the early stages of design exploration. Immediate feedback allows the engineer or designer to explore designs during the creative design process without interrupting the creative process. Inference times of well below 0.01 seconds for the highest tested grid resolution are below the latency threshold of the human brain.
The data driven approach to CFD is not intended as a replacement of existing CFD software. However, it can be understood as a tool for the first step of the product development process. In a second step, high-performance simulation-driven CFD can be employed to further refine the results of the data-driven approach to CFD in order to guarantee quantitative product simulation data with only a small margin of error. The claim of Machine Learning based approaches to CFD might never be to fully replace physics-based numerical solvers but to serve as a good-enough approximation method for fluid flow behavior where time and computational resources for a high-resolution CFD simulation are sparse. While the accuracies of the data-driven approach to CFD can never reach the accuracies of numerical CFD solvers, a suitable application for data-driven approaches to CFD might include computer games, where no high accuracy is needed, but very fast execution time.
This part 3 marks the end of the three-part series on Neural Networks for Steady-State fluid flow prediction. I hope you enjoyed the journey as much as I did.
Thanks for reading! 👨💻 🎉