Computer Vision (CV) is a field of
artificial intelligence (AI) that focuses on methods for extracting data from
visual sources of information. CV commonly refers to tasks involving the
detection, tracking, and classification of objects in digital images. CV is
employed to address a wide range of scientific and business challenges,
including the development of AI for transportation, analysis of road traffic
and road conditions, automatic analysis of medical images (MRI, X-rays), defect
detection in manufacturing plants, automated waste sorting, text recognition,
barcode scanning, recognition of various objects, and more. Despite CV having
its origins in the 1960s, it reached its greatest advancements after 2012 when
the convolutional neural network, AlexNet [1], decisively won the ImageNet
image recognition competition.
CV methods are frequently applied to
address scientific challenges in the analysis of experimental images. They
enable the enhancement of image quality, identification of specific objects in
images, object segmentation, noise reduction, and the application of various
filters. CV is employed to solve tasks such as: converting images to grayscale
and black-and-white based on a specified threshold, morphological
transformations, gradient computation, edge and contour detection, corner
detection, feature extraction, histogram equalization, template matching, image
segmentation, background subtraction, and more. Deep machine learning methods
significantly improve the quality of solutions for tasks such as image
classification, object recognition, segmentation, image restoration, and style
transfer.
This paper presents the results of research
of the gas-dynamic flow generated after a pulsed discharge sliding along the
surface of a dielectric. The resulting flow was visualized using shadowgraphy
and recorded with a high-speed camera. The analysis of a large number of
acquired digital images was automated and performed using a deep learning model
- a convolutional neural network. The task was to recognize certain flow
structures and measure their size.
Currently, the most common methods for
visualizing flows are shadowgraphy and schlieren techniques [2-3], which are
based on the phenomenon of light refraction. According to the well-known
Gladstone-Dale relation (1), the refractive index of a medium, n, is directly
proportional to its density, ρ. This allows for the visualization of
density variations in a transparent medium (liquid or gas):
|
(1)
|
where G is the
Gladstone-Dale constant, and λ is the wavelength of the radiation.
Based on the optical setup of shadowgraphy
or schlieren methods, the evolution of flow inhomogeneities can be recorded
using digital cameras and other equipment. The images of these inhomogeneities
are regions with brightness different from the background. In particular, a
shadowgraph image of a shock wave consists of alternating dark and light
stripes. The advent of modern high-speed digital cameras has enabled the
capture of flows at speeds of up to 10 000 000 frames per second and the
accumulation of large volumes of visual data. Consequently, there is a need for
automating the processing of the acquired image and video datasets. To address
this task, both classical computer vision algorithms [4] and contemporary
methods based on convolutional neural networks and deep learning [5] are
employed.
In this study, the object of investigation
was the non-stationary shockwave flow generated by a nanosecond surface
discharge of cylindrical shape. This flow includes a cylindrical shock (blast)
wave and a contact surface (CS) propagating from the discharge region. Flows of
this type arise, for example, in the case of the breakdown of a pulsed optical
discharge induced by focused laser radiation [6] near a solid wall. In [6], the
authors conducted experimental research and numerical modeling of the optical discharge
near a solid surface. The distance between the focus (discharge area) and the
wall was 5 mm. To simulate the flow, the Euler equations were numerically
solved, using a model of instantaneous energy deposition defined within a
sphere with a radius of 3 mm. It was demonstrated that the dynamics of the
resulting shockwave are similar to those of a shockwave generated by a point
source explosion. The authors also showed that the CS, which separates the
laser-heated gas from the air behind the shockwave front, becomes unstable and
develops Rayleigh-Taylor instabilities. Similar flows occur on larger scales,
such as in explosions. In [7], numerical simulations were conducted for a 16 kT
explosion at an altitude of 580 m above the ground. The energy deposition was
defined within a spherical region with a diameter of 2.4 m. Both the descending
and reflected shockwaves were simulated, including the resulting rising
mushroom-shaped cloud. It was demonstrated that the development of the
mushroom-shaped flow is primarily influenced by three factors: the flow behind
the shockwave front induced by pressure differences, the rise of the heated
region (with a density approximately 700 times lower than that of the
surrounding air for t < 10 s), and the motion of vortex rings behind the
reflected shockwave.
Similar
configurations were observed during the development of high-current nanosecond
surface discharges [8]. Mushroom-shaped vortex formations originating from
regions of increased energy deposition in the plasma sheets were visualized,
evolving from the bottom wall of the discharge chamber. Through manual image
processing, approximately 60 values of the upper coordinates of these
formations were obtained.
To process flow visualization and extract
quantitative information, classical computer vision methods are often applied.
These methods include edge detection using various algorithms [9-10], the Hough
transform for detecting straight lines, typically corresponding to shockwaves
and other types of gas dynamic discontinuities [4, 11]. Cross-correlation and
template matching algorithms are also used to track the movement of specific
flow structures [12]. The Particle Image Velocimetry (PIV) method is based on
these algorithms, involving seeding the flow with small particles, typically
around 1 μm in size, and illuminating selected flow sections with a laser
sheet. The velocity field is calculated by measuring the displacement of
particles between two consecutive frames.
Machine learning and deep learning methods
are increasingly being used for flow visualization. A detailed review on this
topic can be found in [5]. Convolutional neural networks such as Resnet, Unet,
and IVD-Net have been successfully applied for vortex recognition [13]. A
supervised machine learning approach for filling in missing areas in PIV images
achieved high accuracy in reconstructing velocity fields in hidden flow regions
[14]. Synthetic PIV images were used for training the model, with corresponding
velocity vector outputs. Various machine learning methods were employed in
reference [15] to extract quantitative information from schlieren images of a
rarefied plasma channel. A neural network for detecting different types of
vortex traces behind an oscillating aerodynamic profile was developed and
applied for classifying three types of traces: 2S, 2P + 2S, 2P + 4S [16].
Models for feature extraction corresponding to shockwaves from large flow data
sets, both experimental and numerical, were proposed in [17-18].
Flow visualization experiments were carried
out in a discharge chamber at pressures in the range p = 90 - 100 Torr. The
discharge chamber has a rectangular cross section of 24×48 mm². The
top and bottom surfaces are made of dielectric material, while the side walls
are made of quartz glass for optical access. Pairs of electrodes, each less
than 0.1 mm thick, are embedded in the upper and lower walls of the discharge
chamber, 30 mm apart. Surface discharges were produced simultaneously on both
the upper and lower walls of the discharge chamber. A voltage of 25 kV was
applied to the electrodes. The duration of the surface discharge current
sliding along the dielectric surface (plasma sheet) was up to 300 ns, and the
glow of the discharge plasma lasted up to 1 μs. In the pressure range
used, the plasma sheet exhibits non-uniform glow, with increased energy
deposition in 1-2 bright channels, both on the upper and lower surfaces of the
discharge chamber.
The schematic of the discharge chamber, the
discharge and the shock wave flow generated as a result of the pulsed energy
deposition are shown in Figure 1. In addition, Figure 1 shows an integral image
of the discharge glow taken with a digital camera. The investigated discharge
channels appeared as cylindrical formations of low temperature plasma, 30 mm in
length, on the dielectric surface [8].
Figure
1
– Experimental setup and integral photo
of the pulsed surface discharge. 1 – electrodes, 2 – quartz glass windows, 3 – blast
wave, 4 – nanosecond surface discharge
The pulsed surface discharge under study
can be considered, from a gas dynamics perspective, as a cylindrically
symmetric explosion. An explosion can be defined as an event caused by the
release of a significant amount of energy in a very short period of time within
a small, localized volume [7]. The studied discharge fits this definition since
the duration of gas heating does not exceed several tens of nanoseconds, and
the volume of pulsed energy deposition is limited to a cylinder with a diameter
of up to 2 mm.
An optical setup was configured to visualize
the resulting flow using the shadowgraph method. A parallel beam of light was
directed along the discharge channel. At the initial moment, the discharge was
initiated. Prior to the discharge, a high-speed camera was triggered to capture
shadowgraph images at a rate of 124 000 frames per second. Discharge initiation
and camera activation were synchronized by a specially designed electrical
circuit.
The aim of the experiment was to obtain
shadowgraph frames of the flow generated by the discharges, including the
cylindrical shockwaves and the CS, and to measure the diagrams of their motion.
The study of the dynamics of the shock waves was carried out manually (as they
only appear in a few frames due to their high speed), while a neural network
was developed to automatically determine the size of the CS in a large number
of frames.
Figure 2 shows examples of shadowgraph
frames obtained at a low chamber pressure of p = 93 Torr and a frame rate of
124 000 frames per second. Shock waves from two discharge channels on the upper
and lower walls of the discharge chamber are visible in frames up to 55
μs. Subsequent frames show the evolution of the CS that separates the zone
of gas heated by the discharge from the air moving behind the shock wave.
Figure
2
– Shadowgraph images of the main flow
stages. 1 – shock wave; 2 – contact discontinuity.
To automatically measure the size of the CS,
a dataset was annotated to train a convolutional neural network using the
well-known YOLOv8 architecture [19]. The dataset contained objects of a single
class, "plume", corresponding to the CS. The dataset consisted of 492
images with 984 annotations. The image size was 368x224 pixels. The image
dataset was divided into three groups: a training set (429 images, 87%), a
validation set (40 images, 8%) and a test set (23 images, 5%).
The image dataset consisted of both
original images taken with a high-speed camera and enhanced versions of these
images. New versions of the images were created by applying blurring (up to 2.5
pixels) and adding random noise (up to 5% of pixels). Blurring was applied to
make the model more robust to image blurring associated with elongation of the CS
along the direction of the probe beam. Adding noise was used to make the model
more robust to artefacts in the images.
A YOLOv8 convolutional neural network was
used to solve the CS recognition task. At the time of writing, YOLOv8 was the
most advanced model, offering an optimal balance between object recognition
accuracy and speed. The model showed the best performance on the COCO dataset
[20], as measured by the Mean Average Precision (mAP) metric, taking into
account the speed of inference (ms/image). A similar model, trained on another
set of shadowgraph images containing different flow structures, was used in a
previous work [21]. Transfer learning was applied to improve the recognition
quality, using weights from a model pre-trained on the COCO dataset. The
training process consisted of 95 epochs, and the model hyperparameters were set
to default values taken from the Ultralytics repository [22]. The key
hyperparameters had the following values: batch = 16, optimiser = SGD
(stochastic gradient descent), momentum = 0.937, weight_decay = 0.001.
The following metrics were used to evaluate
the model: accuracy, precision, recall, and F1-score, which is the
harmonic mean of precision and recall. These metrics are defined using formulas
(2-4) based on the values of TP (true positive – the model correctly classified
an object as belonging to the considered class), TN (true negative – the model
correctly determined that an object does not belong to the given class), FP
(false positive – the model incorrectly classified an object as belonging to
the given class), and FN (false negative – the model incorrectly determined
that an object does not belong to the given class).
The calculation of object detection model
metrics is based on the measurement of overlap. It is determined by the
Intersection over Union (IoU) metric, which is the ratio of the intersection
and union of predicted regions and ground truth regions on an image. A
threshold IoU value was set to determine how much the prediction should overlap
with the user annotated ground truth region to be considered correct. If IoU
was greater than the threshold, the prediction was considered correct (true
positive, TP) and if it was less, it was considered incorrect (false positive,
FP). The primary metric used to evaluate the model was the mean average
precision (mAP), which is the average of the mean precisions for all classes in
the dataset. The average precision for each class (AP) is determined as the
area under the precision-recall curve. In addition, metrics such as mAP50, calculated
with an IoU threshold of 0.5, and mAP50-95, which is the mAP averaged over the
IoU range [0.5 : 0.05 : 0.95], were calculated. A confusion matrix was also
used to assess the performance of the model, which is a table of four different
combinations of predicted and actual values.
As a result of the model's predictions for
each frame, the corresponding coordinates and sizes of the recognized objects
were obtained. Since this work focused on the vertical size of the CS, the
heights (h) of the obtained bounding boxes were stored in an array along with
their corresponding timestamps (t). A dependency h(t) was constructed. The
dependency of the coordinate of the shock wave (y) on time (t), measured
manually, was overlaid on the same axes. In some cases, an approximation of the
obtained results was performed using polynomials of different degrees chosen
based on physical considerations.
The plots for precision-confidence,
recall-confidence, precision-recall and F1-confidence are shown in
Figure 3. Figure 4 shows plots of some model metrics, including the most
important ones: precision, recall, mAP50 and mAP50-95. After 95 training steps,
the mAP50 metric reached 0.887, while mAP50-95 reached 0.557. These are quite
good results for a relatively small training dataset. The model was able to
detect structures of interest in the flows, such as CS, with an accuracy of up
to 1 mm. The detection speed of the model was 13.9 frames/s on a server with
the following specifications: Intel(R) Xeon(R) CPU @ 2.00GHz, NVIDIA Tesla T4
16GB GPU, 16GB RAM.
Figure
3
– Model metrics: precision and recall
Figure
4
– Model metrics
The confusion matrix of the model (Figure
5) also shows good results. In 90% of cases the model correctly predicted the
position of the CS. In 10% of the cases the model made errors where the
predicted bounding box for the CS did not match the ground truth by more than
the IoU threshold. It's worth noting that the vast majority of errors occurred
during the later stages of the flow at t > 0.9 ms, whereas the research
objective was to study the flow during the early stages at times up to 1 ms.
Figure
5
–
Confusion matrix
Figure 6 shows examples of model
predictions for frames within the time interval 93
μs
to 539
μs.
The entire video record of
the flow consists of several hundred frames in which the model successfully recognized
the CSs. They were successfully identified on both the upper and lower walls of
the discharge chamber, including the initial phase of the flow.
Figure
6
–
Contact discontinuity detection
The predicted coordinates and sizes of the
areas corresponding to the CS were used to automatically construct the
size-time dependency, h(t). The corresponding dependencies for two different
experiments are shown in Figures 7-8. Each figure includes plots for both the
lower and upper CS produced by pulsed discharges on the upper and lower walls
of the discharge chamber. Also plotted on the graphs are x-t plots of the
resulting cylindrical shock waves. These points were measured manually due to
the limited number of images containing shock waves. In Figure 7, the x-t plots
of the shock wave motion are separately highlighted in more detail, along with
the corresponding second degree polynomial approximations.
Figure
7
– x-t plots of shock waves and contact
discontinuities at 92 Torr pressure. The left column corresponds to the upper
flow structures, the right column to the lower ones. The bottom row contains
the second order polynomial approximation.
The experiments were carried out under
identical conditions except for the pressure in the discharge chamber. In the
experiment corresponding to Figure 7, the pressure was 92 Torr. The graphs in
Figure 9 correspond to an experiment conducted at a pressure of 99 Torr.
Analysis of the plots shows that the speed of the shock waves and the size to
which the CS expands depend on the pressure. As the pressure changes, so do the
parameters of the discharge and the energy input localised in the plasma
cylinder zone. Differences in velocity and degree of gas heating lead to the
formation of shock waves of varying intensity. In these experiments, the main
factor influencing the development of the CS is the flow behind the shock wave
front. Estimates have shown that the influence of gravity (buoyancy) can be
neglected when analysing the mechanisms of CS development. Analysis of numerous
experimental data has shown that the flow diagrams on the upper and lower walls
of the discharge chamber are almost identical.
The difference in diagrams under similar
initial conditions is related to random fluctuations in discharge power, energy
redistribution between upper and lower discharges, instability, and the
significant extent of the CS.
From the data obtained, it can be seen that
the shock wave velocity varies little in the observation area and varies
between 430 m/s and 480 m/s in different experiments. The CS expands to a
vertical size of 5 - 11 mm at times t = 0.4 - 0.9 ms. The final size of the CS
and the rate of expansion are greater the higher the velocity of the
corresponding shock wave.
Figure
8
x-t plots of shock waves and contact
discontinuities at 99 Torr pressure. The left column corresponds to the upper
flow structures, the right column to the lower ones.
Based on the x-t plots obtained, by solving
the inverse problem, it is possible to numerically calculate the values of the
initial energy input - the change in the internal energy of the gas in the
breakdown region. It is also possible to estimate what proportion of the
electrical energy was used to heat the gas and form the resulting flow.
Figure 9 shows examples of the automated
processing of several video recordings taken over a range of pressures from 92
to 99 Torr. Using the developed model, both the vertical and horizontal
dimensions of the CS were measured. It can be seen that as the pressure
increases, the vertical extent to which the CS expands decreases. Meanwhile,
the horizontal dimensions show little dependence on pressure within the range
considered.
Figure
9.
x-t
plots of contact discontinuities in the 92-99 Torr pressure range. Left -
vertical size. Right - horizontal size.
In this study, shock wave flow generated in
air at pressures of 90-100 Torr using sliding surface discharges at an applied
voltage of 25 kV was investigated. Flow visualization was performed using the
shadowgraph method while probing along the discharge channel. Images were
recorded using a high-speed camera at a frame rate of 124 000 frames per
second. The aim of the study was to measure the x-t plots of the motion of the
resulting cylindrical shock wave and the contact surface over a large dataset
of shadowgraph images within a time interval of up to 1 ms.
A convolutional neural network based on the
YOLOv8 architecture was successfully trained and applied to detect the region
formed after the discharge. The model was trained on a dataset containing 492
shadowgraph images with 984 annotations. The dataset included both the original
images captured by the high-speed camera and their enhanced versions created by
blurring the original images and adding artificial noise.
The training process consisted of 95 steps,
after which the mAP50 metric reached 0.887, and the mAP50-95 reached 0.557. The
model's confusion matrix showed that in 90% of cases, the model accurately
predicted position of the contact surface. In 10% of cases, the model made
errors, with the predicted contact surface bounding box not matching the
annotation by an extent greater than the IoU threshold. Most errors
corresponded to the later stages of the flow at t > 0.9 ms, during blurring
and instability development of the contact surface, whereas the research aimed
to study the flow up to its cessation on timescales of up to 1 ms.
The trained computer vision model
successfully allowed the construction of dependencies of vertical contact
surface sizes over time in experiments conducted at different pressures. In
addition, x-t plots and velocities of the resulting cylindrical shock waves
were measured. The shockwave velocity remained almost constant within the
observation region and varied from 430 m/s to 482 m/s over the pressure range
studied. The contact surface expanded over time scales of 0.4 to 0.8 ms to a
vertical size of 5 to 11 mm. It was shown that for the time scales considered
in this study, the influence of gravity (rising of the heated discharge region)
can probably be neglected, since the measured flow parameters at the bottom and
top of the discharge chamber were identical. Thus, the primary cause of contact
surface development is the flow behind the shock front.
The study showed that
the application of advanced computer vision methods significantly accelerates
the processing of flow visualisations obtained in gas dynamics experiments and
speeds up the extraction of quantitative information, providing new physical
insights.
This study was supported by the Russian Science
Foundation (Grant No. 22-79-00054), https://rscf.ru/project/22-79-00054/.
1.
Krizhevsky A., Sutskever I, Hinton G.E. ImageNet Classification with
Deep Convolutional Neural Networks // Advances in Neural Information Processing
Systems. 2012. T. 25.
2.
Settles G.S., Hargather M.J. Review of Recent
Developments in Schlieren and Shadowgraph Techniques // Meas. Sci. Technol. 2017.
Ò. 28. ¹ 4.
3.
Rienitz J. Schlieren Experiment 300 Years Ago //
Nature.
1975.
T. 254.
¹ 5498. Ñ.
293–295.
4.
Automatic detection of oblique shocks and simple
waves in schlieren images of two-dimensional supersonic steady flows / G.
Cammi, A. Spinelli, F. Cozzi, A. Guardone // Measurement. 2021. T 168.
5.
Deep learning approaches in flow visualization /
C. Liu, R. Jiang, D. Wei, C. Yang, Y. Li, F. Wang & Xiaoru Yuan // Advances
in Aerodynamics. 2022. T 4.
¹
17.
6.
Numerical and experimental study of a
micro-blast wave generated by pulsed-laser beam focusing / Z. Jiang, K.
Takayama, K.P.B. Moosad, O. Onodera, M. Sun // Shock Waves.
1998.
T.
8. Ñ.
337–349.
7.
Kim J-H, Kim S. Simulation of Blast Wave
Propagation and Mushroom Cloud formation by a Bomb Explosion // AIAA SciTech
Forum, 9 - 13 January 2017, Grapevine, Texas, 55th AIAA Aerospace Sciences
Meeting. 2017.
8.
Shock wave interaction with a thermal layer
produced by a plasma sheet actuator / E. Koroteeva, I. Znamenskaya, D. Orlov
and N. Sysoev // Journal of Physics D: Applied Physics. 2017. T. 50.
9.
Image Processing Techniques in Shockwave
Detection and Modeling / S. Cui, Y. Wang, X. Qian, Z. Deng // J. Signal Inform.
Process. 2013. T. 4. ¹ 3B.
Ñ. 109–113.
10.
Srisha Rao M.V., Jagadeesh G. Visualization and
Image Processing of Compressible Flow in a Supersonic Gaseous Ejector // J.
Indian Inst. Sci. 2013. T. 93. ¹ 1.
Ñ. 57–66.
11.
Znamenskaya I.A., Doroshchenko I.A. Edge Detection and Machine
Learning for Automatic Flow Structures Detection and Tracking on Schlieren and
Shadowgraph Images // J. Flow Vis.
Image Process. 2021.
T. 28
¹
4. Ñ.
1
–
26.
12.
Gena A.W., Voelker C., Settles G.S. Qualitative
and Quantitative Schlieren Optical Measurement of the Human Thermal Plume //
Indoor Air. 2020. T. 30.
¹
4. Ñ.
757
–
766.
13.
Berenjkoub M., Chen G., Günther T. Vortex Boundary
Identification Using Convolutional Neural Network // in Proc. 2020 IEEE
Visualization Conference (VIS), Salt Lake City, USA, October 25–30. 2020.
Ñ.
261
–
265.
14.
Morimoto M., Fukami K., Fukagata K. Experimental Velocity Data
Estimation for Imperfect Particle Images Using Machine Learning // Phys.
Fluids.
2021.
T. 33.
¹
8.
15.
Machine learning methods for schlieren imaging of a plasma channel
in tenuous atomic vapor / G. Bíró, M. Pocsai, I. F. Barna, G. G.
Barnaföldi, J. T. Moody, G. Demeter// Optics & Laser Technology. 2023.
T. 159.
16.
Colvert B., Alsalman M., Kanso E. Classifying Vortex Wakes Using
Neural Networks // Bioinspir.
Biomim. 2018.
T. 13.
¹
2.
17.
A Deep Learning Approach to Identifying Shock
Locations in Turbulent Combustion Tensor Fields / M. Monfort, T. Luciani, J.
Komperda, B. Ziebart, F. Mashayek, G. E. Marai.
Ì.:
Springer,
Cham,
2017.
Ñ. 375–392.
18.
Speed detection in wind-Tunnels by processing
schlieren images / M. D. Manshadi, H. Vahdat-Nejad, M. Kazemi-Esfeh, M. Alavi,
// Int. J. Eng. 2016.
Ò. 29. ¹ 7.
Ñ. 962–967.
19.
Real-Time Flying Object Detection with YOLOv8 /
D. Reis, J. Kupec, J. Hong, A. Daoudi // arXiv:2305.09972. 2023.
Ñ. 1–10.
20.
Microsoft COCO: Common Objects in Context /
T.-Y. Lin
è äð. //
arXiv:1405.0312. 2014.
Ñ. 1–15.
21.
Doroshchenko I.A. Analysis of the experimental flow shadowgraph
images by computer vision methods // Numerical Methods and Programming
(Vychislitel'nye Metody i Programmirovanie). 2023.
T.
24.
¹
2. Ñ.
231
–
242.
22.
YOLOv8
[Ýëåêòðîííûé ðåñóðñ] /
Ultralytics
//
GitHub. 2023.
URL: https://github.com/ultralytics/ultralytics (äàòà îáðàùåíèÿ 29.06.2023).