Shadows pose a problem in robot,
machine, and computer vision in general. Shadows share movement and shape with
the related object, and as a consequence, confusion occurs due to object
detection or obstacle avoidance. Although shadows can help interpret a visual
scene such as image intensity and object configuration ([1],
[2]),
shadows corrupt many applications such as object monitoring, objects
recognition, and image decomposition ([3]-[5]).
In the literature, shadows have been classified as self-shadows and cast
shadows, in which object projections on the background are denoted as cast
shadows, on the other hand, the shadow projection on the object itself is
referred to as a self-shadow ([6], [7], [8]).
Many approaches have been suggested
for shadow detection and removal. These approaches have addressed the problem
of shadows in different applications (see,
[9]-[10].
In the studies by ([11]-[12]),
the authors show the unfavourable influence of shadows in the visual scene of
an agricultural robot. They suggest using image segmentation algorithms,
ultrametric contour maps, and machine learning techniques to remove shadows
from an image. Several researchers, on the other hand, have assessed the
authenticity of an image by removing the incorrect shadows in the edited images
(see for example,
[13]).
These studies have investigated the effects of shadows in detecting forgeries
by revealing the inconsistencies in an image, such as the coherence between the
shadows and light direction (see,
[14]
and
[15]).
Several works have concentrated on a
fine-grained analysis of colour distributions in an image. In
[16],
the authors suggested combining an edge conditional random field (CRF) to
classify edges in consumer photographs. A related method presented by
[17]
considers a moving viewpoint by comparing variant and invariant shadow
features. These features primarily comprised colour, texture, edge which were
then embedded into a segmentation pipeline that provides predictions of the
shadow status.
Many studies (for example
[18]-[19],
[13]
and) use neural networks and learning-based algorithms for shadow detection
and removal. These algorithms require learning regarding both the images that
contain shadows, as well as the corresponding images that do not contain
shadows (i.e., for a specific scene, two images need to be taken in which one
contains a shadow and the other does not). Consequently, these algorithms
require a training process with a large number of images with different cases
(see,
[20]).
As the shadow region in most cases
involves gradual changes in luminance, some researchers tend to use the
gradient-based algorithm for shadow removal (see
[21]).
This an algorithm assumes that the intensity of the shadow regions changes
gradually, and as a consequence, this algorithm fails if the shadow has a sharp
texture
[8].
Shadow detection and removal has two
main challenges: how the shadow region can be detected accurately in a complex
scene, and how the shading can be removed yet keeping the details and
information of the region and without defecting the boundaries of the objects
(see,
[22],
[23]).
In essence, shadows are a critical issue for object recognition systems and
require further investigation and development.
In
this work, the distribution of intensity for an object along with its shadow is
visualized and analyzed to eliminate the shadow's intensity effect using
various types of data filtering. We propose an algorithm that handles
real-world shadows and removes them. This algorithm consists of five
consecutive processing stages. In the first stage, contrast information is
detected using Gaussian derivative functions along the x and y directions. In
the second stage, the contrast information is normalized to generate a balanced
activity at a specific position in relation to neighboring activities. Moving
to the third stage, the boundary of the (ROI)
for the target object is
extracted. Subsequently, in the fourth and fifth stages, the interior of the
detected object is highlighted and reconstructed, respectively. This algorithm
can be used in different applications, such as robot and machine vision.
Unlike
the state-of-art approaches ([13]
and
[24],
[25])
for shadow detection and removal, our algorithm does not require a training
process on different shadows and shadowless images. Our mechanism reconstructs
a target object in a shadowless image through five stages using Gaussian
equations with different shapes and orientations.
The rest of
the paper is structured as follows. Section 2 introduces the suggested
methodology for shadow detection and removal. The discussion and experimental
results of the proposed algorithm are presented in Section 3. Finally, Section
4 presents the conclusions and summarizes the contribution of this work.
In this work, we detect and eliminate the
shadow of an image using five consecutive stages. The first stage involves
achieving contrast filtering information of the image by utilizing the first-order
derivative of the Gaussian function along the x and y axes. This information is
then normalized in the second stage. In the third stage, we extract the
shadowless boundary of the target object. Moving to the fourth stage, we
highlight the obtained enclosed region of the object contour. Finally, in the
fifth stage, we use this region to reconstruct the object in the visual scene
without any shadow. Fig. 1 presents the linear scheme of the suggested
approach. In the following subsections, we provide detailed descriptions of
each stage.
Fig.1
The
linear scheme of suggested approach.
In this stage, the object in the
visual scene is detected based on contrast information of the Gaussian
derivative functions along x and y directions. The first-order derivative of
the Gaussian function in 2D (x,y) space is defined by
where
(I) represents the input image
and * denotes the
spatial convolution operator,
and
denote
Gaussian derivative kernels along x and y directions, respectively. Since an
object edge is an elongated change in the spatial domain, we propose using the
derivative function of an elongated Gaussian kernel. The first stage of Fig.1
shows the kernels of the 2D spatial derivative of the Gaussian function. The
partial derivatives are thus calculated by
where
Λσx1, σy1
and Λσx2,
σy2
represent Gaussian functions with 2D spatial extent
(,
)
and
()
in x and y directions, respectively.
In order to suppress noise and
achieve a balanced intensity response in the spatial domain, we normalize the
response of each location based on the relevant neighbourhoods (see
[26],
[27]).
The normalization process is thus defined by
where
represents a small value to prevent the equation from zero division and
represents a fall-off function that weights the intensity of the spatial neighbourhoods of the target position in the image based on the spatial extent
.
To focus on the edge of the target object and reduce the rest of the information, we generate a binary image. Image binarization is thereby performed in such a way that the edge of the target object is set to 1 (white) and the rest of the information (shadow, noise, and background), on the other hand, is set to 0 (black). Thus, the generated edge here represents the shadowless boundary of the target object. Image binarization
thus is given as
Where
represents
the detected edge, and
T1
represents a threshold value that separates the boundary of the shadowless
object and the rest information of the image. This boundary is then delivered
to the next stage, in which the (ROI) is blurred and highlighted.
In this stage, the region
representing the shadowless object was highlighted, as it is referred to as the
(ROI). As mentioned previously, the binary image contains the edge of the
shadowless object and ignores the rest of the information. As a consequence,
the ROI, here, represents the interior of the generated boundary and thus
represents the body of the shadowless object in the visual scene. The spatial
ROI is blurred and highlighted based on the 2D bell shape function g.
Consequently, the ROI is described by
=
∗
,
in which '∗' denotes the convolution operator,
,
and σ is the
spatial extent of the 2D bell shape function. The blurred ROI is then delivered
to the next stage, in which the target object is reconstructed in a shadowless
image.
In this stage, the shadowless object
and image background are reconstructed in a new image. The new image
,
thus combines the shadowless object and the image background, as defined by
Where
denotes
a threshold value (i, j) represent the spatial coordinates in the 2D space of
the original image I and background image
.
To investigate the intensity
distribution in images that include shadow regions, Fig. 2 shows the intensity
of the three scenes. The first column shows the original images with the shadow
region, see(a1-a3). The second column (b1-b3) illustrates the foreground part
of the images. Here, the foreground of the images is defined by
The
represents
the threshold value that separates the foreground and the background. In our
implementation the value of foreground separation is
= 60.
Fig.2 Intensity distributions in
different images. The first column (a1-a3) shows the original images with
shadow regions. The second column (b1-b3) illustrates the foreground of the
images. The third column (d1-d3) represent the shadowless object on a black
background.
The intensity distribution of the
relative images is shown in Fig. 3, where the x-axis represents the intensity
values, and the y-axis represents the number of pixels. These plots reflect the
distribution of the intensities for the foreground images. The plot reveals
that the intensity of the foreground images (object) merited a small area of
intensity as highlighted the third column in Fig. 2. However, shadow and noise
are spread over a larger region of intensity. Since the foreground image has a
monochrome background (white), the intensity value (255) covers the highest
number of pixels. As a consequence, the target object is separated from the
shadows and noise, as defined by
.
Where
represents
the shadowless separation value. In our implementation, the value of
is
set to 60. To keep the object same size as the shadowless objects, we represent
the shadowless object on a black background as shown in the third column
(d1-d3) in Fig. 2. The results reveal that the shadow intensity in many cases
can be separated from the target object based on the distribution of the image
intensity. However, the separation value of the shadowless object is a critical
issue and depend on the lighting conditions, and thus different values could be
selected for the other databases.
Fig.3
The distribution of the intensities for the foreground images. The x-axis
represents the intensity values, and the y-axis represents the number of
pixels.
To evaluate the performance of our
mechanism, we tested it with realistic scenarios that contain self-shadows
(i.e., shadows linked with objects). Since in the future we plan to focus on
human action recognition as an extended work, we used the KTH database
[28].
The KTH database is characterized as complex scenarios with shadows around the
target objects in different locations. This dataset provides a valuable
benchmark for evaluating shadow detection and elimination methods. In our
evaluation, different subjects with different scenes are addressed. In
addition, this database has a relatively low resolution, making detecting
shadows complex. Since the scenarios of the KTH database contain moving
objects, we used the background subtraction technique to extract these objects
[29].
Here, the median value of constitutive frames is considered as a background of
the scenario
=
median{ }.
Where t
∈
{1, 2, ...m} and m denotes the number of frames used for background
subtraction. Following the suggestion of
[30],
we restored the designated scene with static backgrounds,
∀
(F_(x,y)- I_(x,y)) < d, I_(x,y) = F_(x,y), where d represents a small value
that describes the disparity between the background I and the foreground F.
Table 1 presents the parameters of
the proposed model. In our performance, the shortest length of the dataset
scenarios is used (see
[31]),
and as a consequence, the number of frames used for background subtraction is
m=28. The size of the Gaussian derivative kernels (gx, gy) is (40 × 40)
in which the spatial extent σx1, σy1
are 3, 5
and the spatial extent σx2, σy2
are 5, 3,
respectively. The relative disparity of the foreground and background value is
d=50. The values of the threshold parameters are T1, T2
are 0.5, 0.2, respectively. In the normalizing process, the size of the
Gaussian window Λ in the spatial domain is (49 × 49) in which the
value of σ is 10 and the value of ε is 0.3.
TABLE
I Parameters used in our model.
Definition
|
variable
|
value
|
The
number of frames used for background subtraction
|
m
|
28
|
Gaussian
derivative kernels
|
(gx,
gy)
|
(40
× 40)
|
spatial
extent
|
σx1,
σy1,
σx2,
σy2
|
3,
5, 5, 3
|
The
relative disparity of the foreground and background value
|
d
|
50
|
Threshold
parameters
|
T1, T2
|
0.5,
0.2
|
The
size of the Gaussian window
|
Λ
|
(49
× 49)
|
Normalizing
parameter
|
ε
|
0.3
|
We examined the performance of our
model by probing it with the KTH database. We selected several scenes from the
KTH database with different subjects. As previously mentioned, our mechanism
aims to eliminate the shadow from an image and to reconstruct it as a
shadowless image. Here, we focus on the scenes that contain shadows and select
different indoor and outdoor scenarios. These scenarios include different
scenes and, moving objects escorted by shadows appearing in different
directions. Fig. 4 shows four images from indoor scenarios of the KTH dataset.
As mentioned in Section 2, the shadow was detected and eliminated from the
image using four stages. Fig. 4 shows the results of the four processing stages
in which each row represents one of the selected images (with a coherent
shadow), and each column shows the results of the proposed stages. Here, the
first column ((a1-a4)-in) shows the (origin) images that contain shadows, and
the second column ((b1-b4)-in), on the other hand, presents the detected contour
of the target object. The third column ((c1-c4)-in) shows the highlighted ROI.
Finally, the fourth column ((d1-d4)-in) shows the reconstruction of the
shadowless images. By way of illustration, the eliminated shadows in the fourth
column of Fig. 4 are circled. The results reveal that our mechanism can detect
and eliminate the shadows from indoor images and reconstruct them as shadowless
images. To verify the robustness of our mechanism for the shadow removal from
outdoor images, we tested our mechanism with several outdoor images from the
KTH dataset as shown in Fig. 5. We selected three scenarios containing shadows
from the outdoor database. These scenarios were probed through the suggested
consecutive stages in which the results of each stage are shown in the relative
column in Fig. 5. The first column ((a1-a3)-out) shows the original scenes of
the selected scenarios, while the second and third columns ((b1-b3)-out) and
((c1-c2)-out) show the edge detection and the highlighted ROI, respectively.
The shadowless images are reconstructed in the fourth column ((d1-d3)-out).
Again, the regions of the eliminated shadows are circled in the fourth column
in Fig. 5. The results show that our mechanism is able to detect and eliminate
the shadows from outdoor scenes and reconstruct images into shadowless images.
Fig. 4
Indoor images of the
KTH dataset. The first column ((a1-a4)-in) shows the (original) images that
contain shadows, and the second column ((b1-b4)-in) presents the detected
contour of the target object. The third column ((c1-c4)-in) demonstrates the
highlighted ROI. The fourth column ((d1-d4)-in) shows the reconstruction of the
shadowless images in which the regions of the eliminated shadows are circled.
In order to measure the accuracy of
the shadow removal approach, we calculated the total error,
,
which is determined by
Where
represents
the summation of the error values for
{i = 1...n}
in which n denotes the number of images,
represents
the error of the shadow removal for image i and
Ni
denotes the number of pixels freed from the shadows in image i and
represents
the number of shadow pixels in the related image i (origin image). The results
show that the total error of our approach is 6%, which reflects its ability to
detect and remove the shadow regions from indoor and outdoor scenarios.
Fig.
5
Outdoor images of the KTH dataset. The first column
((a1-a4)-out) shows the (original) images that contain shadows, and the second
column ((b1-b4)-out) presents the detected contour of the target object. Third
column ((c1-c4)-out) demonstrates the highlighted ROI. The fourth column
((d1-d4)-out) shows the reconstruction of the shadowless images in which the
regions of the eliminated shadows are circled.
We have presented a mechanism for
shadow detection and removal. The mechanism is organized in a cascade of five
consecutive processing stages. Contrast filtering is performed in the first
stage, followed by normalization of the obtained result in the second stage.
Next, edge extraction is conducted in the third stage to detect the contour of
the target object. In order to reconstruct the target object in a shadowless
image, the interior of the object is highlighted in the fourth stage and
reconstructed in the fifth stage. The results demonstrate that the proposed
processing stages enable the detection and elimination of shadow regions from
an image
The distribution of the image
intensity was investigated and the potential of shadow elimination based on
intensity distributions has been described. In addition, the distribution of
the image intensity can give an indication regarding the elements in the scene
(i.e. object, shadow, and background). However, the distribution of the image
intensity is affected by several factors such as image resolution and light
conditions. As a consequence, the intensity of the scene (object and shadow)
differs from image to image depending on the adopted database. To demonstrate
the robustness of the proposed stages to detect and element the shadows from
certain images, we used the KTH database that contains different scenarios
(indoors and outdoors) where we focused on the images that contained shadows.
The results reflect the robustness of our approach for shadow detection and removal
with an error of 6%.
This approach can be extended to
address shadow detection and removal in high-resolution colored images by
analyzing and visualizing the distribution of the RGB intensity using different
levels of data filtering. In addition, this approach can be used for different
applications for more than one object. Our current study exploits the generated
shadow-less images for human action recognition.
We would like to thank Prof. Heiko
Neumann for his support and help. We would also like to thank the institute of
Neural Information Processing at Ulm university.
1.
K. Karsch et al., “Automatic Scene Inference for
3D Object Compositing.” arXiv, Dec. 24, 2019. doi: 10.48550/arXiv.1912.12297.
2.
T. Okabe, I. Sato, and Y. Sato, “Attached shadow
coding: Estimating surface normals from shadows under unknown reflectance and
lighting conditions,” in 2009 IEEE 12th International Conference on Computer
Vision, Kyoto: IEEE, Sep. 2009, pp. 1693–1700. doi: 10.1109/ICCV.2009.5459381.
3.
R. Cucchiara, C. Crana, M. Piccardi, A. Prati,
and S. Sirotti, “Improving shadow suppression in moving object detection with
HSV color information,” in ITSC 2001. 2001 IEEE Intelligent Transportation
Systems. Proceedings (Cat. No.01TH8585), Oakland, CA, USA: IEEE, 2001, pp.
334–339. doi: 10.1109/ITSC.2001.948679.
4.
L. I. Abdul-Kreem and H. Neumann, “Estimating
visual motion using an event-based artificial retina,” in Imaging and Computer
Graphics Theory and Applications, in Series of CCIS Communications in Computer
and Information Science. Computer Vision. Springer International Publishing
Switzerland, 2016, pp. 396–415.
5.
L. I. Abdul-Kreem and H. K. Abdul-Ameer, “Object
tracking using motion flow projection for pan-tilt configuration,” Int. J.
Electr. Comput. Eng. IJECE, vol. 10, no. 5, p. 4687, Oct. 2020, doi:
10.11591/ijece.v10i5.pp4687-4694.
6.
R. K. Sasi and V. K. Govindan, “Shadow Detection
and Removal from Real Images: State of Art,” in Proceedings of the Third
International Symposium on Women in Computing and Informatics - WCI ’15, Kochi,
India: ACM Press, 2015, pp. 309–317. doi: 10.1145/2791405.2791450.
7.
C. M., G. R., V. T., and A. C.B., “Cast and Self
Shadow Segmentation in Video Sequences using Interval based Eigen Value
Representation,” Int. J. Comput. Appl., vol. 142, no. 4, pp. 27–32, May 2016,
doi: 10.5120/ijca2016909752.
8.
. Xu, J. Liu, X. Li, Z. Liu, and X. Tang,
“Insignificant shadow detection for video segmentation,” IEEE Trans. Circuits
Syst. Video Technol., vol. 15, no. 8, pp. 1058–1064, Aug. 2005, doi:
10.1109/TCSVT.2005.852402.
9.
L. Hou, T. F. Y. Vicente, M. Hoai, and D.
Samaras, “Large Scale Shadow Annotation and Detection Using Lazy Annotation and
Stacked CNNs,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 4, pp.
1337–1351, Apr. 2021, doi: 10.1109/TPAMI.2019.2948011.
10.
Q. Zheng, X. Qiao, Y. Cao, and R. W. H. Lau,
“Distraction-Aware Shadow Detection,” in 2019 IEEE/CVF Conference on Computer
Vision and Pattern Recognition (CVPR), Long Beach, CA, USA: IEEE, Jun. 2019,
pp. 5162–5171. doi: 10.1109/CVPR.2019.00531.
11.
W. Xu et al., “Shadow detection and removal in
apple image segmentation under natural light conditions using an ultrametric
contour map,” Biosyst. Eng., vol. 184, pp. 142–154, Aug. 2019, doi:
10.1016/j.biosystemseng.2019.06.016.
12.
H. K. Suh, J. W. Hofstee, and E. J. van Henten,
“Improved vegetation segmentation with ground shadow removal using an HDR
camera,” Precis. Agric., vol. 19, no. 2, pp. 218–237, Apr. 2018, doi:
10.1007/s11119-017-9511-z.
13.
J. Wang, X. Li, and J. Yang, “Stacked
Conditional Generative Adversarial Networks for Jointly Learning Shadow
Detection and Shadow Removal,” in 2018 IEEE/CVF Conference on Computer Vision
and Pattern Recognition, Salt Lake City, UT: IEEE, Jun. 2018, pp. 1788–1797. doi:
10.1109/CVPR.2018.00192.
14.
S. K. Yarlagadda et al., “Shadow Removal
Detection and Localization for Forensics Analysis,” in ICASSP 2019 - 2019 IEEE
International Conference on Acoustics, Speech and Signal Processing (ICASSP),
May 2019, pp. 2677–2681. doi: 10.1109/ICASSP.2019.8683695.
15.
Y. Ke, F. Qin, W. Min, and G. Zhang, “Exposing
Image Forgery by Detecting Consistency of Shadow,” Sci. World J., vol. 2014,
pp. 1–9, 2014, doi: 10.1155/2014/364501.
16.
D. Hutchison et al., “Detecting Ground Shadows
in Outdoor Consumer Photographs,” in Computer Vision – ECCV 2010, K.
Daniilidis, P. Maragos, and N. Paragios, Eds., in Lecture Notes in Computer
Science, vol. 6312. Berlin, Heidelberg: Springer Berlin Heidelberg, 2010, pp.
322–335. doi: 10.1007/978-3-642-15552-9_24.
17.
C. C. Newey, O. D. Jones, and H. M. Dee, “Shadow
detection for mobile robots: Features, evaluation, and datasets,” Spat. Cogn.
Comput., vol. 18, no. 2, pp. 115–137, Apr. 2018, doi: 10.1080/13875868.2017.1322088.
18.
R. Guo, Q. Dai, and D. Hoiem, “Paired Regions
for Shadow Detection and Removal,” IEEE Trans. Pattern Anal. Mach. Intell.,
vol. 35, no. 12, pp. 2956–2967, Dec. 2013, doi: 10.1109/TPAMI.2012.214.
19.
L. Qu, J. Tian, S. He, Y. Tang, and R. W. H.
Lau, “DeshadowNet: A Multi-context Embedding Deep Network for Shadow Removal,”
in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR),
Honolulu, HI: IEEE, Jul. 2017, pp. 2308–2316. doi: 10.1109/CVPR.2017.248.
20.
N. Inoue and T. Yamasaki, “Learning From
Synthetic Shadows for Shadow Detection and Removal,” IEEE Trans. Circuits Syst.
Video Technol., vol. 31, no. 11, pp. 4187–4197, Nov. 2021, doi:
10.1109/TCSVT.2020.3047977.
21.
Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen,
“Efficient moving object segmentation algorithm using background registration
technique,” IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 7, pp.
577–586, Jul. 2002, doi: 10.1109/TCSVT.2002.800516.
22.
L. Zhang, Q. Zhang, and C. Xiao, “Shadow Remover:
Image Shadow Removal Based on Illumination Recovering Optimization,” IEEE
Trans. Image Process., vol. 24, no. 11, pp. 4623–4636, Nov. 2015, doi:
10.1109/TIP.2015.2465159.
23.
X. Fan et al., “Shading-aware shadow detection
and removal from a single image,” Vis. Comput., vol. 36, no. 10–12, pp.
2175–2188, Oct. 2020, doi: 10.1007/s00371-020-01916-3.
24.
X. Hu, C.-W. Fu, L. Zhu, J. Qin, and P.-A. Heng,
“Direction-Aware Spatial Context Features for Shadow Detection and Removal,”
IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 11, pp. 2795–2808, Nov.
2020, doi: 10.1109/TPAMI.2019.2919616.
25.
T. Wang, X. Hu, Q. Wang, P.-A. Heng, and C.-W.
Fu, “Instance Shadow Detection,” in 2020 IEEE/CVF Conference on Computer Vision
and Pattern Recognition (CVPR), Seattle, WA, USA: IEEE, Jun. 2020, pp.
1877–1886. doi: 10.1109/CVPR42600.2020.00195.
26.
L. I. Abdul-Kreem and H. Neumann, “Neural
Mechanisms of Cortical Motion Computation Based on a Neuromorphic Sensory
System,” PLOS ONE, vol. 10, no. 11, p. e0142488, Nov. 2015, doi:
10.1371/journal.pone.0142488.
27.
L. I. Abdul-Kreem and H. Neumann, “Bio-inspired
Model for Motion Estimation using an Address-event Representation:,” in
Proceedings of the 10th International Conference on Computer Vision Theory and
Applications, Berlin, Germany: SCITEPRESS - Science and and Technology
Publications, 2015, pp. 335–346. doi: 10.5220/0005311503350346.
28.
C. Schuldt, I. Laptev, and B. Caputo,
“Recognizing human actions: a local SVM approach,” in Proceedings of the 17th
International Conference on Pattern Recognition, 2004. ICPR 2004., Cambridge,
UK: IEEE, 2004, pp. 32-36 Vol.3. doi: 10.1109/ICPR.2004.1334462.
29.
C. Stauffer and W. E. L. Grimson, “Adaptive
background mixture models for real-time tracking,” in Proceedings. 1999 IEEE
Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No
PR00149), Fort Collins, CO, USA: IEEE Comput. Soc, 1999, pp. 246–252. doi:
10.1109/CVPR.1999.784637.
30.
L. I. Abdul-Kreem, “Computational architecture
of a visual model for biological motions segregation,” Netw. Comput. Neural
Syst., vol. 30, no. 1–4, pp. 58–78, Oct. 2019, doi:
10.1080/0954898X.2019.1655173.
31.
K. Schindler and L. van Gool, “Action snippets:
How many frames does human action recognition require?,” in 2008 IEEE
Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA:
IEEE, Jun. 2008, pp. 1–8. doi: 10.1109/CVPR.2008.4587730.