Overview of neural network object detection methods & models on the example of their use for lab animal observation

Authors

DOI:

https://doi.org/10.20535/SRIT.2308-8893.2025.4.05

Keywords:

object detection, neural network, neural layer, architecture, model, optimization, estimation, prediction, video, image, frame, background, foreground, experiment, comparison

Abstract

This article provides a brief overview of a set of the most common basic object detection neural network models. Today, the need for automating surveillance and observation processes remains a growing trend. Moreover, one of the key tasks of such processes is usually the detection of an object of interest for further analysis. Previously, many basic object detection algorithms and approaches have been proposed; however, most of them typically have limitations in terms of their applicability. In most cases, these limitations arise due to the nature of the observed environment or because the detection approaches rely on specific object characteristics, such as color or basic shapes only. To address these problems, a new approach for object detection has been developed using neural networks. This paper presents the basis and central aspects of the most common neural network object detection models. The experiment has demonstrated the features, advantages, and disadvantages of the studied methods in the application case of lab animal detection during their behavioral study. Considering this, conclusions and recommendations for their usage cases were made.

References

A.R. Smith, “Color Gamut Transform Pairs,” in SIGGRAPH '78: Proceedings of the 5th annual conference on Computer graphics and interactive techniques, pp. 12–19,1978. doi: https://www.doi.org/10.1145/800248.807361

M.A. Shvandt, V.V. Moroz, “Overview Of The Detection And Tracking Methods Of The Lab Animals”, System Research & Information Technologies, no. 1, 2022, pp. 124–148. doi: https://www.doi.org/10.20535/SRIT.2308-8893.2022.1.10

V.V. Moroz, M.A. Shvandt, “Study of movement and behavior of laboratory animals by methods of object detection and tracking”, Herald of the National Technical Univer-sity ‘KhPI’, Series of ‘Informatics and Modeling’, Kharkiv: NTU ‘KhPI’, Kharkiv, vol. 13, no. 1338, pp. 93–103, 2019. doi: https://www.doi.org/10.20998/2411-0558.2019.13.09

TensorFlow 2 Detection Model Zoo. Available: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

T.-Y. Lin et al., Microsoft COCO: Common Objects in Context. 2014, 15 p. doi: https://www.doi.org/10.48550/ARXIV.1405.0312

L. Wood, F. Chollet, Efficient Graph-Friendly COCO Metric Computation for Train-Time Model Evaluation. 2022, 7 p. doi: https://www.doi.org/10.48550/ARXIV.2207.12120

K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, Q. Tian, CenterNet: Keypoint Triplets for Object Detection. 2019, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1904.08189

H. Law, J. Deng, CornerNet: Detecting Objects as Paired Keypoints. 2018, 14 p. doi: https://www.doi.org/10.48550/ARXIV.1808.01244

X. Lu, B. Li, Y. Yue, Q. Li, J. Yan, Grid R-CNN. 2018, 9 p. doi: https://www.doi.org/10.48550/ARXIV.1811.12030

X. Zhou, D. Wang, P. Krähenbühl, Objects as Points. 2019, 12. doi: https://www.doi.org/10.48550/ARXIV.1904.07850

S. Trivedi, CenterNet: Objects as Points - A Comprehensive Guide. 2020. Available: https://medium.com/visionwizard/centernet-objects-as-points-a-comprehensive-guide-2ed9993c48bc

L. Huang, Y. Yang, Y. Deng, Y. Yu, DenseBox: Unifying Landmark Localization with End to End Object Detection. 2015, 13 p. doi: https://www.doi.org/10.48550/ARXIV.1509.04874

Z. Tian, C. Shen, H. Chen, T. He, FCOS: Fully Convolutional One-Stage Object Detec-tion. 2019, 13 p. doi: https://www.doi.org/10.48550/ARXIV.1904.01355

A. Newell, K. Yang, J. Deng, Stacked Hourglass Networks for Human Pose Estimation. 2016, 17 p. doi: https://www.doi.org/10.48550/ARXIV.1603.06937

K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition. 2015, 12 p. doi: https://www.doi.org/10.48550/ARXIV.1512.03385

A.G. Howard et al., MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. 2017, 9 p. doi: https://www.doi.org/10.48550/ARXIV.1704.04861

D. Wang, E. Shelhamer, T. Darrell, Deep Layer Aggregation. 2017, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1707.06484

T.-Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal Loss for Dense Object Detec-tion. 2017, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1708.02002

S. Trivedi. Understanding Focal Loss-A Quick Read. 2020. Available: https://medium.com/visionwizard/understanding-focal-loss-a-quick-read-b914422913e7

S. Bangar, Resnet Architecture Explained. 2022. Available: https://medium.com/@siddheshb008/resnet-architecture-explained-47309ea9283d

P. Ruiz, Understanding and visualizing ResNets. 2018. Available: https://towardsdatascience.com/understanding-and-visualizing-resnets-442284831be8

T.-Yi Lin et al., Feature Pyramid Networks for Object Detection. 2016, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1612.03144

J. Hui, Understanding Feature Pyramid Networks for object detection (FPN). 2018. Available: https://jonathan-hui.medium.com/understanding-feature-pyramid-networks-for-object-detection-fpn-45b227b9106c

S.-H. Tsang, Review: FPN - Feature Pyramid Network (Object Detection). 2019. Avail-able: https://towardsdatascience.com/review-fpn-feature-pyramid-network-object-detection-262fc7482610

S. Tanwar, FPN (feature pyramid networks). 2020. Available: https://medium.com/analytics-vidhya/fpn-feature-pyramid-networks-77d8be41817c

S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards Real-Time Object Detec-tion with Region Proposal Networks. 2015, 14 p. doi: https://www.doi.org/10.48550/ARXIV.1506.01497

W. Liu et al., SSD: Single Shot MultiBox Detector. 2015, 17 p. doi: https://www.doi.org/10.48550/ARXIV.1512.02325

S.-H. Tsang, Review: MobileNetV1 - Depthwise Separable Convolution (Light Weight Model). 2018. Available: https://towardsdatascience.com/review-mobilenetv1-depthwise-separable-convolution-light-weight-model-a382df364b69

M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, L.-C. Chen, MobileNetV2: Inverted Residuals and Linear Bottlenecks. 2018, 14 p. doi: https://www.doi.org/10.48550/ARXIV.1801.04381

S.-H. Tsang, Review: MobileNetV2 - Light Weight Model (Image Classification). 2019. Available: https://towardsdatascience.com/review-mobilenetv2-light-weight-model-image-classification-8febb490e61c

A. Newell, K. Yang, J. Deng, Stacked Hourglass Networks for Human Pose Estimation. 2016, 17 p. doi: https://www.doi.org/10.48550/ARXIV.1603.06937

E. Calleris, The Hourglass Network. 2022. Available: https://medium.com/@calleris.enrico/hourglass-network-6e74cdb9ce2f

S. Li, Simple Introduction about Hourglass-like Model. 2017. Available: https://medium.com/@sunnerli/simple-introduction-about-hourglass-like-model-11ee7c30138

J. Long, E. Shelhamer and T. Darrell, Fully Convolutional Networks for Semantic Seg-mentation. 2014, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1411.4038

A. Krizhevsky, I. Sutskever, G.E. Hinton, “ImageNet classification with deep convolu-tional neural networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems - NIPS’12, Curran Associates Inc., Red Hook, NY, USA, 2012, vol 1., pp. 1097–1105.

O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional Networks for Biomedical Image Segmentation. 2015, 8 p. doi: https://www.doi.org/10.48550/ARXIV.1505.04597

S. Minaee et al., Image Segmentation Using Deep Learning: A Survey. 2020, 22 p. doi: https://www.doi.org/10.48550/ARXIV.2001.05566

F. Milletari, N. Navab, S.-A. Ahmadi, V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. 2016, 11 p. doi: https://www.doi.org/10.48550/ARXIV.1606.04797

J.T. Springenberg, A. Dosovitskiy, T. Brox, M. Riedmiller, Striving for Simplicity: The All Convolutional Net. 2015, 14 p. doi: https://www.doi.org/10.48550/ARXIV.1412.6806

D. Oñoro-Rubio, M. Niepert, Contextual Hourglass Networks for Segmentation and Density Estimation. 2018, 3 p. doi: https://www.doi.org/10.48550/ARXIV.1806.04009

R. Sharma, EfficientDet: Scalable and Efficient Object Detection. 2021. Available: https://medium.com/analytics-vidhya/efficientdet-scalable-and-efficient-object-detection-384a5df9011a

M. Tan, R. Pang, Q.V. Le, EfficientDet: Scalable and Efficient Object Detection. 2019, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1911.09070

M. Tan, Q.V. Le, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. 2020, 11 p. doi: https://www.doi.org/10.48550/ARXIV.1905.11946

J. Solawetz, A Thorough Breakdown of EfficientDet for Object Detection. 2020. Availa-ble: https://towardsdatascience.com/a-thorough-breakdown-of-efficientdet-for-object-detection-dc6a15788b73

S. Liu, L. Qi, H. Qin, J. Shi, J. Jia, Path Aggregation Network for Instance Segmenta-tion. 2018, 11 p. doi: https://www.doi.org/10.48550/ARXIV.1803.01534

C. Peng et al., MegDet: A Large Mini-Batch Object Detector. 2017, 9 p. doi: https://www.doi.org/10.48550/ARXIV.1711.07240

J. Hui, SSD object detection: Single Shot MultiBox Detector for real-time processing. 2018. Available: https://jonathan-hui.medium.com/ssd-object-detection-single-shot-multibox-detector-for-real-time-processing-9bd8deac0e06

K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition. 2014, 14 p. doi: https://www.doi.org/10.48550/ARXIV.1409.1556

J. Boschman, VGG16 (2014) – one minute summary. 2021. Available: https://medium.com/one-minute-machine-learning/very-deep-convolutional-networks-for-large-scale-image-recognition-2014-one-minute-summary-44a8f04586ab

Faster R-CNN – ML. Available: https://www.geeksforgeeks.org/faster-r-cnn-ml/

A. Khazri, Faster RCNN Object detection. 2019. Available: https://towardsdatascience.com/faster-rcnn-object-detection-f865e5ed7fc4

J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You Only Look Once: Unified, Real-Time Object Detection. 2016, 10 p. doi: https://www.doi.org/10.48550/ARXIV.1506.02640

J.R.R. Uijlings, K.E.A. van de Sande, T. Gevers, A.W.M. Smeulders, “Selective Search for Object Recognition”, in International Journal of Computer Vision, 2013, 14 p. doi: https://www.doi.org/10.1007/s11263-013-0620-5

M. D. Zeiler and R. Fergus, Visualizing and Understanding Convolutional Networks. 2013, 11 p. doi: https://www.doi.org/10.48550/ARXIV.1311.2901

R. Kundu, YOLO: Algorithm for Object Detection Explained [+Examples]. 2023. Available: https://www.v7labs.com/blog/yolo-object-detection

S.-H. Tsang, Brief Review: YOLOv5 for Object Detection. 2023. Available: https://sh-tsang.medium.com/brief-review-yolov5-for-object-detection-84cc6c6a0e3a

Downloads

Published

2025-12-29

Issue

Section

Methods, models, and technologies of artificial intelligence in system analysis and control