Multimodal system for skin cancer detection
DOI:
https://doi.org/10.20535/SRIT.2308-8893.2026.1.03Keywords:
medical image classification, computer vision, gradient boosting, deep neural networks, clinical decision support systemsAbstract
Melanoma detection is vital for early diagnosis and effective treatment. While deep learning models on dermoscopic images have shown promise, they require specialized equipment, limiting their use in broader clinical settings. This study introduces a multi-modal melanoma detection system using conventional photo images, making it more accessible and versatile. Our system integrates image data with tabular metadata, such as patient demographics and lesion characteristics, to improve detection accuracy. It employs a multi-modal neural network combining image and metadata processing and supports a two-step model for cases with or without metadata. A three-stage pipeline further refines predictions by boosting algorithms and enhancing performance. To address the challenges of a highly imbalanced dataset, specific techniques were implemented to ensure robust training. An ablation study evaluated recent vision architectures, boosting algorithms, and loss functions, achieving a peak Partial ROC AUC of 0.18068 (0.2 maximum) and top-15 retrieval sensitivity of 0.78371. Results demonstrate that integrating photo images with metadata in a structured, multi-stage pipeline yields significant performance improvements. This system advances melanoma detection by providing a scalable, equipment-independent solution suitable for diverse healthcare environments, bridging the gap between specialized and general clinical practices.
References
P. Gruber, P.M. Zito, Skin cancer. Treasure Island (FL): StatPearls Publishing, 2024.
Andrew J. Wagner, Nancy Berliner, Edward J. Benz Jr., “Anatomy and physiology of the gene,” in Hematology, pp. 3–16. Elsevier, 2018.
M. Mateen, S. Hayat, F. Arshad, Y.-H. Gu, M.A. Al-antari, “Hybrid Deep Learning Framework for Melanoma Diagnosis Using Dermoscopic Medical Images,” Diagnos-tics, 14(19), 2242, 2024. doi: https://doi.org/10.3390/diagnostics14192242
V. Rotemberg et al., “A patient-centric dataset of images and metadata for identifying melanomas using clinical context,” Sci. Data, 8(1):34, 2021. doi: 10.1038/ s41597-021-00815-z
A.A. Adegun, S. Viriri, “Deep learning techniques for skin lesion analysis and mela-noma cancer detection: a survey of state-of-the-art,” Artificial Intelligence Review, vol. 54, pp. 811–841, 2020. doi: 10.1007/s10462-020-09865-y
M. Naqvi, S.Q. Gilani, T. Syed, O. Marques, H.C. Kim, “Skin Cancer Detection Using Deep Learning—A Review,” Diagnostics, 13(11), 1911, 2023. doi: https://doi.org/ 10.3390/diagnostics13111911
W. Gouda, N.U. Sama, G. Al-Waakid, M. Humayun, N.Z. Jhanjhi, “Detection of Skin Cancer Based on Skin Lesion Images Using Deep Learning,” Healthcare, 10(7), 1183, 2022. doi: https://doi.org/10.3390/healthcare10071183
J.R.H. Lee, M. Pavlova, M. Famouri, A. Wong, “Cancer-Net SCa: tailored deep neural network designs for detection of skin cancer from dermoscopy images,” BMC Medical Imaging, vol. 22, article no. 143, 2022. doi: https://doi.org/ 10.1186/s12880-022-00871-w
B. Cassidy, C.Kendrick, A. Brodzicki, J. Jaworek-Korjakowska, M.H. Yap, “Analy-sis of the ISIC image datasets: Usage, benchmarks and recommendations,” Medical Image Analysis, vol. 75, 102305, 2022. doi: https://doi.org/10.1016/ j.media.2021.102305
D. Wen, A. Soltan, E. Trucco, R.N. Matin, “From data to diagnosis: skin cancer im-age datasets for artificial intelligence,” Clinical and Experimental Dermatology, vol. 49, issue 7, pp. 675–685, 2024. doi: https://doi.org/10.1093/ced/llae112
K.M. Hosny, M.A. Kassem, M.M. Foaud, “Classification of skin lesions using transfer learning and augmentation with Alex-net,” PLOS One, 14(5), e0217293, 2019. doi: https://doi.org/10.1371/journal.pone.0217293
K. He, X. Zhang, S. Ren, J. Sun, Deep Residual Learning for Image Recognition, 2015. doi: https://doi.org/10.48550/arXiv.1512.03385
G. Huang, Z. Liu, L. van der Maaten, K.Q. Weinberger, Densely Connected Convo-lutional Networks, 2018. doi: https://doi.org/10.48550/arXiv.1608.06993
N. Saranya, Alfred C. Jowin, R.R. Rishikesh, Idayan I. Gilbert, “Analysis of GAN for Melanoma Skin CancerClassification with Dermatologist Recommendation,” in Proceedings of the 2024 International Conference on Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence (RAEEUCCI), Chennai, India, 2024, pp. 1–8. doi: https://doi.org/10.1109/ RAEEUCCI61380.2024.10547727
J. Kawahara, S. Daneshvar, G. Argenziano, G. Hamarneh, “Seven-point checklist and skin lesion classification using multitask multimodal neural nets,” IEEE Journal of Biomedical and Health Informatics, vol. 23, no. 2, pp. 538–546, 2019. doi: https://doi.org/10.1109/JBHI.2018.2824327
boliu61, SIIM-ISIC Melanoma Classification - Discussion on Kaggle. 2020. Ac-cessed on: October 27, 2024. Available: https://www.kaggle.com/c/siim-isic-melanoma-classification
M.R. Thanka et al., “A hybrid approach for melanoma classification using ensemble machine learning techniques with deep transfer learning,” Computer Methods and Programs in Biomedicine Update, 3(11):100103, 2023. doi: 10.1016/ j.cmpbup.2023.100103
A. Ju, J. Tang, S. Chen, Y. Fu, Y. Luo, “Pyroptosis-related gene signatures can ro-bustly diagnose skin cutaneous melanoma and predict the prognosis,” Frontiers in Oncology, vol. 11, 709077, 2021. doi: https://doi.org/10.3389/fonc.2021.709077
F.P. Loss et al., Skin cancer diagnosis using NIR spectroscopy data of skin lesions in vivo using machine learning algorithms, 2024. doi: https://doi.org/10.48550/ arXiv.2401.01200
International Skin Imaging Collaboration. SLICE-3D 2024 Challenge Dataset, 2024. Creative Commons Attribution-Non Commercial 4.0 International License. doi: https://doi.org/10.34970/2024-slice-3d
N. Kurtansky et al., “The SLICE-3D dataset: 400,000 skin lesion image crops ex-tracted from 3D TBP for skin cancer detection,” Scientific Data, 11, article no. 884, 2024. doi: https://doi.org/10.1038/s41597-024-03743-w
MAli-Farooq.Derm-T2IM-Dataset, 2024. Accessed on: October 20, 2024. Available: https://huggingface.co/datasets/MAli-Farooq/Derm-T2IM-Dataset
Canfield Scientific, I. VECTRA® WB360 Imaging System, 2024. Accessed on: October 21, 2024.
B. D’Alessandro, “Methods and Apparatus for Identifying Skin Features of Interest,” US11164670B2, Nov. 2021. Filed: March 18, 2016; Issued: November 2, 2021.
B. Betz-Stablein et al., “Reproducible Naevus Counts Using 3D Total Body Photog-raphy and Convolutional Neural Networks,” Dermatology, 238(1), pp. 4–11, 2022. doi: https://doi.org/10.1159/000517218
R. Rombach, A. Blattmann, D. Lorenz, P. Esser, B. Ommer, High-Resolution Image Synthesis with Latent Diffusion Models, 2021. doi: https://doi.org/10.48550/ arXiv.2112.10752
“Partial Area Under the ROC Curve,” Wikipedia, 2023. Accessed on: October 26, 2024. Available: https://en.wikipedia.org/wiki/ Partial_Area_Under_the_ROC_Curve
ISIC Research. Challenge 2024 Metrics, 2024. Accessed on: October 26, 2024. Available: https://github.com/ISIC-Research/Challenge-2024-Metrics/tree/main
Kaggle. ISIC 2024 Challenge - Secondary Prize Metrics, 2024. Accessed on: October 26, 2024. Available: https://www.kaggle.com/competitions/isic-2024-challenge/overview/ secondary-prize-metrics
T. Akiba, S. Sano, T. Yanase, T. Ohta, M. Koyama, “Optuna: A Next-generation Hyperparameter Optimization Framework,” in KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2623–2631, 2019. doi: https://doi.org/10.1145/3292500.3330701
N. Kurtansky, V. Rotemberg, M. Gillis, K. Kose, W. Reade, A. Chow, “ISIC 2024 - Skin Cancer Detection with 3D-TBP,” Kaggle, 2024. Available: https://kaggle.com/ competitions/isic-2024-challenge
OpenAI. ChatGPT. Accessed on: November 22, 2024. Available: https:// chatgpt.com/
A. Scope et al., “The “ugly duckling” sign: agreement between observers,” Archives of dermatology, 144(1), pp. 58–64, 2008. doi: 10.1001/archdermatol.2007.15
“StandardScaler,” scikit learn, 2024. Accessed on: November 22, 2024. Available: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html
“Quantile Transformer,” scikit learn, 2024. Accessed on: November 22, 2024. Available: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.QuantileTransformer.html
S. Du, B. Hers, N. Bayasi, G. Hamarneh, R. Garbi, “FairDisCo: Fairer ai in derma-tology via disentanglement contrastive learning,” in Proceedings of the Computer Vi-sion–ECCV 2022 Workshops: Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part IV. Springer, 2023, pp. 185–202.
S. Woo et al., “Convnext v2: Co-designing and scaling convnets with masked auto-encoders,” in Proceedings of the Proceedings of the IEEE/CVF Conference on Com-puter Vision and Pattern Recognition, 2023, pp. 16133–16142. doi: 10.1109/CVPR52729.2023.01548
M. Maaz et al., “EdgeNeXt: Efficiently Amalgamated CNN-Transformer Architecture for Mobile Vision Applications,” in Proceedings of the European Conference on Computer Vision. Springer, 2022, pp. 3–20.
M. Tan, Q. Le, “Efficientnetv2: Smaller models and faster training,” in Proceedings of the International Conference on Machine Learning, PMLR, 2021, pp. 10096–10106.
R. Wightman et al., PyTorch Image Models, 2019. doi: https://doi.org/10.5281/zenodo.4414861
J. Deng, W. Dong, R. Socher, L.J. Li, K. Li, L. Fei-Fei, “ImageNet: A large-scale hierarchical image database,” in Proceedings of the 2009 IEEE Conference on Com-puter Vision and Pattern Recognition, 2009, pp. 248–255. doi: https:// doi.org/10.1109/CVPR.2009.5206848
Itseez. Open Source Computer Vision Library, 2015. Available: https:// github.com/itseez/opencv
D.P. Kingma, J. Ba, Adam: A Method for Stochastic Optimization, 2017. doi:
https://doi.org/10.48550/arXiv.1412.6980
L. Liu et al., On the Variance of the Adaptive Learning Rate and Beyond, 2021. doi: https://doi.org/10.48550/arXiv.1908.03265
G. Lemaître, F. Nogueira, C.K. Aridas, “Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning,” Journal of Machine Learning Research, 18, pp. 1–5, 2017.
G. Ke et al., “LightGBM: A highly efficient gradient boosting decision tree,” Ad-vances in Neural Information Processing Systems, 30, 2017.
T. Chen, C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in KDD '16: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. doi: https://doi.org/10.1145/ 2939672.2939785
A. Zawacki et al., “SIIM-ISIC Melanoma Classification,” Kaggle, 2020. Available: https://kaggle.com/competitions/siim-isic-melanoma-classification
T. Mendonça, P.M. Ferreira, J.S. Marques, A.R.S. Marcal, J. Rozeira, “PH2 - A dermoscopic image database for research and benchmarking,” in 2013 35th Annual In-ternational Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Osaka, Japan, 2013, pp. 5437–5440. doi: https://doi.org/10.1109/ EMBC.2013.6610779
M.F.J. Acosta, L.Y.C. Tovar, M.B. Garcia-Zapirain, W.S. Percybrooks, “Melanoma diagnosis using deep learning techniques on dermatoscopic images,” BMC Medical Imaging, vol. 21, article no. 6, 2021. doi: https://doi.org/10.1186/s12880-020-00534-8
M. Tan, Q.V. Le, EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, 2020. doi: https://doi.org/10.48550/arXiv.1905.11946
T.Y. Lin, P. Goyal, R. Girshick, K. He, P. Dollár, Focal Loss for Dense Object De-tection, 2018. doi: https://doi.org/10.48550/arXiv.1708.02002
E. Ben-Baruch et al., Asymmetric Loss for Multi-Label Classification, 2021. doi: https://doi.org/10.48550/arXiv.2009.14119
A. Galdran, G. Carneiro, M.A. González Ballester, “Balanced-MixUp for Highly Im-balanced Medical Image Classification,” in Medical Image Computing and Computer Assisted Intervention – MICCAI 2021, vol. 12905, pp. 323–333. Springer International Publishing, 2021. doi: https://doi.org/10.1007/978-3-030-87240-3_31