In recent years, the image analysis landscape is witnessing a paradigm shift with the emergence of the vision transformer as a better alternative to Convolutional Neural Networks (CNNs). Transformers process sequences globally with self-attention capturing long-range features, while CNNs extract features locally through convolutional operations. We propose the adoption of Swin Transformer as backbone for calcification cluster detection in mammography, assessing its efficacy through a comprehensive experimental study comparing transformer-based and CNN-based models. Our experiments conducted on the large-scale mammography image database OMI-DB demonstrate a notable superiority of the Swin Transformer architecture. The best-performing Swin backbone obtained a sensitivity of 80.67% at 0.1 false positive per image, with a +3.34% improvement over the best convolutional backbone. Our findings underscore the efficacy of transformer-based architectures for detecting clusters of calcifications in mammography, offering improved diagnostic accuracy in this field.
Transformer Models for Enhanced Calcifications Detection in Mammography
Cantone, Marco;Marrocco, ClaudioSupervision
;Tortorella, FrancescoSupervision
;Bria, AlessandroSupervision
2025-01-01
Abstract
In recent years, the image analysis landscape is witnessing a paradigm shift with the emergence of the vision transformer as a better alternative to Convolutional Neural Networks (CNNs). Transformers process sequences globally with self-attention capturing long-range features, while CNNs extract features locally through convolutional operations. We propose the adoption of Swin Transformer as backbone for calcification cluster detection in mammography, assessing its efficacy through a comprehensive experimental study comparing transformer-based and CNN-based models. Our experiments conducted on the large-scale mammography image database OMI-DB demonstrate a notable superiority of the Swin Transformer architecture. The best-performing Swin backbone obtained a sensitivity of 80.67% at 0.1 false positive per image, with a +3.34% improvement over the best convolutional backbone. Our findings underscore the efficacy of transformer-based architectures for detecting clusters of calcifications in mammography, offering improved diagnostic accuracy in this field.| File | Dimensione | Formato | |
|---|---|---|---|
|
2025 - Transformer Models for Enhanced Calcifications Detection in Mammography_compressed.pdf
non disponibili
Tipologia:
Documento in Pre-print
Licenza:
Copyright dell'editore
Dimensione
2.74 MB
Formato
Adobe PDF
|
2.74 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

