Breast cancer still poses a serious threat to world health, needing creative approaches to early identification in order to improve patient outcomes. This study investigates the potential of deep learning methods to improve the precision and effectiveness of mammography interpretation for the identification of breast cancer. In this paper proposed, a convolutional neural network (CNN) architecture, ResNet50, is created and trained on a sizable data set of annotated mammograms. The CNN is made to automatically identify and extract pertinent elements, such as microcalcifications, masses, and architectural distortions,that may be symptomatic of possible cancers. The model develops the ability to distinguish between benign and malignant instances through an iterative process of training and validation, finally displaying a high level of discriminatory accuracy. The paper findings show that the deep learning model outperforms conventional mammography interpretations in terms of sensitivity and specificity for detecting breast cancer. Furthermore, the model's potential for use in actual clinical settings is highlighted by its generalizability across a range of patient demographics and imaging technologies.This study represents a big step in improving radiologists' capacity for breast cancer early detection. Our deep learning-based architecture has promise for improving the screening procedure and potentially decreasing the difficulties brought on by radiologist shortages by lowering false positives, improving accuracy, and offering quick analysis. By utilising cutting-edge technology to enable prompt and efficient detection, this study contributes to continuing efforts by the international healthcare community to improve breast cancer outcomes.
Early detection and prompt treatment have evolved into essential elements in the fight against maternal cancer-related mortality. These goals have been attained with the application of routine mammography tests. The screenings based on mammography, on the other hand, are quite expensive and demand a lot of resources due to their reliance on the expertise of human professionals. A number of countries are about to execute radiologists, which would worsen the current situation.Unimportant but significant side effect is the high frequency of erroneous results associated with mammographic exams. Numerous issues arise as a result of this circumstance, including the patients’ unjustified emotional suffering, the need for unwelcome assistance for follow-up care, additional imaging studies, and occasionally the use of invasive tissue collection techniques like needle biopsies.
The ongoing advancement of deep learning, in particular, has sparked the interest of the medical imaging industry in using these methods to improve the accuracy of cancer detection. Breast cancer is the second greatest cause of mortality among cancers that affect women in the United States. A crucial tool for lowering mortality rates is screening mammography. Nevertheless, despite these benefits, screening mammography suffers from a high percentage of false positives and false negatives. The system architecture that is envisioned for early cancer detection is shown in Figure 1. To improve diagnosis precision, it incorporates cutting-edge technology like deep learning and medical imaging. From the collection of medical images to automated analysis using cutting-edge algorithms, the architecture places a strong emphasis on seamless data flow. The goal of this shortened procedure is to enhance early diagnosis by quickly identifying any anomalies. The suggested system intends to improve patient outcomes by enabling rapid intervention and treatment options through the synergy of novel methodologies.In the United States, digital screening mammography has an average sensitivity and specificity of 86.9% and 88.9%, respectively. Since the 1990s, radiologists have used computer-assisted detection and diagnosis (CAD) software to improve the screening mammography’s predictive accuracy. Unfortunately, the first generation of commercial CAD systems did not significantly increase performance, which led to a decade-long halt in development. However, the emergence of deep learning has reignited interest in creating deep learning tools for radiologists due to its exceptional performance in object recognition and other fields. Recent research shows that when employed in support mode, deep learning-based CAD systems even outperform radiologists in terms of performance while working independently.
Due to the modest tumour size in comparison to the overall breast image, detecting subclinical breast cancer with screening mammography might be difficult. The majority of research has concentrated on categorising annotated lesions, with little attention paid to the full mammography. There are drawbacks to this method, especially when working with datasets that lack region-of-interest (ROI) annotations. Without depending on annotations, some research have attempted to train neural networks using entire mammograms, although it is yet unknown if these methods can locate clinically important abnormalities.Pre-training shows promise in addressing the need for huge training datasets. Pre-training techniques provide quicker and more accurate classification by initialising a classifier’s weights with features discovered from a different dataset. Using a fully annotated dataset, we provide a "end-to-end" method in this work for classifying local picture patches. The entire image classifier is then initialised using the patch classifier’s weights, and it can be modified using datasets without ROI annotations. This approach improves breast cancer classification algorithms by utilising both larger unlabelled datasets and fully annotated datasets.
In order to put this strategy into practise, we used a sizable public database of digitised film mammography in order to create patch and full picture classifiers. The smaller public full-field digital mammography (FFDM) database received these classifiers after that. Our study considers alternative network designs for building these classifiers and investigates various training methods. This study highlights the benefits and drawbacks of various training methods while also outlining an efficient workflow for creating full picture classifiers.
In particular, the RSNA which plays a significant role in this field, is in charge of organising the prestigious competition. The non-profit organisation RSNA serves as a global representative for the field and represents 31 radiological specialties from 145 different countries. Through education, innovative research projects, and a continuous search for cutting-edge technological solutions, the organisation is dedicated to improving medical care and the delivery of it.In the face of these pressing problems and the pursuit of better techniques for breast cancer screening, your understanding of machine learning has the potential to serve as a catalyst. You have the chance to use your machine learning expertise to improve and streamline the review procedure radiologists use to evaluate screening mammograms. This has the potential to lessen the burden on healthcare systems and increase the efficiency of breast cancer detection, especially given the shortage of radiologists. Finally, your contributions have the potential to usher in a new era of accuracy, effectiveness, and patient results improvement in the study and treatment of senile cancer.
A) Review of Literature
Breast cancer detection, which relies primarily on mammography pictures, is developing quickly. This section provides a brief summary of recent developments in this field. Deep convolution and belief networks are used in [1] to categorize mass mammograms using a structured support vector machine (SSVM) and conditional random field (CRF). In terms of training and inference time, CRF performed better. Full-resolution convolutional network (FrCN) was presented using four-fold cross-validation on the in breast dataset X-ray mammograms, and it achieved an excellent F1 score of 99.24%, accuracy of 95.96%, and a Matthews correlation coefficient (MCC) of 98.96%. Another study [2] proposed BDR-CNN-GCN for MIAS dataset, resulting in 96.10% accuracy, by combining a graph-convolutional network (GCN) and an 8-layer CNN. With a 96.50% accuracy and 93.50% MCC, a modified YOLOv5 network in [3] successfully recognized and categorized breast tumors, outperforming YOLOv3 and quicker RCNN. [4, 5] proposed the diversified features (DFeBCD) technique for classifying mammograms into normal and abnormal using an integrated classifier that was influenced by emotion learning and achieved an accuracy of 80.30%. To counter overfitting, [6] presented a deep-CNN with transfer learning (TL). Inbreast (95.5%), DDSM (97.35%), and BCDR (96.67%) were among the datasets that produced high accuracies. [7] utilized lifting wavelet transform (LWT) for feature extraction, producing accuracies of 95.70% (MIAS) and 98.06% (DDSM) using moth flame optimization and extreme learning machine (ELM).
Attained in [8] were 0.88 sensitivity, 0.87 specificity, and 0.946 AUC for the CNN Inception-v3 model. Eight fine-tuned pretrained models with CNN and TL were proposed by [9] for categorization. 95.6% accuracy was attained by hybrid models using Mobilenet, ResNet50, and Alexnet [10]. Four CNN architectures (VGG19, InceptionV3, ResNet50, and VGG16) were used, and they were evaluated on 1007 images after being trained on 5000 images [11].SVM’s accuracy on the MIAS and DDSM databases was 96.30%. The accuracy of SVM and the gray level co-occurrence matrix (GLCM) by [5, 12] was 93.88%. On DDSM and CBIS-DDSM datasets, AlexNet and SVM combined for data augmentation obtained 71.01% accuracy (87.2% with SVM). For classification, DenseNet collected features and used FC layers [6]. The DICNN technique in [13] combines morphological operations with dilated semantic segmentation network to achieve 98.9% accuracy.
Despite the advances, there are still problems, such as tumor location, memory complexity, processing time, and deep learning data needs. In order to address this, a novel method of identifying and categorizing breast cancer will be covered in the section that follows.Mammography stands out as a skilled screening method in the field of medical imaging with increased sensitivity to breast calcifications, making it superior in recognizing micro-calcifications or clusters of calcifications. Mammography plays a crucial role in the early detection of these tiny calcifications, which act as precursors of malignant breast cancers. Low-dose X-ray mammography effectively picks up on minute changes and anomalies such structural deformations, bilateral asymmetry, nodules, densities, and most significantly, calcifications. Every breast is seen twice during a screening mammography, which is beneficial for women 40 and older. Additional mammograms are advised when worrisome regions appear. On the other hand, diagnostic mammograms, which are utilized for accurate diagnosis, are focused investigations that concentrate on particular regions or developing abnormalities.
Ultrasound imaging, in contrast to mammography, produces monochrome images with poorer resolution, excels at differentiating cysts from solid masses, and reveals malignant areas as amorphous shapes with fuzzy edges. Although it requires time and money, Magnetic Resonance Imaging (MRI), a non-invasive technology that uses magnetic fields and radio waves, excels in sensitivity and provides precise cross-sectional images.The gold standard of histopathology makes microscopic tissue examination possible and provides vital phenotypic data crucial for both diagnosis and treatment. An emerging method called thermography can identify breast irregularities by looking at heat patterns, but more tests are still needed to make a definitive diagnosis.Because of its aptitude for categorizing, identifying, and segmenting breast cancer, Deep Learning (DL) has become a game-changer. Identification and prognosis have taken use of DL’s ability to handle highly dimensional and correlated data. It has been successful to use transfer learning, feature extraction, and generative adversarial networks. Despite the advancements, difficulties still exist, such as the lack of readily available biological datasets and privacy issues. The journey toward earlier breast cancer identification and improved treatment approaches continues with the convergence of deep learning and medical imaging, offering the possibility of improving patient outcomes as show in Table 1.
Method | Algorithm / Approach | Accuracy | Advantages | Scope |
SSVM and CRF [16] |
Structured support vector machine (SSVM) and conditional random field (CRF) |
N/A | CRF outperformed SSVM in time | Breast cancer classification in mammograms using structured prediction techniques |
FrCN [4] |
Full-resolution convolutional network (FrCN) |
F1: 99.24%, Accuracy: 95.96%, MCC: 98.96% |
High performance on X-ray mammograms |
Breast cancer classification with full-resolution convolutional network |
BDR-CNN-GCN [5] |
Combination of graph-convolutional network (GCN) and 8-layer CNN (BDR-CNN-GCN) |
Accuracy: 96.10% | Effective combination of GCN and CNN |
Breast cancer classification using combined GCN and CNN |
Modified YOLOv5 [14] | Modified YOLOv5 | Accuracy: 96.50%, MCC: 93.50% | Improved results compared to YOLOv3 and RCNN |
Breast tumor detection and classification using modified YOLOv5 |
DFeBCD [15] |
Diversified features (DFeBCD) | Accuracy: 80.30% | Influence of emotion learning on classification |
Categorization of mammograms into normal and abnormal using diversified features |
TL Deep-CNN [17] |
Transfer learning (TL) with deep CNN |
Accuracy: INbreast (95.5%), DDSM (97.35%), BCDR (96.67%) |
Improved performance on multiple datasets |
Breast cancer detection and classification using transfer learning and deep CNN |
LWT [18] |
Lifting wavelet transform (LWT) | Accuracy: MIAS (95.70%), DDSM (98.06%) |
Effective feature extraction using LWT |
Feature extraction using lifting wavelet transform |
Inception-v3 [11] |
CNN Inception-v3 model | Sensitivity: 0.88, Specificity: 0.87, AUC: 0.946 |
High sensitivity and AUC on Inception-v3 model |
Breast cancer detection using CNN Inception-v3 model |
CNN TL [19] |
CNN with transfer learning (TL) | N/A | Enhanced classification using transfer learning |
Breast cancer classification using CNN and transfer learning |
Hybrid models [2] |
Combination of Mobilenet, ResNet50, and Alexnet |
Accuracy: 95.6% | Improved classification accuracy | Hybrid classification model using multiple CNN architectures |
4 CNN architectures [20] | VGG19, InceptionV3, ResNet50, and VGG16 |
N/A | Utilization of various CNN architectures |
Breast cancer classification using different CNN architectures |
SVM [21] | SVM classifier | Accuracy: 96.30% | Effective classification using SVM | Breast cancer detection using SVM classifier |
SVM and GLCM [22] |
SVM classifier with gray level co-occurrence matrix (GLCM) |
Accuracy: 93.88% | Enhanced classification using GLCM and SVM |
Breast cancer detection using SVM and gray level co-occurrence matrix |
AlexNet and SVM [23] |
AlexNet and SVM classifier | Accuracy: Data augmentation improved accuracy |
Improved classification using data augmentation |
Breast cancer classification using AlexNet and SVM |
DenseNet [11] |
DenseNet deep learning framework | N/A | Feature extraction and classification using FC |
Breast cancer classification using DenseNet |
DICNN [6] |
Dilated semantic segmentation network with morphological operations |
Accuracy: 98.9% | Effective combination of segmentation and SVM |
Breast cancer classification using DICNN with morphological operations |
B) Dataset Discription
A collection of medical photographs called the RSNA Breast Cancer diagnosis dataset as shown in Figure 2 was created to assist in the creation of machine learning models for the early diagnosis of breast cancer. The goal of this dataset is to promote medical image analysis while increasing the precision of breast cancer screening.
The RSNA Breast Cancer diagnosis dataset’s main goal is to make it easier to design and test deep learning models and other machine learning techniques for mammogram-based early diagnosis of breast cancer. Early detection of breast cancer symptoms is essential for efficient treatment and better patient outcomes. This dataset can be used by specialists in medical imaging to create and compare machine learning methods. To protect patient privacy and data security, access and usage may need to follow data usage agreements and ethical standards (Table 2).
Attribute | Type | Class | Area |
Mammography Images |
Medical Images |
Healthy, Suspected Abnormal |
Breast Tissue |
Annotations (ROIs) |
Bounding Boxes, Masks |
Suspected Abnormal |
ROI within Images |
It is advised to consult the official sources or documentation published by RSNA for the most accurate and recent information on the dataset because datasets and their properties can change over time. When using medical imaging collections, make sure you always follow ethical norms and secure the required permits.The groupings of photographs based on their state of health are referred to as the "Class". The term "Area" refers to the precise region of breast tissue shown in the pictures as well as any areas of interest (ROIs) that have been flagged as potentially abnormal. Therefore, in Table 3 we have discussed about description of dataset.
Attribute | Description |
site_id | Hospital's source ID number. |
patient_id | The patient's ID number. |
image_id | ID number for the picture. |
laterality | Specifies whether the breast in the image is on the left or right. |
view | Describes the image's orientation; a screening test normally includes two images of each breast. |
age | Age of the patient in years. |
implant | Whether the patient had breast implants is indicated. (site 1 only offers data at the patient level.) |
density | Breast tissue density is rated on a scale of A (least dense) to D (most dense). |
machine_id | A device ID number for the imaging system used. |
cancer | Whether malignant cancer was detected in the breast. (target value, just train.) |
biopsy | Reveals whether a breast follow-up biopsy was completed. (only trains.) |
invasive | Indicates whether the cancer was found to be invasive if the breast is cancer-positive. (Only trains.) |
BIRADS | Ratings: 0 (follow-up is necessary), 1 (cancer is not present), and 2 (normal). (Only trains.) |
prediction_id | ID for the matched submission row; the same prediction ID is shared by several photos. (Doar test.) |
difficult_negative_case | Indicates whether the case was particularly challenging. (only trains.) |
The system architecture that is envisioned for early cancer detection is shown in Figure 1. To improve diagnosis precision, it incorporates cutting-edge technology like deep learning and medical imaging. From the collection of medical images to automated analysis using cutting-edge algorithms, the architecture places a strong emphasis on seamless data flow. The goal of this shortened procedure is to enhance early diagnosis by quickly identifying any anomalies. The suggested system intends to improve patient outcomes by enabling rapid intervention and treatment options through the synergy of novel methodologies.
The first step in the procedure is preprocessing the mammography images and pertinent metadata, as shown in Figure 3, such as patient age and breast tissue density. The CNN, which specialises at extracting complex features from medical pictures, is then fed these preprocessed inputs. The convolutional layers of the CNN analyse regional trends to assist in the early detection of anomalies. The ResNet50 architecture, renowned for its deep layers and residual connections, is used for more complicated features. ResNet50 improves feature extraction, allowing the network to recognise small irregularities that might be a sign of cancer.
The RNN, which can analyse sequential data, is then given the combined outputs from the CNN and ResNet50. To identify temporal patterns and correlations in this situation, RNN examines data such as patient age and imaging device history. A thorough comprehension of the context of each case is made possible by the RNN’s capacity to model sequential dependencies.The classification layer receives the combined data after that, where it makes the ultimate determination of the presence of cancer.
A comprehensive evaluation of the possibility of breast cancer is provided by this complex integration of CNN, ResNet50, and RNN. This method combines image analysis, deep feature extraction, and sequential context modelling, boosting the possibility of early cancer detection and providing doctors with more knowledgeable insights for precise diagnosis and prompt intervention illustrate in Figure 4.
A) Convolution Neural Network (CNN)
It is exceptional at capturing complex details, patterns, and spatial relationships in images. CNNs can recognise complicated patterns because they automatically learn feature hierarchies from raw pixel inputs shown in Figure 5. CNNs include layers such as convolutional, pooling, and fully connected layers. Localised convolutions are carried out by the convolutional layers to identify edges, textures, and forms. Following pooling layers shrink the spatial dimensionality while preserving crucial data. For classification or regression tasks, fully linked layers then interpret the features gathered by earlier layers.
Due to their capacity to automatically extract pertinent features without the need for manual feature engineering, CNNs are frequently used in the fields of image identification, object detection, and medical imaging. They are effective tools for tasks requiring advanced picture analysis, such as the early diagnosis of diseases like breast cancer, because their hierarchical structure closely resembles the visual processing in the human brain.
Algorithm Step:
Step 1: Data Collection and Preparation:
Step 2: Data Preprocessing:
Step 3: Data Annotation:
\(B_{i} = {(x_{min},y_{min},x_{max},y_{max})}\)
Step 4: Model Selection:
Step 5: Model Architecture:
Customize the chosen architecture to include:
Step 6: Loss Function:
\(L_{total(I_{i},B_{i},C_{gt},\Delta B_{gt} )} = L_{classification(C(I_{i}; \theta),C_{gt} )} + \lambda \times L_{regression(R(I_{i}; \theta),\Delta B_{gt} )}\)
where , \(C_{gt}\) is the ground truth class label (1 for license plate, 0 for non-license plate), \(\Delta B_{gt}\) is the ground truth bounding box offset, and \(\lambda\) is a hyperparameter that balances the two losses.
Step 7: Training:
\(\theta* = argmin_{\theta} \frac{1}{N} \times \sum_{i} L_{total(I_{i},B_{i},C_{gt},\Delta B_{gt})}\)
Step 8: Evaluation:
\(IoU(B_{pred},B_{gt} )=\frac{Area_{of Overlap}}{Area_{of Union}}\)
Step 9: Fine-tuning and Optimization:
B) Recurrent Neural Network (RNN)
Recurrent neural networks (RNNs) are a subclass of artificial neural networks created with the specific purpose of processing sequential data while keeping track of prior inputs. RNNs excel at tasks involving sequences or time-dependent patterns because they have internal loops that, in contrast to standard feedforward networks, allow them to maintain information over time steps. Time-series analysis, speech recognition, and natural language processing all benefit greatly from this architecture. RNNs process input one step at a time, employing both the most recent input and knowledge from earlier steps. Traditional RNNs, on the other hand, may experience vanishing gradient issues, which restricts their capacity to detect distant relationships. In order to solve this problem, variants including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), which better regulate information flow, were introduced. RNNs are effective tools for tasks like sentiment analysis, language production, and even medical data analysis where sequential context plays a significant role, like in early cancer detection, because of their innate memory and capacity to learn temporal patterns.
1. Recurrent neural network (RNN) algorithm:
Step 1: Initialise the parameters:
Step 2: Initialise Hidden State:
\(activation(x)=\frac{1}{1+ e^{x}}\)
Step 3: Loop across time steps:
\(ht=activation(W_{input}{hidden} \times xt + W_{hidden}{hidden}\times ht- 1 + b_{hidden})\)
Step 4: Calculate Loss:
\(yt=activation(W_{output}{hidden}\times ht+b_{output})\)
Step 5: Backpropagation via Time (BPTT):
Step 6: Update Parameters:
\(yt=activation(W_{output}{hidden}\times ht+b_{output})\)
C) CNN ResNet50 Model:
A well-known convolutional neural network design known for its outstanding performance in deep learning applications, particularly image classification, is the ResNet50 model, as shown in Figure 6, also known as the Residual Network with 50 layers. ResNet50 solves the vanishing gradient issue that occurs in extremely deep networks by introducing the ground-breaking idea of residual connections. Because the gradients can pass right across the network layers thanks to these residual connections, it is possible to build deep architectures with 50 layers without losing any information. The capacity of the ResNet50 model to recognise complex features from input photos is improved by the residual function learning that occurs in each layer. ResNet50 excels at capturing both low-level and high-level picture features with skip connections and residual blocks, allowing it to comprehend complicated patterns and achieve exceptional accuracy on a variety of visual identification tasks. Due to its popularity, "identity shortcut connections" are increasingly being used in neural network topologies, having a considerable impact on deep learning and dramatically improving model performance in a variety of applications.
1. ResNet50 Algorithm:
Step 1: Initialise Parameter:
Initialise the settings for the convolutional layers, including the bias vectors, weight matrices, and other parameters.
Step 2: Input Layer:
To extract fundamental features, run the input image through a first convolutional layer.
Step 3: Residual Blocks:
Complete the following for each of the 16 residual blocks.
\(X_{out}=Conv1\times1(X_{in})\)
\(X_{temp}=Conv3\times3(X_{out})\)
\(X_{temp}=Conv3\times3(X_{temp})\)
\(X_{temp}=Conv1\times1(X_{temp}\)
\(X_{residual}=X_{in}+X_{temp}\)
\(X_{out}=ReLU(X_{residual})\)
Step 4: Global Average Pooling:
Use global average pooling to shrink the feature maps’ spatial dimensions.
\(X_{pooled}=GlobalAvgPooling(X_{out})\)
\(GlobalAvgPooling(X)=H\times W_{1i}=1\sum Hj=1\sum WX(i,j)\)
Step 5: Fully linked Layer:
For categorization, join the combined characteristics to a fully linked layer.
\(Output=FC(X_{pooled})\)
In the experimentation, the preparation stage, the images that were pulled from the dataset for testing were uniformly shrunk to have dimensions of 256 256 and 512 512 pixels. As a result of this downsizing, the images were standardized and prepared for additional processing and analysis. Consistent handling was made feasible by ensuring that all models and architectures used the same image size.To try to boost performance, a variety of pre-trained models from the "timm" (PyTorch Image Models) package were employed. Among these were the "CNN," "resnet50," "RNN," "efficientnet_b0," and "maxvit_nano_rw_256" models. Pre-trained models have the benefit of having completed extensive training on large datasets and may be further customized to match specific picture identification applications.
To effectively run these models and speed up the training process, the suggested architecture was used. The pre-trained models that were chosen were incorporated into this framework, which also made it simpler to control optimisation approaches, evaluate performance, and import additional data and enhance it. The experimental process ensured that the models would be trained and evaluated on a solid basis with systematic execution and consistency with the aid of this framework. A complete evaluation of how effectively these models performed under the conditions of the task at hand was made possible with the assistance of the proposed framework, numerous pre-trained models, and the enlarged photographs.
Figure 7 depicts three separate cases of breast cancer: invasive cancer (a), non-invasive cancer (b), and no cancer (c). The image depicts the existence of a malignant growth that has invaded nearby tissues in the context of invasive cancer (a), possibly indicating an advanced stage. In the non-invasive cancer representation (b), aberrant cell proliferation is shown to be restricted to the breast ducts, displaying a less aggressive form that can be effectively controlled with prompt intervention. In contrast, the image showing no cancer (c) shows a healthy composition of breast tissue free of any abnormal aberrations. In order to gain a greater understanding of the characteristics of breast cancer, physicians and researchers can use this trio of representations to identify visual differences related to various cancer types and states. The relevance of Figure 7 is to assist healthcare professionals in improving their diagnostic abilities, enabling accurate diagnosis and early intervention, which are essential for better patient outcomes and improved overall breast cancer management strategies.
Figure 8 illustrates the idea of feature extraction by image segmentation. This method entails segmenting an image into separate areas based on shared traits, such colour or texture. The goal is to locate and isolate particular objects or regions of interest in the image. The image is divided into pieces that each represent a valuable area using sophisticated algorithms.