Research Article | | Volume 13 Issue 2 (April, 2024) | Pages 18 - 27

Early Detection of Breast Cancer using Deep Learning in Mammograms

 ,
 ,
1
Department of Oncology, Krishna Vishwa Vidyapeeth, Karad, Maharashtra, India.
2
Department of Surgery Krishna Vishwa Vidyapeeth, Karad, Maharashtra, India.
3
Department of General Medicine Krishna Vishwa Vidyapeeth, Karad, Maharashtra, India.
Under a Creative Commons license
Open Access
Received
Nov. 22, 2023
Accepted
Jan. 29, 2024
Published
April 29, 2024

Abstract

Breast cancer still poses a serious threat to world health, needing creative approaches to early identification in order to improve patient outcomes. This study investigates the potential of deep learning methods to improve the precision and effectiveness of mammography interpretation for the identification of breast cancer. In this paper proposed, a convolutional neural network (CNN) architecture, ResNet50, is created and trained on a sizable data set of annotated mammograms. The CNN is made to automatically identify and extract pertinent elements, such as microcalcifications, masses, and architectural distortions,that may be symptomatic of possible cancers. The model develops the ability to distinguish between benign and malignant instances through an iterative process of training and validation, finally displaying a high level of discriminatory accuracy. The paper findings show that the deep learning model outperforms conventional mammography interpretations in terms of sensitivity and specificity for detecting breast cancer. Furthermore, the model's potential for use in actual clinical settings is highlighted by its generalizability across a range of patient demographics and imaging technologies.This study represents a big step in improving radiologists' capacity for breast cancer early detection. Our deep learning-based architecture has promise for improving the screening procedure and potentially decreasing the difficulties brought on by radiologist shortages by lowering false positives, improving accuracy, and offering quick analysis. By utilising cutting-edge technology to enable prompt and efficient detection, this study contributes to continuing efforts by the international healthcare community to improve breast cancer outcomes.

Keywords
breast cancer, mammograms, deep learning, resnet50, convolution neural network

1. Introduction

Early detection and prompt treatment have evolved into essential elements in the fight against maternal cancer-related mortality. These goals have been attained with the application of routine mammography tests. The screenings based on mammography, on the other hand, are quite expensive and demand a lot of resources due to their reliance on the expertise of human professionals. A number of countries are about to execute radiologists, which would worsen the current situation.Unimportant but significant side effect is the high frequency of erroneous results associated with mammographic exams. Numerous issues arise as a result of this circumstance, including the patients’ unjustified emotional suffering, the need for unwelcome assistance for follow-up care, additional imaging studies, and occasionally the use of invasive tissue collection techniques like needle biopsies.

The ongoing advancement of deep learning, in particular, has sparked the interest of the medical imaging industry in using these methods to improve the accuracy of cancer detection. Breast cancer is the second greatest cause of mortality among cancers that affect women in the United States. A crucial tool for lowering mortality rates is screening mammography. Nevertheless, despite these benefits, screening mammography suffers from a high percentage of false positives and false negatives. The system architecture that is envisioned for early cancer detection is shown in Figure 1. To improve diagnosis precision, it incorporates cutting-edge technology like deep learning and medical imaging. From the collection of medical images to automated analysis using cutting-edge algorithms, the architecture places a strong emphasis on seamless data flow. The goal of this shortened procedure is to enhance early diagnosis by quickly identifying any anomalies. The suggested system intends to improve patient outcomes by enabling rapid intervention and treatment options through the synergy of novel methodologies.In the United States, digital screening mammography has an average sensitivity and specificity of 86.9% and 88.9%, respectively. Since the 1990s, radiologists have used computer-assisted detection and diagnosis (CAD) software to improve the screening mammography’s predictive accuracy. Unfortunately, the first generation of commercial CAD systems did not significantly increase performance, which led to a decade-long halt in development. However, the emergence of deep learning has reignited interest in creating deep learning tools for radiologists due to its exceptional performance in object recognition and other fields. Recent research shows that when employed in support mode, deep learning-based CAD systems even outperform radiologists in terms of performance while working independently.

Proposed system architecture

Due to the modest tumour size in comparison to the overall breast image, detecting subclinical breast cancer with screening mammography might be difficult. The majority of research has concentrated on categorising annotated lesions, with little attention paid to the full mammography. There are drawbacks to this method, especially when working with datasets that lack region-of-interest (ROI) annotations. Without depending on annotations, some research have attempted to train neural networks using entire mammograms, although it is yet unknown if these methods can locate clinically important abnormalities.Pre-training shows promise in addressing the need for huge training datasets. Pre-training techniques provide quicker and more accurate classification by initialising a classifier’s weights with features discovered from a different dataset. Using a fully annotated dataset, we provide a "end-to-end" method in this work for classifying local picture patches. The entire image classifier is then initialised using the patch classifier’s weights, and it can be modified using datasets without ROI annotations. This approach improves breast cancer classification algorithms by utilising both larger unlabelled datasets and fully annotated datasets.

In order to put this strategy into practise, we used a sizable public database of digitised film mammography in order to create patch and full picture classifiers. The smaller public full-field digital mammography (FFDM) database received these classifiers after that. Our study considers alternative network designs for building these classifiers and investigates various training methods. This study highlights the benefits and drawbacks of various training methods while also outlining an efficient workflow for creating full picture classifiers.

In particular, the RSNA which plays a significant role in this field, is in charge of organising the prestigious competition. The non-profit organisation RSNA serves as a global representative for the field and represents 31 radiological specialties from 145 different countries. Through education, innovative research projects, and a continuous search for cutting-edge technological solutions, the organisation is dedicated to improving medical care and the delivery of it.In the face of these pressing problems and the pursuit of better techniques for breast cancer screening, your understanding of machine learning has the potential to serve as a catalyst. You have the chance to use your machine learning expertise to improve and streamline the review procedure radiologists use to evaluate screening mammograms. This has the potential to lessen the burden on healthcare systems and increase the efficiency of breast cancer detection, especially given the shortage of radiologists. Finally, your contributions have the potential to usher in a new era of accuracy, effectiveness, and patient results improvement in the study and treatment of senile cancer.

A) Review of Literature

Breast cancer detection, which relies primarily on mammography pictures, is developing quickly. This section provides a brief summary of recent developments in this field. Deep convolution and belief networks are used in [1] to categorize mass mammograms using a structured support vector machine (SSVM) and conditional random field (CRF). In terms of training and inference time, CRF performed better. Full-resolution convolutional network (FrCN) was presented using four-fold cross-validation on the in breast dataset X-ray mammograms, and it achieved an excellent F1 score of 99.24%, accuracy of 95.96%, and a Matthews correlation coefficient (MCC) of 98.96%. Another study [2] proposed BDR-CNN-GCN for MIAS dataset, resulting in 96.10% accuracy, by combining a graph-convolutional network (GCN) and an 8-layer CNN. With a 96.50% accuracy and 93.50% MCC, a modified YOLOv5 network in [3] successfully recognized and categorized breast tumors, outperforming YOLOv3 and quicker RCNN. [4, 5] proposed the diversified features (DFeBCD) technique for classifying mammograms into normal and abnormal using an integrated classifier that was influenced by emotion learning and achieved an accuracy of 80.30%. To counter overfitting, [6] presented a deep-CNN with transfer learning (TL). Inbreast (95.5%), DDSM (97.35%), and BCDR (96.67%) were among the datasets that produced high accuracies. [7] utilized lifting wavelet transform (LWT) for feature extraction, producing accuracies of 95.70% (MIAS) and 98.06% (DDSM) using moth flame optimization and extreme learning machine (ELM).

Attained in [8] were 0.88 sensitivity, 0.87 specificity, and 0.946 AUC for the CNN Inception-v3 model. Eight fine-tuned pretrained models with CNN and TL were proposed by [9] for categorization. 95.6% accuracy was attained by hybrid models using Mobilenet, ResNet50, and Alexnet [10]. Four CNN architectures (VGG19, InceptionV3, ResNet50, and VGG16) were used, and they were evaluated on 1007 images after being trained on 5000 images [11].SVM’s accuracy on the MIAS and DDSM databases was 96.30%. The accuracy of SVM and the gray level co-occurrence matrix (GLCM) by [5, 12] was 93.88%. On DDSM and CBIS-DDSM datasets, AlexNet and SVM combined for data augmentation obtained 71.01% accuracy (87.2% with SVM). For classification, DenseNet collected features and used FC layers [6]. The DICNN technique in [13] combines morphological operations with dilated semantic segmentation network to achieve 98.9% accuracy.

Despite the advances, there are still problems, such as tumor location, memory complexity, processing time, and deep learning data needs. In order to address this, a novel method of identifying and categorizing breast cancer will be covered in the section that follows.Mammography stands out as a skilled screening method in the field of medical imaging with increased sensitivity to breast calcifications, making it superior in recognizing micro-calcifications or clusters of calcifications. Mammography plays a crucial role in the early detection of these tiny calcifications, which act as precursors of malignant breast cancers. Low-dose X-ray mammography effectively picks up on minute changes and anomalies such structural deformations, bilateral asymmetry, nodules, densities, and most significantly, calcifications. Every breast is seen twice during a screening mammography, which is beneficial for women 40 and older. Additional mammograms are advised when worrisome regions appear. On the other hand, diagnostic mammograms, which are utilized for accurate diagnosis, are focused investigations that concentrate on particular regions or developing abnormalities.

Ultrasound imaging, in contrast to mammography, produces monochrome images with poorer resolution, excels at differentiating cysts from solid masses, and reveals malignant areas as amorphous shapes with fuzzy edges. Although it requires time and money, Magnetic Resonance Imaging (MRI), a non-invasive technology that uses magnetic fields and radio waves, excels in sensitivity and provides precise cross-sectional images.The gold standard of histopathology makes microscopic tissue examination possible and provides vital phenotypic data crucial for both diagnosis and treatment. An emerging method called thermography can identify breast irregularities by looking at heat patterns, but more tests are still needed to make a definitive diagnosis.Because of its aptitude for categorizing, identifying, and segmenting breast cancer, Deep Learning (DL) has become a game-changer. Identification and prognosis have taken use of DL’s ability to handle highly dimensional and correlated data. It has been successful to use transfer learning, feature extraction, and generative adversarial networks. Despite the advancements, difficulties still exist, such as the lack of readily available biological datasets and privacy issues. The journey toward earlier breast cancer identification and improved treatment approaches continues with the convergence of deep learning and medical imaging, offering the possibility of improving patient outcomes as show in Table 1.

Table 1: Summary of related work using deep learning
Method Algorithm / Approach Accuracy Advantages Scope

SSVM and CRF

[16]

Structured support vector machine
(SSVM) and conditional
random field (CRF)
N/A CRF outperformed SSVM in time Breast cancer classification in
mammograms using structured
prediction techniques

FrCN

[4]

Full-resolution convolutional
network (FrCN)
F1: 99.24%, Accuracy: 95.96%,
MCC: 98.96%
High performance on X-ray
mammograms
Breast cancer classification with
full-resolution convolutional network

BDR-CNN-GCN

[5]

Combination of graph-convolutional
network (GCN) and 8-layer CNN
(BDR-CNN-GCN)
Accuracy: 96.10% Effective combination of GCN and
CNN
Breast cancer classification using
combined GCN and CNN
Modified YOLOv5 [14] Modified YOLOv5 Accuracy: 96.50%, MCC: 93.50% Improved results compared to
YOLOv3 and RCNN
Breast tumor detection and
classification using modified
YOLOv5

DFeBCD

[15]

Diversified features (DFeBCD) Accuracy: 80.30% Influence of emotion learning on
classification
Categorization of mammograms into
normal and abnormal using diversified
features

TL Deep-CNN

[17]

Transfer learning (TL) with deep
CNN
Accuracy: INbreast (95.5%), DDSM
(97.35%), BCDR (96.67%)
Improved performance on multiple
datasets
Breast cancer detection and
classification using transfer
learning and deep CNN

LWT

[18]

Lifting wavelet transform (LWT) Accuracy: MIAS (95.70%), DDSM
(98.06%)
Effective feature extraction using
LWT
Feature extraction using lifting
wavelet transform

Inception-v3

[11]

CNN Inception-v3 model Sensitivity: 0.88, Specificity:
0.87, AUC: 0.946
High sensitivity and AUC on
Inception-v3 model
Breast cancer detection using CNN
Inception-v3 model

CNN TL

[19]

CNN with transfer learning (TL) N/A Enhanced classification using
transfer learning
Breast cancer classification using
CNN and transfer learning

Hybrid models

[2]

Combination of Mobilenet,
ResNet50, and Alexnet
Accuracy: 95.6% Improved classification accuracy Hybrid classification model using
multiple CNN architectures
4 CNN architectures [20] VGG19, InceptionV3, ResNet50, and
VGG16
N/A Utilization of various CNN
architectures
Breast cancer classification using
different CNN architectures
SVM [21] SVM classifier Accuracy: 96.30% Effective classification using SVM Breast cancer detection using SVM
classifier

SVM and GLCM

[22]

SVM classifier with gray level
co-occurrence matrix (GLCM)
Accuracy: 93.88% Enhanced classification using GLCM
and SVM
Breast cancer detection using SVM
and gray level co-occurrence matrix

AlexNet and SVM

[23]

AlexNet and SVM classifier Accuracy: Data augmentation
improved accuracy
Improved classification using data
augmentation
Breast cancer classification using
AlexNet and SVM

DenseNet

[11]

DenseNet deep learning framework N/A Feature extraction and
classification using FC
Breast cancer classification using
DenseNet

DICNN

[6]

Dilated semantic segmentation
network with morphological operations
Accuracy: 98.9% Effective combination of
segmentation and SVM
Breast cancer classification using
DICNN with morphological operations

B) Dataset Discription

A collection of medical photographs called the RSNA Breast Cancer diagnosis dataset as shown in Figure 2 was created to assist in the creation of machine learning models for the early diagnosis of breast cancer. The goal of this dataset is to promote medical image analysis while increasing the precision of breast cancer screening.

  • Type: Mammography images, which are X-ray images of the breast tissue, make up the majority of the collection.
  • Annotations: The areas of interest (ROIs) that are thought to contain anomalies suggestive of breast cancer are noted on the images in this collection. Bounding boxes or masks defining these ROIs are frequently included in annotations.
Representation of Density age wise breast cancer patient of RSNA Dataset

The RSNA Breast Cancer diagnosis dataset’s main goal is to make it easier to design and test deep learning models and other machine learning techniques for mammogram-based early diagnosis of breast cancer. Early detection of breast cancer symptoms is essential for efficient treatment and better patient outcomes. This dataset can be used by specialists in medical imaging to create and compare machine learning methods. To protect patient privacy and data security, access and usage may need to follow data usage agreements and ethical standards (Table 2).

Table 2: Summary of dataset
Attribute Type Class Area
Mammography
Images
Medical
Images
Healthy,
Suspected Abnormal
Breast
Tissue
Annotations
(ROIs)
Bounding
Boxes, Masks
Suspected
Abnormal
ROI within
Images

It is advised to consult the official sources or documentation published by RSNA for the most accurate and recent information on the dataset because datasets and their properties can change over time. When using medical imaging collections, make sure you always follow ethical norms and secure the required permits.The groupings of photographs based on their state of health are referred to as the "Class". The term "Area" refers to the precise region of breast tissue shown in the pictures as well as any areas of interest (ROIs) that have been flagged as potentially abnormal. Therefore, in Table 3 we have discussed about description of dataset.

Table 3: Description of dataset
Attribute Description
site_id Hospital's source ID number.
patient_id The patient's ID number.
image_id ID number for the picture.
laterality Specifies whether the breast in the image is on the left or right.
view Describes the image's orientation; a screening test normally includes two images of each breast.
age Age of the patient in years.
implant Whether the patient had breast implants is indicated. (site 1 only offers data at the patient level.)
density Breast tissue density is rated on a scale of A (least dense) to D (most dense).
machine_id A device ID number for the imaging system used.
cancer Whether malignant cancer was detected in the breast. (target value, just train.)
biopsy Reveals whether a breast follow-up biopsy was completed. (only trains.)
invasive Indicates whether the cancer was found to be invasive if the breast is cancer-positive. (Only trains.)
BIRADS Ratings: 0 (follow-up is necessary), 1 (cancer is not present), and 2 (normal). (Only trains.)
prediction_id ID for the matched submission row; the same prediction ID is shared by several photos. (Doar test.)
difficult_negative_case Indicates whether the case was particularly challenging. (only trains.)

2. Proposed Methodology

The system architecture that is envisioned for early cancer detection is shown in Figure 1. To improve diagnosis precision, it incorporates cutting-edge technology like deep learning and medical imaging. From the collection of medical images to automated analysis using cutting-edge algorithms, the architecture places a strong emphasis on seamless data flow. The goal of this shortened procedure is to enhance early diagnosis by quickly identifying any anomalies. The suggested system intends to improve patient outcomes by enabling rapid intervention and treatment options through the synergy of novel methodologies.

The first step in the procedure is preprocessing the mammography images and pertinent metadata, as shown in Figure 3, such as patient age and breast tissue density. The CNN, which specialises at extracting complex features from medical pictures, is then fed these preprocessed inputs. The convolutional layers of the CNN analyse regional trends to assist in the early detection of anomalies. The ResNet50 architecture, renowned for its deep layers and residual connections, is used for more complicated features. ResNet50 improves feature extraction, allowing the network to recognise small irregularities that might be a sign of cancer.

The RNN, which can analyse sequential data, is then given the combined outputs from the CNN and ResNet50. To identify temporal patterns and correlations in this situation, RNN examines data such as patient age and imaging device history. A thorough comprehension of the context of each case is made possible by the RNN’s capacity to model sequential dependencies.The classification layer receives the combined data after that, where it makes the ultimate determination of the presence of cancer.

Workflow of Deep learning algorithm for early detection of Breast Cancer

A comprehensive evaluation of the possibility of breast cancer is provided by this complex integration of CNN, ResNet50, and RNN. This method combines image analysis, deep feature extraction, and sequential context modelling, boosting the possibility of early cancer detection and providing doctors with more knowledgeable insights for precise diagnosis and prompt intervention illustrate in Figure 4.

Conceptual model of proposed method

A) Convolution Neural Network (CNN)

It is exceptional at capturing complex details, patterns, and spatial relationships in images. CNNs can recognise complicated patterns because they automatically learn feature hierarchies from raw pixel inputs shown in Figure 5. CNNs include layers such as convolutional, pooling, and fully connected layers. Localised convolutions are carried out by the convolutional layers to identify edges, textures, and forms. Following pooling layers shrink the spatial dimensionality while preserving crucial data. For classification or regression tasks, fully linked layers then interpret the features gathered by earlier layers.

Due to their capacity to automatically extract pertinent features without the need for manual feature engineering, CNNs are frequently used in the fields of image identification, object detection, and medical imaging. They are effective tools for tasks requiring advanced picture analysis, such as the early diagnosis of diseases like breast cancer, because their hierarchical structure closely resembles the visual processing in the human brain.

Architecture of CNN

Algorithm Step:

Step 1: Data Collection and Preparation:

  • Collect a data set of labeled images, denoted as \(I_{i}\),\(B_{i}\), where \(I_{i}\) is the i-th image and \(B_{i}\) is the set of bounding box coordinates for the license plate in the i-th image.

Step 2: Data Preprocessing:

  • Resize: Transform each image \(I_{i}\) to a fixed size WxH (width x height).
  • Normalize: Normalize pixel values to the range [0, 1] or [-1, 1].
  • Augmentation: Define a set of augmentation functions \(A_{k}\) (e.g., cropping, rotation, flipping) and apply them to generate augmented images \(I'_{i}\) = \(A_{k}\)(\(I_{i}\)) for each original image \(I_{i}\).

Step 3: Data Annotation:

  • Each image \(I_{i}\) is annotated with the bounding box coordinates.

\(B_{i} = {(x_{min},y_{min},x_{max},y_{max})}\)

Step 4: Model Selection:

  • Choose a CNN architecture suitable for object detection, denoted as f(I; \(\theta\)), where I is the input image and \(\theta\) represents the model parameters.

Step 5: Model Architecture:

Customize the chosen architecture to include:

  • Classification head: Produces class scores for license plate and non-license plate classes, denoted as C(I; \(\theta\)).
  • Regression head: Predicts bounding box coordinates offsets \(\Delta B = (\Delta x_{min},\Delta y_{min},\Delta x_{max},\Delta y_{max})\) relative to the default box, denoted as R(I; \(\theta\)).

Step 6: Loss Function:

  • Define the total loss \(L_{total}\) as a combination of classification and regression losses:

\(L_{total(I_{i},B_{i},C_{gt},\Delta B_{gt} )} = L_{classification(C(I_{i}; \theta),C_{gt} )} + \lambda \times L_{regression(R(I_{i}; \theta),\Delta B_{gt} )}\)

where , \(C_{gt}\) is the ground truth class label (1 for license plate, 0 for non-license plate), \(\Delta B_{gt}\) is the ground truth bounding box offset, and \(\lambda\) is a hyperparameter that balances the two losses.

Step 7: Training:

  • Minimize the average loss over the training dataset using stochastic gradient descent (SGD) or an optimizer like Adam:

\(\theta* = argmin_{\theta} \frac{1}{N} \times \sum_{i} L_{total(I_{i},B_{i},C_{gt},\Delta B_{gt})}\)

Step 8: Evaluation:

  • For each image \(I_{i}\) in the validation/test dataset,
  • calculate the Intersection over Union (IoU) between the predicted bounding box coordinates \(B_{pred}\) and the ground truth \(B_{gt}\):

\(IoU(B_{pred},B_{gt} )=\frac{Area_{of Overlap}}{Area_{of Union}}\)

Step 9: Fine-tuning and Optimization:

  • Adjust hyperparameters, model architecture, or collect additional data based on evaluation results to improve detection performance.

B) Recurrent Neural Network (RNN)

Recurrent neural networks (RNNs) are a subclass of artificial neural networks created with the specific purpose of processing sequential data while keeping track of prior inputs. RNNs excel at tasks involving sequences or time-dependent patterns because they have internal loops that, in contrast to standard feedforward networks, allow them to maintain information over time steps. Time-series analysis, speech recognition, and natural language processing all benefit greatly from this architecture. RNNs process input one step at a time, employing both the most recent input and knowledge from earlier steps. Traditional RNNs, on the other hand, may experience vanishing gradient issues, which restricts their capacity to detect distant relationships. In order to solve this problem, variants including Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), which better regulate information flow, were introduced. RNNs are effective tools for tasks like sentiment analysis, language production, and even medical data analysis where sequential context plays a significant role, like in early cancer detection, because of their innate memory and capacity to learn temporal patterns.

1. Recurrent neural network (RNN) algorithm:

Step 1: Initialise the parameters:

  • It including the hidden-to-hidden connections (W_hidden_hidden) and input-to-hidden connections (W_input_hidden) weight matrices. Initialise the output unit’s (b_output) and hidden units’ (b_hidden) bias vectors as well.

Step 2: Initialise Hidden State:

  • Put zeros or a small random value into the hidden state (h).

\(activation(x)=\frac{1}{1+ e^{x}}\)

Step 3: Loop across time steps:

  • Calculate the hidden state at time t by using the current input and the previous hidden state.

    \(ht=activation(W_{input}{hidden} \times xt + W_{hidden}{hidden}\times ht- 1 + b_{hidden})\)

  • Computed Results:
  • Utilising the present hidden state, calculate the output at time t.

Step 4: Calculate Loss:

  • Determine the difference in profit between the desired output (y) and the predicted output (y).

\(yt=activation(W_{output}{hidden}\times ht+b_{output})\)

Step 5: Backpropagation via Time (BPTT):

  • By back propagating the error via time steps, compute gradients for the parameters.

Step 6: Update Parameters:

  • Using the obtained gradients and an optimisation approach (such as gradient descent), update the weight matrices and bias vectors.

\(yt=activation(W_{output}{hidden}\times ht+b_{output})\)

C) CNN ResNet50 Model:

A well-known convolutional neural network design known for its outstanding performance in deep learning applications, particularly image classification, is the ResNet50 model, as shown in Figure 6, also known as the Residual Network with 50 layers. ResNet50 solves the vanishing gradient issue that occurs in extremely deep networks by introducing the ground-breaking idea of residual connections. Because the gradients can pass right across the network layers thanks to these residual connections, it is possible to build deep architectures with 50 layers without losing any information. The capacity of the ResNet50 model to recognise complex features from input photos is improved by the residual function learning that occurs in each layer. ResNet50 excels at capturing both low-level and high-level picture features with skip connections and residual blocks, allowing it to comprehend complicated patterns and achieve exceptional accuracy on a variety of visual identification tasks. Due to its popularity, "identity shortcut connections" are increasingly being used in neural network topologies, having a considerable impact on deep learning and dramatically improving model performance in a variety of applications.

Architecture of ResNet50 model

1. ResNet50 Algorithm:

Step 1: Initialise Parameter:

Initialise the settings for the convolutional layers, including the bias vectors, weight matrices, and other parameters.

Step 2: Input Layer:

To extract fundamental features, run the input image through a first convolutional layer.

Step 3: Residual Blocks:

Complete the following for each of the 16 residual blocks.

  1. First Convolution Layer: To minimise dimensionality and improve network efficiency, apply a 1x1 convolution to the input feature maps in the first convolutional layer.

    \(X_{out}=Conv1\times1(X_{in})\)

  2. Main Convolutional Block: To capture characteristics at different scales, apply three successive convolutional layers with various kernel sizes.

    \(X_{temp}=Conv3\times3(X_{out})\)

    \(X_{temp}=Conv3\times3(X_{temp})\)

    \(X_{temp}=Conv1\times1(X_{temp}\)

  3. Skip Connection (Identity Shortcut): To construct a residual, add the original input to the main convolutional block’s output.

    \(X_{residual}=X_{in}+X_{temp}\)

  4. Activation Function: Apply an activation function on the residual (like ReLU, for example)

    \(X_{out}=ReLU(X_{residual})\)

Step 4: Global Average Pooling:

Use global average pooling to shrink the feature maps’ spatial dimensions.

\(X_{pooled}=GlobalAvgPooling(X_{out})\)

\(GlobalAvgPooling(X)=H\times W_{1i}=1\sum Hj=1\sum WX(i,j)\)

Step 5: Fully linked Layer:

For categorization, join the combined characteristics to a fully linked layer.

\(Output=FC(X_{pooled})\)

3. Results and Discussion

In the experimentation, the preparation stage, the images that were pulled from the dataset for testing were uniformly shrunk to have dimensions of 256 256 and 512 512 pixels. As a result of this downsizing, the images were standardized and prepared for additional processing and analysis. Consistent handling was made feasible by ensuring that all models and architectures used the same image size.To try to boost performance, a variety of pre-trained models from the "timm" (PyTorch Image Models) package were employed. Among these were the "CNN," "resnet50," "RNN," "efficientnet_b0," and "maxvit_nano_rw_256" models. Pre-trained models have the benefit of having completed extensive training on large datasets and may be further customized to match specific picture identification applications.

To effectively run these models and speed up the training process, the suggested architecture was used. The pre-trained models that were chosen were incorporated into this framework, which also made it simpler to control optimisation approaches, evaluate performance, and import additional data and enhance it. The experimental process ensured that the models would be trained and evaluated on a solid basis with systematic execution and consistency with the aid of this framework. A complete evaluation of how effectively these models performed under the conditions of the task at hand was made possible with the assistance of the proposed framework, numerous pre-trained models, and the enlarged photographs.

Figure 7 depicts three separate cases of breast cancer: invasive cancer (a), non-invasive cancer (b), and no cancer (c). The image depicts the existence of a malignant growth that has invaded nearby tissues in the context of invasive cancer (a), possibly indicating an advanced stage. In the non-invasive cancer representation (b), aberrant cell proliferation is shown to be restricted to the breast ducts, displaying a less aggressive form that can be effectively controlled with prompt intervention. In contrast, the image showing no cancer (c) shows a healthy composition of breast tissue free of any abnormal aberrations. In order to gain a greater understanding of the characteristics of breast cancer, physicians and researchers can use this trio of representations to identify visual differences related to various cancer types and states. The relevance of Figure 7 is to assist healthcare professionals in improving their diagnostic abilities, enabling accurate diagnosis and early intervention, which are essential for better patient outcomes and improved overall breast cancer management strategies.

Representation of (i) Invasive cancer (ii) Non-Invasive cancer and (iii) No Cancer

Figure 8 illustrates the idea of feature extraction by image segmentation. This method entails segmenting an image into separate areas based on shared traits, such colour or texture. The goal is to locate and isolate particular objects or regions of interest in the image. The image is divided into pieces that each represent a valuable area using sophisticated algorithms.