Since its inception in 1943 by McCulloch and Pitt [13], the field of Artificial Neural Networks (ANN) has witnessed a significant transformation, particularly notable in recent years. The seminal work by McCulloch and Pitt introduced a mathematical framework for ANNs, emphasizing the binary nature of neuronal activity and their representation through a threshold function. Neural networks, particularly through the utilization of non-linear activation functions in hidden layers, offer a robust mechanism for modeling complex, non-linear relationships inherent in the training data, a task that is challenging for traditional algorithms. Among the various architectures developed, CNNs, cited in [14], have gained prominence in tasks involving image data, such as in the field of medical imaging. The rapid advancements in deep learning, aligned with breakthroughs in biological and medical imaging technologies, have led to an unprecedented ability to generate, collect, and analyze voluminous datasets of medical images, thereby enhancing the capability of CNNs in feature detection and pattern analysis.
CCN modelsNumerous established applications in the fields of medicine and biology have successfully utilized CNNs for tasks such as image classification and segmentation, similar to the approach we have taken. These networks are instrumental across a diverse array of challenges including, but not limited to, cancer detection, continual disease monitoring, the generation of customized treatment plans, and accurate disease diagnosis. Furthermore, CNNs have demonstrated versatility in handling a variety of data sources including but not limited to X-rays, computed tomography (CT) scans, magnetic resonance imaging (MRI), retinography, pathological anatomy slides, and even sequences from the human genome, as cited in [15].
The subsequent examples represent merely a fraction of the applications employing CNNs in medical imaging, showcasing the breadth of ongoing research in this expansive field. These applications illustrate the potential of CNNs to classify a multitude of diseases, employ various imaging modalities, and utilize a diverse range of convolutional network architectures. In the realms of lung nodule classification, detection, and segmentation using CT scans, various deep learning approaches have been employed. These include 2D and 3D CNNs, various architectural designs, and often incorporate autoencoder structures. Such techniques have yielded diagnostic accuracies ranging from 84–95%, as reported in several studies [16,17,18,19,20,21]. Additionally, the MICCAI-BRATS (Brain Tumor Segmentation Challenge) annually conducts competitions for brain tumor segmentation, with outcomes and winners detailed on its website [22]. These challenges predominantly use MRI data and have led to significant developments, including a 3D DenseNet neural network aimed at determining the IDH genotype of gliomas, with an accuracy of 84.6% [23].
2D and 3D CNNs have been instrumental in classifying Alzheimer’s disease using MR imaging, as evidenced by various studies [24,25,26].
ISLES (Ischemic Stroke Lesion Segmentation) initiative regularly offers challenges aimed at the segmentation of ischemic brain lesions, predominantly utilizing MR imaging as noted in [27], with the latest challenges accessible on their website [28]. Current research on machine learning for diagnosing acute cerebral ischemic lesions in NCCT primarily focuses on techniques analyzing brain hemisphere symmetry [5, 29, 30], segmentation with generative networks in contrast-enhanced CT [31], and to a lesser extent, employs 2D [32] and 3D [2] CNNs. Regarding stroke analysis, several studies have implemented various artificial intelligence algorithms for the classification, segmentation, and diagnosis of strokes in NCCT and AngiographyCT images [5,6,7]. Mokli et al. [33] have conducted a review of the market’s available applications that utilize automatic and semi-automatic algorithms for image analysis in diagnosing acute cerebral infarction. Nevertheless, detailed information about the technical aspects of these algorithms, as well as specifics on the training and validation data, is often scant in the general descriptions provided on the applications’ websites [34].
Goodness metricsThe metrics used to evaluate the performance of the networks were the accuracy and the confusion matrix. These are standard measures to assess the performance of any classifier, so they can be used to compare other alternatives and validate the efficiency of the solutions presented.
The accuracy definition is quite simple:
However, this metric alone does not allow for a more complete algorithm evaluation. A better way to evaluate the performance of a classifier is to use the confusion matrix. This matrix allows other concepts to be assessed, as will be seen below. To calculate the confusion matrix, a set of predictions is made with the network in question, with the test/evaluation set. Each row of the confusion matrix represents the actual class (positive/negative), and each column represents the class generated by the model (positive/negative). Some metrics from the confusion matrix can be used:
True positives: the model predicts a value as true, and it really is true.
True negatives: the model predicts a value as false, and it really is false.
False positives: the model predicts a value as true, and it really is false.
False negatives: the model predicts a value as false, and it really is true.
From these definitions, two additional metrics can be considered: precision and sensitivity.
The sensitivity or true positive rate (recall) is the proportion of positive examples that are correctly detected by the classifier.
Depending on the problem being considered, it can be of interest to minimizing false positive cases, in which case precision could be used. On the other hand, sensitivity is the adequate metric if interest is focused on reducing false negative cases. In this specific case of stroke detection, it is very important to minimize false negatives which correspond to an incorrect stroke diagnosis. In other words, special attention to sensitivity must be considered. Unfortunately, improving on both metrics simultaneously is impossible, as increasing sensitivity reduces accuracy and vice-versa. Two other metrics from the confusion matrix can be deduced: specificity and the so-called F1-score, i.e., the weighted harmonic mean of accuracy and sensitivity.
F1-score can be used as a valid reference metric as it weights the rest of the metrics. In this case, an excellent performance is considered when the value is 1 (or close to it) and a very poor performance when it is close to 0.
Existing solutionsTo validate the work proposal, it is also necessary not only to analyze the performance of the developed models but also the solutions that are presented in order to validate the results obtained. These solutions are presented here.
Presently, there are five leading software platforms that incorporate machine learning algorithms for stroke detection, including Brainomix e-ASPECTS (Oxford, UK), Olea Medical (La Ciotat, France), Siemens Frontier (Erlangen, Germany), iSchemaView RAPID (California, USA), and Viz.ai (California, USA) [35].
Viz.ai specifically utilizes a convolutional neural network for the analysis of head and neck CT angiography to detect occlusions in major vessels of the anterior circulation [36].
Both e-ASPECTS and RAPID ASPECTS platforms are clinically certified for stroke diagnosis and segmentation in NCCT, with e-ASPECTS boasting the most validation studies and comparability with expert radiologists’ performance [33]. As medical software, these platforms fall under the category of medical devices and are subject to validation and certification by regulatory bodies such as the US FDA (Food and Drug Administration) and the EU’s MDR (Medical Device Regulation).
Frontier ASPECTS is not yet certified, as documented [37]. Comparative studies reveal that while e-ASPECTS demonstrates high agreement with expert evaluations, Frontier’s performance aligns moderately [38].
E-ASPECTS and RAPID ASPECTS both endeavor to quantitatively evaluate focal ischemic damage utilizing the ASPECTS scoring system. They process complete cranial CT scans in DICOM format and use heat maps to demarcate affected areas. The distinction lies in their respective machine learning methodologies: Brainomix employs random forest learning for classification and segmentation, whereas RAPID ASPECTS involves skull and cerebrospinal fluid removal, uses atlases to identify the 20 ASPECTS score regions, and implements random forest learning for classification and segmentation.
These solutions require a previous segmentation in all cases to diagnose and classify ischemic stroke. This classification must be carried out in a single computer algorithm capable of being classified based only on NCCT images.
NCCT scans often suffer from low contrast and noise, making detection challenging. As in the case of previously presented solutions, some works focus on the segmentation of NCCT images, as presented in [39] and [40], with results similar to those achieved by neuroradiologists [41]. In fact, there are recent developments focused on using a combination of Transformers with CCNs [42]. Again, it focuses on automatic image segmentation but not image detection. A similar work is presented in [43] using a combination of NCCT and CTA (CT Angiography) images) to train a multi-scale 3D CNN. The work shows that if NCCT imaging alone is used, the accuracy achieved is about 53% (0.53±0.09), which is a very far result from what is expected (near 100%). Only by combining NCCT and CTA images accuracies of 90% (0.90±0.04) are achieved. [44] uses a combination of NCCT images with Computer Tomography Perfusion (CTP) images. A pair of NCCT-CTP images is used for each case studied. The proposed ensemble model leverages three deep Convolutional Neural Networks (CNNs) to handle three end-to-end feature maps and handcrafted features defined by distinct contra-lateral properties. The results obtained are promising, with 91.16% accuracy, but define three CCN networks’ complex structure. In the above works, a combination of different types of images is used, but none exclusively use NCCT images.
This research provides an initial requirement to use only NCCT images with signs of early ischemia that can be analyzed with neural networks, as segmentation techniques do with the ASPECTS score [45].
Under this assumption, other works use only NCCT images. In [46] a two-stage CCN model is trained, which effectively identifies and locates invisible ischemic strokes in non-contrast CT images. The model achieved up to 91.89% identification accuracy for ischemic stroke on NCCT images. In [47] another CCN is used which effectively detects and segments a specific type of thrombus (intra-arterial thrombi) using non-contrast computed tomography scans. The CCN has similar sensitivity and specificity to expert neuroradiologists (the paper compares CNN results with two radiologists’ decisions), achieving 0.86 sensitivity and 0.65 specificity in thrombus detection. CCN accuracy is not calculated, but it can be computed, resulting in a value of 85%, considering a prevalence size of 0.99 (very high value).
A more general ischemic stroke detector is presented in [48] The CNN model achieved a success rate of over 90% in distinguishing ischemic stroke cases from healthy controls in medical images. In all the papers presented, the accuracy is below 92%. This fact can define another research question: pre-trained networks can achieve better accuracy in detecting ischemic stroke by directly classifying NCCT images without previous segmentation.
In the case of the use of pre-trained architectures, architectures such as YOLO have been tested [49]. YOLO algorithm is used in object identification tasks. The AC-YOLOv5 algorithm, combining adaptive local region contrast enhancement and YOLOv5, effectively detects ischemic stroke in NCCt images with high accuracy and recall rates. Algorithm metrics have a high accuracy (91.7%) and recall (88.6%) rate. Again, accuracy metrics is under 92%.
Within the specific works using pre-trained networks, only this one has been found [50]. This research paper proposed a U-Net model based on the VGG-16 backbone, effectively detecting and segmenting infarct core areas in NCCT scans of ischemic stroke patients. Similarity indices such as the Dice coefficient and IoU are used as metrics and explicitly used in segmentation rather than classification. The U-Net model achieves an impressive Intersection over Union (IoU) score of 0.76 and a Dice coefficient of 0.79.
After analyzing the research in the area of ischemic stroke detection using NCCT images, it can be concluded that the use of pre-trained networks is not widespread and that the accuracies obtained by CNN networks do not exceed 92% in the case of non-pre-trained architectures. Therefore, this paper aims to show that it is possible to answer the research question formulated above: pre-trained networks can achieve better accuracy in detecting ischemic stroke by directly classifying NCCT images without previous segmentation.
An additional objective is to propose an Open CNN model that aids radiologists in decision-making, given the proprietary nature or non-utilization of neural network techniques in these applications’ algorithms. The attainment of this aim necessitates a specific Dataset, details of which will be delineated in the following section.
Comments (0)