Artificial Intelligence & Deep Learning in Retina

Artificial Intelligence & Deep Learning in Retina

November 23, 2021

Since its introduction in 1959, artificial intelligence technology has evolved rapidly and helped benefit research, industries and medicine.

Deep learning, as a process of artificial intelligence (AI) is used in ophthalmology for data analysis, segmentation, automated diagnosis and possible outcome predictions.

The association of deep learning and optical coherence tomography (OCT) technologies has proven reliable for the detection of retinal diseases and improving the diagnostic performance of the eye's posterior segment diseases.

Deep learning is used in ophthalmology for data analysis, segmentation, automated diagnosis and possible outcome predictions.

At present, some difficulties of translating these algorithms into the clinical practice may be encountered, such as inconsistency in the reporting metrics when analyzing data from multiple OCT devices, imaging protocols that are not standardized between devices, and limited capabilities in graphics processing.

The increase in the number of retinal disease cases has produced an ever-growing demand for retinal image readers.

The development of artificial intelligence (AI) and deep learning analysis of retinal imaging can reduce the number of ophthalmologists needed for image interpretation and the time allocated for this procedure.

At the same time, AI may increase the efficiency of healthcare providers by establishing the correct and rapid diagnosis of retinal diseases.

Since Helmholtz’s pioneering invention of the ophthalmoscope in the 19th century, direct and indirect ophthalmoscopy serve as the standard methodology for the diagnostic assessment and management of retinal diseases.

As image acquisition instrumentation evolved, technologies such as echography and ultrawidefield fundus photography have emerged to assist or, in some cases, supplant standard funduscopic examination.

Concurrently, ongoing advances in artificial intelligence (AI) have prompted great interest in automated retinal image analysis in clinical and research settings.

Ophthalmology, and specifically the field of retina, has the opportunity to capitalize on the offerings of AI given the myriad clinical data and multimodal imaging routinely performed.

This brief review will address how AI has been implemented in the field of retina and its potential applications in the screening and management of retinal diseases.


Artificial intelligence broadly refers to the field of computer science concerned with a computer’s ability to carry out complex tasks synonymous with human performance, including visual processing, pattern recognition, and decision making.

Machine learning (ML), a subset of AI, includes the computer system “learning” associations between the input and output data provided and subsequently editing its coding to make enhanced predictions about new data.

Deep learning is a subset of machine learning that distinguishes itself from traditional ML by the type of data it uses and methodology the computer system learns.

Deep learning eliminates a considerable amount of the predefined data that is commonly employed in machine learning, which in turn permits the usage of unstructured data and limits the dependency of human input.

Additionally, in deep learning, “hidden layers” are added between the input and output layers that permit a more intricate evaluation of the input data.

Thus, the deep learning system can independently formulate decisions or associations, making decisions often invisible to human experts.

Through the combination of input data and different weights and biases determined by the system, deep learning neural networks make novel yet accurate perceptions, classifications, and delineations of various data sets.


Given the increasing burden of diabetic retinopathy (DR) and its far-reaching public health and societal implications, DR screening has been a major focus of AI efforts.

Despite being the leading cause of vision loss among working-age adults in the United States, only 40% of diabetic patients obtain their annual recommended screening.

Artificial intelligence screening tools may come to address existing access/resource gaps and to better delineate patients who have referrable disease and require treatment from the wider population pool.

A 2016 pivotal study by Google Inc. highlighted the potential of deep learning in DR screening by showing that referable DR could be identified from a single fundus photo with a sensitivity of 97.5% and specificity of 98.5%.

Since then, autonomous AI systems have been developed and approved for DR screening. In April 2018, the IDx-DR (IDx) became the first US Food and Drug Administration approved fully autonomous AI-based medical device to detect “more than mild diabetic retinopathy” (mtmDR).

This system uses 45-degree fundus photos, which are uploaded and analyzed using cloud-based software for the detection of mtmDR. The images can be taken during the patient’s primary care appointment and the patient referred to an ophthalmologist if mtmDR is detected.

In the pivotal prospective study using this system, sensitivity was 87.2% and specificity was 90.7% for the detection of mtmDR.

Additional systems including the iGradingM (Medalytix/ Emis Health), EyeArt (EyeNuk Inc.), and Retmarker (Retmarker Ltd.) have been approved in Europe for the automated screening of DR using fundus photography.

In the assessment of 102,856 fundus photos of 20,258 patients, the EyeArt and Retmarker systems achieved sensitivities of 93.8% and 85.0%, respectively, for referable retinopathy, and sensitivities of 99.6% and 97.9% for proliferative disease, respectively.

Both the Retmarker and EyeArt systems have been validated to have acceptable sensitivity to capture referable retinopathy when compared to human graders, potentially making them a cost-effective alternative to manual grading alone.

In the pivotal clinical study using the EyeArt system, a sensitivity of 96% and specificity of 88% were found for detection of mtmDR. This system gained FDA clearance in the United States for automated DR screening in August 2020.

Other imaging modalities, such as optical coherence tomography (OCT) and OCT angiography (OCTA), are actively being investigated for DR screening and management. Prognostic models have been developed to predict retinal response to anti-VEGF treatments in patients with macular edema through the analysis of OCT images.


Substantial challenges exist in the screening and diagnosis of retinopathy of prematurity given the disease’s clinical variability and limitations to access to trained screening specialists.

Management decisions hinge on the location, stage of vascular findings, as well as the presence of plus disease.

Given that the Early Treatment of ROP study identified plus disease as one of the most important parameters for identifying treatment-level ROP, great emphasis has been placed on identifying it on screening examinations.

Creating a standardized ROP screening system that is both reliable and repeatable has become a major goal, and computer-based image analysis stands to make a significant impact.

Multiple algorithms have demonstrated promise for detecting plus or pre-plus disease with achieved accuracies of 95% while outperforming human ROP experts evaluating the same data set.

The DeepROP system incorporates both ROP zone and stage into its classification model and grades images as either normal, minor ROP, or severe ROP.

The i-ROP system is capable of categorizing fundus photos into type 1, type 2, and pre-plus ROP with probability scores of 0.96 and 0.91 for detecting type 1 ROP and clinically significant ROP, respectively.

The i-ROP score has been shown to be noninferior to human diagnosis when identifying vascular changes in pre-plus and plus disease.

These programs demonstrate that deep learning may minimize the interobserver variability that has challenged ROP screening and play a continued role in the screening, particularly in resource-limited settings.


Timely detection and treatment of AMD, specifically neovascular AMD, often leads to better visual outcomes.

In-office examinations, as well as home monitoring tools such as the Amsler grid and portable devices (Foresee Preferential Hyperacuity Perimeter; Reichert Technologies), have customarily been employed for detecting AMD progression.

Although AI DR screening relies primarily on fundus photography, AI systems in the context of AMD have focused heavily on OCT images.

Trained neural networks have demonstrated strong accuracy in the differentiation of OCT images of normal and AMD patients, with a sensitivity and specificity of 92.64% in normal patients and 93.69% in AMD patients.

In the context of diagnosing exudative AMD, AI had upwards of 91.0% accuracy and 95.5% accuracy for predicting the need for injection treatment. Software has also reliably detected subretinal and sub-RPE fluid with high accuracy.

Fundus image models have also demonstrated promising results. The DeepSeeNet program performed better than retina specialists in the accuracy (0.671 vs 0.599), sensitivity (0.590 vs 0.512), and specificity (0.930 vs 0.916) in classifying eyes based on the AREDS severity score.

The program has also demonstrated the capability to identify geographic atrophy with an accuracy comparable to human graders. Artificial intelligence models have also been used in a prognostic context in AMD.

A predictive model using OCT features and demographic factors of 495 fellow eyes from the HARBOR trial, differentiated converting vs nonconverting eyes with a performance of 0.68 and 0.80 for the development of choroidal neovascularization and geographic atrophy, respectively.

Models have also been developed with the purpose of predicting visual acuity response to treatment with anti-VEGF therapy using baseline OCT images.


Sickle cell disease is one of the most common inherited genetic diseases with various ocular manifestations in the retina, warranting careful monitoring and treatment.

OCTA features such as blood vessel diameter and tortuosity, vessel perimeter index, foveal avascular zone area, contour irregularity, and parafoveal avascular density have been used to train algorithms to identify retinopathy with an average accuracy of 95%.

Algorithms have also been able to differentiate between mild sickle-cell retinopathy (stage II) and severe sickle-cell retinopathy (stage III) with an accuracy rate of 97%.

Machine learning tools have also been applied to screen for systemic risk factors and disease. Models have accurately determined a patient’s age, sex, smoking status, and systolic blood pressure from a single fundus photograph.

The same algorithm may also predict a patient’s 5-year risk of developing a major adverse cardiac event.


Although the diagnostic accuracy of AI programs is impressive, algorithms may have a relatively high false positive rate of detection,which may necessitate unneeded referrals. However, since these result in clinical examinations, no unnecessary treatments would occur.

Screening programs must have high sensitivity in order to be clinically safe. The specificity should be high enough to be clinically useful. Furthermore, development of functional algorithms rely on the quality and the abundance of source data.

Homogenous data may lead to biases in the models, particularly when they are generalized and utilized in conjunction with underrepresented populations.  

Thus, the training set used must be diverse and include various subsets of the population at large to develop an algorithm with widespread applicability.

Additionally, despite the common goal of accurately identifying diseases, there currently is no standardized methodology/protocols for image capturing and image analysis algorithms, inherently resulting in variability and usability.

Furthermore, the importance of obtaining images with sufficient quality for grading is paramount because systems are unable to access images if below a certain threshold.

The National Institutes of Health through its collaborative community projects is working on these areas of unmet needs for various ophthalmic conditions.

Collaboration across countries and organizations as well as extensive data sharing and open-source algorithms will ensure relevant and useful AI systems in the future for screening and determining of treatment prognosis for ophthalmic conditions.

As the capabilities of AI evolve,more commercially available products for not only DR but other diseases will begin to appear.

Together with current routine clinical practice, AI and deep learning offer potential avenues to improve clinical efficiency, expand access to care, and ultimately improve the overall quality of care.