A Multicenter Analysis of Subjectivity of Indirect Immunofluorescence Test in Antinuclear Antibody Screening
Vildan TURAN FARAŞAT1, Talat ECEMİŞ1, Yavuz DOĞAN2, Aslı Gamze ŞENER3, Gülfem TEREK ECE4, Pınar Erbay DÜNDAR5, Tamer ŞANLIDAĞ6
1Department of Medical Microbiology, Manisa Celal Bayar University, Faculty of Medicine, Manisa, Turkey
2Department of Medical Microbiology, Dokuz Eylül University, Faculty of Medicine, Izmir, Turkey
3Department of Medical Microbiology, Katip Çelebi University, Faculty of Medicine, Izmir, Turkey
4Department of Medical Microbiology, Izmir Medicalpark Hospital, Izmir, Turkey
5Department of Public Health, Manisa Celal Bayar University, Faculty of Medicine, Manisa, Turkey
6Department of Medical Microbiology, Manisa Celal Bayar University, Faculty of Medicine, Manisa, Turkey
Keywords: Antinuclear antibodies, autoimmune rheumatic diseases, indirect immunofluorescence, subjectivity
Objectives: This study aims to evaluate the interpretation of the antinuclear antibody (ANA)-indirect immunofluorescence (IIF) test results based on the interpreter-related subjectivity and to examine the inter-center agreement rates with the performance of each laboratory.
Patients and methods: The ANA-IIF testing was carried out in a total of 600 sera and evaluated by four laboratories. The inter-center agreement rates were detected. The same results given by the four centers were accepted as gold standard and the predictive values of each center were calculated.
Results: The inter-center agreement was reported for ANA-IIF test results from 392 of 600 (65.3%) sera, while 154 of 392 results were positive. Four study centers reported 213 (35.5%), 222 (37.0%), 266 (44.3%), and 361 (60.2%) positive test results, respectively. In terms of the patterns, the highest and lowest positive predictive values were 72.3% and 42.7%, respectively, while the highest and lowest negative predictive values were 99.6% and 61.5%, respectively. The agreement for semi-quantitative evaluation at three levels of fluorescence intensity stated by four centers was detected in 100 sera at 87% 3(+), while the other two levels were 6% and 7%. The highest predictive value for the highest fluorescence intensity of 3(+) was found to be 71.9%.
Conclusion: Significant differences may be observed among laboratories in terms of qualitative results, patterns, and semi-quantitative determination of the fluorescence intensity in the ANA-IIF testing, particularly at low fluorescence intensity levels and in those with speckled patterns. In case of any discrepancy between ANA-IIF test and clinical prediagnosis, the test should be repeated in another laboratory, if necessary.
Systemic autoimmune rheumatic diseases are a wide, heterogeneous group of diseases characterized by the occurrence of antinuclear antibodies (ANAs) directed against several intracellular targets. Therefore, detection of ANAs is of utmost importance in the diagnosis of these diseases.(1,2)
The term ANA is based on historical development in the understanding of this group of diseases; however, currently, this term indicates antibodies against both nuclear and cytoplasmic antigens. Indirect immunofluorescence (IIF) test has been used for detecting these antibodies for more than five decades and considered the gold standard screening method owing to its high sensitivity.(3) By binding to cells from human laryngeal epithelial carcinoma-2 (HEp-2) cell line, which are specifically used for this test, ANAs create images (patterns) which not only demonstrate the presence of ANAs but that may also predict the prognosis for a specific ANA type, which makes IIF test more than a screening test and superior to other screening modalities.(4) The ANA titers may be determined with IIF tests, and fluorescence intensity may be expressed semi-quantitatively as currently practiced in many laboratories.(5) Based on the currently established ANA disease testing algorithm, the second-line reflex tests including enzyme-linked immunosorbent assay (ELISA) using extractable nuclear antigens (ENAs), flow cytometry or immunoblotting techniques, which can detect multiple anti-ENA antibodies, are used to detect specific antibodies, when the initial screening tests produce ANA positivity.(6-8)
Although the ANA-IIF test has been considered the gold standard screening test, this technique involves a number of time-consuming manual procedures with standardization issues.(5,9) The major challenge and limitation of the ANA-IIF testing is that it requires competent and experienced interpreters and subjectivity. Although subjectivity is an established fact in routine laboratory practice, the degree of subjectivity has been investigated in a limited number of studies in the literature to date. Therefore, in this study, we aimed to evaluate the interpretation of the ANA-IIF test results based on the interpreter-related subjectivity and to examine the inter-center agreement rates with the performance of each laboratory.
Patients and Methods
This multicenter study was conducted between January 2016 and September 2017 in medical microbiology laboratories of four tertiary hospitals. The ANA-IIF analyses were assessed by specialists who were competent in reading and interpreting ANA-IIF testing. The interpreter-assessed positive/negative rates, patterns detected in the analyses and semi- quantitative analysis of fluorescence intensities were compared to each other. The results were interpreted and statistically analyzed.
The inclusion criteria for the laboratories were as follows: having staff microbiologists and technicians with a minimum of five-year experience in reading and interpreting ANA-IIF tests; and the inter-center distances not exceeding 50 km to protect test slides in transport.
A code unrelated to the above order was randomly assigned to each center as Center 1, Center 2, Center 3, and Center 4. The Serology Laboratory of Manisa Celal Bayar University, Faculty of Medicine performed and interpreted a total of 100 ANA-IIF tests every day. Test slides were delivered to other centers during the same day in containers with light and heat insulation. Tests were assessed by interpreters within the same day.
Test sera were selected independently from the clinical diagnosis. A total of 600 sera with positive and negative results were obtained from the centers and included in the study. All sera were collected at Manisa Celal Bayar University, Faculty of Medicine. The sera were stored at -70°C until analysis. The IIFT Mosaic Basic Profile 3A® (Euroimmun®, Luebeck, Germany) test containing both human HEp-2 cells and monkey liver cells was used in the ANA-IIF testing. Tests were conducted in sera diluted 1/100 in accordance with the manufacturer’s instructions.
For the assessment of ANA-IIF tests, the sera were randomly distributed into slides. All ANA-IIF tests were assessed in a double-blind fashion. The Guidelines for the Laboratory Diagnosis of Autoantibodies of the Society for Clinical Microbiologists of Turkey was used in the assessment of ANA patterns, and fluorescence microscopes of the same brand (EUROStar II®, Euroimmun®, Luebeck, Germany) were used in all study centers.(10) The fluorescence intensity was semi-quantitatively rated at three levels and expressed as 1(+), 2(+), 3(+).
Considering multiple patterns in sera with ANA positivity in the ANA-IIF test, the test results were divided into two categories based on the pattern diagnosed in the analysis: complete agreement and partial agreement. Agreement with at least one pattern was considered partial agreement, while agreement for all patterns was considered complete agreement, if multiple patterns were detected in the same sera sample.
Statistical analysis was performed using the SPSS version 15.0 software (SPSS Inc., Chicago, IL, USA). The percentages of agreement were calculated for comparison. The ANA-IFF tests yielding the same results from all four centers were considered the gold standard to numerically express the performance of each center and to enable comparisons. Also, positive and negative predictive values were calculated for each center.(11)
Four centers used a total of 17 pattern types and 90 patterns were identified in multiple combinations of single or two, or three or four pattern types. Speckled pattern was the most widely reported pattern type for either single or multiple patterns (Table 1).
|Center 1||Center 2||Center 3||Center 4|
|SP: Single pattern; MP: Multiple patterns; NS: Nuclear speckled; NDFS: Nuclear dense fine speckled; NH: Nuclear homogeneous; CR: Cytoplasmic reticular; C: Centromere; N: Nucleolar; CS: Cytoplasmic speckled; SLP: Scl-70 like pattern; CLF: Cytoplasmic fibrillar linear; CE: Centrosome; NDF: Nuclear dots few; CDFS: Cytoplasmic dense fine speckled; NM: Nuclear membrane; IB: Intracellular bridge; CFF: Cytoplasmic fibrillar filamentous; SF: Spindle fibers; CPS: Cytoplasmic polar speckled.|
According to the ANA-IIF test results reported from each center, the highest rate of positive results was reported by Center 4 with 361 (60.2%) positive test results (Table 2). The lowest rate of positive results was reported by Center 2 with 213 (35.5%) positive test results.
|* In total number of tests (n=600); ** Positive predictive value; *** Negative predictive value.|
In the assessment of semi-quantitative analysis of the fluorescence intensity rated by the ANA-IIF tests, 3(+) fluorescence intensity was the most commonly reported level by three centers, while the highest rates of 1(+) or 2(+) fluorescence intensity were observed in Center 4.
In the assessment of the inter-center agreement for the analysis of the same sera samples by the ANA-IIF test, the number of sera for which all of the four centers agreed on (including partial agreements) was found to be 392 (65.3%) of which 154 (39.3%) were sera tested positive for ANA and 238 (60.7%) were sera tested negative for ANA (Table 3). The calculation of predictive values of the test results from each center revealed that Center 2 had the highest rate with 72.3% in 392 positive results upon which a consensus was reached. In addition, Center 4 had the highest negative predictive values with a single non-agreement (99.6%). These results are shown in Table 2.
|Agreement on patterns of four centers||Agreement on fluorescence intensity of four centers|
|Agreement on positive results|
|Complete agreement||Partial agreement||Toplam||Agreement on negative results||Total||1(+)||2(+)||3(+)||Total|
|* In total positive results; ** In total; *** In total number of tests (n=600).|
The same fluorescence intensity was reported by the four centers for 100 sera based on the semi-quantitative analysis, and 87 (87%) of 100 complete agreements were at 3(+) fluorescence intensity level (Table 3). In a total of 100 sera upon which a consensus was reached, Center 2 had the highest predictive value with 46.7%. The predictive values of the three fluorescence intensity levels are shown in Table 4 and Center 2 had the highest rate with 71.9%.
|* In total; ** Positive predictive value in fluorescence intensity category; *** Positive predictive value in overall agreement (n=100).|
In addition, following results were noted on the pattern agreement among four centers:
1. A speckled pattern was diagnosed in 57 of 104 sera upon which a complete agreement was reached by all of four centers and 35 of 50 (70%) sera upon which a partial agreement was reached.
2. Of 46 sera considered positive by three of four centers, 19 (41.3%) were found to be negative for ANAs by a single center.
3. Of 94 sera considered negative by three of four centers, 91 (96.8%) sera were considered positive by Center 4, and 81 (89.0%) of these 91 sera showed nuclear and cytoplasmic speckled patterns (45 and 36 sera, respectively) of which 65 (71.4%) had 1(+) fluorescence intensity level and the rest had 2(+) fluorescence intensity level.
The most important feature expected to be found in a screening test is the ability to discriminate patients from healthy individuals and guide to an appropriate second-line reflex test, in case of a positive result. As with ANA screening using the ELISA, which has been increasingly used currently, a positive result may be adequate to guide to second-line reflex tests such as ELISA and immunoblotting, irrespective of the pattern; however, the ANA-IIF testing seems to be an indispensable natural array, as patterns detected in this test may predict certain antibodies, which are not covered by reflex tests (as seen with mitotic or nuclear membrane pattern), and ANA-IIF may allow the detection of new antibodies in the future.(7,8) The American College of Rheumatology has also emphasized that IIF-based tests should remain as the gold standard in ANA screening and alternative tests should exhibit similar performance to IIF test.(12)
In our study, the assessment of positive and negative results reported from the study centers revealed that the lowest rate was found to be 35.5% (Center 2) and the highest rate was found to be 60.2% (Center 4) (Table 2). A difference of 24.7% indicates that the results may differ qualitatively among different laboratories in about one of four sera. The other important issues here are the agreement rates among positive and negative results reported from the centers and the predictive values of these results. Agreement was reached by four centers in 392 (65.3%) of 600 ANA-IIF tests. Considering this consensus as the gold standard, serious differences were observed among the centers in performance rates, as shown in Table 2. Positive predictive values of the centers varied from 42.7 to 72.3%. A difference of 30% was observed in the positive results, while the difference further increased in the negative predictive values, reaching about 38% (Table 2). These differences indicate the extent of subjectivity. In addition, Center 4 seems to increase the difference, and this should be investigated. Also, the difference in the predictive values was calculated after excluding Center 4 and the difference in positive predictive values and the difference in negative predictive values decreased to 14.4% and 9.8%, respectively, indicating a relative improvement in the subjectivity rates. However, we believe that the results obtained by excluding Center 4, expert laboratory for ANA-IIF test, were not more realistic. The percentage of non-agreement with the results reported from other centers was 57.3% in 361 positive results reported by Center 4, suggesting that Center 4 had difficulty in interpreting positive results. Indeed, it is likely to result from the deceptive appearance of speckled patterns, as in many laboratories. It is not difficult to fall into this deceptive trap, particularly with borderline patterns. The analyses of the patterns reported by the study centers indicated that Center 4 reported a nuclear speckled pattern in 179 positive tests and cytoplasmic speckled pattern in 97 positive tests, while the closest values to these results were 77 and 38 for the same patterns, respectively, as shown in Table 1. Furthermore, Center 4 reported positive results for 91 of 94 sera which were considered negative for ANA by the other three centers, while 89% of these sera showed speckled pattern and fluorescence intensity was 1(+) (borderline) in 71%, indicating that the mistake mainly originated from the speckled patterns and the center had difficulty in discriminating positive results from negative results. The reason for the difficulty in interpreting can be attributed to dusting on the fluorescence microscope and photobleaching effect, which may be lessened by 30% using the monitor instead of IIF microscope.(13) In addition, in overlap patterns, speckled patterns may overlap other patterns and the overlap pattern may remain unrecognized or may be confused with other patterns. The combination of the homogenous pattern and nuclear speckled pattern may be frequently confused with the dense fine speckled 70. As we have mentioned in the results section, in our study, the speckled pattern rate was 54.9% in complete agreements, while the rate of speckled pattern was 70% in partial agreements, which may be the results of overlapping patterns. In general, the discrimination problem experienced by Center 4 is the common problem of many ANA-IIF interpreters.
In the literature, there is a limited number of studies on the subjectivity in ANA-IIF testing. In a study comparing two laboratories in the same region, the rate of agreement in 101 tests was found to be 42%.(14) In another study, it was reported that the computer-aid diagnosis system might help resolving drawbacks in the interpretation of ANA-IIF tests, while the mean agreement rate among three interpreters using this system was found to be moderate.(15) In a study using the quality samples for the external quality assessment, inter- laboratory agreement rate varied between 92.7 and 99.5%.(16) The external quality samples may include marked patterns with high antibody levels which may explain higher agreement rates, compared to our study results. In our study, sera were routinely obtained only from those known or suspected to be patient, and the study was conducted under routine laboratory conditions.
Although it is recommended to report by titration antibody levels for ANA-IIF tests, the semi-quantitative expression of the antibody levels such as three or four positive fluorescence intensity has been more widely used.(5,17) The semi-quantitative analysis is cost-effective which significantly decreases workload of a laboratory; however, this technique is highly subjective. In our study, 3(+) fluorescence intensity was the most frequently reported result by the first three centers and the rates of 3(+) fluorescence intensity were higher than 50% for these three centers, while the rates of all three levels of fluorescence intensity were quite close to each other in Center 4. The rate of agreement for overall fluorescence intensity was also the lowest in Center 4 (27.7%). None of the centers was able to achieve 50% in overall positive predictive value for fluorescence intensity (Table 4). The assessment of each individual fluorescence intensity level revealed that at 3(+), which was the most intensive level, the agreement rate was as high as 87% corresponding to a quite good agreement (Table 3). Predictive values of the centers varied between 64 and 72% at 3(+) intensity level and these rates might be considered satisfactory; however, the rates were found to be lower than expected at 1(+) intensity level and 2(+) intensity level (Table 4). In the most recent study, the agreement rate at 3(+) intensity level was similar to the rates in our study, while our results for lower fluorescence intensities were far lower than 43%, which was reported in the aforementioned study.(15) Based on our study results, it may be concluded that the semi-quantitative analysis of fluorescence intensity is not effective at lower intensity levels and the results obtained at these levels may not be of clinical relevance. Many laboratories use four levels of fluorescence intensity rather three levels. This practice further increases the rates of non-agreement. It is possible that the rate of non-agreement can be decreased and the results may be more meaningful, if the fluorescence intensity levels are categorized into high and low intensity levels alone, rather than today’s use.
A significant challenge observed in our study was the lack of standardization in pattern nomenclature and reporting. It is of paramount importance to create a common terminology of patterns and reporting for both laboratories and clinicians. In routine practice, the lack of a precise categorization and evaluation level of patterns, and the way to express them still remain as major problems. To overcome this problem, an important initiative was launched in 2014 and classification and nomenclature standards were established by the International Consensus on ANA Patterns (ICAP) including categorization and nomenclature, and a reporting format was developed at the expert and competent level for pattern reporting.(18)
Although the expert-level classification and nomenclature of the ICAP was used in our study, it was impossible for some patterns and the routine practice was inevitably followed. The discrimination between the course and fine speckled patterns, which should be reported at the expert-level, could not be established by certain centers and the staff working in these laboratories stated that they were not used to make such discrimination. However, patients cannot be directed to specific reflex tests such as ribonucleoprotein/Smith, Sjögren’s syndrome type A, or Sjögren’s syndrome type B, when such discrimination is not made, and the greatest strength of the ANA-IIF screening test may not be benefited. One of the observations in this study was the use of antibody names for certain patterns instead of those in the nomenclature; for instance, Jo-1 pattern was used rather than cytoplasmic fine-speckled pattern. Routine laboratory experience and this study have demonstrated that detailed expert-level reporting leads to confusions currently. Evaluation difficulties are not only seen in our country but all over the world and the categorization diversity of patterns is generally reduced, such as five.(19) It may be concluded that the establishment of this nomenclature and reporting may take time.
Microscopic assessments are subjective in nature which can be minimized by training and experience. Specific properties of the ANA-IIF test further increase the subjectivity. Over the past decade, training in this field has gained importance in Turkey. Although many scientific disciplines have paid attention to the ANA-IIF test training, these tests are not still considered an essential subject matter of a branch of science. The level of subjectivity in our study may be substantially reduced, by increasing the awareness on the ANA-IIF test training and ensuring that competent healthcare professionals perform this test. Certification programs may be initiated for the evaluation of ANA-IIF test and those who have this certificate may be authorized to evaluate this test in the laboratory. The development and extensive use of ELISA ANA screening test, which have been recently introduced to several laboratories, may be one of the solutions to overcome this issue. We believe that the use of automated systems is the ideal solution to alleviate interpreting and reporting issues of ANA-IIF test while preserving its favorable properties. The most optimal solution to decrease subjectivity seems to be the development and extensive use of these expensive systems which have been introduced to a small number of laboratories and have not been operated in full capacity and the use of these systems with the visual ANA-IIF.(20)
In conclusion, although certain features such as excellent screening and guidance to reflex tests make the ANA-IIF tests indispensable, in this multicenter study, we found that the inter-center agreement might decrease up to 65%, and the difference in positive and negative predictive values might increase up to 30% and 38%, respectively. In the assessment of ANA-IIF test reports, it should be kept in mind that semi-quantitative analysis is experience-based subjective, particularly at low antibody levels, and deceptive nature of the speckled patterns should be considered. In addition, the clinical relevance should be analyzed for the test results. It may be a rational approach to request a repeat test from another laboratory, in case of any suspicion.
The authors declared no conflicts of interest with respect to the authorship and/or publication of this article.
The authors received no financial support for the research and/or authorship of this article.
- Bonaguri C, Melegari A, Dall’Aglio P, Ballabio A, Terenziani P, Russo A, et al. An italian multicenter study for application of a diagnostic algorithm in autoantibody testing. Ann N Y Acad Sci 2009;1173:124-9.
- Solomon DH, Kavanaugh AJ, Schur PH. Evidence- based guidelines for the use of immunologic tests: antinuclear antibody testing. Arthritis Rheum 2002;47:434-44.
- Copple SS, Sawitzke AD, Wilson AM, Tebo AE, Hill HR. Enzyme-linked immunosorbent assay screening then indirect immunofluorescence confirmation of antinuclear antibodies: a statistical analysis. Am J Clin Pathol 2011;135:678-84.
- Tozzoli R, Antico A, Porcelli B, Bassetti D. Automation in indirect immunofluorescence testing: a new step in the evolution of the autoimmunology laboratory. Auto Immun Highlights 2012;3:59-65.
- Damoiseaux J, von Mühlen CA, Garcia-De La Torre I, Carballo OG, de Melo Cruvinel W, Francescantonio PL, et al. International consensus on ANA patterns (ICAP): the bumpy road towards a consensus on reporting ANA results. Auto Immun Highlights 2016;7:1.
- Wieser M, Pohla-Gubo G, Hintner H. Antinuclear antibodies (ANA) diagnostic value of different methods for screening and differentiation. Clin Appl Immunol Rev 2001;3:201-6.
- Adams BB, Mutasim DF. The diagnostic value of anti-nuclear antibody testing. Int J Dermatol 2000;39:887-91.
- Op De Beeck K, Vermeersch P, Verschueren P, Westhovens R, Mariën G, Blockmans D, et al. Detection of antinuclear antibodies by indirect immunofluorescence and by solid phase assay. Autoimmun Rev 2011;10:801-8.
- Copple SS, Giles SR, Jaskowski TD, Gardiner AE, Wilson AM, Hill HR. Screening for IgG antinuclear autoantibodies by HEp-2 indirect fluorescent antibody assays and the need for standardization. Am J Clin Pathol 2012;137:825-30.
- Kaklıkkaya N, Kaşifoğlu N, Sarıbaş Z, Şener B, Taşkınoğlu T, Uyar NY. Otoantikorların laboratuvar tanısı rehberi (KLİMUD Yayın No: 9) Ankara: Çağhan Ofset; 2015. p. 40-84.
- Sackett DL, Haynes RB, Guyatt GH, Tugwell P, editors. Clinical Epidemiology: A Basic Science for Clinical Medicine. 2nd ed. New York: Little Brown; 1991. p. 163-7.
- American College of Rheumatology Position Statement: Methodology of Testing for Antinuclear Antibodies. (2009). Available at: http://www. rheumatology.org/practice /clinical/position/ana_ position_stmt.pdf.
- Rigon A, Soda P, Zennaro D, Iannello G, Afeltra A. Indirect immunofluorescence in autoimmune diseases: assessment of digital images for diagnostic purpose. Cytometry B Clin Cytom 2007;72:472-7.
- Abeles AM, Gomez-Ramirez M, Abeles M, Honiden S. Antinuclear antibody testing: discordance between commercial laboratories. Clin Rheumatol 2016;35:1713-8.
- Rigon A, Infantino M, Merone M, Iannello G, Tincani A, Cavazzana I, et al. The inter-observer reading variability in anti-nuclear antibodies indirect (ANA) immunofluorescence test: A multicenter evaluation and a review of the literature. Autoimmun Rev 2017;16:1224-9.
- Pham BN, Albarede S, Guyard A, Burg E, Maisonneuve P. Impact of external quality assessment on antinuclear antibody detection performance. Lupus 2005;14:113-9.
- Fritzler MJ. The antinuclear antibody test: last or lasting gasp? Arthritis Rheum 2011;63:19-22.
- Chan EK, Damoiseaux J, Carballo OG, Conrad K, de Melo Cruvinel W, Francescantonio PL, et al. Report of the First International Consensus on Standardized Nomenclature of Antinuclear Antibody HEp-2 Cell Patterns 2014-2015. Front Immunol 2015;6:412.
- 19 Tebo AE. Recent Approaches To Optimize Laboratory Assessment of Antinuclear Antibodies. Clin Vaccine Immunol 2017;24.
- Alsuwaidi M, Dollinger M, Fleck M, Ehrenstein B. The Reliability of a Novel Automated System for ANA Immunofluorescence Analysis in Daily Clinical Practice. Int J Rheumatol 2016;2016:6019268.