Abstract
Objectives
Establishing the reproducibility of expert-derived measurements on CTA exams of aortic dissection is clinically important and paramount for ground-truth determination for machine learning.
Methods
Four independent observers retrospectively evaluated CTA exams of 72 patients with uncomplicated Stanford type B aortic dissection and assessed the reproducibility of a recently proposed combination of four morphologic risk predictors (maximum aortic diameter, false lumen circumferential angle, false lumen outflow, and intercostal arteries). For the first inter-observer variability assessment, 47 CTA scans from one aortic center were evaluated by expert-observer 1 in an unconstrained clinical assessment without a standardized workflow and compared to a composite of three expert-observers (observers 2–4) using a standardized workflow. A second inter-observer variability assessment on 30 out of the 47 CTA scans compared observers 3 and 4 with a constrained, standardized workflow. A third inter-observer variability assessment was done after specialized training and tested between observers 3 and 4 in an external population of 25 CTA scans. Inter-observer agreement was assessed with intraclass correlation coefficients (ICCs) and Bland-Altman plots.
Results
Pre-training ICCs of the four morphologic features ranged from 0.04 (−0.05 to 0.13) to 0.68 (0.49–0.81) between observer 1 and observers 2–4 and from 0.50 (0.32–0.69) to 0.89 (0.78–0.95) between observers 3 and 4. ICCs improved after training ranging from 0.69 (0.52–0.87) to 0.97 (0.94–0.99), and Bland-Altman analysis showed decreased bias and limits of agreement.
Conclusions
Manual morphologic feature measurements on CTA images can be optimized resulting in improved inter-observer reliability. This is essential for robust ground-truth determination for machine learning models.
Key Points
• Clinical fashion manual measurements of aortic CTA imaging features showed poor inter-observer reproducibility.
• A standardized workflow with standardized training resulted in substantial improvements with excellent inter-observer reproducibility.
• Robust ground truth labels obtained manually with excellent inter-observer reproducibility are key to develop reliable machine learning models.
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00330-022-09056-z/MediaObjects/330_2022_9056_Fig1_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00330-022-09056-z/MediaObjects/330_2022_9056_Fig2_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00330-022-09056-z/MediaObjects/330_2022_9056_Fig3_HTML.png)
![](http://media.springernature.com/m312/springer-static/image/art%3A10.1007%2Fs00330-022-09056-z/MediaObjects/330_2022_9056_Fig4_HTML.png)
Similar content being viewed by others
Abbreviations
- CT:
-
Computed tomography
- CTA:
-
Computed tomography angiography
- GRRAS:
-
Guidelines for reporting reliability and agreement studies
- ICC:
-
Intraclass correlation coefficient
- ROADMAP:
-
Registry of aortic dissections to model adverse events and progression
- TEVAR:
-
Thoracic endovascular aortic repair
- uTBAD:
-
Uncomplicated acute Stanford type B aortic dissection
References
Hagan PG, Nienaber CA, Isselbacher EM et al (2000) The International Registry of Acute Aortic Dissection (IRAD): new insights into an old disease. JAMA 283:897–903
Howard DP, Banerjee A, Fairhead JF et al (2013) Population-based study of incidence and outcome of acute aortic dissection and premorbid risk factor control: 10-year results from the Oxford Vascular Study. Circulation 127:2031–2037
Landenhed M, Engstrom G, Gottsater A et al (2015) Risk profiles for aortic dissection and ruptured or surgically treated aneurysms: a prospective cohort study. J Am Heart Assoc 4:e001513
Nienaber CA, Clough RE (2015) Management of acute aortic dissection. Lancet 385:800–811
Fleischmann D, Afifi RO, Casanegra AI et al (2022) Imaging and Surveillance of Chronic Aortic Dissection: A Scientific Statement From the American Heart Association. Circ Cardiovasc Imaging. https://doi.org/10.1161/HCI.0000000000000075:HCI0000000000000075
MacGillivray TE, Gleason TG, Patel HJ et al (2022) The Society of Thoracic Surgeons/American Association for Thoracic Surgery clinical practice guidelines on the management of type B aortic dissection. J Thorac Cardiovasc Surg. https://doi.org/10.1016/j.jtcvs.2021.11.091
Spinelli D, Benedetto F, Donato R et al (2018) Current evidence in predictors of aortic growth and events in acute type B aortic dissection. J Vasc Surg 68:1925–1935 e1928
Chang CP, Liu JC, Liou YM, Chang SS, Chen JY (2008) The role of false lumen size in prediction of in-hospital complications after acute type B aortic dissection. J Am Coll Cardiol 52:1170–1176
Delsart P, Beregi JP, Devos P, Haulon S, Midulla M, Mounier-Vehier C (2014) Thrombocytopenia: an early marker of late mortality in type B aortic dissection. Heart Vessels 29:220–230
Evangelista A, Salas A, Ribera A et al (2012) Long-term outcome of aortic dissection with patent false lumen: predictive role of entry tear size and location. Circulation 125:3133–3141
Grommes J, Greiner A, Bendermacher B et al (2014) Risk factors for mortality and failure of conservative treatment after aortic type B dissection. J Thorac Cardiovasc Surg 148(2155-2160):e2151
Jonker FH, Trimarchi S, Rampoldi V et al (2012) Aortic expansion after acute type B aortic dissection. Ann Thorac Surg 94:1223–1229
Kudo T, Mikamo A, Kurazumi H, Suzuki R, Morikage N, Hamano K (2014) Predictors of late aortic events after Stanford type B acute aortic dissection. J Thorac Cardiovasc Surg 148:98–104
Sailer AM, van Kuijk SM, Nelemans PJ et al (2017) Computed tomography imaging features in acute uncomplicated Stanford type-B aortic dissection predict late adverse events. Circ Cardiovasc Imaging:10
Sueyoshi E, Nagayama H, Hayashida T, Sakamoto I, Uetani M (2013) Comparison of outcome in aortic dissection with single false lumen versus multiple false lumens: CT assessment. Radiology 267:368–375
Sueyoshi E, Sakamoto I, Hayashi K, Yamaguchi T, Imada T (2004) Growth rate of aortic diameter in patients with type B aortic dissection during the chronic phase. Circulation 110:II256–II261
Tanaka A, Sakakibara M, Ishii H et al (2014) Influence of the false lumen status on clinical outcomes in patients with acute type B aortic dissection. J Vasc Surg 59:321–326
Tolenaar JL, Froehlich W, Jonker FH et al (2014) Predicting in-hospital mortality in acute type B aortic dissection: evidence from International Registry of Acute Aortic Dissection. Circulation 130:S45–S50
Trimarchi S, Tolenaar JL, Jonker FH et al (2013) Importance of false lumen thrombosis in type B aortic dissection prognosis. J Thorac Cardiovasc Surg 145:S208–S212
Tolenaar JL, van Keulen JW, Jonker FH et al (2013) Morphologic predictors of aortic dilatation in type B aortic dissection. J Vasc Surg 58:1220–1225
Ueki C, Sakaguchi G, Shimamoto T, Komiya T (2014) Prognostic factors in patients with uncomplicated acute type B aortic dissection. Ann Thorac Surg 97:767–773 discussion 773
Kamman AV, Brunkwall J, Verhoeven EL, Heijmen RH, Trimarchi S, Trialists A (2017) Predictors of aortic growth in uncomplicated type B aortic dissection from the Acute Dissection Stent Grafting or Best Medical Treatment (ADSORB) database. J Vasc Surg 65:964–971 e963
Loewe C, Czerny M, Sodeck GH et al (2012) A new mechanism by which an acute type B aortic dissection is primarily complicated, becomes complicated, or remains uncomplicated. Ann Thorac Surg 93:1215–1222
Kamman AV, Jonker FHW, Sechtem U et al (2017) Predictors of stable aortic dimensions in medically managed acute aortic syndromes. Ann Vasc Surg 42:143–149
Kitamura T, Torii S, Oka N et al (2015) Impact of the entry site on late outcome in acute Stanford type B aortic dissectiondagger. Eur J Cardiothorac Surg 48:655–661 discussion 661-652
Quint LE, Liu PS, Booher AM, Watcharotone K, Myles JD (2013) Proximal thoracic aortic diameter measurements at CT: repeatability and reproducibility according to measurement method. Int J Cardiovasc Imaging 29:479–488
Regeer MV, van Rosendael PJ, Kamperidis V et al (2015) Effect of statins on aortic root growth rate in patients with bicuspid aortic valve anatomy. Int J Cardiovasc Imaging 31:1583–1590
Rudarakanchana N, Bicknell CD, Cheshire NJ et al (2014) Variation in maximum diameter measurements of descending thoracic aortic aneurysms using unformatted planes versus images corrected to aortic centerline. Eur J Vasc Endovasc Surg 47:19–26
Singh K, Jacobsen BK, Solberg S et al (2003) Intra- and interobserver variability in the measurements of abdominal aortic and common iliac artery diameter with computed tomography. The Tromso study. Eur J Vasc Endovasc Surg 25:399–407
Hahn LD, Mistelbauer G, Higashigaito K et al (2020) True and false lumen segmentation in uncomplicated type B aortic dissection using machine learning. Radiol Cardiothorac Imaging 2:e190179
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
van Hamersvelt RW, Willemink MJ, Takx RA et al (2014) Cardiac valve calcifications on low-dose unenhanced ungated chest computed tomography: inter-observer and inter-examination reliability, agreement and variability. Eur Radiol 24:1557–1564
Bland JM, Altman DG (1986) Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1:307–310
Kottner J, Audige L, Brorson S et al (2011) Guidelines for Reporting Reliability and Agreement Studies (GRRAS) were proposed. J Clin Epidemiol 64:96–106
Wichmann JL, Willemink MJ, De Cecco CN (2020) Artificial intelligence and machine learning in radiology: current state and considerations for routine clinical implementation. Invest Radiol 55:619–627
Willemink MJ, Koszek WA, Hardell C et al (2020) Preparing medical imaging data for machine learning. Radiology 295:4–15
Cayne NS, Veith FJ, Lipsitz EC et al (2004) Variability of maximal aortic aneurysm diameter measurements on CT scan: significance and methods to minimize. J Vasc Surg 39:811–815
Jaakkola P, Hippelainen M, Farin P, Rytkonen H, Kainulainen S, Partanen K (1996) Interobserver variability in measuring the dimensions of the abdominal aorta: comparison of ultrasound and computed tomography. Eur J Vasc Endovasc Surg 12:230–237
Lederle FA, Wilson SE, Johnson GR et al (1995) Variability in measurement of abdominal aortic aneurysms. Abdominal Aortic Aneurysm Detection and Management Veterans Administration Cooperative Study Group. J Vasc Surg 21:945–952
Plonek T, Berezowski M, Bochenek M et al (2019) A comparison of aortic root measurements by echocardiography and computed tomography. J Thorac Cardiovasc Surg 157:479–486
Freeman LA, Young PM, Foley TA, Williamson EE, Bruce CJ, Greason KL (2013) CT and MRI assessment of the aortic root and ascending aorta. AJR Am J Roentgenol 200:W581–W592
Elefteriades JA, Mukherjee SK, Mojibian H (2020) Discrepancies in measurement of the thoracic aorta: JACC review topic of the week. J Am Coll Cardiol 76:201–217
Acknowledgements
The authors thank Shannon G. Walters, RT, MS, from the Stanford 3D and Quantitative Imaging Laboratory for his support with the image processing workflow.
Funding
This study has received funding from the American Heart Association grant numbers 18POST34030192 (MJW) and 826389 (MC), and a research grant (5T32EB009035) from the National Institute of Biomedical Imaging and Bioengineering (DM).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Guarantor
The scientific guarantor of this publication is Dominik Fleischmann, MD.
Conflict of interest
Martin. J. Willemink is a Junior Deputy Editor of European Radiology. They have not taken part in the review or selection process of this article.
The other authors of this manuscript declare no relationships with any companies whose products or services may be related to the subject matter of the article.
Statistics and biometry
One of the authors has significant statistical expertise.
Informed consent
Written informed consent was waived by the Institutional Review Board.
Ethical approval
Institutional Review Board approval was obtained.
Study subjects or cohorts overlap
Some of the subjects (n = 47, out of a total of 72) included in this study were part of the population (n = 83) included in reference #14 (Sailer et al doi:10.1161/CIRCIMAGING.116.005709)
Methodology
• retrospective
• experimental
• multi-center study
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
ESM 1
(PDF 480 kb)
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Willemink, M.J., Mastrodicasa, D., Madani, M.H. et al. Inter-observer variability of expert-derived morphologic risk predictors in aortic dissection. Eur Radiol 33, 1102–1111 (2023). https://doi.org/10.1007/s00330-022-09056-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00330-022-09056-z