Using breath analysis to differentiate interstitial lung diseases

Non-invasive biomarkers could lead to better treatment selection

Publication information: L. Plantier et al. The use of exhaled air analysis in discriminating interstitial lung diseases: a pilot study, Respiratory Research. (2022) 23:12. DOI: 10.1186/s12931-021-01923-5

Disease Area: Respiratory – Interstitial Lung Disease (ILD), Idiopathic Pulmonary Fibrosis (IPF)

Application: Patient stratification

Sample medium: Breath

Analysis approach: GC-ToF-MS


  • Interstitial lung diseases (ILDs) lead to impaired lung function with strong associations with morbidity and mortality. Different treatments are appropriate for different ILDs, despite similar presentation of symptoms. Current clinical methods for differential diagnosis are invasive and burdensome.
  • Plantier et al. measured VOCs on breath of patients with ILD and produced models capable of discriminating ILD subtypes, including IPF, from each other and controls.
  • Non-invasive breath biomarkers have huge potential to help achieve better treatment selection for ILD patients and help-seekers.

Interstitial lung diseases (ILD) are a group of similar chronic lung conditions resulting from varying degrees of lung inflammation, leading to fibrosis – the most serious of which is idiopathic pulmonary fibrosis (IPF). These lung diseases are not particularly common (less than 20 cases per 100,000) however, they are severe and have strong associations with both morbidity and mortality – ‘if classified as a malignancy, IPF would rank as the eighth most prevalent cancer worldwide’. Diagnosis is difficult, it can take a few years and is often only assigned through the exclusion of other diseases, rather than a definitive test result. Even once ILD is diagnosed, treatment is still not straightforward.

Unfortunately, while anti-inflammatory and immunosuppressive drugs are appropriate treatments for many less advanced forms of ILD, they are harmful for subjects with IPF, due to differences in their underlying pathogenic processes. Symptoms for all kinds of ILD are typically similar, and differential diagnosis is tricky, normally relying on a balance of data from clinical insight, imaging, bloodwork and alveolar lavage tests, and involving multidisciplinary professionals. Achieving a definitive diagnosis would be possible with an invasive lung biopsy – however, this is often not possible and would be a last resort for subjects already living with lung damage. Developing new, non-invasive and accurate diagnostic tests capable of differentiating conditions within the ILD family is of critical importance for earlier intervention and better treatment selection.

For this reason, Plantier et al. set out to try and identify volatile organic compounds (VOCs) in exhaled breath that might be able to non-invasively discriminate IPF and other forms of ILD. They also wanted to examine whether any of the VOCs identified could also be associated with disease severity.



Three groups of subjects were recruited for this study which took place around Paris, France in 2014 and 2015 – 53 subjects with diagnosed IPF, 53 subjects with diagnosed connective tissue disease-associated interstitial lung disease CTD-ILD and 51 controls with no chronic lung disease.

Each subject provided a breath sample by exhaling into a 3L Tedlar bag until the bag was full. The VOCs within each bag were transferred onto sorbent tubes before being analyzed via gas chromatography-time-of-flight-mass spectrometry (GC-ToF-MS), in an attempt to detect, measure and identify specific VOCs. Identities were assigned to VOCs of interest through spectrum recognition with reference to the National Institute of Standard and Technology (NIST) library and an in-house database of pure compounds, before being validated by an experienced mass spectrometrist.



By applying multivariate Random Forest modelling and analysis, 34 VOCs were observed that had potential to discriminate between IPF patients and healthy controls. These VOCs were able to discriminate between groups with 84.6% accuracy (sensitivity of 81.1% and specificity of 88.2%). The Receiver Operating Characteristic (ROC) curve generated by this model using the 34 VOCs had an area under the curve (AUC) of 91.2%. The five most strongly discriminating VOCs for this model were ethanol, heptane, benzaldehyde, dimethyl sulfide, and an unidentified molecule.

VOC profiling for IPF and ILD patients v. controls

Figure 1. VOC profiling for IPF versus controls (left) and ILD patients versus controls (right). Featuring ROC curves (A) and PCA plots of Random Forests proximities (B).  The distance between individual points expresses their similarity.

In contrast just 11 VOCs seemed able to distinguish between CTD-ILD patients and healthy controls. These 11 VOCs combined to produce a model that could differentiate these groups with an accuracy of 77.5% (sensitivity of 76.5% and a specificity of 78.4%). The ROC AUC was 83.9%. In this model the four most significant VOCs were identified as 2-heptanone, 4-penten-ol, 2,5-dimethyl furan and ethanol.

16 VOCs were found capable of differentiating IPF and CTD-ILD, with an accuracy of 76.9% (sensitivity of 75.5% and specificity of 78.4%) and a ROC AUC of 83.8%. All 16 of these VOCs were identified by Plantier et al.

Table of discriminating VOCs

Table 1. Chemical putative identities of all discriminatory VOCs of the comparison between IPF and CTD-ILD

There was significant positive association between functional impairment of the lungs (as measured by total lung capacity and the 6 min’ walk distance) and all of the identified discriminatory VOCs – suggesting that these VOCs (and by extension, breath testing in general) hold potential for use as non-invasive biomarkers of disease progression monitoring, not just diagnosis.



Plantier et al. have reported specific VOC profiles which they have found to differentiate IPF and CTD-ILD from both healthy controls and each other. Supporting the potential clinical utility of these prospective biomarkers, the ILD-specific VOC profile was found to have a strong positive associations with existing clinical tests of lung function.

Commenting on other recent breath studies of interest for discriminating ILD subtypes, Plantier et al. said:

‘Recently, Enose technology was shown to distinguish ILD patients from healthy controls and to discriminate between different ILD subgroups. Although this is a very promising result that indicates breath analysis might be useful for timely diagnosis of specific ILDs in the future, such Enose studies i) do not provide insight into the biological mechanisms of diseases and ii) generate device-specific data that are hardly translatable to other devices or technologies.’

GC-MS, as used by Plantier et al. remains the gold standard for breath analysis, for both discovery and validation studies.

This paper has reported a fairly large number of VOCs as potential biomarkers. While it is unlikely that they will turn out to be relevant, identifying a larger group of candidates increases the possibility that some will be clinically useful – and provides numerous targets for further studies.

Future research that attempts to reconfirm or validate this study’s findings will need to include much larger cohorts of patients. Many of the ILD and IPF subjects, and some of the controls, were on medications which could have been confounding factors. Larger studies might also be able to analyze patient subgroups by age, sex and smoking history, other potential confounding factors in the significance of the present study’s results.


If you’d like to conduct further research into breath based biomarkers for IPF, ILD, or any other respiratory diseases we can help. Our in-house breath reserach platform, Breath Biopsy OMNI, has been optimized for the both reliable and reproducible collection of breath samples (even across multiple sites) and the detection and identification of biorelevant compounds from breath. Many of the VOCs that this paper and other recent studies have identified as potential biomarkers of respiratory diease can already be found in our HRAM Library – the list of VOCs which we can consistently identify with high confidence.