Mouse transcriptome reveals potential signatures of protection and pathogenesis in human tuberculosis

Although mouse infection models have been extensively used to study the host response to Mycobacterium tuberculosis, their validity in revealing determinants of human tuberculosis (TB) resistance and disease progression has been heavily debated. Here, we show that the modular transcriptional signature in the blood of susceptible mice infected with a clinical isolate of M. tuberculosis resembles that of active human TB disease, with dominance of a type I interferon response and neutrophil activation and recruitment, together with a loss in B lymphocyte, natural killer and T cell effector responses. In addition, resistant but not susceptible strains of mice show increased lung B cell, natural killer and T cell effector responses in the lung upon infection. Notably, the blood signature of active disease shared by mice and humans is also evident in latent TB progressors before diagnosis, suggesting that these responses both predict and contribute to the pathogenesis of progressive M. tuberculosis infection. The pathobiological validity of mouse models of mycobacteria infection is sometimes questioned. O’Garra and colleagues demonstrate that mice share transcriptomic modules with active human tuberculosis and a characteristic type I IFN signature.

T B results in over 1.3 million deaths annually 1 , yet most individuals infected with M. tuberculosis remain asymptomatic. Latent TB infection (LTBI) is defined by an interferon (IFN)-γ-release assay (IGRA) specific for M. tuberculosis antigens, although some patients may have subclinical disease and may progress to active TB 2 . Protective immune responses against M. tuberculosis include CD4 + T lymphocytes and the cytokines IL-12, IFN-γ, tumor necrosis factor (TNF) [3][4][5][6] and IL-1 7 , but these factors do not explain why most individuals control infection, whereas a subset goes on to develop active TB. A blood transcriptional signature in patients with active TB has implicated type I IFN in TB pathogenesis [8][9][10][11][12][13][14][15][16] . Immunological heterogeneity in the blood transcriptome of a cohort of recent TB contacts has been observed, with a small proportion of contacts expressing a persistent blood TB signature and subsequently progressing to active disease (LTBI-progressors) 16 , suggesting a host response evolving toward active disease 16 .
How the immune response in blood 8,15 reflects that occurring at disease sites is poorly understood and sampling the latter in humans is prohibitive. The mouse TB model, owing to the richness of genetic and immunological tools available, has been invaluable in defining immune responses in the lung influencing disease outcome after infection 4,5,17 . However, a global systematic analysis to deter mine potential common pathways of protection or pathogenesis in different TB mouse models and human disease has not been reported. A role for type I IFN in TB pathogenesis [8][9][10][11][12][13][14][15][16] is supported by mouse TB models 6,18 with elevated and sustained levels of type I IFN resulting from: (1) infection of particular genetic strains of mice with clinical isolates of M. tuberculosis; [19][20][21][22][23] (2) infection of hosts with genetic mutations in regulators of type I IFN, such as Tpl2 (ref. 24 ), IL-1 (ref. 7 ) or ISG15 (ref. 25 ); (3) administration of adjuvants, for example poly(I)C 7,26 ; or (4) viral co-infection 27 . Whether it is the genetic strain of mouse or the M. tuberculosis pathogen itself, which results in an immune response that most resembles human TB is unclear. Although the spectra of human 28 and mouse 29 TB disease do not completely overlap, comparison of human TB with genetically diverse backgrounds of mice has established points of similarity in their response to M. tuberculosis. Some mouse strains recapitulate key elements of the pathogenesis of human TB disease, at the level of induction of necrotic TB lesions in the lungs 29 . Whether the global immune response to M. tuberculosis in susceptible mouse strains resembles that of TB in humans is as yet unclear.
Here, we report that the human blood TB type I IFN-inducible signature 8,16 is recapitulated in susceptible C3HeB/FeJ mice infected with different strains of M. tuberculosis. Increased expression of granulocyte-associated genes in blood from patients with active

Mouse transcriptome reveals potential signatures of protection and pathogenesis in human tuberculosis
TB, TB-susceptible mice and LTBI-progressors before TB diagnosis suggested their role in early disease pathogenesis. Con versely, under-abundance of B cell, NK cell and effector T cell signatures in blood from human patients with TB 16 , LTBI-progressors and TB-susceptible mice and yet over-abundance in lungs of M. tuberculosis-infected C57BL/6J resistant mice reinforced their role in early disease control. The translationally relevant knowledge dataset presented here on potential pathways of protection and pathogenesis in human TB are easily accessible using an online ShinyApp: https://ogarra.shinyapps.io/tbtranscriptome/.   20 10 Fig. 1 | Human TB blood transcriptional signature is preserved in blood of TB-susceptible mice. Blood modules of coexpressed genes, derived using WGCNA from human TB datasets by Singhania et al. 16 , are shown for blood RNA-seq datasets from patients with TB from London (n = 21 biologically independent samples), South Africa (n = 16 biologically independent samples) (both compared to London controls; n = 12 biologically independent samples) and Leicester (n = 53 biologically independent samples) (compared to Leicester controls; n = 50 biologically independent samples; Supplementary Table 2); human blood modules were tested on blood RNA-seq datasets obtained from different genetic strains of mice (C57BL/6J, resistant; C3HeB/FeJ, susceptible) infected with low and high doses of M. tuberculosis laboratory strain H37Rv or M. tuberculosis clinical isolate HN878 (n = 4 biologically independent samples per group for H37Rv infection and n = 5 biologically independent samples per group for HN878 infection from one experiment per M. tuberculosis infection, as depicted in Supplementary Fig. 1a), compared to their respective uninfected controls (Supplementary Table 3). Fold enrichment scores derived using quantitative set analysis for gene expression (QuSAGE) are depicted, with red and blue indicating modules over-or under-abundant compared to controls. Color intensity of the dots represents the degree of perturbation, indicated by the color scale. Size of the dots represents the relative degree of perturbation, with the largest dot representing the highest degree of perturbation within the plot. Only modules with fold enrichment scores with a false discovery rate (FDR) P value <0.05 were considered significant and depicted here (left and middle). Module name indicates biological processes associated with the genes within the module (Supplementary Table 1). Cell types associated with genes within each module were identified using the mouse cell-type-specific signatures from Singhania et al. 31 (right). Cell-type enrichment was calculated using a hypergeometric test, with only FDR P value <0.05 considered significant and depicted here (right panel). Color intensity represents significance of enrichment. *** indicates modules in which granulocyte-associated genes were detected, as listed in Table 1.

Results
The peak transcriptomic response in M. tuberculosis-infected mice. To determine whether a mouse blood transcriptional TB signature resembles that of human disease, we tested the human blood modular transcriptional TB signature 16 on RNA-seq data from blood of different genetic inbred strains of mice, C57BL/6J (resistant) and C3HeB/FeJ (susceptible), infected with low and high doses of the M. tuberculosis laboratory strain H37Rv or the clinical isolate HN878 (refs. 21,22 ) ( Supplementary Fig. 1a-c; Fig. 1 and Supplementary  Tables 1-3). The human blood TB signature 16 was first tested on microarray data from blood of H37Rv-infected BALB/c mice at different time points after infection, to establish the peak transcriptomic response, where immune signatures were barely detectable at 14 d and 21 d after infection, but were greatest by 138 d (Supplementary Fig. 2a and Supplementary Table 3). Analysis of blood microarray data from an independent study 30 , showed that the blood signature in 129S2 and C57BL/6NCrl mice was again barely detectable at 14 d after H37Rv infection, being observed robustly by 21 d, which was the endpoint of that study 30 (Supplementary Fig. 2a). Upon testing a lung disease modular signature 31 on microarray data from lungs of H37Rv-infected BALB/c mice, we detected a peak response 56 d after infection, only starting to be detected by 28 d after infection (Supplementary Fig. 2b and Supplementary Table 4). On the basis of these data we tested the human blood transcriptional TB signature 16 and lung disease modular signature 31 , on the blood and lungs, respectively, from C57BL/6J and C3HeB/FeJ mice infected with HN878, at 26-56 d after infection ( Supplementary Fig. 3a,b and Supplementary  Tables 3 and 4). The peak response chosen was approximately 42 d after infection, which best showed a robust signature in blood and lungs from HN878-infected C57BL/6J and C3HeB/FeJ mice ( Supplementary Fig. 3a,b and Supplementary Tables 3 and 4; tissues from HN878-infected susceptible mice were collected 33-35 d after infection due to excessive pathology).

Blood transcriptional TB signature in mouse and humans.
Principal-component analysis (PCA) at the peak response depicted distinct global transcriptional signatures in blood of C57BL/6J (resistant) and C3HeB/FeJ (susceptible) mice, infected with low and high doses of H37Rv or HN878, with the largest distance from uninfected mice observed in HN878-infected C3HeB/FeJ mice ( Supplementary Fig. 1). The human blood modular transcriptional TB signature 16 was recapitulated in blood of HN878-infected C3HeB/FeJ mice, and high-dose HN878-infected C57BL/6J mice ( Fig. 1; for annotation see Supplementary Table 1; for genes see  Supplementary Tables 2 and 3). Two over-abundant (red) IFNinducible modules (HB12 and HB23) in blood from patients with TB 16 , showed a graded increase from the C57BL/6J to the C3HeB/ FeJ mice infected with low to high-dose H37Rv, further increased in C57BL/6J then C3HeB/FeJ mice infected with low to high-dose HN878 (Fig. 1). Expression levels of IFN-inducible modules (HB12 and HB23) in blood of HN878-infected C3HeB/FeJ mice most closely resembled the profile in human TB (Fig. 1). Likewise, other over-abundant modules of human TB, including inflammasome (HB3), innate/hemopoietic mediators (HB5), innate immunity/ pathogen recognition receptor (PRR)/complement (C') (HB8) and myeloid/C'/adhesion (HB14) modules, were over-abundant in the HN878-infected C3HeB/FeJ mice and, to a lesser extent, in highdose HN878-infected C57BL/6J mice (Fig. 1). Under-abundance (blue) of the human TB, T cell (HB4) and B cell (HB15) 16 modules, was recapitulated in the blood of HN878-infected C3HeB/FeJ mice (Fig. 1). In keeping with this, cellular deconvolution analyses 31 of blood RNA-seq data from M. tuberculosis-infected mice showed a marked decrease in the percentages of B cell and CD4 + T cell fractions ( Supplementary Fig. 1d).
Cell types associated with each module were identified by comparing cell-type-specific gene signatures using the mouse RNA-seq dataset from ImmGen Ultra Low Input (ULI) (ImmGen Consortium, GSE109125; www.immgen.org) as described 31 and analyzed against the mouse gene orthologs within each human blood TB module (Fig. 1). Cell-type-specific enrichment data validated modular annotation for the blood T cell (HB2 and HB4) modules, with enrichment for αβ and γδ T cells; the NK and T cell (HB21) module, with enrichment for αβ and γδ T cells and innate lymphocytes; and the B cell (HB15) module with enrichment for B cells (Fig. 1). This approach also led to the discovery of previously unappreciated gene signatures, most strikingly, a dominance of granulocyte-associated genes within the inflammasome (HB3) and innate immunity/PRR/C' (HB8) modules (Fig. 1). This set of granulocyte-associated genes was highly expressed in blood from HN878-infected C3HeB/FeJ mice and human TB cohorts (Table 1; Supplementary Table 3 and  Supplementary Table 2). Increased expression of granulocyteassociated genes in blood of HN878-infected C3HeB/FeJ mice was reinforced by data obtained from cellular deconvolution analyses 31 ( Supplementary Fig. 1d).

Host and M. tuberculosis genetic differences drive lung TB signatures.
To determine the transcriptional response at the site of infection, RNA-seq data were obtained from the lungs of the same C57BL/6J and C3HeB/FeJ inbred strains of mice infected with H37Rv or HN878, used for the blood data from Fig. 1 (Supplementary  Fig. 1a). PCA depicted distinct global transcriptional signatures for uninfected mice and the different strains of H37Rv-or HN878infected mice, with the largest distance from uninfected controls observed in HN878-infected C3HeB/FeJ mice ( Supplementary  Fig. 4). The lung transcriptional response depicted a similar but more accentuated difference between the infected and uninfected groups than in blood (Supplementary Fig. 1c and Supplementary Fig. 4). A lung disease modular signature 31 was tested on the lung RNAseq data from the different groups of infected mice, to identify coexpressed groups of genes across the lung (Fig. 2). The type I IFN/Ifit/ Oas (L5) module was over-abundant in the lungs of H37Rv-and HN878-infected C57BL/6J and C3HeB/FeJ mice to similar levels, as shown by eigengene expression (Fig. 2a,b). Six modules (L10-L15), dominated by an over-abundance of granulocyte-, macrophage-and myeloid-specific genes, including modules with myeloid/granulocyte (L10) and IL-17 pathway/granulocytes (L11) function, showed the highest eigengene expression in the HN878-infected C3HeB/ FeJ mice (Fig. 2a Table 4). However, this immunoglobulin h/k (L25) module was not changed in the lungs of high-dose HN878-infected C57BL/6J mice or C3HeB/FeJ mice (Fig. 2a,d), correlating with mice showing greater TB susceptibility ( Supplementary Fig. 1b). The immunoglobulin h/k (L25) module was also highly abundant in the lungs of BALB/c mice infected with low-dose H37Rv, in keeping with its relatively resistant phenotype ( Supplementary Fig. 2b). The Ifng/Gbp/antigen presentation/C' (L7) and cytotoxic/T cells/ innate lymphocytes (ILCs)/Tbx21/Eomes/B cells (L35) modules were over-abundant in the lung across both strains of H37Rv-or HN878-infected mice (Fig. 2a,e and Supplementary Fig. 3b) and H37Rv-infected BALB/c mice ( Supplementary Fig. 2b), but less abundant in lungs from HN878-infected C3HeB/FeJ mice ( Fig. 2a and Supplementary Fig. 3b), as shown quantitatively by eigengene profiles (Fig. 2e). Independent derivation and annotation yielded similar transcriptional modules across all samples from uninfected and M. tuberculosis-infected mice, resulting in 27 modules ((ML1-ML27), Supplementary Fig. 5 and Supplementary Tables 5 and 6). The type I IFN/Stat2/Mx1 (ML2) and type I IFN signaling (ML21) modules were similarly over-abundant in the lungs of H37Rv-and HN878-infected C57BL/6J and C3HeB/FeJ mice ( Supplementary Fig. 5a,b). Over-abundance of modules ML19 and ML27, enriched for granulocyte/macrophage-specific genes, showed highest eigengene expression in HN878-infected C3HeB/ FeJ mice ( Supplementary Fig. 5a,c), confirmed by cell-type-specific enrichment analysis ( Supplementary Fig. 5a). The Ifng/Gbp/antigen presentation/C' (ML3) and T cell/NK/ILC/antigen-presenting cell (APC)/B cell (ML11) modules were over-abundant in lungs from both strains of H37Rv-or HN878-infected mice, although significantly less abundant in HN878-infected C3HeB/FeJ mice, as shown quantitatively by eigengene profiles (Supplementary Fig. 5a,d) and validated by cell-type-specific enrichment for T cells, dendritic cells, ILCs and B cells ( Supplementary Fig. 5a). Thus, two complementary and independently derived modular tools revealed similar transcriptional signatures in the lungs of M. tuberculosis-infected susceptible mice, indicating increased type I IFN and granulocyteassociated responses and decreased IFN-γ, NK, T effector and B cell responses ( Fig. 2 and Supplementary Fig. 5).
The over-abundance of inflammatory modules associated with granulocytes observed using the two independent modular approaches is in keeping with the more severe inflammation observed by hematoxylin and eosin (H&E) staining in the lungs of HN878-infected C3HeB/FeJ mice and high-dose HN878-infected C57BL/6J mice ( Fig. 3 and Supplementary Fig. 6). This was accompanied by greater numbers of M. tuberculosis bacteria observed in the lungs of these mice by Ziehl-Neelsen (ZN) staining (multibacillary infections, Fig. 3 and Supplementary Fig. 6).
Degree of preservation of lung modules in human and mouse blood. It is unclear as to what extent the airway transcriptional signature is reflected in the blood during M. tuberculosis infection. Certain immune responses across a range of experimental models of disease are well preserved between lung and blood, some not preserved and others only discernible in blood with previous knowledge from the airway response 31 . To address this question in TB, the mouse lung modular TB signature ( Supplementary Fig. 5a) was tested on the RNA blood samples from the different cohorts of human patients with TB and from the different mouse TB models ( Supplementary Fig. 7a). The mouse lung modules showed significant preservation in human and mouse blood, as assessed by Z summary scores, indicating the degree of preservation, with scores >10 considered as strongly preserved ( Supplementary Fig. 7b,c). Type I IFN-associated modules (ML2 and ML21) ( Supplementary  Fig. 5a), were over-abundant in human and mouse blood, being most over-abundant in HN878-infected susceptible mice ( Supplementary Fig. 7a). The lung type I IFN/Stat2/Mx1 module (ML2) was the most highly preserved module in human blood ( Supplementary Fig. 7b) and the second-most-preserved module in  31 (L1-L38) tested in mouse lung TB samples from different genetic strains of mice (C57BL/6J, resistant; C3HeB/FeJ, susceptible) infected with low and high doses of M. tuberculosis laboratory strain H37Rv or the M. tuberculosis clinical isolate HN878 (n = 3 biologically independent samples per group for low-dose HN878 infection of C3HeB/FeJ and n = 5 biologically independent samples per group for all other groups, as depicted in Supplementary Fig. 1a), compared to their respective uninfected controls (Supplementary Table 4). Red and blue indicate modules over-or under-abundant compared to the controls. Color intensity of the dots represents the degree of perturbation, indicated by the color scale. Size of the dots represents the relative degree of perturbation, with the largest dot representing the highest degree of perturbation within the plot. Only modules with fold enrichment scores with FDR P value <0.05 were considered significant and depicted here. GCC, glucocorticoid; K channel, potassium channel; TM, transmembrane; Ubiq, ubiquitination. b-e, Box plots depicting the module eigengene expression (the first principal component for all genes within the module), are shown for uninfected (Uninf) and M. tuberculosis-infected (low dose; high dose) C57Bl/6J and C3HeB/FeJ mice, for modules: type I IFN/Ifit/Oas (L5) (b); IL-17 pathway/granulocytes (L11), inflammation/IL-1 signaling/myeloid cells (L12), myeloid cells/Il1b/Tnf (L13) (c); immunoglobulin h/k-enriched (L25) (d); cytotoxic/T cells/ ILCs/Tbx21/Eomes/B cells (L35) and Ifng/Gbp/antigen presentation (L7) (e). mouse blood (Supplementary Fig. 7c) and the type I IFN signaling module (ML21) stood out as the third-most-preserved module in both human and mouse blood ( Supplementary Fig. 7b,c). The lung Ifng/Gbp/antigen presentation/C' module (ML3) was weakly overabundant in the blood of human patients with TB and in the blood of M. tuberculosis-infected mice ( Supplementary Fig. 7a  to a lesser extent, but highly preserved in both human and mouse blood ( Supplementary Fig. 7b,c). The overall increased abundance of the Ifng/Gbp/antigen presentation/C' module (ML3) was largely attributable to over-expression of genes such as GBP/Gbp genes and C' genes (Supplementary Fig. 7a; Supplementary Tables 2 and  3, and ShinyApp). However, the Ifng gene itself, although upregulated in the blood of M. tuberculosis-infected resistant mice, was barely upregulated in the blood of HN878-infected susceptible mice and IFNG was downregulated in the blood from patients with TB (Supplementary Tables 2 and 3 and ShinyApp). The lung macrophage/granulocyte modules (ML19 and ML27) and myeloid cell signaling module (ML10) were also over-abundant in blood of patients with active TB and most over-abundant in the blood of HN878-infected susceptible mice ( Supplementary Fig. 7a). While lung ML19 and ML10 modules were highly preserved in both human and mouse blood, the ML27 module was only highly preserved in mouse blood and to a much lesser extent in human ( Supplementary Fig. 7b,c). Lung modules associated with T, NK and B cells (ML11 and ML13) were under-abundant in HN878-infected susceptible mouse blood and blood from all human TB cohorts ( Supplementary Fig. 7a), with ML11 being highly preserved in both human and mouse blood ( Supplementary Fig. 7b,c). These findings regarding the preservation of over or under-abundant lung modules in the blood from human patients with TB and TB-susceptible mouse models ( Supplementary Fig. 7), are in keeping with the transcriptional signatures observed on testing human blood TB modules on blood from humans and mouse models of TB (Fig. 1).

Modular gene networks in human versus mouse TB.
We further interrogated the changes in gene expression of the key modules, HB3, HB15 and HB21, between the blood and lungs of resistant and susceptible mice infected with the different strains of M. tuberculosis, compared to human blood. To do so, we examined the expression of top 50 'hub' genes with high intramodular connectivity within the mouse data, on human blood from patients with TB and blood and lungs from mice infected with M. tuberculosis (Fig. 4). In keeping with our current findings that granulocytespecific genes are upregulated within the originally named inflammasome human blood TB module (HB3) 16 , granulocyte-specific genes were among the 50 'hub' genes within that module (now 'inflammasome/granulocyte'; Fig. 4). The granulocyte-specific genes Cd177, Elane, Mmp8, Mpo, Ncf1, Camp, Lcn2, S100a6 and Ltf ( Fig. 4 and Supplementary Fig. 8a; ShinyApp), which have been associated with neutrophil recruitment and activation 32 , were most highly differentially expressed in blood from patients with TB and M. tuberculosis-infected susceptible mice. Expression of these genes in mouse blood and lungs revealed a graded increase from the C57BL/6J to the C3HeB/FeJ mice infected with low to high-dose H37Rv, with a further increase observed in C57BL/6J to the C3HeB/FeJ mice infected with low to high-dose HN878 ( Fig. 4 and Supplementary Fig. 8a). The 50 'hub' genes within innate immunity/PRR/C' module (HB8) also showed enrichment for granulocyte-specific genes, including Mmp9, Alox5ap, Ncf2, Mxd1, S100a8 and S100a9, also associated with neutrophil activation ( Supplementary Fig. 8b) and were most highly expressed in blood from human patients with TB and blood and lung from HN878-infected C3HeB/FeJ mice ( Fig. 4 and Supplementary Fig.  8b; ShinyApp). Increased expression of these neutrophil-specific genes in the lungs of TB-susceptible HN878-infected mice was mirrored by the increased numbers of neutrophils detected in the lungs of these mice by immunohistochemistry ( Fig. 5 and Supplementary  Fig. 6), confirming the H&E data ( Fig. 3 and Supplementary Fig. 6). Collectively these data support a major role for neutrophils in human TB pathogenesis, similarly to the previously reported role for neutrophils in TB-susceptible strains of mice [33][34][35] .
The 50 top 'hub' genes within the human B cell module (HB15), Cd19, Pax5, Spib, Cd79 and Cd22, were downregulated in the blood of human patients with TB and M. tuberculosis-infected mice ( Fig. 4 and Supplementary Fig. 8c; ShinyApp). Most of the B cell-specific top 'hub' genes were upregulated in the lungs of H37Rv-infected mice, but strikingly downregulated in the lungs of high-dose HN878-infected C57BL/6J and C3HeB/FeJ mice ( Fig. 4 and Supplementary Fig. 8c; ShinyApp). This difference in expression of B cell-specific genes between the lungs of relatively TB-resistant and TB-susceptible mouse models was mirrored by differences in the numbers of B cells detected by B cell-specific immunofluorescent  staining of lungs from these mice ( Fig. 5 and Supplementary Fig. 6). While vastly increased numbers of B cells were observed in the lungs of H37Rv-infected mice, with accompanying formation of B cell follicles, these were practically absent in the lungs of C57BL/6J mice infected with high-dose HN878 and HN878-infected C3HeB/ FeJ mice ( Fig. 5 and Supplementary Fig. 6). These data support a possible role for B cells in protection against M. tuberculosis infection, as has previously been proposed 36,37 .

Fig. 4 | Gene networks of specific TB modules in human blood from patients with TB and blood and lung from M. tuberculosis-infected mice.
Differential expression of genes from human blood modules inflammasome/granulocytes (HB3), B cells (HB15) and NK and T cells (HB21), depicting the top 50 'hub' network of genes with high intramodular connectivity found within the mouse data (mouse genes most connected with all other genes within the module), is shown for data from blood from patients with TB (Leicester cohort) and blood and lungs from mice infected with M. tuberculosis, each against their respective controls. An enlarged representative network showing human gene names is shown for human blood (top) and an enlarged representative network showing mouse gene names is shown for blood samples from C3HeB/FeJ mice infected with high-dose HN878 (bottom). Each gene is represented as a circular node with edges representing the correlation between the gene expression profiles of two respective genes. Color of the node represents log 2 fold change of the gene for human blood TB samples or mouse blood and lung samples from M. tuberculosis-infected mice compared to respective controls.
In keeping with the under-abundance of the human blood NK and T cells module (HB21), the top 50 'hub' genes in this module were downregulated in the blood of patients with active TB (Fig. 4 and Supplementary Fig. 8d) as previously reported 16 . Although upregulated in the blood and lungs of H37Rv-infected C57BL/6J and C3HeB/FeJ mice and HN878-infected C57BL/6J mice, the majority of these 50 'hub' genes were downregulated in the blood and either minimally or not upregulated in the lungs from HN878infected C3HeB/FeJ mice (Fig. 4 and Supplementary Fig. 8d). These included Tbx21, Gzma, Eomes, Cd8a, Nfatc2, Fasl, Nkg7, Klrd1, Klrg1, Ifng and Runx3, reflecting downregulation of effector T cells and NK cells in the blood of patients with TB and HN878-infected susceptible C3HeB/FeJ mice ( Fig. 4 and Supplementary Fig. 8d; ShinyApp). Minimally altered gene expression was mirrored by a decrease in CD3 + T cells in HN878-infected C3HeB/FeJ mouse lungs, as shown by immunofluorescence ( Fig. 5 and Supplementary  Fig. 6), reflecting an absence of activated effector T cells required for protection against M. tuberculosis infection [4][5][6] .
Heat maps of the top 50 'hub' genes from the human blood TB modules interferon/PRR (HB12) and interferon/C'/myeloid (HB23) demonstrated a large number of genes that were over-expressed in human blood from the London and Leicester TB cohorts and were similarly over-expressed in mouse blood from HN878-infected C3HeB/FeJ mice (Supplementary Fig. 9). In contrast, many of these type I IFN-inducible genes in the HB12 module, including Il1rn, Ifit1, Ifit2, Oas2 and Stat2, were upregulated to a lower extent, in the blood of H37Rv-infected C57BL/6J mice compared to HN878infected C3HeB/FeJ mice (Supplementary Fig. 9). The majority of the top 50 'hub' genes from the interferon/PRR (HB12) and interferon/C'/myeloid (HB23) human modules were upregulated in the lungs of all M. tuberculosis-infected mice, with the highest expression observed in the lungs from HN878-infected C3HeB/FeJ mice ( Supplementary Fig. 9).

Blood signatures reflect the extent of lung pathology in TB.
Correlation between the whole-blood TB signature and the extent of lung radiographic burden of human disease has been reported 8 . A quantitative measure of the transcriptional signature, determined using the molecular distance to health, showed a graded increase in the signature across patients categorized with no disease to minimal, moderate and advanced disease 8 . Here we show that the extent of the blood modular signatures associated with type I IFN-inducible genes (HB12 and HB23), shown quantitatively by eigengene expression, positively correlated with the extent of lung pathology assessed by the combined relative lesion burden and percentage of tissue affected scores in the TB mouse models (Fig. 6a). The type I IFN-associated blood modular signature was lowest in the more resistant mouse models of TB, increasing with different levels of lung pathology, peaking in the HN878-infected C3HeB/FeJ mice (Fig. 6a). Similarly, the level of the type I IFN-associated blood modular signatures in human TB, here shown by eigengene expression, also positively correlated with the radiographic extent of lung disease in patients with different degrees of disease (Fig. 6b). The neutrophil-associated (HB3 and HB8) blood modular signatures, likewise, showed an increased eigengene expression in the blood of mice in the different TB models, correlating with an increased lung neutrophil score (Fig. 6c) and the most severe lung lesions, as assessed histopathologically (Fig. 6a). The neutrophil-associated modular blood signature was highest in the HN878-infected C3HeB/FeJ mice, correlating with the highest lung neutrophil score (Fig. 6c). Although the neutrophil lung score was similarly high in the high-dose HN878-infected C57BL/6J mice, the blood neutrophil-associated modular signature remained low in these mice (Fig. 6c). The blood neutrophil-associated signature in human TB also positively correlated with the radiographic extent of lung disease in patients with TB (Fig. 6d), again supporting a role for neutrophils in TB pathogenesis.
In contrast to the increased type I IFN and neutrophil-associated blood modular signatures in TB, the blood B cell (HB15) and NK and T cell (HB21) modular signatures showed a decrease in the blood of M. tuberculosis-infected mice showing advanced lung disease, specifically the HN878-infected C3HeB/FeJ mice and to a lesser extent, the high-dose HN878-infected C57BL/6J mice (Fig. 6e). This decreased blood signature in advanced disease correlated with a decrease in the lung lymphocyte score, which in the more   Supplementary Fig. 1a) (a,c,e); and for human blood samples from the London TB cohort divided into healthy control (no X-ray; n = 12 biologically independent samples) and patients with TB, grouped according to the radiographic extent of disease as: no disease (n = 21 biologically independent samples), minimal (n = 7 biologically independent samples), moderate (n = 6 biologically independent samples) or advanced (n = 8, biologically independent samples, as described by Berry et al. 8  resistant mice had increased on infection (Fig. 6e). In human TB, these blood B cell (HB15) and NK and T cell (HB21) modular signatures showed a similar decrease in the blood, inversely correlating with the extent of lung radiographic disease (Fig. 6f).

Modular blood signatures in LTBI-progressors.
We next set out to determine whether the type I IFN- (HB12 and HB23), neutrophil- (HB3 and HB8), B cell-(HB15) and NK and T cell (HB21)associated modular signatures, determined in human active TB and susceptible mouse models of TB, could be detected during early M. tuberculosis infection of humans. To this end, we analyzed RNAseq data from the blood of recent contacts of patients with active TB who were subsequently shown to progress to active TB (LTBIprogressors), patients with active TB and healthy controls (IGRA − and IGRA + contacts who did not progress to TB) 16 (Fig. 7). The interferon/PRR (HB12) and interferon/C'/myeloid (HB23) blood modular signatures, shown quantitatively by eigengene expression, were increased in the blood of LTBI-progressors to the same level as in patients with active TB compared to healthy controls (Fig. 7a). As shown for the London cohort (Fig. 6b), the type I IFNassociated modular signatures also correlated with the radiographic extent of lung disease in this independent cohort (Fig. 7a). Type I IFN-inducible genes in these modular signatures included STAT1, STAT2, IRF9, OAS1, OASL, IFITM1, ISG15 and IL1RN, which were expressed at the same level in the blood of LTBI-progressors and patients with active TB (Fig. 8a; ShinyApp). Again, the degree of expression of these individual genes positively correlated with the extent of radiographic signs of disease, being already increased in the blood of patients with minimal disease (Fig. 8a). We also observed increased expression of type I IFN-inducible genes (Fig. 8a) in an independent cohort of LTBI-progressors compared to individuals with LTBI, who remained healthy 38 .
Notably, expression of the neutrophil-associated (HB3 and HB8) modular signatures was also increased to high levels in the blood of LTBI-progressors, to the same level as seen in blood of patients with active TB, as compared to healthy controls (Fig. 7b). The extent of these neutrophil-associated blood signatures positively correlated with the radiographic signs of lung disease (Fig. 7b). Confirming the contribution of genes associated with neutrophil activation and recruitment, CD177, NCF1, NCF2, LRG1, MMP9, S100A8, S100A9 and ALOX5AP were upregulated in the blood of LTBI-progressors from both cohorts compared to controls to a similar level as in the blood of patients with active TB, their level of expression correlating with increased signs of radiographic lung disease (Fig. 8b). The increased expression of genes associated with neutrophil activation and recruitment in the blood of patients with TB with minimal radiographic signs of disease and LTBI-progressors (Fig. 8b) points to an unappreciated role for neutrophils in early disease.
The expression of the B cell (HB15) and NK and T cell (HB21)associated modular signatures was decreased in the blood of LTBIprogressors to the same extent as in active TB (Fig. 7c) compared to controls, again correlating with increased radiographic signs of disease (Fig. 7c). Expression of the NK and T cell-specific genes IFNG, EOMES, TBX21, GZMA, KLRD1 and NKG7 was similarly decreased    Table 7).
Genes in type I IFN modules HB12 and HB23 Genes in neutrophil modules HB3 and HB8 Genes in NK and T cell module HB21 in the blood of LTBI-progressors in both cohorts compared to healthy controls and in the blood of patients with minimal signs of disease, although further decreased in those with advanced signs of radiographic disease (Fig. 8c). Since T cell and NK cell genes convey protection against M. tuberculosis infection [4][5][6]39,40 , their loss may contribute to progression to active TB. Collectively our findings predict that a dominance of gene expression associated with a type I IFN response and neutrophil activation and recruitment, together with a loss of NK and effector T cell responses, early after infection with M. tuberculosis, may contribute to progression to active TB.

Discussion
Here we show that the IFN-inducible human blood TB transcriptional signature 16 is recapitulated in blood from M. tuberculosis HN878-infected TB-susceptible C3HeB/FeJ mice, whereas this signature is minimal in blood from M. tuberculosis H37Rv-infected resistant C57BL/6J mice. Combining our modular signature data with cell-type-specific signatures 31 we reveal an increase in neutrophil-associated genes in the blood of TB-susceptible mice and patients with TB. Genes associated with type I IFN responses and with neutrophil recruitment and activation were increased in LTBIprogressors before diagnosis, suggesting an unappreciated role for neutrophils in early disease. Decreased B cell, NK and T cell signatures of human active TB 8,16 were observed in the blood of infected TB-susceptible mice and LTBI-progressors, whereas these were upregulated on infection in the lungs of TB-resistant mice, suggesting that their early loss contributes to progression to active TB.
Neutrophils are abundant in the lung lesions of M. tuberculosis-infected susceptible mice, contributing to TB pathogenesis 33,34 , whereas lesions of infected resistant mice contain only scattered neutrophils, instead dominated by lymphocytes and macrophages 41 . M. tuberculosis-infected neutrophils have been detected within inflammatory lung granuloma of patients with active TB 42,43 . We herein reveal low levels of a neutrophil-associated signature in lungs of M. tuberculosis-infected C57BL/6J mice, which was maximally increased in HN878-infected susceptible C3HeB/FeJ mice. This was validated by histological analysis, although S100A9 neutrophil staining was lost due to the necrotic nature of the lesions. This led to our discovery of increased expression of neutrophil-associated genes within the over-abundant human TB blood modules, originally annotated as 'inflammasome' and 'innate immunity/PRR/C'' 16 , which we now rename as 'inflammasome/granulocyte' and 'innate immunity/PRR/C'/granulocyte' . Previous studies showed no change by flow cytometry in neutrophil numbers in the blood of patients with active TB 8 , suggesting that the over-abundance of this granulocyte-associated signature of activation and recruitment may be attributable to a subset of activated neutrophils, which has circulated to the blood from the lung. Whether these neutrophils are carriers of M. tuberculosis to the blood in human TB, where the bacteria have been recently shown to be detected in early disease 44 , remains to be investigated. The granulocyte-associated signature was also increased in blood from LTBI-progressors before diagnosis, suggesting a previously unappreciated role for neutrophils in early progression to disease.
The type I IFN-associated signature widely reported in the blood of patients with active TB [8][9][10][11][12][13][14][15][16] was also present in blood from M. tuberculosis-infected mice, with the highest levels observed in the more susceptible models, correlating with more severe lung pathology. The type I IFN-inducible signature resulted from the host genetic background and the dose and strain of M. tuberculosis, possibly explaining differing reports regarding the role of type I IFN in TB pathogenesis 18,[20][21][22]24,27,45,46 . The enhanced type I IFN-associated signature in the C3HeB/FeJ mice is in keeping with a recent report that B6.Sst1S congenic mice carrying the C3H 'sensitive' allele of the Sst1 locus that renders them highly susceptible to M. tuberculosis infections 47 , exhibit markedly increased type I IFN signaling, which contributes to their high TB susceptibility via induction of the IL-1 receptor antagonist (IL-1Ra) 48 . We show that the Il1rn gene expression is increased in mouse blood and lung on infection, correlating with increasing susceptibility to TB in C3HeB/FeJ mice infected with HN878, a M. tuberculosis strain reported to enhance type I IFN induction and TB pathogenesis 21,22 . The IL1RN gene was highly expressed in blood from patients with TB, but also in the LTBIprogressors, along with other type I IFN-inducible genes, such as OAS1, IFITM1 and ISG15, suggesting that type I IFN-inducible genes may contribute to early TB pathogenesis. Genes of the complement cascade were also upregulated in the blood from LTBIprogressors, in keeping with previous reports 15,49 .
Upregulation of both type I and II IFN have been reported before diagnosis of patients with TB 15 . However, we herein report that in patients with TB, the IFNG gene itself is downregulated in the blood, alongside a number of key molecules, including TBX21, EOMES, GZMA, GZMB, NKG7 and KLRD1, suggesting a loss of the protective effector function of T cells and NK cells 5,6,39,40 . This decrease was also observed in LTBI-progressors, suggesting that the decreased expression of IFNG and other genes associated with effector and cytotoxic functions early after M. tuberculosis infection may contribute to disease progression. This supports reports that IFN-γ, cytotoxic effector molecules and NK cells are important for protection against M. tuberculosis infection in both mouse models and human disease 5,6,39,40 . In keeping with this, genes associated with effector and cytotoxic NK and T cell responses (Nkg7, Klrd1, Gzma, Gzmb and Tbx21) as well as Ifng, were upregulated in the blood and lungs of M. tuberculosis-infected TB-resistant C57BL/6J mice but drastically reduced in the blood and lungs from HN878-infected susceptible C3HeB/FeJ mice, similarly to in blood from LTBIprogressors and patients with active TB. Decreased IFNG expression in blood of patients with TB and Ifng expression in the blood and lungs of susceptible mice parallels the increase in neutrophils, supporting previous reports that IFN-γ regulates neutrophil function 35 , thus limiting lung inflammation and TB exacerbation.
Our findings of a decrease in the B cell-associated modular expression in the blood of M. tuberculosis-infected susceptible mice are in keeping with reports on the reduction in abundance of total B cells in human TB 8,40 , largely driven by a reduction in circulating naive B cells 40 . This under-abundance of the B cell-associated module was also observed in blood from LTBI-progressors, although maximal in patients with TB and susceptible mice with advanced signs of lung disease. Reduction in peripheral B cells could be due to preferential sequestration of these cells at the site of infection or diminished output of B cells from the bone marrow 36,37 . Our data support a combination of both, depending on the extent of the disease. The top 50 interacting 'hub' genes in the B cellassociated module showed increased expression in the lungs from M. tuberculosis-infected resistant mice, but were decreased in lungs from HN878-infected susceptible mice, as verified by histopathology. B cells at the site of infection could be contributing to control of M. tuberculosis infection in the resistant mice, as has been proposed elsewhere 36,37 .
Using a combination of mouse TB models and human TB cohorts we provide data to suggest that dominance of a type I IFN response and neutrophil activation and recruitment, together with a loss of B cell, NK and T cell effector responses may contribute to the pathogenesis of progressive M. tuberculosis infection. Mouse models of TB have been employed for decades as tools for elucidating mechanisms of host resistance and pathogenesis. While failing to recapitulate many of the features of clinical TB and in several cases protective vaccine responses, they have been remarkably useful in identifying both effector and regulatory responses that have emerged to be important in human infection and disease. The data reported here comparing the host transcriptomic responses of M. tuberculosis-infected mice and humans offer further compelling characterization and validation of the mouse model for further mechanistic studies and suggest a peripheral signature associated with progression to clinical disease in TB.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41590-020-0610-z. Cellular deconvolution. Deconvolution analysis for quantification of relative levels of distinct cell types on a per sample basis was carried out on normalized counts using CIBERSORT 58 . CIBERSORT estimates the relative subsets of RNA transcripts using linear support vector regression. Mouse cell signatures for 25 cell types were obtained using ImmuCC 59 and grouped into 9 representative cell types on the basis of the application of ImmuCC cellular deconvolution analysis to the sorted cell RNA-seq samples from the ImmGen ULI RNA-seq dataset (ImmGen Consortium: GSE109125; www.immgen.org) as previously described 31,60,61 ( Supplementary Fig. 1d).
Module generation. Human blood modules were previously determined in human TB 16 . Weighted gene coexpression network analysis was performed to identify lung modules, using the package WGCNA 62 in R. Modules were across all control and infected samples, using log 2 RNA-seq expression values. The lung modules were constructed using the 10,000 most-variable genes across all lung samples. A signed weighted correlation matrix containing pairwise Pearson correlations between all genes across all samples was computed using a soft threshold of β = 22 to reach a scale-free topology. Using this adjacency matrix, the topological overlap measure was calculated, which measures the network interconnectedness 63 and is used as input to group highly correlated genes together using average linkage hierarchical clustering. The WGCNA dynamic hybrid tree-cut algorithm 64 was used to detect network modules of coexpressed genes with a minimum module size of 20 and deep split of 1. Lung modules were numbered ML1-ML27 and human blood modules previously found in human TB 16 were numbered HB1-HB23. An additional 'gray' module was identified in lung modules (Supplementary Table 6, module titled NA), consisting of genes that were not coexpressed with any other genes. These gray modules were not considered in any further analysis. To create gene interaction networks, hub genes with the highest intramodular connectivity and a minimum correlation of 0.75 were calculated, with a cutoff of 50 hub genes and exported into Cytoscape v.3.4.0 for visualization.
For checking either human blood modules into mouse data or mouse lung modules into human data, human Ensembl gene IDs were translated into Mouse gene IDs using BioMart to extract mouse ortholog genes (Supplementary Table 8).

Modular annotation.
Lung modules were enriched for biological pathways and processed using IPA (QIAGEN Bioinformatics), Metacore (Thomson Reuters) and a careful manual annotation, by checking cell-type-specific enrichment and individual read counts. Significantly enriched canonical pathways, and upstream regulators were obtained from IPA (top 5). Modules were assigned names based on representative biological processes from pathways and processes from all methods (Supplementary Tables 5 and 6).

Module enrichment analysis.
Fold enrichment for the WGCNA modules was calculated using QuSAGE 65 using the bioconductor package qusage v.2.4.0 in R, to identify the modules of genes over-or under-abundant in a dataset, compared to the respective control group using log 2 expression values. The qusage function was used with the n.points parameter set to 2 (ref. 15 ). Only modules with enrichment scores with FDR P values <0.05 were considered significant and plotted using the ggcorrplot function in R. Eigengene profiles, which are representative expression profiles for a given module in a particular dataset, were generated using the module eigengenes function from the WGCNA package and were plotted using the ggplot2 package 57 .
Cell-type-specific enrichment. Cell-type enrichment analysis to identify overrepresented cell types in blood and lung modules was performed as previously described 31 using a hypergeometric test, using the phyper function in R. The P values were corrected for multiple testing using the p.adjust function in R, using the Benjamini-Hochberg method, to obtain FDR-corrected P values.
Method for use of online web application. An online web application: https:// ogarra.shinyapps.io/tbtranscriptome/ accompanies the manuscript to visualize the findings of the study. The app is subdivided into four distinct pages that can be accessed through the tabs displayed on the top of the page, with a customized sidebar for user input on each page. Tab 1: Expression table allows the user to visualize read counts, either as raw counts or log 2 -normalized expression values, in either the mouse blood TB, mouse lung TB or human blood TB (Leicester, London or South Africa) datasets. Each row represents a different gene and each column represents a sample in the corresponding dataset. The user can download the dataset into a spreadsheet file format.
Tab 2: Average expression table allows the user to visualize the average read counts by group, either as raw counts or log 2 -normalized expression values, in either the mouse blood TB, mouse lung TB or human blood TB (Leicester, London or South Africa) datasets. Each row represents a different gene and each column represents a group in the corresponding dataset. The user can download the dataset into a spreadsheet file format. Tab 3: Gene expression allows the user to visualize the expression of individual genes, either as raw or log 2 -normalized expression values, in either the mouse blood TB, mouse lung TB or human blood TB (Leicester, London or South Africa) datasets. Each dot represents the expression value for the chosen gene in one sample.
Tab 4: Module profiles allows the user to visualize the expression profile (eigengene from the WGCNA R package) of a module they can select, either from human blood TB modules (HB1-HB23) 16 , mouse lung TB modules (ML1-ML27) derived de novo in this study or mouse lung disease modules (L1-L38) 31 . Each dot represents the eigengene value for the chosen module in one sample. A table below the plot displays all genes present within that module.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The materials, data and any associated protocols that support the findings of this study are available from the corresponding author upon request. The RNA-seq datasets have been deposited in the NCBI Gene Expression Omnibus database with the primary accession number GSE140945 (TB mouse blood and lung). Publicly available datasets used in this study include GSE107995 (human TB datasets from Singhania et al. 16 ) and GSE79362 (human TB dataset from Zak et al. 38 ).