banner
Centro notizie
Qualità e prestazioni sono le caratteristiche distintive dei nostri prodotti.

Un atlante integrato di tumori, sistema immunitario e microbioma del cancro del colon

Sep 01, 2023

Nature Medicine volume 29, pagine 1273–1286 (2023)Citare questo articolo

18k accessi

116 Altmetrico

Dettagli sulle metriche

La mancanza di set di dati sul cancro multi-omici con ampie informazioni di follow-up ostacola l’identificazione di biomarcatori accurati dell’esito clinico. In questo studio di coorte, abbiamo eseguito analisi genomiche complete su campioni freschi congelati di 348 pazienti affetti da cancro del colon primario, comprendendo il sequenziamento dell'RNA, dell'intero esoma, del recettore delle cellule T profonde e del gene dell'rRNA batterico 16S sul tumore e sul tessuto del colon sano abbinato, integrati con sequenziamento dell’intero genoma del tumore per un’ulteriore caratterizzazione del microbioma. Una cellula T helper di tipo 1, citotossica, con firma di espressione genica, chiamata costante immunologica di rigetto, ha catturato la presenza di cloni di cellule T arricchite clonalmente e arricchite con tumore e ha sovraperformato i biomarcatori molecolari prognostici convenzionali, come il sottotipo molecolare consenso e le classificazioni di instabilità dei microsatelliti . La quantificazione dell'immunoediting genetico, definito come un numero di neoantigeni inferiore al previsto, ne ha ulteriormente affinato il valore prognostico. Abbiamo identificato una firma del microbioma, guidata da Ruminococcus bromii, associata a un esito favorevole. Combinando la firma del microbioma e la costante immunologica di rigetto, abbiamo sviluppato e convalidato un punteggio composito (mICRoScore), che identifica un gruppo di pazienti con un'eccellente probabilità di sopravvivenza. Il set di dati multi-omici disponibile al pubblico fornisce una risorsa per una migliore comprensione della biologia del cancro del colon che potrebbe facilitare la scoperta di approcci terapeutici personalizzati.

Sebbene sia stata condotta una notevole quantità di ricerche sui biomarcatori per il cancro primario del colon, le attuali linee guida cliniche negli Stati Uniti e in Europa (comprese le linee guida del National Comprehensive Cancer Network e della European Society for Medical Oncology) si basano solo sul rapporto tumore-linfonodi-metastasi stadiazione e rilevamento del deficit di riparazione del mismatch del DNA (MMR) o dell'instabilità dei microsatelliti (MSI), oltre alle variabili clinicopatologiche standard, per determinare raccomandazioni di trattamento1,2. L'MSI è causata da difetti somatici o germinali dei geni MMR e porta all'accumulo di mutazioni somatiche, neoantigeni con conseguente riconoscimento immunitario e alta densità di linfociti infiltranti il ​​tumore3.

La forza della reazione immunitaria adattativa in situ, come rilevata ad esempio dalla valutazione della densità e della distribuzione spaziale delle cellule T (Immunoscore), è associata a un ridotto rischio di recidiva e morte indipendentemente da altre variabili clinicopatologiche, incluso lo stato di MSI4, 5.

Tuttavia, nonostante la schiacciante evidenza dell’effetto prognostico dell’Immunoscore e di altri parametri immuno-correlati nel cancro del colon6,7, una mancanza di associazione tra le stime basate sull’espressione genica della risposta immunitaria e la sopravvivenza del paziente nel Cancer Genome Atlas (TCGA) La coorte di adenocarcinoma del colon (COAD) è stata notata dalla comunità di ricerca8,9,10. TCGA, per la sua ricchezza e cura dei dati genomici, rappresenta il set di dati preminente per le analisi omiche; tuttavia, la raccolta di dati clinici completi, compresi i risultati di sopravvivenza, non era né un obiettivo primario del TCGA né una possibilità pratica in considerazione della sua portata mondiale e dei limiti di tempo11. Pertanto, i dati limitati di follow-up dei pazienti associati al TCGA-COAD e ad altri set di dati TCGA hanno ostacolato analisi di sopravvivenza statisticamente rigorose11. Inoltre, il TCGA non includeva test dedicati per l'analisi del repertorio dei recettori delle cellule T (TCR) o per la caratterizzazione del microbioma, che è stata successivamente eseguita utilizzando dati di sequenziamento di DNA e RNA (RNA-seq) in massa e include solo pochi tessuti solidi sani (ad esempio colon sano ) campioni12,13. Inoltre, poiché il TCGA si è inizialmente concentrato sulla catalogazione dei cambiamenti genomici e molecolari che si verificano nelle cellule tumorali, sono stati imposti criteri di inclusione dei campioni basati su rigorosi limiti di purezza del tumore14, potenzialmente distorcendo la popolazione verso campioni tumorali meno immuni o ricchi di stroma.

0.1% in the tumor, which are at least 32 times higher in the tumor compared to normal) are highlighted. i, Correlation of proportion of tumor-enriched T cell clones in the tumor (in percent) with ICR score. Pearson's r and P value of the correlation are indicated in the plot. All P values are two-sided./p>12 per Mb. Overall P value is calculated by log-rank test. c, Scatter-plot of ICR score by genetic immunoediting (GIE) value for ICR-high and ICR-low samples. Number of samples in each quadrant is indicated in the graph. Gray area delineates ICR scores from 5–9. d, Kaplan–Meier for OS by IES. Censor points are indicated by vertical lines and corresponding table of number of patients at risk in each group is included below the Kaplan–Meier plot. Overall P value is calculated by log-rank test. e, Violin plot of IES by productive TCR clonality (immunoSEQ) (left) and MiXCR-derived TCR clonality (right). Spearman correlation statistics are indicated above each plot. Significance within ICR low and high is indicated. Center line, box limits and whiskers represent the median, interquartile range and 1.5× interquartile range, respectively. P values are two-sided, n reflects the independent number of samples./p> 2) (Fig. 5c and annotated in Supplementary Table 5). No major difference in α diversity (the variety and abundance of species within an individual sample) was observed between tumor and healthy samples (Extended Data Fig. 7b) and only a modestly reduced microbial diversity was observed in ICR-high versus ICR-low tumors (Extended Data Fig. 7b). Selenomonas and Selenomonas 3 were the taxa most significantly increased in ICR-high versus -low tumors (Fig. 5e, Extended Data Fig. 7c and Supplementary Table 6). In terms of survival analysis, the highest number of nominally significant associations was obtained using tumor data (rather than healthy colon data) and OS as the end point (Extended Data Fig. 7d and Supplementary Table 7)./p>20-fold coverage of at least 99% of targeted exons and >70-fold in at least 81% targeted exons. In healthy samples, sequencing achieved >20-fold coverage of at least 94% of targeted exons and >30-fold in at least 84% targeted exons. Adaptor trimming was performed using the tool trimadap (v.0.1.3). ConPair was run to evaluate concordance and estimate contamination between matched tumor–normal pairs. In eight of the pairs a mismatch was detected and for five pairs, a potential contamination was indicated. HLA typing data were used to validate these results. All potential mismatches and contaminations were excluded, retaining 281 patients for data analysis./p>2 µg) and sample selection was exclusively based on DNA availability. TCR sequencing was performed using extracted DNA of 114 primary tissue samples and ten matched healthy colon tissues with sufficient DNA available./p>0.1% were defined as tumor-enriched sequences, as previously implemented by Beausang et al.75. The fraction of tumor-enriched TCR sequences in the tumor was calculated by dividing the number of productive templates of tumor-enriched sequences by the total number of productive templates per tumor sample. Pearson's correlation coefficient between the fraction tumor-enriched TCR sequences and ICR score was calculated./p>1% in the general population. After these technical exclusion criteria, biological filters were applied, including selection of nonsynonymous mutations (frame shift deletions, frame shift insertions, inframe deletions, inframe insertions, missense mutations, nonsense mutations, nonstop mutations, splice site and translation start site mutations). The resulting number of variants/mutations per Mb (capture size is 40 Mb) per sample is referred to as the nonsynonymous TMB. Next, to identify most frequently mutated genes in our cohort that might play a role in cancer, we excluded variants that are predicted to be tolerated according to SIFT annotation or benign according to PolyPhen (polymorphism phenotyping). Finally, all artifact genes, which are typically encountered as bystander mutations in cancer that are mutated for example as a consequence of a high homology of sequences in the gene, were excluded76. The OncoPlot function from ComplexHeatmap (v.2.1.2) was used to visualize the most frequent somatic mutations./p>5% of the tumor samples) with frequencies detected in previously published datasets containing colon cancer samples (TCGA-COAD and NHS-HPFS) as well as reported cancer driver genes32 or colon oncogenic mediators38. First, we extracted genes with a nonsynonymous mutation frequency >5% in the AC-ICAM cohort. Subsequently, only genes that are likely involved in cancer development, as described in the section ‘Cancer-related gene annotation’, were retained. All artifact genes (mutations typically encountered as bystander mutations in cancer that are mutated for example as a consequence of a high homology of sequences in the gene), were excluded. Genes that have previously been reported as colon cancer oncogenic mediator38 or cancer driver gene for colorectal cancer (COADREAD)32 were also excluded. Finally, only genes with a mutation frequency <5% in the NHS-HPFS colon cancer cohort37 and <5% in TCGA-COAD36 were maintained. As a final filter, only genes that had a nonsynonymous mutation frequency of at least twofold in AC-ICAM compared to TCGA-COAD were labeled as potentially new in colon cancer./p> 0.4) or MSS (MANTIS score ≤ 0.4)./p> 500 nM, were used as criteria to infer neoantigens. Predicted neoantigens were used to calculate the GIE value. We calculated the GIE value by taking the ratio between the number of observed versus the number of expected neoantigens. The expected number of neoantigens was based on the assumption of a linearity between TMB and the number of neoantigens. We therefore assumed that samples that have a lower frequency of neoantigens than expected (lower GIE values), display evidence of immunoediting. A higher frequency of neoantigens than expected indicates a lack of immunoediting, see calculations section for details./p>60× coverage per sample. The median (across samples) of the average target coverage (per sample) was 76× (range of 50–92)./p> ±0.3. Clusters among the networks (groups of at least three correlated genera using the cutoffs specified above) were defined via a fast greedy clustering algorithm. All co-occurrence networks were made using the R package ‘NetCoMI (v.1.1.0) – Network Construction and Comparison for Microbiome Data’84 and visualized using Cytoscape (v.3.9.1)./p>0) and ‘low-risk’ (<0) groups as performed in the training set. Therefore, no cutoff optimization occurred in the validation phase./p>2 μg). Securing additional funds allowed us to perform WGS and 16S rRNA sequencing and to expand the WES and TCR analyses to any sample with sufficient DNA available. No specific power calculation was performed at that time and the targeted sample size was based on the estimated number of samples that could be retrieved from LUMC (n = 400), which compared favorably with the sample size of similar studies in the field./p>90% to detect a 10% mutational frequency in 90% of genes86./p>80% for an HR of 0.5 with a two-sided α of 0.05. With 154 OS events in the whole cohort, our study has a power of 90% for an HR of 0.59 (assuming two group of equal size c) and a power of 90% for an HR of 0.57 (assuming groups with unequal sample size, 2:1) with a two-sided α of 0.05./p>

0.1% in the tumor, that are at least 32 times more abundant in the tumor compared to the normal./p>12/Mb) versus Low (<12/Mb) TMB. b, Same as a, but only including ICR Medium. c, Kaplan–Meier curves for OS by GIE status. d, Same as c in ICR Medium patients. Overall P value is calculated by log-rank test and P value corresponding to HR is calculated using cox proportional hazard regression (a-d). e, Stacked bar charts of mutational load category (top) and MSI status (bottom) per IES. f, Kaplan–Meier curves for OS (left) and PFS (right) stratified by AJCC pathological stage (I, II, III) within IES4. Stratification was not performed for stage IV due to the limited number (n = 2). g, Stacked bar chart of distribution of AJCC Pathological Tumor Stage by IES. h, Multivariate cox proportional hazards model for OS including IES (ordinal, IES1, IES2, IES3, IES4) and AJCC Pathological Tumor Stage (ordinal, Stage I, II, III, IV). P values corresponding to HR calculated by cox proportional hazard regression analysis are indicated. i, Violin plot represents TCR clonality as determined by MiXCR in ICR Medium samples. Center line, box limits, and whiskers represent the median, interquartile range and 1.5x interquartile range respectively. P value calculated by unpaired, two-sided t-test. j, Results of the multiple linear regression model showing the respective contributions of productive TCR clonality (X1) and (X2) for prediction of IES (Y). Corresponding significance of the effects are indicated in the scatter-plots (left). k, Local Polynomial Regression Fitting of productive TCR clonality by IES (ordinal variable). The gray band reflects the 95% confidence interval for predictions of the local polynomial regression model. All P values are two-sided; n reflects the independent number of samples in all panels. Overall Survival (OS). Tumor Mutational Burden (TMB). Genetic Immunoediting (GIE). ImmunoEditing Score (IES)./p> 0). d, Concordance index of optimal multivariate cox regression model per dataset. The cross-validation performance highlights the mean concordance of 10-different folds with the optimal hyper parameters (gamma and lambda) that is, the same parameters as the optimal model. e, Forest plot with HR (center), corresponding 95% confidence intervals (error bars), and P value calculated by cox proportional hazard regression analysis for OS, using: 1) the 16 S MBR score in AC-ICAM, 2) WGS R. bromii abundance 3) PCR-based R. bromii abundance, 4) 16 S Ruminococcus 2 relative abundance and 5) MBR score calculated using WGS data. f, Heat map of Spearman correlation between the relative abundance of the MBR classifier taxa in tumor samples and immune traits. Only correlations with an FDR > 0.1 are visualized. An additional row is added for Ruminococcus 2 showing all correlations, unfiltered for FDR. * The taxonomical order is indicated between brackets, as family was unassigned. g, Kaplan–Meier curve for PFS in AC-ICAM, with all patients stratified by mICRoScore High vs Low. HR and P value are calculated using cox proportional regression. h, AJCC pathological stage within the mICRoScore High group in AC-ICAM and within TCGA-COAD i, Kaplan–Meier curve for PFS in AC-ICAM, with all patients with ICR High stratified by mICRoScore. Overall P value is calculated by log-rank test and P value corresponding to HR is calculated using cox proportional hazard regression. Overall Survival (OS), Progression-Free Survival (PFS). All P values are two-sided; n reflects the independent number of samples in all panels./p>