Abstract
Single-cell RNA-seq is a powerful tool for revealing heterogeneity in cancer cells. However, each of the current single-cell RNA-seq platforms has inherent advantages and disadvantages. Here, we show that combining the different single-cell RNA-seq platforms can be an effective approach to obtaining complete information about expression differences and a sufficient cellular population to understand transcriptional heterogeneity in cancers. We demonstrate that it is possible to estimate missing expression information. We further demonstrate that even in the cases where precise information for an individual gene cannot be inferred, the activity of given transcriptional modules can be analyzed. Interestingly, we found that two distinct transcriptional modules, one associated with the Aurora kinase gene and the other with the DUSP gene, are aberrantly regulated in a minor population of cells and may thus contribute to the possible emergence of dormancy or eventual drug resistance within the population.
Similar content being viewed by others
Introduction
Recent developments in single-cell sequencing technologies have opened the possibility of analyzing individual single cells. A number of reports have demonstrated that single-cell analysis provides pivotal information for elucidating cellular plasticity and diversity within a given population of cells in vitro and in vivo. There are a number of potential applications for scRNA-seq analysis. Among them, cancer is supposed to be one of the most important targets to analyze1. Cancer is a complex cellular ecosystem that consists of various cell types, including cancer cells, cancer-associated fibroblasts, tumor-infiltrating leukocytes and vascular cells2. In cancer, no two individual cells are identical, as they are located in varying micro-environmental conditions and interact with different cells. Even among clonal cancer cells, diverse phenotypic features are frequently observed in each individual cell during the clonal evolution of cancers3,4,5. It is important to focus on the molecular diversity of cancer cells to understand the mechanisms underlying the emergence of drug-resistant and metastasizing cells. Detailed knowledge of such intra-tumor heterogeneity would provide crucial information for understanding the eventual development of drug-resistant cells or metastatic dissemination in cancer and also generate potential opportunities for novel pharmaceutical interventions6,7. Most cancers acquire resistance to anti-cancer drugs, including gefitinib, one of the most well-characterized molecular-targeting anti-cancer drugs for EGFR in lung adenocarcinomas, after a period of drug treatment8,9,10. Several drug resistance-acquiring mutations, such as the T790M mutation in the EGFR gene, have been reported11. These drug-resistant mutations emerge in a small number of cells and spread to the population12. Before drug resistance is fixed in the form of genomic mutations, transcriptomic diversity is considered to be the precedent for the initial repertoire, allowing cells to survive during initial selection13. Details of the molecular process that eventually leads to drug resistance remain mostly elusive despite several pioneering studies14. It has been difficult to examine cellular diversity using conventional methods, which subject groups of cells to assays in bulk. Limited amounts of data represent cancer cell heterogeneity, even with the most powerful datasets describing cancers, such as the TCGA, ICGC and COSMIC databases15,16,17,18,19.
It is expected that single-cell analysis will provide substantial novel insights into the identification and characterization of rare cells among cancer cells20,21,22,23,24. Such single-cell analyses were initiated with single-cell RNA-seq (scRNA-seq) analysis. In scRNA-seq, single cells are separated by micro-pipetting, laser capture micro-dissection, FACS, micro-fluidic and micro-droplet-based methods25,26,27,28,7C inset table, Fisher’s exact test, p = 0.39).
Module activities in other cell lines
We examined whether the high activities of the DUSP1-AURKA gene modules were unique to gefitinib-treated PC9 cells. Using the micro-chamber datasets for the other cell lines, we conducted a similar module analysis. First, we performed clustering analysis using entire genes and found that the cells were separated depending on their originating cell types (Sup. Fig. S8A). Of note, the cells were not separated between untreated and gefitinib-treated cells in H1975 and H2228, which gefitinib should not affect (Sup. Fig. S8B). Distinct patterns were observed for the PC9 and II-18 cells, among which the untreated and gefitinib-treated cells were distinct. H1650 cell clustering was marginal, perhaps reflecting their partial response to gefitinib. When we clustered each of the cell lines by the “magenta” module (the DUSP1 gene module) in PC9 cells, we did not identify clear outlier cells in the gefitinib-treated II-18 cells. Several cells had high expression levels of AURKA and DUSP1 genes (s_255 and s_274 cells; indicated by arrows; Fig. 8A). However, their expression levels were less significant than those of the outlier PC9 cell, s_062 (top panel; Fig. 8A).
Instead, when we conducted a similar analysis for II-18 cells, we identified a distinct module, “red.” This module consisted of 132 genes (Sup. Table S8) with the SOX4 gene as a core. Clustering using this module identified two outlier cells (s_252 and s_247 cells) (Fig. 8B) that exhibited high levels of ME-turquoise and CD44 expression. Analysis using the micro-droplet dataset revealed that such cells represented 0.06% of the entire population (Sup. Fig. S9). SOX4 has been reported to act as a tumor suppressor gene, depending on the context, by promoting cell cycle arrest and apoptosis53,54,55. CD44 is known as a cancer stem or cancer progenitor marker in several tumors56,57. Another report indicates that cells expressing CD44 show stem-cell-like properties58. Aberrant activation of this module was not observed in PC9. Cancer cells may utilize some common modules for survival but may more frequently use unique modules, depending on their original transcriptomic status.
Possible biological relevance of outlier cells
Using data from clinical samples, we investigated the potential phenotypic significance of the outlier cells with high expression levels of DUSP1 and AURKA genes. For this purpose, we used the TCGA dataset9, which provides transcriptome information as well as clinical information for 506 lung adenocarcinoma patients. We divided the patients into two groups based on AURKA and DUSP1 expression levels (Fig. 9A, inset table). Again, there seemed to be no or little direct correlation between the expression levels of these genes in the clinical samples. Twenty-nine cases showed high expression levels of both genes. We compared the overall survival times of the patients depending their expression levels of AURKA and DUSP1 genes. We observed that the patients with high expression levels of both AURKA and DUSP1 genes showed a poor prognosis compared with cases with normal or low expression levels of either gene (Fig. 9B). Activation of AURKA and DUSP1 genes may have a favorable effect on the survival of lung cancer cells. Outlier cells with modules highly related to AURKA and DUSP1 would have survival advantages, particularly under severe conditions, and may contribute to the development of small populations of more malignant cells for which anti-cancer drugs are less effective.
Conclusions
In this study, we first evaluated the representative two analytical platforms, the micro-chamber and the micro-droplet methods, which are used for scRNA-seq. We found that the datasets obtained from those different platforms have the similar nature, although the respective platforms have their unique advantages and disadvantages. To make the most use of the advantages of these two platforms, we attempted to combine the datasets generated from two different platforms in the later sections. In the first part of the paper, we generated datasets from the micro-chamber and the micro-droplet platforms. In either of the platforms, the datasets consisted of the gene expression information of each individual single cell. Such a separation was possible even though the two platforms identify the single cell based on the different methods. Namely, for the micro-chamber platform, we collected the individual cells by separating the cells into different chambers. With the micro-droplet platform, we separated the cells by confining the cells into micro-droplets, where the mRNAs of each cell were labeled with distinct barcodes in individual droplets. As a result, even though the number of the analyzed cells and the sequencing depth per cell differed between the two platforms, we could compare the expression information between the data at the individual cell level. Detailed technical evaluations and comparisons of these platforms revealed that both methods were highly reproducible and concordant. However, the difference in sequence depths, which depends on the number of cells subjected to the analysis with a given sequencing cost, caused distinct features of the datasets. Indeed, both methods had inherent advantages and disadvantages. Namely, the micro-chamber system enabled us to examine the detailed character of each cell in terms of its gene expression information. However, the feasible number of subjected cells is too small to detect a rare population of cells and estimate the frequency and variance of those cells if they are detected. Conversely, the micro-droplet platform examined a much larger number of cells. We considered that the micro-droplet platform to be still useful, despite the expression information from a given single cell being relatively poor. This is the only currently available platform that can analyze >5,000 cells at the same time. Without this platform, it would be essentially impossible to characterize a population of cells regarding the gene expression information of each individual cell. Therefore, that the micro-chamber method should be complementarily used with the micro-droplet method. Namely, the former has an advantage in the sequencing depth and rich expression information for each single cell, while the latter has an advantage in the population analysis of the cells. However, the limited sequencing depth for each cell makes interpretation of the gene expression data on its own difficult.
In the second part of this paper, we demonstrated to address these limitations by integrating the data from the two platforms. First, we provided a statistical inference originating from the micro-chamber dataset that could predict the missing values in the micro-droplet dataset. Second, we identified a minor population of cells using a transcriptional module-based approach with the micro-chamber dataset. Further analyses using the micro-droplet dataset revealed the frequency and divergence of such cells in the entire population. In particular, we identified two modules with the AURKA gene and the DUSP1 gene as their cores. Interestingly, simultaneous activation of those genes was associated with the poorest prognosis in clinical samples. We believe that single-cell analysis would provide indispensable information for further analysis of the molecular basis underlying the emergence of such cancer cells.
Further detailed evaluations are clearly needed to validate the clinical relevance of the observed heterogeneity of cancer cells. Diverse in vivo microenvironments should further impose complicated factors on cellular gene expression. Several methods to monitor single-cell transcriptomes in vivo are being developed. However, the resolution and precision of the data are still limited. Taking various advantages of the cell lines, we believe that this work should provide a first step towards a thorough understanding of the diverse nature of cancer.
Materials and Methods
Cell culture
PC9 and II-18 cells were acquired from the RIKEN Bio Resource Center (catalog number RCB4455 and RCB2093), and H1650, H1975 and H2228 were acquired from the American Type Culture Collection (catalog numbers CRL5883, CRL5908 and CRL5953). The cells were grown in RPMI-1640 medium (Wako, 189–02145) with 10% fetal bovine serum (FBS), MEM Non-Essential Amino Acid Solution (catalog number M7145, Sigma-Aldrich, St. Louis, MO) and penicillin and streptomycin in an incubator maintained at 37 °C with 5% CO2. For gefitinib (CAS 184475-35-2, Santa Cruz Biotechnology) treatment, the drug was added to the culture medium at a final concentration of 1 μM. Twenty-four hours after the drug treatment, the cells were harvested. For the untreated control, DMSO was added to the culture medium in place of gefitinib. For each experiment, 106 cells were harvested and separated using bead-seq and a Chromium Single Cell 3’ (10× Genomics, version 1).
Single-cell RNA-seq with the micro-chamber system
We prepared libraries according to Matsunaga et al.31 and utilized the HiSeq. 2500 platform (Illumina) with 50-base single-end reads. For the PC9 replicate samples, we performed 35-base single-end reads. To remove ribosomal RNA, the generated RNA-seq tags were mapped to rmRNA, and unmapped reads were removed. Trimmed reads were aligned to the human reference genome (UCSC hg19) by TopHat/Bowtie. Using our Perl script, RNA-seq tag counts were calculated as reads per kilobase RNA per million mapped tags (RPKM)59.
Single cell RNA-seq with the micro-droplet system
Using Chromium Single Cell 3′, libraries were prepared according to the manufacturer’s instructions. We used a HiSeq. 2500 Rapid run platform to generate 50-base paired-end reads. RNA-seq tags from the Chromium experiments were aligned using Cell Ranger software. Using our Perl script, sequences with low quality and PCR duplicates were removed. Trimmed reads were sorted based on their cell barcode, and only cell barcodes with >5 k tags were selected. Using our Perl script, RNA-seq tag counts were calculated as parts per million mapped tags (ppm).
Correlation analysis between two platforms
When the values of the results from the two different platforms, which have distinct numbers of cells and sequencing depths, were compared, the statistical significance of the difference was evaluated by the indicated methods. For the correlation analysis at the cell to cell level, we selected the individual cells having the largest the second largest and the third largest number of their sequence tags and designates them as “top1”, “top 2” and “top3” cells, respectively, for each of the platforms.
Cell cycle analysis of PC9 cells
As shown in Fig. 5A at the top left, we used 44 PC9 DMSO-treated cells and 20 cell cycle-regulated genes (four genes per phase) to refine the cell state; CCNE1, E2F1, CDC6 and PCNA were used for G1/S phase, RFC4, DHFR, RRM2, and RAD51 for S phase, CDC2, TOP2A, CCNF and CCNA2 for G2 phase, STK15, BUB1, CCNB1 and PLK1 for G2/M phase, and PTTG1, RAD21, VFGFC and CDKN3 for M/G1 phase. These gene sets were obtained from Whitefield et al.60. The expression levels (RPKM) of each gene in the gene set of each single cell were calculated and scaled. To order the cells, we compared the average scores of five phases.
For the micro-droplet datasets, we first attempted to draw a heatmap using same method as for the micro-chamber dataset. However, we cannot draw the heatmap as in Fig. 5A top due to the absence of values in the micro-droplet datasets. To overcome those problems, we analyzed the cell cycle of each cell based on the method previously reported by Macosko et al.33. As shown at the bottom left of Fig. 5A, we used 5,166 PC9 DMSO-treated cells and 603 genes from Macosko et al. From those genes, we excluded genes with a low correlation to the cell state (r < 0.2). Twenty-one genes correlated with the G1/S phase, 14 genes with the S phase, 39 genes with the G2/M phase, 51 genes with the M phase, and 19 genes with the M/G1 phase remained (a total of 144 genes, Sup. Table 2). We calculated the expression levels of these genes and averaged the normalized (log2(PPM+1)) values in each phase. We scaled these scores and obtained a phase-specific score for 5,166 cells. Next, we compared the pattern of the phase-specific scores to nine potential patterns to determine the cell phases and ordered the cells according to their phases. Of the 5,166 cells, 2,812 cells were grouped into five phases. In contrast, the other 2,354 cells were estimated to be intermediate between G1/S and S, G2/M and M, and M/G1 and M phases. With the ordered datasets, we ran the R package “gplots” and used the “heatmap2” routine included in this package61.
At the top right of Fig. 5A, we used the same datasets as in the heatmap. At the bottom left of Fig. 5A, we used the 2,812 cells that had been grouped into five phases. We did not use the cells estimated to be intermediate between phases. To generate a two-dimensional projection, we reduced the dimensionality of those two datasets by principal component analysis (PCA)62. We represented individual cells by running R package “ggplot2” to draw figures63.
MAPK Analysis of PC9 cells
To determine the expression levels of genes included in the MAPK/ERK pathway in Fig. 5B, we mapped the tag counts of each gene in the illustration43. We used the top1 cell, the cell with the largest number of mapped reads per cell, from each platform to color the figures64.
Estimation of missing values
To estimate missing values, we combined gene expression data from two systems. We used 232 DMSO-treated cells and 210 gefitinib-treated cells from the present study. Genes with an average RPKM >10 across different cells were selected from the micro-chamber data sets. We also used the micro-droplet system datasets as predictors and to construct predictive models. The base-10 logarithms of all the expression levels were processed, and a pseudo value of 0.01 was used for values that were missing before the logarithms of the values were taken. There were 4,901 and 4,845 genes in the micro-chamber system with RPKM >10 for the DMSO- and gefitinib-treated cells, respectively. The expression levels of the genes in the micro-chamber system were encoded as explanatory variables, and the other genes that were not consistently among the explanatory variables were encoded as response variables. LASSO regression was then performed65. The response functions of LASSO were subsequently employed with the micro-droplet system datasets to predict gene expression levels.
To validate the estimation, we used the gene expression levels of the micro-droplet system dataset that were not missing and compared them with the values that had been estimated according to the computational method (Fig. 6A and D). The global correlation coefficients were determined by calculating Pearson’s r between the experimental values and predicted values of all the cells.
All the R programs were executed using R version 3.3.1, and the R package “glmnet” was employed to perform the Lasso regression. The parameter lambda in the Lasso regression was set to the 10th value of the lambda list in “glmnet” R package, and other parameters were set to their default values66.
Module-based single-cell analysis
We ran R package “WGCNA” and estimated co-expression network modules. First, we used 66 cells (DMSO-treated and gefitinib-treated PC9 cells)44. We clustered the samples and detected and removed five outlier cells with low expression levels (<5 RPKM) for more than 5000 genes. We removed genes that were not expressed much more than 5 RPKM in at least one cell. Based on the scRNA-seq data from 61 PC9 cells, we identified 71 modules and listed the genes included in those modules and the ME value of each cell. To evaluate the characteristics of these modules, we also conducted an eigengene network analysis and gene ontology (GO) enrichment analysis, which are included in the WGCNA package. We repeated the same process for the other four cell lines: II-18, H1650, H1975, and H2228. Figures were generated based on the identified modules (Sup. Table S9).
To create Fig. 7A, we used 61 PC9 cells (44 DMSO-treated and 17 gefitinib-treated cells) and the expression levels of genes included in the module “lightsteelblue1”. First, we rearranged the cells in the MElightsteelblue1 value order and represented the treatment (DMSO or gefitinib) and MElightsteelblue1 value for each cell with a bar plot. We then transformed the expression level of the gene in the module “lightsteelblue1” to a log2(RPKM+0.01) value and drew a heatmap. We used heatmap.2, which is included in the R package “ggplots.” In the right margin, we show the expression levels of four genes, the top3 module genes and AURKA, and the MEmagenta value for each cell with a bar plot.
To create Fig. 7C, we used the expression levels of the genes included in the module “magenta.” We projected 9,544 cells based on their PC scores onto a two-dimensional map using t-Distributed Stochastic Neighbor Embedding (t-SNE)67. Cells were clustered into two clusters based on the k-means score and colored by treatment, orange for DMSO and blue for gefitinib.
To create Fig. 8, we gathered data from 429 cells (Sup. Table 5) and applied a hierarchal clustering based on the genes included in the modules “II-18-red” (top) and “magenta (PC9 module)” (bottom).
Survival analysis
To analyze the TCGA dataset, we downloaded the RNA-seq v2 data and clinical information for the TCGA lung adenocarcinoma (TCGA-LUAD) dataset from the NCI Genomic Data Commons using TCGA-Assembler v2.0.1 (the data downloaded on 2017/03/09)68. We obtained 506 cases with both RNA-seq and clinical data. As the RNA-seq dataset, we downloaded the dataset by TCGA-Assembler with the following options; assayPlatform = “gene.normalized_RNAseq” and cancerType = “LUAD.” We transformed the expression levels to their log2(expression + 1) values. In the present study, we denoted expression “high” when its level was > average + 0.5 s.d. and “low” when its level was < average −0.5 s.d. We download the clinical dataset by TCGA-Assembler with the option; cancerType = “LUAD.” The data for overall survival for each case were extracted from clinical patient and follow-up files. Kaplan-Meier analysis with the log-rank test was conducted using the survival package in R69.
Availability of data and material
The sequence data from this study have been submitted to DNA Data Bank of Japan under accession number DRA005922- DRA005929.
References
Navin, N. E. The first five years of single-cell cancer genomics and beyond. Genome Res. 25, 1499–1507 (2015).
Pienta, K. J., McGregor, N., Axelrod, R. & Axelrod, D. E. Ecological Therapy forCancer: Defining Tumors Using an Ecosystem Paradigm Suggests New Opportunities for Novel Cancer Treatments. Transl. Oncol. 1, 158–164 (2008).
Hu, M. & Polyak, K. Microenvironmental regulation of cancer development. Curr. Opin. Genet. Dev. 18, 27–34 (2008).
Quail, D. & Joyce, J. Microenvironmental regulation of tumor progression and metastasis. Nat. Med. 19, 1423–1437 (2013).
Elenbaas, B. & Weinberg, R. A. Heterotypic Signaling between Epithelial Tumor Cells and Fibroblasts in Carcinoma Formation. Exp. Cell Res. 264, 169–184 (2001).
McGranahan, N. & Swanton, C. Biological and therapeutic impact of intratumor heterogeneity in cancer evolution. Cancer Cell 27, 15–26 (2015).
Gawad, C., Koh, W. & Quake, S. R. Single-cell genome sequencing: current state of the science. Nat. Rev. Genet. 17, 175–188 (2016).
Sharma, S. V., Bell, D. W., Settleman, J. & Haber, D. A. Epidermal growth factor receptor mutations in lung cancer. Nat. Rev. Cancer 7, 169–181 (2007).
Collisson, E. A. et al. Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).
Chong, C. R. & Jänne, P. A. The quest to overcome resistance to EGFR-targeted therapies in cancer. Nat. Med. 19, 1389–1400 (2013).
Kobayashi, S. et al. EGFR Mutation and Resistance of Non–Small-Cell Lung Cancer to Gefitinib. N. Engl. J. Med. 352, 786–792 (2005).
Shaffer, S. M. et al. Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance. Nature 546, 431–435 (2017).
Takahashi, T. et al. Genomic and transcriptomic analysis of imatinib resistance in gastrointestinal stromal tumors. Genes, Chromosom. Cancer 56, 303–313 (2017).
Hata, A. N. et al. Tumor cells can follow distinct evolutionary paths to become resistant to epidermal growth factor receptor inhibition. Nat. Med. 22, 262–269 (2016).
Chang, K. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat. Genet. 45, 1113–1120 (2013).
Zhang, J. et al. International cancer genome consortium data portal-a one-stop shop for cancer genomics data. Database 2011, 1–10 (2011).
Forbes, S. A. et al. COSMIC: Exploring the world’s knowledge of somatic mutations in human cancer. Nucleic Acids Res. 43, D805–D811 (2015).
Forbes, S. A. et al. The Catalogue of Somatic Mutations in Cancer (COSMIC), https://doi.org/10.1002/0471142905.hg1011s57 (2009).
Hudson (Chairperson), T. J. et al. International network of cancer genome projects. Nature 464, 993–998 (2010).
Tirosh, I. et al. Single-cell RNA-seq supports a developmental hierarchy in human oligodendroglioma. Nature 539, 309–313 (2016).
Ramsköld, D. et al. Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol. 30, 777–782 (2012).
Venteicher, A. S. et al. Decoupling genetics, lineages, and microenvironment in IDH-mutant gliomas by single-cell RNA-seq. Science (80−.). 355, eaai8478 (2017).
Ting, D. T. et al. Single-cell RNA sequencing identifies extracellular matrix gene expression by pancreatic circulating tumor cells. Cell Rep. 8, 1905–1918 (2014).
Suzuki, A. et al. Single-cell analysis of lung adenocarcinoma cell lines reveals diverse expression patterns of individual cells invoked by a molecular target drug treatment. Genome Biol. 16, 66 (2015).
Gierahn, T. M. et al. Seq-Well: portable, low-cost RNA sequencing of single cells at high throughput. Nat. Methods 14, 395–398 (2017).
Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: Single-Cell RNA-Seq by Multiplexed Linear Amplification. Cell Rep. 2, 666–673 (2012).
Wilson, N. K. et al. Combined Single-Cell Functional and Gene Expression Analysis Resolves Heterogeneity within Stem Cell Populations. Cell Stem Cell 16, 712–724 (2015).
Nichterwitz, S. et al. Laser capture microscopy coupled with Smart-seq. 2 for precise spatial transcriptomic profiling. Nat. Commun. 7, 12139 (2016).
Hu, P., Zhang, W., **n, H. & Deng, G. Single Cell Isolation and Analysis. Front. Cell Dev. Biol. 4, 1–12 (2016).
Wang, Y. & Navin, N. E. Advances and Applications of Single-Cell Sequencing Technologies. Mol. Cell 58, 598–609 (2015).
Matsunaga, H. et al. A highly sensitive and accurate gene expression analysis by sequencing (‘bead-seq’) for a single cell. Anal. Biochem. 471, 9–16 (2015).
Wu, A. R. et al. Quantitative assessment of single-cell RNA-sequencing methods. Nat. Methods 11, 41–46 (2013).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Zheng, G. X. Y. et al. Massively parallel digital transcriptional profiling of single cells. Nat. Commun. 8, 14049 (2017).
Sos, M. L. et al. PTEN Loss Contributes to Erlotinib Resistance in EGFR-Mutant Lung Cancer by Activation of Akt and EGFR. Cancer Res. 69, 3256–3261 (2009).
Pao, W. et al. Acquired resistance of lung adenocarcinomas to gefitinib or erlotinib is associated with a second mutation in the EGFR kinase domain. PLoS Med. 2, 0225–0235 (2005).
Koivunen, J. P. et al. EML4-ALK Fusion Gene and Efficacy of an ALK Kinase Inhibitor in Lung Cancer. Clin. Cancer Res. 14, 4275–4283 (2008).
Soda, M. et al. Identification of the transforming EML4–ALK fusion gene in non-small-cell lung cancer. Nature 448, 561–566 (2007).
Pak, M., Shin, D., Lee, C. & Lee, M. Significance of EpCAM and TROP2 expression in non-small cell lung cancer. World J. Surg. Oncol. 10, 53 (2012).
Shvartsur, A. & Bonavida, B. Trop2 and its overexpression in cancers: regulation and clinical/therapeutic implications. Genes Cancer 6, 84–105 (2015).
Smith, S. L. et al. Overexpression of aurora B kinase (AURKB) in primary non-small cell lung carcinoma is frequent, generally driven from one allele, and correlates with the level of genetic instability. Br. J. Cancer 93, 719–729 (2005).
Cell signaling Technology. MAPK/Erk in Growth and Differentiation Signaling Pathway. Available at: http://www.cellsignal.com/common/content/content.jsp?id=pathways-mapk-erk&pathway=MAPK/Erk in Growth and Differentiation.
Langfelder, P. & Horvath, S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9, 559 (2008).
Chen, J. et al. AURKA upregulation plays a role in fibroblast-reduced gefitinib sensitivity in the NSCLC cell line HCC827. Oncol. Rep. 33, 1860–1866 (2015).
Zhong, N. et al. Silencing Aurora-A with siRNA inhibits cell proliferation in human lung adenocarcinoma cells. Int. J. Oncol. https://doi.org/10.3892/ijo.2016.3605 (2016).
Zhou, X. et al. Gefitinib Inhibits the Proliferation of Pancreatic Cancer Cells via Cell Cycle Arrest. Anat. Rec. Adv. Integr. Anat. Evol. Biol. 292, 1122–1127 (2009).
Lawan, A., Shi, H., Gatzke, F. & Bennett, A. M. Diversity and specificity of the mitogen-activated protein kinase phosphatase-1 functions. Cell. Mol. Life Sci. 70, 223–237 (2013).
Jeffrey, K. L., Camps, M., Rommel, C. & Mackay, C. R. Targeting dual-specificity phosphatases: manipulating MAP kinase signalling and immune responses. Nat. Rev. Drug Discov. 6, 391–403 (2007).
Moncho-Amor, V. et al. DUSP1/MKP1 promotes angiogenesis, invasion and metastasis in non-small-cell lung cancer. Oncogene 30, 668–678 (2011).
Kidger, A. M. & Keyse, S. M. The regulation of oncogenic Ras/ERK signalling by dual-specificity mitogen activated protein kinase phosphatases (MKPs). Semin. Cell Dev. Biol. 50, 125–132 (2016).
Kesarwani, M. et al. Targeting c-FOS and DUSP1 abrogates intrinsic resistance to tyrosine-kinase inhibitor therapy in BCR-ABL-induced leukemia. Nat. Med. 23, 472–482 (2017).
Hur, W. et al. SOX4 overexpression regulates the p53-mediated apoptosis in hepatocellular carcinoma: Clinical implication and functional analysis in vitro. Carcinogenesis 31, 1298–1307 (2010).
Pan, X. et al. Induction of SOX4 by DNA damage is critical for p53 stabilization and function. Proc. Natl. Acad. Sci. USA 106, 3788–93 (2009).
Vervoort, S. J., van Boxtel, R. & Coffer, P. J. The role of SRY-related HMG box transcription factor 4 (SOX4) in tumorigenesis and metastasis: friend or foe? Oncogene 32, 3397–3409 (2013).
McFarlane, S. et al. CD44 increases the efficiency of distant metastasis of breast cancer. Oncotarget 6, 11465–76 (2015).
Du, L. et al. CD44 is of functional importance for colorectal cancer stem cells. Clin. Cancer Res. 14, 6751–6760 (2008).
Leung, E. L.-H. et al. Non-Small Cell Lung Cancer Cells Expressing CD44 Are Enriched for Stem Cell-Like Properties. PLoS One 5, e14062 (2010).
Mortazavi, A., Williams, B. A., McCue, K., Schaeffer, L. & Wold, B. Map** and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–8 (2008).
Whitfield, M. L. et al. Identification of Genes Periodically Expressed in the Human Cell Cycle and Their Expression in Tumors. Mol. Biol. Cell 13, 1977–2000 (2002).
Warnes, G. R. et al. gplots: Various R Programming Tools for Plotting Data (2016).
Jolliffe, I. T. Principal Component Analysis, Second Edition. Encycl. Stat. Behav. Sci. 30, 487 (2002).
Wickham, H. ggplot2: Elegant Graphics for Data Analysis. (Springer-Verlag, New York, 2009).
Suzuki, A. et al. DBTSS as an integrative platform for transcriptome, epigenome and genome sequence variation data. Nucleic Acids Res. 43, D87–D91 (2015).
Tibshirani, R. Regression selection and shrinkage via the lasso. Journal of the Royal Statistical Society B 58, 267–288 (1996).
Friedman, J., Hastie, T. & Tibshirani, R. Regularization Paths for Generalized Linear Models via Coordinate Descent. J. Stat. Softw. 33 (2010).
Maaten, vander Visualizing Data using t-SNE. J. Mach. Learn. Res. 9, 2579–2605 (2008).
Zhu, Y., Qiu, P. & Ji, Y. TCGA-Assembler: open-source software for retrieving and processing TCGA data. Nat. Methods 11, 599–600 (2014).
Therneau, T. M. & Grambsch, P. M. Modeling Survival Data: Extending the Cox Model. (Springer New York, 2000). https://doi.org/10.1007/978-1-4757-3294-8.
Acknowledgements
The authors are grateful to K. Imamura, Y. Ishikawa, Y. Kuze, T. Horiuchi, H. Wakaguri and S. Shimazu for technical support. This work was supported by JST CREST Grant Number JPMJCR15G3, Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) from AMED under Grant Number JP17am0101104, and JSPS KAKENHI Grant Number 16H06279.
Author information
Authors and Affiliations
Contributions
A.S. and Y.S. designed and directed the study. Y.K., M.H., H.M. and M.S. performed the drug response experiments. Y.L. analyzed the lack of estimation study. K.A., T.K., H.T., S.S. and K.T. contributed to scientific discussions for designing the study and writing the manuscript. Y.K., A.S., K.T. and Y.S. wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Kashima, Y., Suzuki, A., Liu, Y. et al. Combinatory use of distinct single-cell RNA-seq analytical platforms reveals the heterogeneous transcriptome response. Sci Rep 8, 3482 (2018). https://doi.org/10.1038/s41598-018-21161-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-018-21161-y
- Springer Nature Limited