Junhyong Kim

Dr. Junhyong Kim

Christopher H. Browne Distinguished Professor of Biology

Co-Director, Penn Program in Single Cell Biology
Secondary Professor, Computer and Information Science


Research Interests

304G Lynch Laboratory



Ellison Medical Foundation Senior Scholar Award in Aging

John Simon Guggenheim Fellow

Yale Senior Faculty Research Excellence Award

Sloan Foundation Young Investigator Award


Ph.D., SUNY at Stoney Brook, 1992

Research Interests

My lab works at the interface of Genomics, Computational/Mathematical Biology, and Evolution. We employ both quantitative modeling and experimental methods to understand the fundamental mechanisms that govern dynamical processes in an organism and the evolution of such processes. We employ a wide variety of systems and collaborate broadly with colleagues from different disciplines. Some of the projects being carried out in my lab are:

Single cell functional genomics—Recent advances in genomic measurements have revealed surprising diversity in individual cell states at both DNA and RNA levels. My lab has been engaged in a close collaboration with Dr. Jim Eberwine (Penn) to characterize the transcriptome at the level of individual cells. A concept that we have advocated is the view that the cells of a multicellular individual should be seen not as uniform functional units of a tissue but rather that the tissues should be seen as functionally coherent assemblies arising from ecologies of cells. That is, an organ might be seen as a manifestation of communities of cells, whose ecological interactions characterize the system level phenotype. If this is an organizing principle across species, then the characterization of single cells, their diversity and ecology, will be an imperative for understanding the multi-cellular individual. Jim and I previously published some of the first single cell whole transcriptome measurements using microarray technologies and also investigations of the cell phenotype using direct transfers of whole transcriptomes.


In a joint project with the Eberwine lab supported by a Roadmap grant from NIMH, we are studying human brain cells and human cardiomyocytes for single cell transcriptome analysis. The goal is to characterize the single cell diversity of electrically excitable human cells and follow up with functional studies of expression manipulation in ex vivo conditions. The long-term goal of this project is to characterize the full diversity of human brain cells and the role of individual variation in establishing whole brain system-level function. We previously utilized whole RNA phototransfection, in a technique we called TIPeR, to effect a phenotype change from neurons to astrocytes and fibroblasts to cardiomyocytes. We are applying these techniques in this project to explore the function of cells at the level of single cell manipulations.

In another project supported by the Ellison Medical Foundation we are asking why some cells show aging or degeneration while others seem normal. Our idea is that all cells have natural single cell variation in their molecular states due to both physiological dynamics and functionally neutral expression drift. However, when the cell’s environment changes, say by aging or plaques, then some of the neutral variation may become deleterious leading to heterogeneous degeneration. To address this question we are carrying out single cell RNAseq, single molecule in situ, and phenotypic characterization of cells under stress.

Through direct work in my lab and through collaborations, we have obtained single cell transcriptome data from mice, rats, Drosophila, zebra fish, C. elegans, and Planaria, amassing more than 800 single cell transcriptome datasets. These datasets are ideal for comparative studies and for developing models of transcriptome evolution at the single cell level. Fig. A shows a 3D projection of the single cell transcriptome of five different mouse cell types showing large scale structured variability within a cell type. We are currently working on a control theory model to explain the single cell variability of each cell type as well as the dynamics that govern stable sets between cell types. We are especially interested in mechanisms controlling low-expressed genes in individual cells. We are also collecting data from different developmental time points across multiple mammalian species to understand how the single cell variation interacts with tissue level function and differentiation decisions.

Evolution of mammalian neurons—We currently have two related projects where we are investigating the evolution of neuronal function in mouse and rats with plans to expand to other rodent species. In the first project, we are dissecting the mechanism of RNA localization in hippocampal neurons and the evolutionary divergence of the set of dendritic RNA between mouse and rats. Again with the Eberwine lab, we previously discovered a novel role in rats for a SINE element called ID element. We found that these elements confer dendritic localization when inserted into certain transcripts. Interestingly, the localization mechanism does not seem to be conserved in mouse (Fig B). We are investigating the subsequent divergence of the dendritic transcriptome and potential functional consequences. In a related project, we found a novel form of functional non-coding RNA consisting of intronic sequences that are retained in a minor isoform of RNA. These retained intronic sequences found in the cytoplasm have been shown to affect neuronal function. We have used RNAseq of micro-dissected dendrites to characterize the extent of retained introns and found surprisingly broad distribution of retained introns throughout the transcriptome. These intron-retaining forms seems to be a minor isoform comprising 5-10% of the transcripts of a locus and also seem evolutionarily labile in their distribution across species and strains. We are now starting a new project to understand the evolutionary dynamics of these retained introns and how they might contribute to the evolution of neuronal function.  

Computational Algorithms and Quantitative Biology—Over last 25 years, my lab has been actively engaged in developing novel algorithms, statistical analyses, and quantitative models of biological data. We have developed novel algorithms for evolutionary tree estimation (e.g., horizontal transfer estimates), transmembrane protein annotation (used to clone Drosophila odorant receptors), gene expression networks (First-Order Conditional Independence network), gene expression temporal ordering (graph-theoretic principal curves), and time-series visualization, among others. In particular, we have been interested in developing novel tools to help understand the RNA biology of single cells and have developed methods for miRNA annotation, RNA structure classification without folding, RNA isoform detection, etc. We are constantly engaged in method development but our efforts are motivated by our biological work. For example, to analyze transcriptome data from my lab, we have developed methods for annotating SNPs from next-generation sequence data, using temporal patterns for gene regulatory module estimation, minimal transcription factor networks to estimate cell determination factors, and many other techniques. Each of these methods was developed to meet a research need in the lab.

I have been also interested in applying geometric techniques and frameworks for biological modeling and data analysis. For example, I helped to introduce the idea of using algebraic geometric concepts (Groebner basis and algebraic varieties) to the analysis of graph-theoretic biological data models. Using similar analysis we’ve also analyzed gene expression data for effective dimensions of variation. In particular, motivated by our observations in genomics data, I have been developing new ideas about mathematical models of system level function and evolution of system level function. Fig C shows a projection of whole transcriptome data from three different strains of yeast, measured at two different stages of the cell cycle for responses to temperature perturbations. A striking feature of this dataset is that while the number and kinds of genes that respond to temperature change is different in each strain/stage, the direction and pattern of response to temperature is quantitatively similar. Analysis of major direction of variation as a response to temperature suggests correlated axes between 9-20 degrees angle. This suggests that there is a coordinated system-level control that is not based on the identity of individual genes but a more global structure. We are developing a new geometric projection model for the control effects on genes and the evolutionary dynamics of such system-level control processes.




Courses Taught

Introduction to Computational Biology and Biological Modeling (BIOL 437)

Selected Publications

Daugharty, E., Goodman, A., and J. Kim 2012. Pervasive antisense transcription is conserved in budding yeast. Mol. Biol. Evol., doi:10.1093/molbev/mss240.

Eberwine, J., Lovatt D., Buckley, P., Dueck, H., Francis, C., Kim, T.K., Lee, J., Lee, M., Miyashiro K., Morris, J., Peritz, T., Schochet, T., Spaethling, J., Sul, J.-Y., and J. Kim 2012. Quantitative biology of Single Cells. J. Roy. Soc. Interface doi:10.1098/rsif.2012.0417.

J. Kim and J. Eberwine. 2010. RNA as the state memory of cellular phenotype. Trends in Cell Biology, DOI: 10.1016/j.tcb.2010.03.003.

Sul, J.-Y., Wu, C.K., Zeng, F., Jochems, J., Lee, M.T., Kim, T.K, Peritz T., Buckley, P., Cappelleri, D.J., Maronski, M., Kim, M., Kumar, V., Meaney, D., Kim, J., and Eberwine, J. 2009. Transcripome transfer produces a predictable cellular phenotype. PNAS USA  doi: 10.1073/pnas.0902161106.

Rifkin, S. A., Houle, D., Kim, J. and White, K.P. 2005. A mutation accumulation assay reveals extensive capacity for rapid gene expression evolution. Nature, 438: 220-223.

Ge, F., Wang, L.S., Kim, J. 2005. Cobweb of life revealed by genome-scale estimates of horizontal gene transfer. PLOS Biology, 3(10):e316.

Kim, J. Computers are from Mars, Organisms are from Venus: An interrelationship guide to Biology and Computer Science. IEEE Computer, July 2002.

Kim, J. 2000. Slicing hyperdimensional oranges: The geometry of phylogenetic estimation. Mol. Phyl. Evol. 17(1): 58-75.