Abstract
Personality traits can offer considerable insight into the biological basis of individual differences. However, existing approaches toward understanding personality across species rely on subjective criteria and limited sets of behavioral readouts, which result in noisy and often inconsistent outcomes. Here we introduce a mathematical framework for describing individual differences along dimensions with maximum consistency and discriminative power. We validate this framework in mice, using data from a system for high-throughput longitudinal monitoring of group-housed male mice that yields a variety of readouts from across the behavioral repertoire of individual animals. We demonstrate a set of stable traits that capture variability in behavior and gene expression in the brain, allowing for better-informed mechanistic investigations into the biology of individual differences.
Similar content being viewed by others
Data availability
The RNA-seq data for this project have been deposited to the NCBI’s Sequence Read Archive (SRA) under the following accession number: PRJNA542512. The datasets generated during and/or analyzed during the current study are available from the corresponding author upon request.
Code availability
All the code used in the Matlab LDA implementation, including a demonstration of its use on the results from the original cohort of mice (n = 168), is publicly available at the following link: https://orenforkosh.github.io/IdentityDomains/. The color-based video tracking system will be made available upon request. Likewise, the self-similarity tests implemented in Matlab and the R code used in the RNA-seq data analysis will be made available upon reasonable request.
References
Eysenck, J. H. The Structure of Human Personality (Methuen & Co., 1953).
McCrae, R. R. & Costa, P. J. Personality in Adulthood: A Five-Factor Theory Perspective (Guilford Press, 2002).
Shemesh, Y. et al. High-order social interactions in groups of mice. eLife 2, e00759 (2013).
Shemesh, Y. et al. Ucn3 and CRF-R2 in the medial amygdala regulate complex social dynamics. Nat. Neurosci. 19, 1489–1496 (2016).
Shoval, O. et al. Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space. Science 336, 1157–1160 (2012).
Gallagher, T., Bjorness, T., Greene, R., You, Y. J. & Avery, L. The geometry of locomotive behavioral states in C. elegans. PLoS One 8, e59865 (2013).
Krömer, S. A. et al. Identification of glyoxalase-I as a protein marker in a mouse model of extremes in trait anxiety. J. Neurosci. 25, 4375–4384 (2005).
Butcher, J. N. Minnesota Multiphasic Personality Inventory. in The Corsini Encyclopedia of Psychology (eds Weiner, I. B. & Craighead, W. E.) https://doi.org/10.1002/9780470479216.corpsy0573 (2010).
McCrae, R. R., Costa, P. T., Del Pilar, G. H., Rolland, J.-P. & Parker, W. D. Cross-cultural assessment of the five-factor model. J. Cross Cult. Psychol. 29, 171–188 (1998).
Triandis, H. C. & Suh, E. M. Cultural influences on personality. Annu. Rev. Psychol. 53, 133–160 (2002).
Rothbart, M. K. Measurement of temperament in infancy. Child Dev. 52, 569–578 (1981).
Hart, Y. et al. Inferring biological tasks using Pareto analysis of high-dimensional data. Nat. Methods 12, 233–235 (2015).
Hinrich, J. L. et al. Archetypal analysis for modeling multisubject fMRI data. IEEE J. Sel. Top. Signal Process. 10, 1160–1171 (2016).
Rabiner, L. R. A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77, 257–286 (1989).
Duda, R. O., Hart, P. E. & Stork, D. G. Pattern Classification (Wiley, 2012).
Breiman, L., Friedman, J., Stone, C. J. & Olshen, R. A. Classification and Regression Trees (CRC Press, 1984).
De Vries, H., Stevens, J. M. G. & Vervaecke, H. Measuring and testing the steepness of dominance hierarchies. Anim. Behav. 71, 585–592 (2006).
Leger, M. et al. Object recognition test in mice. Nat. Protoc. 8, 2531–2537 (2013).
Franklin, K. B. J & Paxinos, G. The Mouse Brain in Stereotaxic Coordinates (Academic Press, 1997).
Andrews, S. et al. FastQC: a quality control tool for high throughput sequence data. Babraham Bioinformatics http://www.bioinformatics.babraham.ac.uk/projects/fastqc (2010).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet J. 17, 10–12 (2011).
Bray, N. L., Pimentel, H., Melsted, P. & Pachter, L. Near-optimal probabilistic RNA-seq quantification. Nat. Biotechnol. 34, 525–527 (2016).
Love, M. I., Huber, W. & Anders, S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 15, 550 (2014).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47 (2015).
Hoffman, G. E. & Schadt, E. E. variancePartition: interpreting drivers of variation in complex gene expression studies. BMC Bioinformatics 17, 483 (2016).
Acknowledgements
The authors thank N. Eren, I. Couzin and C. Wotjak for their assistance, advice and constructive criticism. They thank M. Engel for her technical assistance with the RNA-seq experiment. Thanks are also given to J. Keverne for professional English editing, formatting and scientific input. Their thanks also go to O. Maoz for his unique insights into the mathematics and their interpretation. Finally, the authors would like to extend special thanks to the recently passed Chaya Tannor for fascinating discussions on human personality. A.C. receives financial support from serving as the Vera and John Schwartz Family Professorial Chair at the Weizmann Institute and as the head of the Max Planck Society—Weizmann Institute of Science Laboratory for Experimental Neuropsychiatry and Behavioral Neurogenetics. This work is supported by the following grants and agencies (to A.C.): a FP7 Grant from the European Research Council (260463); the Israel Science Foundation (1565/15); the ERANET Program; the Chief Scientist Office of the Israeli Ministry of Health; the Federal Ministry of Education and Research (01KU1501A); Roberto and Renata Ruhman; Bruno and Simone Licht; the I-CORE Program of the Planning and Budgeting Committee and The Israel Science Foundation (grant no. 1916/12); the Nella and Leon Benoziyo Center for Neurological Diseases; the Henry Chanoch Krenter Institute for Biomedical Imaging and Genomics; the Perlman Family Foundation, founded by Louis L. and Anita M. Perlman; the Adelis Foundation; the Marc Besen and the Pratt Foundation; and the Irving I. Moskowitz Foundation. S.K. is supported by the International Max Planck Research School for Translational Psychiatry (IMPRS-TP).
Author information
Authors and Affiliations
Contributions
O.F and S.K. designed the experiments, analyzed the results and wrote the manuscript. C.T. contributed to the design of the behavioral experiments. M.N., C.F. and P.M.K. assisted in experiments. S.R. performed the preprocessing of the RNA-seq data and contributed to the final analyses. U.A., S.A. and Y.S. contributed to the manuscript. A.C. supervised and supported the project.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Peer review information Nature Neuroscience thanks Ann Kennedy and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Integrated supplementary information
Supplementary Figure 1 Individual differences and consistencies.
(a) Behavioral readout structure. Hierarchical clustering and cross-correlations of the 60 behavioral readouts for n=168 mice. Behavioral readouts tend to cluster based on whether they are independent (related to 1 mouse) or pairwise (derived from the locations of 2 mice). (b) Some behavioral parameters were consistent within individuals over time (for example, contact rate), some parameters could discriminate between individuals (for example, number of chases: F(1,643)=43.7, p=7.8·10-11, time near food or water: F(1,668)=136.9, p=6.7·10-29), while others could discriminate between different times (for example, contact duration F(1,567)=74.9, p=5.1·10-17). Several parameters satisfied both conditions (for example, mean speed: interaction effect is F(1,658)=3.9, p=4.9·10-2, wall-distance: identity effect is F(1,662)=7.5, p=6.3·10-3, day effect is F(1,662)=5.9, p=1.5·10-2 ; ***p<0.001, *p<0.05; all tests were performed on n=168 individuals using a two-way ANOVA. In the box plots, boxes represent the 25%, 50% (median) and 75% quantiles and whiskers span from minima to maxima.).
Supplementary Figure 2 Between-within variability ratio.
Identity domain (ID) components ranked by their Fisher-Rao coefficient. Four components had a Fisher-Rao-score below 4, indicating a greater contribution of between over within-individual variability.
Supplementary Figure 3 Validation of the identity domains (IDs) in a second dataset from a different setup.
(a) Alternative social arena (50 x 70 cm) with a different locations and types of objects compared to arena shown in Fig. 1b. (b) IDs 1-4 show intermediate to strong correlations between the original and replication datasets (ρ denotes the Pearson correlation coefficient between the sets). Each point represents the score of a mouse tested in the original setup using either the original or alternative projection matrix. Alternative IDs were computed using a projection matrix estimated based on the behaviors of mice in the alternative setup (n = 208 individuals). In order to make use of the same projection matrix, the number of behaviors in the original groups were restricted to the same 37 that could be collected from the alternative setup.
Supplementary Figure 4 Identity domain (ID) stability over a short timescale.
IDs were stable over experimental time, such that average ID scores for experimental days 1 through 4 could predict the corresponding scores for each animal on day 5.
Supplementary Figure 5 ID score change over time with respect to self or others.
ID stability during aging was tested by comparing the ID scores of individuals measured once juveniles (4-5 weeks old) and once more during adulthood (15-16 weeks old). Depicted here are change in ID scores relative to one’s own initial score versus relative to the scores of all other individuals (p-values are computed using a one-sided permutation test; n=32 individuals). Points in the shaded region represent greater individual changes, whereas points in the unshaded region represent changes that were larger relative to other individuals than to oneself.
Supplementary Figure 6 Group shuffle diagram.
Mice were observed in the social boxes over 4 days and re-grouped on day 5 such that no mouse was familiar with any of its new conspecifics (n = 64, 16 groups).
Supplementary Figure 7 Principal component analysis (PCA) on the initial set of behaviors.
In order to compare how LDA performs relative to better-known and more commonly used dimension reduction method, PCA was performed on the same initial dataset as used to generate the IDs. (a) The percent variance of the behavioral data explained by each principal component (PC). (b) Correlations between scores on each PC and an abbreviated list of behavioral readouts. (c) The stability of PC scores was tested as with the IDs before and after mixing the mouse groups such that all individuals were unfamiliar to one another. Only the first principal component remained stable after the mix (one-sided permutation test, n=64 individuals). (d) Scores on the first four PCs were used as predictors of transcriptomic variance in RNA-sequencing data from three different brain regions. This analysis directly mimicked the equivalent analysis performed using the four IDs (PC scores from day 1, 200 shuffled PC score sets; randomization test with n=32 individuals). The top four PCs did not carry more overall transcriptomic information than would be expected by chance.
Supplementary Figure 8 High-anxiety (HAB) versus normal-anxiety (NAB) mouse model.
(a) Selective breeding for high versus normal anxiety-like behavior levels (HAB/NAB) was performed for over > 40 generations starting with outbred CD-1 mice7. Selection was based on results of the Elevated Plus-Maze test (% time in the open arm). After the animals of each respective genotype were weaned, they were mixed into groups of three NABs and one HAB each. (b) The power of the identity domain (ID) scores to detect genotype was tested directly using the area under the receiver operating characteristic curve of a model predicting genotype based on IDs 1-4. The area under the curve of this model was compared against a distribution created based on 200 trials with shuffled ID scores.
Supplementary Figure 9 Correlations between identity domains (IDs) and their contributing behavioral readouts.
The readouts are separated into individual (based on the movements of a single mouse) and pairwise (based on the movements of a mouse and one more of its group members).
Supplementary information
Supplementary Video 1
Tracking multiple individuals in a semi-naturalistic environment. A representative segment taken from a video recording of the social arena with a group of four fur-dyed mice. Overlaid on the video are illustrations of tracked mouse locations and the layout and components of the social arena.
Rights and permissions
About this article
Cite this article
Forkosh, O., Karamihalev, S., Roeh, S. et al. Identity domains capture individual differences from across the behavioral repertoire. Nat Neurosci 22, 2023–2028 (2019). https://doi.org/10.1038/s41593-019-0516-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41593-019-0516-y
- Springer Nature America, Inc.
We’re sorry, something doesn't seem to be working properly.
Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.