Abstract
This paper focuses on methods to study patterns of collaboration in co-authorship networks at the mesoscopic level. We combine qualitative methods (participant interviews) with quantitative methods (network analysis) and demonstrate the application and value of our approach in a case study comparing three research fields in chemistry. A mesoscopic level of analysis means that in addition to the basic analytic unit of the individual researcher as node in a co-author network, we base our analysis on the observed modular structure of co-author networks. We interpret the clustering of authors into groups as bibliometric footprints of the basic collective units of knowledge production in a research specialty. We find two types of coauthor-linking patterns between author clusters that we interpret as representing two different forms of cooperative behavior, transfer-type connections due to career migrations or one-off services rendered, and stronger, dedicated inter-group collaboration. Hence the generic coauthor network of a research specialty can be understood as the overlay of two distinct types of cooperative networks between groups of authors publishing in a research specialty. We show how our analytic approach exposes field specific differences in the social organization of research.
Similar content being viewed by others
Notes
We call these ‘seed’ groups as we regard them as entry points into scientific communities, and plan to extend our field studies following links (of cooperation or competition) of these seed labs.
It is worth noting that these specialties do not fall squarely into single sub-disciplines but typically unite the efforts of researchers who identity with different subdisciplines. For field B which we label here as ‘synthetic chemistry’ these are mainly organic chemists, inorganic and organo-metallic chemists as well as polymer chemists. For field A which we label here as belonging to the subdiscipline ‘physical chemistry’ these are physical chemists as well as experimental and theoretical physicist, often with a background in atomic and molecular physics, but also in nuclear physics. For field C these are indeed mostly physical chemists.
When comparing this number to other co-author networks in the literature, remember that this number is calculated for a reduced co-author network (after excluding one-time authors). Hence this number will overestimate the relative size of the giant component for the unreduced network.
Available from Martin Rosvall’s home page at http://www.tp.umu.se/~rosvall/code.html.
Our field studies confirm that in areas of synthetic chemistry the task of having to find someone with the instrumental equipment to conduct certain measurements on your sample is very common.
Institutional Review Board, http://en.wikipedia.org/wiki/Institutional_review_board.
References
Acedo, F., Barroso, C., Casanueva, C., & Galán, J. (2006). Co-authorship in management and organizational studies: An empirical and network analysis. Journal of Management Studies, 43(5), 957.
Adams, J., Gurney, K., & Jackson, L. (2008). Calibrating the zoom—A test of Zitt’s hypothesis. Scientometrics, 75(1), 81–95.
Althouse, B., West, J., Bergstrom, T., & Bergstrom, C. (2008). Differences in impact factor across fields and over time. Arxiv preprint ar**v:0804.3116.
Barabasi, A., Jeong, H., Neda, Z., et al. (2002). Evolution of the social network of scientific collaborations. Physica A: Statistical Mechanics and its Applications, 311(3–4), 590–614.
Bassecoulard, E., Lelu, A., & Zitt, M. (2007). A modular sequence of retrieval procedures to delineate a scientific field: From vocabulary to citations and back. In Proceedings of 11th international conference on scientometrics and informetrics (ISSI 2007), Madrid, Spain, pp. 25 ff.
Batagelj, V., & Mrvar, A. (2003). Analysis and visualization of large networks. In M. Juenger & P. Mutzel (Eds.), Graph drawing software (pp. 77–103). Berlin: Springer.
Caruana, R., Elhawary, M., Nguyen, N., & Smith, C. (2006). Meta clustering. In Proceedings of the sixth international conference on data mining (ICDM’06).
Crane, D. (1972). Invisible colleges: Diffusion of knowledge in scientific communities. Chicago: The University of Chicago Press.
Fortunato, S., & Castellano, C. (2007). Community structure in graphs. arxiv: 0712.271.
Freeman, L. C. (1978). Centrality in social networks—Conceptual clarification. Social Networks, 1, 215–239.
Fry, J., & Talja, S. (2007). The intellectual and social organization of academic fields and the sha** of digital resources. Journal of Information Science, 33(2), 115.
Glänzel, W., & de Lange, C. (1997). Modelling and measuring multilateral co-authorship in international scientific collaboration. Part II. A comparative study on the extent and change of international scientific collaboration links. Scientometrics, 40(3), 605–626.
Glänzel, W., & de Lange, C. (2002). A distributional approach to multinationality measures of international scientific collaboration. Scientometrics, 54(1), 75–89.
Guimera, R., Sales-Pardo, M., & Amaral, L. (2007). Classes of complex networks defined by role-to-role connectivity profiles. Nature Physics, 3(1), 63–69.
Knorr Cetina, K. (1999). Epistemic cultures—How the sciences make knowledge. Cambridge: Harvard University Press.
Kretschmer, H. (1994). Coauthorship networks of invisible colleges and institutionalized communities. Scientometrics, 30(1), 363–369.
Kretschmer, H. (2004). Author productivity and geodesic distance in bibliographic co-authorship networks, and visibility on the Web. Scientometrics, 60(3), 409–420.
Leclerc, M., & Gagné, J. (1994). International scientific cooperation: The continentalization of science. Scientometrics, 31(3), 261–292.
Leydesdorff, L., & Wagner, C. (2008). International collaboration in science and the formation of a core group. Journal of Informetrics, 2(4), 317–325.
Liberman, S., & Wolf, K. B. (1998). Bonding number in scientific disciplines. Social Networks, 20(3), 239–246.
Lievrouw, L. A. (1990). Reconceiling structure and process in the study of scholarly communication. In C. L. Borgman (Ed.), Scholarly communication and bibliometrics. London: Sage Publications.
Liu, X., Bollen, J., Nelson, M., & Van de Sompel, H. (2005). Co-authorship networks in the digital library research community. Information Processing and Management, 41(6), 1462–1480.
Moed, H., Burger, W., Frankfort, J., & Van Raan, A. (1985). The application of bibliometric indicators: Important field-and time-dependent factors to be considered. Scientometrics, 8(3), 177.
Mogoutov, A., & Kahane, B. (2007). Data search strategy for science and technology emergence: A scalable and evolutionary query for nanotechnology tracking. Research Policy, 36(6), 893–903.
Morris, S. (2005). Manifestation of emerging specialties in journal literature: A growth model of papers, references, exemplars, bibliographic coupling, cocitation, and clustering coefficient distribution. Journal of the American Society for Information Science and Technology, 56(12), 1250–1273.
Morris, S., Goldstein, M., & Deyong, C. (2007). Manifestation of research teams in journal literature: A growth model of papers, authors, collaboration, coauthorship, weak ties, and Lotka’s law. Journal of the American Society for Information Science and Technology, 58(12), 1764–1782.
Nepusz, T., Petróczi, A., Négyessy, L., & Bazsó, F. (2008). Fuzzy communities and the concept of bridgeness in complex networks. Physical Review E, 77(1), 16107.
Newman, M. (2001a). The structure of scientific collaboration networks. PNAS, 98(2), 404–409.
Newman, M. (2001b). Scientific collaboration networks. I. Network construction and fundamental results. Physical Review E, 64(1), 16131.
Newman, M. (2004). Fast algorithm for detecting community structure in networks. Physical Review E, 69, 066133.
Palla, G., Derenyi, I., Farkas, I., & Vicsek, T. (2005). Uncovering the overlap** community structure of complex networks in nature and society. Nature, 435(7043), 814–818.
Radicchi, F., et al. (2004). Defining and identifying communities in networks. PNAS, 101(9), 2658–2663.
Rosvall, M., & Bergstrom, C. (2007). An information-theoretic framework for resolving community structure in complex networks. PNAS, 104(18), 7327.
Schaeffer, S. (2007). Graph clustering. Computer Science Review, 1(1), 27–64.
Seglen, P., & Aksnes, D. (2000). Scientific productivity and group size: A bibliometric analysis of Norwegian microbiological research. Scientometrics, 49(1), 125–143.
Sen, P. (2006). Complexities of social networks: A Physicist’s perspective. Arxiv preprint physics, 0605072.
Snyder, H., & Bonzi, S. (1998). Patterns of self-citation across disciplines (1980–1989). Journal of Information Science, 24(6), 431.
Wagner, C. S., & Leydesdorff, L. (2005). Network structure, self-organization, and the growth of international collaboration in science. Research Policy, 34, 1608–1618.
Whitley, R. (2000). The intellectual and social organization of the sciences. Oxford: Clarendon Press.
Zitt, M., Bassecoulard, E., & Okubo, Y. (2000). Shadows of the past in international cooperation: Collaboration profiles of the top five producers of science. Scientometrics, 47(3), 627–657.
Zitt, M., Ramanana-Rahary, S., & Bassecoulard, E. (2005). Relativity of citation performance and excellence measures: From cross-field to cross-scale effects of field-normalisation. Scientometrics, 63(2), 373–401.
Zuccala, A. (2006). Modeling the invisible college. Journal of the American Society for Information Science and Technology, 57(2), 152–168.
Acknowledgments
We are indebted to our field study participants. Further, this research has been made possible through financial support by the National Science Foundation through grants IIS-738543 SGER: Advancing the State of eChemistry, DUE-0840744 NSDL Technical Network Services: A Cyberinfrastructure Platform for STEM Education, and NSF award 0404553. Support also came from Microsoft Corporation for the project ORE-based eChemistry. We are grateful to those that make our work so much more effective by making neat tools and algorithms available on the Web, such as Martin Rosvall (infomap clustering code), Vladimir Batagelj and Andrej Mrvar (pajek), Michael Weseman (plot), and Peter Mcaster (OmniGraffle extensions for pie charts).
Author information
Authors and Affiliations
Corresponding author
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendices
Appendix 1: Data
Guided by our informants we developed lexical queries to retrieve from Web of Science the literature covering a specific speciality—sometimes a single term, sometimes a 2–3 line long query using Boolean operators. Our informants validated the data sets, checking whether crucial authors whom they expected to see were actually represented, and whether out-of-scope topics had been accidentally introduced. The data was then further processed to eliminate articles with only one author or unknown authors (indicated by ‘Anon’ in the WoS records), or with only a single reference, as the latter records typically do not refer to original research articles or reviews.
This study is part of a larger dissertation research project. To ensure the anonymity of the human participants of the field study part of this research in accordance with the IRBFootnote 6 approved protocol for the dissertation research we cannot disclose the lexical queries that would provide access to the exact data sets discussed in this report. If access to the networks build from the data used in this study is needed for a particular research or validation purpose we are happy to work out a solution that retains the anonymity of our study participants.
Data field A
Time span: 1991–2008, number of records: 53,947 (retrieved on 22 Jun 2009)
Data field B
Time span: 1991–2008, number of records: 12,817 (retrieved on 14 Nov 2008)
Data field C
Time span: 1987–2008, number of records: 30,636 (retrieved on 17 Dec 2008).
We derive the geographical affiliation of a cluster at continent level from the country affiliation listed in the Web of Science record for each publication. We represent each cluster by all the publications any of its authors has been a co-author of. Then we determine the country affiliation that is most often listed for papers published by cluster authors. In cases where the second placed country is listed at least 50% as many times as the most often listed country, and if these two countries belong to different continents, we assign a mixed, two continent geographical affiliation
Appendix 2: Methods
In Guimera et al. (2007) each node of a clustered network was assigned to one out of seven node roles. The seven node roles are defined by the values of two parameters that quantify how a node is connected to the other nodes in its cluster and how the links going outside the cluster are distributed.
The first parameter, z, is simply the inside-its-cluster-degree of a node, normalized by subtracting the average inside-the-cluster degree and then dividing the result by the standard deviation. Nodes that have z < 2.5 were called non-hubs, while nodes with z ≥ 2.5 were called hubs. In other words, hubs are well-connected inside their own cluster with a normalized threshold beyond 2.5 (z > 2.5).
The second parameter is the participation coefficient P that quantifies how a node distributes its outside links among the clusters. For a node i in a network of N clusters it is defined as follows:
where s = (1…N), k is = number of links of node i to nodes in cluster s, k i = the total degree of node i.
From this definition it follows that P is normalized for each node so that the value is within 0 and 1. A node that connects to only one cluster would have a small P whereas a node that connects to a lot of clusters would have a high P.
Non-hubs are classified into 4 types based on their P values.
-
P ≤ 0.05: Ultra-peripheral (R1) with its connections restricted to its own cluster
-
0.05 < P ≤ 0.62: Peripheral (R2) with limited outside connectivity
-
0.62 < P ≤ 0.8: Satellite connector (R3) with good connectivity with a number of outside clusters
-
0.8 < P: Kinless (R4) with outside connectivity distributed evenly among all clusters
Similarly hubs are categorized into 3 types based on their P values.
-
P ≤ 0.30: Provincial hub (R5) that is well-connected only inside its own cluster
-
0.30 < P ≤ 0.75: Connector hub (R6) that connects to many clusters
-
0.75 < P: Global hub (R7) that connects to most clusters
We have used the same set of threshold values to assign roles to nodes.
Rights and permissions
About this article
Cite this article
Velden, T., Haque, Au. & Lagoze, C. A new approach to analyzing patterns of collaboration in co-authorship networks: mesoscopic analysis and interpretation. Scientometrics 85, 219–242 (2010). https://doi.org/10.1007/s11192-010-0224-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11192-010-0224-6