Background

Bird vocalizations are social signals that serve diverse functions, including mate attraction, territory defense, and social interaction with conspecifics or other species [1]. There are three basic mechanisms by which animals encode information in vocalizations [2]: song and syllable repertoire [3,4,5], frequency parameters, and temporal parameters [2, 6]. The first mechanism is used by open-ended learners of songbirds with extremely large song repertoires [4]; the second mechanism involves encoding information by simple changes in frequency and amplitude within syllables or notes [7, 8], which is a ubiquitous strategy used by vertebrates and many groups of invertebrates; and the third mechanism of encoding information is by changing the temporal distribution of vocalizations, such as temporal characteristics and delivery rate [9,10,11], to express behavioral motivations [6, 12]. Therefore, not only acoustic communication consisting of diverse types of syllables and elements expressing various meanings [13,14,15,16], the simple vocalizations, such as referential alarm calls can indicate categories of predators, or even predators’ behaviors [17,18,19]. For instance, noisy miner (Manorina melanocephala) emits ‘aerial’ alarm calls (high-frequency) to airborne raptor and produces ‘chur’ alarm calls (low-frequency and broad bandwidth) to terrestrial or perched raptor [20].

However, little is known about the referential functions of vocal behaviors in Rallidae, which appear to have more stereotyped and simpler vocalizations characterized by smaller repertoire sizes [2, 21]. Rallidae often gather in groups and have complex life history traits, such as breeding displays, alarm context and agonistic behaviors involving the broadcasting of loud calls during the breeding season, suggesting that the acoustic component of social interactions plays an important role in breeding interactions [6, 22,23,24]. Vocalizations of Rallidae are mainly “calls” that are uttered when they engage in courtship, mate attraction, territory guarding, and parent-offspring communication (e.g., the travel of newly hatched chicks led by their parents to feeding areas) [22]. Despite being subject to similar acoustic selective pressures and inhabiting the same habitats as other birds with complex vocalizations, how Rallidae express complex behavioral motivations using much simpler vocal types remains unclear [2].

A few recent studies have shown that Rallidae can use all three of these basic mechanisms including repertoire size, frequency and temporal modulation to encode information. Diverse types of information have been observed to be encoded in the vocal signals of rails [23, 25], crakes [2, 26] and corncrakes [6]. Modulation of the acoustic characteristics of the small vocal repertoire permits various types of information relating to breeding, species recognition and social signaling to be encoded. For example, information carried by the small repertoire of a single call type in petrels plays a role in social interactions, such as burrow defense and female mate choice, and acoustic parameters of energy quartiles, call duration, and syllable or phrase rate encode individual identity [22]. But these studies seldom considered to what extend the individual variations caused vocal structural differences in encoding behavioral motivations.

In this study, we used common coots to study how Rallidae code social interaction information such as mate attraction, territorial advertisement and individual signatures using single-syllable call types. Coots are good models to study vocal communication because it has a relatively small repertoire of innate calls, it normally breeds in wetlands with visibility often being restricted by dense vegetation, and vocalizations are known to play an important role in their social behavior [27]. They are highly territorial and produce loud advertisement calls consisting of a long series of identical, single-syllable notes throughout the daytime during the breeding season, which indicate the significance of vocal communications for successful breeding [28, 29]. Aggressive behaviors consisting of chasing or fighting with a long series of loud, identical, and single-syllable calls are frequently observed during the breeding season, suggesting that such simple calls encode multiple types of information such as physical quality (body size) or motivation and play an essential role in territory defense. Common coot parents produce sharp calls when leading nestlings to search for food, indicating a key role of vocalizations for parent-offspring communication [30].

Although previous work has described the vocal repertoires and displays of the American Coot (F. americana) [27], these preliminary studies were descriptive and did not use detailed acoustic analysis of the complete repertoire in a spectrogram to study the diverse behavioral contexts associated with their social interactions. Here, we provide a comprehensive overview of how the simple calls of the common coot encode diverse behavioral motivations by considering both vocal structure and the acoustic environment (i.e. natural factor in habitat such as vegetation) in which these vocalizations are produced. Specifically, we addressed 2 questions related to the functions of acoustic signaling: (1) Are different call types used in different behavioral contexts, such as aggression, courtship, foraging, or parent-nestling communication? and (2) Are acoustic parameters such as the frequency or temporal spectral domains modulated in ways that permit a single call type to express diverse behavioral motivations, and to what extent does individual signature contribute to the acoustic variations? We first identified and described various acoustic structures and their behavioral contexts under natural conditions to parse variation in the acoustic structure of calls according to different behavioral contexts during the breeding season. Second, we analyzed a general situation in which a specific type of vocalization was used to summarize how the common coot expresses information with a single call. Additionally, we evaluated different hypotheses for the relationship between the structure and presumed function of vocalizations among acoustic environments. By studying how a “simple repertoire” functions during breeding, we aimed to broaden our understanding of how diverse behavioral motivations are encoded in relatively simple systems [13].

Results

Different call types emitted under various behavioral contexts

ANOVA revealed that call types a, b, c and d were significantly different acoustically (Table 1, Fig. 1). Call a has the longest duration (Table 1; ANOVA, a-c: F = 72.311, n = 23 and 4, P < 0.001; a-d: F = 72.311, n = 23 and 4, P < 0.001) with the highest number of harmonics compared with the other 3 calls and was emitted during 8 of the behaviors observed in this study. Call b was produced during leaving nest or communicating with nestlings. Call c was the shortest in duration (Table 1; ANOVA, a-c: F = 72.311, n = 23 and 4, P < 0.001; c-d: F = 72.311, n = 4 and 4, P < 0.05) and had the longest intervals between syllables (Table 1; ANOVA, a-c: F = 14.294, n = 23 and 4, P < 0.001) with no harmonic; it was recorded during back to nest or in the nest. Call d had the highest maximum frequency (Table 1; ANOVA, a-d: F = 7923.200, n = 23 and 4, P < 0.001; c-d: F = 7923.200, n = 3 and 4, P < 0.001) and was only heard during forage on open water or in the nest.

Table 1 Results of one-way ANOVA showing significant differences among all 4 types of calls. n is the number of individuals. Paired comparisons between each of the 2 call types were subjected to least-significant difference tests
Fig. 1
figure 1

Acoustic comparisons among all 4 call types of adult common coot showing variation among different calls. Significant differences between any 2 types of calls are indicated by a line and * above the bar

A single call type expresses multiple behavioral motivations

The a1–a5 and a8 had higher frequency parameters (peak frequency, maximum frequency, and maximum/minimum frequency of F0), and longer durations with much faster syllable production than a6 and a7 (Fig. 2; Supplemental Table S1). According to LMM, only T had significant contribution (estimate ± SE = − 2.178 ± 0.912, t = − 2.387, P < 0.05) to classify the calls a1 and a3–a7 (Table 2). Means of frequency and temporal variables of calls b8, b9, c5, c6, d3 and d6 were shown on Table 3.

Fig. 2
figure 2

Acoustic variation among different call subtypes of a that were produced under 8 different behavioral contexts

Table 2 The effect of PF, F0, F0max, F0min and t on variations of call a1 and a3-a7 classified by different behavioral contexts
Table 3 Means of acoustic parameters of calls b8, b9, c5, c6, d3 and d6, which are shown ± SD. n is the number of individuals

Discussion

Rallidae have a simple vocal apparatus, and their simple syringeal anatomy is thought to constrain their vocal complexity and limit the diversification of call types within the vocal repertoire of Rallidae [31]. Nevertheless, they vocalize extensively with their small repertoires, and these vocalizations have important functions during breeding [2]. Our study supported these ideas, as only 4 different call types (a, b, c, and d) were recorded, all of which consisted of a long series of repeated single-syllable sounds under 9 different behaviors.

Despite a small repertoire of vocalizations, the common coot expressed diverse behavioral motivations. Specifically, the common coot modified the vocal structures of their simple acoustic systems in 3 ways. First, the common coot producing acoustically different call types that were clearly distinguished by DFA analysis in different behavioral contexts, and the minimum frequency of fundamental frequency (F0min) and duration of syllable (T) contributed the most to the acoustic divergence between call types. Call c had the shortest duration and was produced during in the nest and back to nest. Previous studies have shown that North American rails have high-frequency alarm calls that are characterized by short pulses, and the note duration of the alarm call of the king rail (Rallus elegans) is short, making it difficult for predators to detect [25, 32]. Because short notes are superior for avoiding detection by predators and can enrich information relating to direction and distance [33], the short c is often favored during parental interactions in common coots during in the nest or back to nest. common coots emitted call d when they were foraging for food on the water surface or in the nest; d had a significantly higher frequency compared with the other 3 calls. Because the location of vocalization producers can be easily detected by signal receivers through high-frequency calls [13, 19], d may be used to determine the location of mates and be used as a general contact call. Call b is the only call that we recorded for parent-offspring communication, which had the lowest fundamental frequency (F0) among all call types. According to the acoustic adaptation hypothesis, dense habitats favor the use of calls with lower frequencies, as low-frequency calls experience less acoustic degradation in dense habitats compared with high-frequency calls [34, 35]. Call b was used in the dense reeds and would thus be advantageous for its lower frequency duration transmission. Therefore, the behavioral contexts and the acoustic environment in which the call is produced both drive the vocal structures of common coot calls. Their relative, the American Coot has been shown to have similar ways of containing information, in which different call types are used for individual recognition, courtship, and alarm signals during nest/territory defense and communication between mates and parent-offspring [28].

common coots can also send information by changing the frequency and temporal parameters within a single call type. Call a was the most commonly used and was emitted in 8 of the 9 behaviors that were noted. Except a5 and a8 (which have few recordings, Supplemental Table S2), a1–a7 were correctly classified in the DFA analysis, and the maximum frequency of the fundamental frequency (F0max) and interval of syllables (TI) contributed the most to the classification. To modify frequency, common coots used a4, which had the highest F0max during chase and fight with intruders, and used a6 and a7, which had the lowest F0max, during in the nest and searching nest materials on open water. Increases in frequency have been observed during the arousal of many vertebrates, including birds and mammals, as a way of expressing urgency [36,37,38,39]. The results of our study support this idea given that intruders are the main threat to breeding adults compared with contact with mates while in the nest or while searching for nest materials. Call b was produced when individuals leave the nest and during parent-offspring communication, and its much lower fundamental frequency (F0) when parents call to their chicks may represent an adaptation to dense habitat, as the use of low-frequency calls by adults to contact chicks is favored in complex acoustic environments [13].

Temporal distributions also enrich the ways by which common coots can express behavioral motivations. The a6 and a7 had the longest TI, which makes sense given that interactions in the nest and searching for nest materials on open water are generally some of more peaceful and slower activities that common coots engage in. Under the more urgent behavioral contexts of a1–a4 (courtship, copulation, forage, and chase and fight, respectively), TI is shorter, and thus a1–a4 are much faster. This finding suggests that common coots encode urgent situations by decreasing the temporal intervals between syllables and thus increasing the speed of syllable output. Call d emitted during forage had a significantly shorter TI than when call d was emitted during in the nest. Thus, we inferred that the former d functions as a contact call for mates and/or as territorial advertisement, both of which are activities that have a greater sense of urgency compared with activities while in the nest. This temporal modification depending on the degree of behavioral urgency has also been observed in many other animal taxa, including mammals [40] and songbirds [6, 41] but has only been documented in a few cases in Rallidae. For example, the spotted crake (Porzana porzana) lengthened their between-call intervals as an aggressive motivation [2]. Corncrakes (Crex crex) calls consist of 2 syllables separated by 2 intervals (I1 and I2); although I1 is generally similar to I2, males can produce calls that have a longer I2 than I1, which encodes information on the aggressive motivation to other males. That is, specific information can be encoded by the temporal pattern [6, 41].

Vocal individuality also contributes to acoustic parameters divergence [42,43,44]. Nevertheless, although vocal variation of call a among individuals was considered in our study (LMM), a parameter, Duration of syllable (T), was still differed significantly in different behavioral contexts, which indicates T is specifically used for expressing distinct behavioral motivation.

In this study, we classified these call types and subtypes by mainly acoustic traits analyses and spectrogram measurements, there was more difference among call types a, b, c and d than subtypes of a calls, and in visual, spectrogram subtypes of a was similar. However, some studies indicated that even extremely similar vocalizations were classified into different call types because they were produced in different behavioral contexts and encode contrasting function, for instance, the surprisingly similar hawk and mobbing alarm calls of superb fairy-wrens (Malurus cyaneus) [18] and aggressive and affiliative trill of Java sparrow (Lonchura oryzivora) [45]. Thus, the common methods we used for classified call types maybe not applicable to birds of Rallidae with simple calls, and call type classification should focus on not only acoustic parameters’ differences, but certain function or behavior context. Playback experiments are needed to test the function and classification of these call types further, with considering how sexes or individual differences lead to vocal variations in the future [46]. A playback experiment simulating territorial intrusion in the spotted crake reported that males can lengthen their between-call intervals to show aggressive motivation [2]. Finally, according to LMM, we found that both individuality and behavioral contexts contribute to variation of acoustic traits of different call types in common coots.

However, there are obvious limitations and some conclusions might surpass what the limited sample size and methods design can reach in our study. First, this is a more descriptive study which attempts to explore the relationship between behavioral context and acoustic parameters, and it’s restricted to only spectrogram analysis and behavioral observation without testing the responses of signal receivers. Second, the sample sizes of call b, c, d and some subtypes of call a such as a2 and a8 are critically low with a few individuals (only one in some cases), which hindered analyzing divergent functions of different call types. Third, the differences of acoustic parameters between male and female common coots that may contributed to the acoustic variation are not tested (but a LMM was conducted) in this study because the sex of each individual cannot be certain through morphology in field. Therefore, further experiments such as call manipulation or playback experiments are needed to conduct to shed light on the specific information encoding mechanisms of Rallidae in future.

Conclusions

In sum, we provided the first detailed spectral analysis of common coot vocalizations, which indicated that common coots produce a few vocal types that containing various types of information under different behavioral contexts. The findings of this study on common coot, a member of the Rallidae, support the results of recent studies suggesting that even considering vocal individual signature, the vocal repertoire, acoustic structure, and temporal distributions of sounds provide three basic mechanisms by which vocalizations can encode information in species of Rallidae [2, 6]. This study also broadens our perspective on how birds emit complex functions using relatively simple acoustic signals, thereby increasing our understanding of the origin and evolution of small vocal repertoires. Our detailed spectrogram analysis of common coot vocalizations provides a foundation for future playback experiments to determine how subtle changes in calls modulate the information that the calls contain. Similar to the American coot, the vocalizations of the common coot play an important role in social behaviors [27, 28]; thus, anthropogenic sources of noise should be mitigated near the breeding areas of common coots to avoid disturbing their reproductive activities [47].

Methods

Recording protocol and behavioral contexts definition

In this study, we studied a population of common coot at Anbanghe National Reserve in Heilongjiang, China (46.8853°N–47.0650°N, 131.1033°E–131.5400°E). Common coots are strongly territorial and produce loud, brief, and sharp sounds all day during the breeding season, and intruders are expelled from territories immediately upon their entry. The stability of their territories thus permitted many individuals to be recorded while ensuring that different individuals were discriminated and identified. Recordings taken > 100 m apart were assumed to be different individuals based on estimated territory size and individually marked with number (see also Supplemental Table S2). The nest sites of common coots can be approached closely to make high-quality vocal recordings through an artificial corridor for tourists within the national reserve, thus we opportunistically recorded common coots who produce calls within an estimated distance of 10–30 m from focal birds along the artificial corridor between 05:00–10:00 h and 13:00–17:00 h from April to June 2008. Vocalizations of breeding adults were recorded using Portable Recorder (Lotoo L-200, Bei**g, China) and a Directional Microphone (ΛZDEN SGM 1X, Tokyo, Japan) held approximately 1.5 m high on hands of the researcher. The duration of each recording did not exceed 2 min (except b9 with 5′47″) and were made at 16 bit resolution and sampling rate of 22.05 kHz (for calls a and c) or 44.1 kHz (for calls b and d), which have been demonstrated previously to be sufficient for the extraction of the acoustic parameters we measured [48]. For behavioral observation, we observed and defined certain behavior of common coot in breeding season within an estimated distance of 10–30 m from focal birds (the same as call recorded, which was synchronized with behavioral observation) and recorded these behaviors using a camera (Panasonic DMC-FZ18GK, Osaka, Japan) to ensure that the behaviors displayed while target call types were broadcast were also noted. In total, we noted 9 types of behaviors that were displayed when common coots produced calls (Table 4).

Table 4 Behavioral contexts list, description of behaviors and call types produced under these contexts

Vocal analysis

The vocalizations were analyzed in Avisoft-SASLab Pro 4.52 (Avisoft Bioacoustics Inc., Berlin, Germany); the waveforms and spectrograms for analyses were created using FFT-length 512 points, Hamming window, frame 50%, and overlap 75%. WAV sound files. Calls that were undisturbed by other sounds (e.g., man-made noise, vocalizations of anurans or other bird species in the habitat of common coot) and possessed a high signal-to-noise ratio (S/N) were selected for analysis. We classified the vocalizations into different call types (i.e. syllables) according to spectrogram characteristics and then we classified these call types further by different behavioral contexts in which different call types were used, which simple elements separated by noticeable time intervals on the spectrograms are defined as syllables [49], i.e. calls for common coots. In order to do so, we measured 6 variables: (1) peak frequency (PF), (2) fundamental frequency (F0), (3) maximum frequency of fundamental frequency (F0max), (4) minimum frequency of fundamental frequency (F0min), (5) duration of syllable (T), and (6) interval of syllables (TI, Table 5; Fig. 3), and we chose these variables following some previous similar studies [2, 25, 50, 51]. We identified 4 different types of calls consisting of repeated, single-syllable calls from 61 recordings of 30 breeding adults (see also Supplemental Table S2), which were called a (46 recordings including 517 calls from 23 individuals), b (2 recordings; 215 calls; 2 individuals), c (8 recordings; 59 calls; 4 individuals), and d (5 recordings; 18 calls; 4 individuals), 809 calls in total. The 4 types of calls are easily distinguishable through visual observation in the spectrograms (Fig. 4). Call a was emitted under 8 different behaviors and was thus the most frequently used among the 4 call types. The numbers 1–8 were used to refer to the different behavioral contexts where a is emitted (a1, a2, a3, a4, a5, a6, a7, and a8, Supplemental Table S1, Fig. S1). The a2 and a8 were excluded from statistical analyses because only a few syllables from one individual were recorded for each of these behavioral contexts. The calls b, c, and d were only emitted under 2 behaviors. The aforementioned numbering was also applied to b, c, and d (b8, b9, c5, c6, d3, and d6).

Table 5 Parameters and definition of various vocal parameters measured for each call types
Fig. 3
figure 3

Sonogram of a common coot call to demonstrate how parameters were measured for each syllable. The parameters included peak frequency (PF), fundamental frequency (F0), maximum frequency of fundamental frequency (F0max), minimum frequency of fundamental frequency (F0min), duration of syllable (T), and interval of syllables (TI). Amplitude spectra of the left syllable show high energy in the third harmonic (PF)

Fig. 4
figure 4

Four types of calls—a, b, c, and d—of adult common coots in the breeding season are shown in (A)—(D) respectively

Statistical analysis

We tested whether the common coot used distinct call modes by examining acoustic structure and context-dependent variation in their vocalizations during the breeding season. First and foremost, Shapiro-Wilk test was conducted to test for normality of all variables, the parameters of all call types were approximated to a normal distribution; thus, one-way analysis of variance (ANOVA) was used to analyze significant differences in the parameters of call types a, c and d, and followed by least-significant difference (LSD) tests for pairwise comparisons because of comparison among 3 samples. The F0max and F0min of call b were not measured because the edge of the harmonic was vague. Call b was not analyzed because of its small sample size (n = 2). Potential discrimination among calls from different contexts was tested using discriminant function analysis (DFA) to classify different call types (a, c and d) by their behavioral context. The DFA (Table 6; Fig. 5) classified them clearly by F0min (explaining 70.0% of the total variance) in function 1 and T (explaining 92.2% of the total variance) in function 2.

Table 6 Discriminant function analysis (DFA) of call types a, c, and d. Eigenvalues, percent variance, and the standardized canonical discriminant function coefficients of functions and parameters
Fig. 5
figure 5

Discriminant function analysis (DFA) indicating that calls a, c, and d can be separated completely by the first 2 discriminant functions

Among the 4 call types, song type a was the most commonly used during various behaviors; thus, one-way analysis of variance was also performed to analyze significant differences in the parameters of call a1 and calls a3–a7; calls a2 and a8 were not analyzed because of their small sample sizes (n = 1). We used the DFA to determine if acoustic variation in song type a was associated with different behavioral purposes. According to the results of the DFA, calls a1–a5 and a8 were distinct from a6–a7 by F0max and TI (explaining 91.6 and 90.3% of the total variance respectively) in function 1, and F0max (explaining 93.8% of the total variance) in function 2 (Table 7; Fig. 6). These statistical analyses were performed in IBM SPSS ver. 23 for Windows (SPSS Inc., Chicago, USA). However, we did not analyze significant differences in the parameters b, c, and d under different behaviors because of the small sample size (b8, b9, c5, c6, d3, and d6, only one individual in some cases but with a few syllables). Finally, we ran a linear mixed model (LMM) with call types (a1 and a3–a7) as the response variable, PF, F0, F0max, F0min and T (except TI) as fixed effects and individual identity (ID) as random effect using the ‘lmer’ of the ‘lmerTest’ R package [52] in R ver. 4.0.5 (The R Foundation for Statistical Computing, Vienna, Austria, http://www.r-project.org); a2 and a8 with only one individual were dropped from the model. Because the edge of the harmonic of b and c5 was vague, we did not measure the F0max and F0min of them. Data were presented as mean ± SD. P < 0.05 and P < 0.01 was regarded as statistically significant and highly significant, respectively.

Table 7 Discriminant function analysis (DFA) of calls a1–a8. Eigenvalues, percent variance, and standardized canonical discriminant function coefficients of functions and parameters
Fig. 6
figure 6

Results of discriminant function analysis (DFA) of different call subtypes of a under 8 different behaviors