Local stability of cooperation in a continuous model of indirect reciprocity

Lee, Sanghun; Murase, Yohsuke; Baek, Seung Ki

doi:10.1038/s41598-021-93598-7

Local stability of cooperation in a continuous model of indirect reciprocity

Article
Open access
Published: 09 July 2021

Volume 11, article number 14225, (2021)
Cite this article

Download PDF

You have full access to this open access article

Scientific Reports

Local stability of cooperation in a continuous model of indirect reciprocity

Download PDF

Sanghun Lee¹,
Yohsuke Murase² &
Seung Ki Baek¹

1163 Accesses
6 Citations
3 Altmetric
Explore all metrics

Abstract

Reputation is a powerful mechanism to enforce cooperation among unrelated individuals through indirect reciprocity, but it suffers from disagreement originating from private assessment, noise, and incomplete information. In this work, we investigate stability of cooperation in the donation game by regarding each player’s reputation and behaviour as continuous variables. Through perturbative calculation, we derive a condition that a social norm should satisfy to give penalties to its close variants, provided that everyone initially cooperates with a good reputation, and this result is supported by numerical simulation. A crucial factor of the condition is whether a well-reputed player’s donation to an ill-reputed co-player is appreciated by other members of the society, and the condition can be reduced to a threshold for the benefit-cost ratio of cooperation which depends on the reputational sensitivity to a donor’s behaviour as well as on the behavioural sensitivity to a recipient’s reputation. Our continuum formulation suggests how indirect reciprocity can work beyond the dichotomy between good and bad even in the presence of inhomogeneity, noise, and incomplete information.

Quantitative assessment can stabilize indirect reciprocity under imperfect information

Article Open access 12 April 2023

Social norm complexity and past reputations in the evolution of cooperation

Article 08 March 2018

The evolution of indirect reciprocity under action and assessment generosity

Article Open access 31 August 2021

Introduction

Reputation was an absolutely essential asset in trade of the illiterate in the premodern era¹, and it still plays a crucial role in markets and communities, making reputation management a central part of marketing and public relations. Also in a variety of social contexts starting from early childhood, we evaluate others based on third-party interactions² and adjust our own behaviour to earn good reputations from others³. In this regard, although some studies suggest the existence of social evaluation in species other than humans⁴, Homo sapiens seems to have unique capability to use information of other social members through rumour and gossip.

Evolutionary biologists argue that the ability of social evaluation helps us extend the range of cooperation beyond kinship by encouraging cooperators and punishing defectors in a social dilemma^{5,6,7,8,9,10,11}. A classical example of a social dilemma is the donation game, in which a player’s cooperation benefits his or her co-player by an amount of b at the cost of c, where $0<c<b$. The following payoff matrix defines the game:

$$\begin{aligned} \left( \begin{array}{c|cc} &{} C &{} D\\ \hline C &{} b-c &{} -c\\ D &{} b &{} 0 \end{array} \right) , \end{aligned}$$

(1)

where we abbreviate cooperation and defection as C and D, respectively. As is clearly seen from this payoff matrix, choosing D is the rational choice for each player whereas mutual cooperation is better for both, hence a dilemma. The players can escape from mutual defection by the action of reciprocity if the game is repeated^{12,13,14,15,16,17,18,19}, but the price is that they have to remember the past and repeat interaction with sufficiently high probability, which is sometimes unfeasible. The basic idea of indirect reciprocity is that even a single encounter between two persons can be enough if that experience is reliably transferred in the form of reputation to those who will interact with these players in future. In other words, the problem is how to store, transmit, and retrieve information on each others’s past behaviour in a distributed manner^9,20. Experiments show that the notion of indirect reciprocity provides a useful explanation for cooperative human behaviour^21,22.

For this mechanism to work, we need two rules as a social norm: One is an assessment rule to assign reputation to a player based on his or her action to another player. The other is a behavioural rule to prescribe an action between C and D, when players’ reputations are given. An early idea was a norm called Image Scoring, which judges the donor’s C and D as good and bad, respectively⁶. According to this norm, cooperation can thrive when

$$\begin{aligned} q > c/b, \end{aligned}$$

(2)

where q means the probability of knowing someone’s reputation²³. On the one hand, this condition seems natural because it parallels Hamilton’s rule for kin selection, and the only difference is that q has replaced genetic relatedness. On the other hand, if one asks what is an essential prerequisite for a norm to promote cooperation, it is not answered by Eq. (2), and we need a broader perspective on the structure of social norms.

According to Kandori’s formalism²⁴, Image Scoring is an example of ‘first-order’ assessment rules because its judgment depends only on the donor’s action. A ‘second-order’ assessment rule takes the recipient’s reputation into account, and a ‘third-order’ assessment rule additionally refers to the donor’s reputation. The number of possible third-order rules thus amounts to $2^{2^3} = 256$. On the other hand, the number of actions rules is $2^{2^2} = 16$ because a behavioural rule prescribes an action depending on the reputations of the donor and recipient. Among the $2^{2^3 + 2^2} = 4096$ combinations, we have the leading eight^25,26, the eight pairs of an assessment rule $\alpha$ and a behavioural rule $\beta$ that make cooperative equilibrium evolutionarily stable against every mutant with $\beta ' \ne \beta$ (Table 1).

Table 1 Leading eight.

Full size table

The situation becomes complicated when reputations are not globally shared in the population: Misjudgement does occur in the presence of error, and some players may even have their own private rules of assessment^31,32,33,34. Then, strict social norms such as ‘Judging’ and ‘Stern Judging’ completely fail to tell if other players are good or bad, although they successfully induce cooperation when reputation is always public information^35,36. Communication rounds can be introduced to resolve disagreements¹⁰, or one may need empathy or prudence in judgment to alleviate the problem^37,38, but these remedies imply the intrinsic instability of the reputation mechanism in its pure sense. We also point out that most of the existing models are based on an assumption that the dynamic variables are binary, although reputation is not really a simple dichotomy between good and bad, and some actions cannot be classified as either cooperation or defection^39,40.

In this work, we thus wish to investigate indirect reciprocity by taking reputations and actions as continuous variables. By doing so, we can naturally deal with the continuous dynamics between the existing norm and its close variants by means of analytic tools. We also expect that this formulation can be used to address the problems of error and incompleteness: The idea is that perception error will effectively replace a binary reputation by a probabilistic mixture between good and bad, just as a binary action can be replaced by a probabilistic mixture of cooperation and defection in the presence of implementation error. Although the number of possible social norms expands to infinity, we will restrict ourselves to local-stability analysis by assuming that mutants appear from a small neighbourhood of the existing social norm.

Analysis

Let us imagine a large population and denote the number of players as N. The basic setting is that a random pair of players are picked up to play the donation game [Eq. (1)]. In our model, the player chosen as a donor decides the degree of cooperation to the co-player between zero and one, which mean full defection and full cooperation, respectively, based on their reputations. Let $m_{ij}$ denote player j’s reputation from the viewpoint of player i. The player i also has a behavioural rule $\beta _i (m_{ii}, m_{ij})$, which determines how much he or she will do as a donor to j. Note that all of $m_{ij}$, $\alpha _{i}$, and $\beta _i$ for any i and j take real values inside the unit interval. Player k is observing the interaction between i and j, and it has its own assessment rule $\alpha _k (m_{ki}, \beta _i, m_{kj})$. With observation probability $q > 0$, the reputation that k assigns to i will be updated on average as follows:

$$\begin{aligned} m_{ki}^{t+1} = (1-q) m_{ki}^t + \frac{q}{N-1} \sum _{j \ne i} \alpha _k \left[ m_{ki}^t, \beta _i \left( m_{ii}^t, m_{ij}^t \right) , m_{kj}^t \right] \end{aligned}$$

(3)

where the superscripts have been used as time indices. Equation (3) is to be analysed in this section. Before proceeding, let us note two points: First, as a deterministic equation, Eq. (3) does not include error explicitly. If the probability of error is low, Eq. (3) will nevertheless describe the dynamics for most of the time, and the main effect of error will be to perturb the output of $\alpha$ or $\beta$ by a small amount at a point in time, say, $t=0$. Second, from a mathematical point of view, it is preferable to treat both diagonal and off-diagonal elements on an equal footing as in Eq. (3), which implies that one has to observe even the self-reputation $m_{ii}$ probabilistically. If that sounds unrealistic, we may alternatively assume that donors and recipients update their self-reputations with probability one. However, it is a reasonable guess that the difference between these two settings becomes marginal when N is large enough, and this guess is indeed verified by numerical calculation (not shown).

Throughout this work, $\alpha$ and $\beta$ are assumed to be C$^2$-differentiable. In addition, we will focus on the cases where the system has a fixed point characterized by

$$\begin{aligned} \alpha (1,1,1)&=1 \end{aligned}$$

(4a)

$$\begin{aligned} \beta (1,1)&=1 \end{aligned}$$

(4b)

because otherwise the norm would not sustain cooperation among well-reputed players from the start. As concrete examples of $\alpha$ and $\beta$, let us extend the leading eight to deal with continuous variables by applying the trilinear (bilinear) interpolation to $\alpha$ ($\beta$) in Table 1. If we consider L3 (Simple Standing), for instance, it is described by

$$\begin{aligned} \alpha _\text {SS}(x,y,z)&= yz - z + 1 \end{aligned}$$

(5a)

$$\begin{aligned} \beta _\text {SS}(x,y)&= y. \end{aligned}$$

(5b)

If we define $A_\xi \equiv \left. \partial \alpha / \partial \xi \right| _{(1,1,1)}$ and $B_\lambda \equiv \left. \partial \beta / \partial \lambda \right| _{(1,1)}$ with $\xi \in \{x,y,z\}$ and $\lambda \in \{x,y\}$, all the leading eight have $A_y = B_y = 1$, together with $A_x = B_x = 0$, and these are related with the basic properties of the leading eight to be nice, retaliatory, apologetic, and forgiving²⁶.

Below, we will examine two aspects of stability: The first is recovery of full cooperation from disagreement in a homogeneous population where everyone uses the same $\alpha$ and $\beta$³⁶. Starting from $m_{ij}=1$ for every i and j, the dynamics of Eq. (3) will be investigated within the framework of linear-stability analysis. The second aspect is the stability against mutant norms, for which we have to check the long-term payoff difference between the resident and mutant norms in a stationary state. We again start this analysis from a nearly homogeneous population in which only one individual considers using a slightly different norm. Although private assignment of reputation is allowed, the point is that it will remain unrealised if no one has a reason to deviate from the prevailing norm, considering that such deviation will only decrease his or her own payoff. In this sense, the homogeneity serves as a self-consistent assumption in the second part of the stability analysis.

Recovery from disagreement

To understand the time evolution of disagreement in a homogeneous population with common $\alpha$ and $\beta$, let us rewrite Eq. (3):

$$\begin{aligned} m_{ki}^{t+1} = (1-q) m_{ki}^t + \frac{q}{N-1} \sum _{j \ne i} \alpha \left[ m_{ki}^t, \beta \left( m_{ii}^t, m_{ij}^t \right) , m_{kj}^t \right] , \end{aligned}$$

(6)

where $\alpha _k = \alpha$ and and $\beta _i = \beta$ in this homogeneous population. Initially, everyone starts with a good reputation, which can be perturbed by error. To see whether the magnitude of the perturbation grows with time, we set $m_{ki}^t \equiv 1-\varepsilon _{ki}^t$ and expand the above equation to the first order of $\varepsilon$ as follows:

$$\begin{aligned} 1-\varepsilon _{ki}^{t+1}= & {} (1-q) \left( 1-\varepsilon _{ki}^t \right) + \frac{q}{N-1} \sum _{j \ne i} \alpha \left[ 1-\varepsilon _{ki}^t, \beta \left( 1-\varepsilon _{ii}^t, 1-\varepsilon _{ij}^t \right) , 1-\varepsilon _{kj}^t \right] \end{aligned}$$

(7)

$$\begin{aligned}\approx & {} (1-q) \left( 1-\varepsilon _{ki}^t \right) + \frac{q}{N-1} \sum _{j \ne i} \alpha \left[ 1-\varepsilon _{ki}^t, 1- \left( B_x\varepsilon _{ii}^t + B_y \varepsilon _{ij}^t \right) , 1-\varepsilon _{kj}^t \right] \end{aligned}$$

(8)

$$\begin{aligned}\approx & {} (1-q) \left( 1-\varepsilon _{ki}^t \right) + \frac{q}{N-1} \sum _{j \ne i} \left\{ 1- \left[ A_x \varepsilon _{ki}^t + A_y \left( B_x\varepsilon _{ii}^t + B_y \varepsilon _{ij}^t \right) + A_z \varepsilon _{kj}^t \right] \right\} , \end{aligned}$$

(9)

or, equivalently,

$$\begin{aligned} \varepsilon _{ki}^{t+1}\approx & {} (1-q) \varepsilon _{ki}^t + \frac{q}{N-1} \sum _{j \ne i} [{A_x} \varepsilon _{ki}^t + {A_y} ({B_x} \varepsilon _{ii}^t + {B_y} \varepsilon _{ij}^t) + {A_z} \varepsilon _{kj}^t] \end{aligned}$$

(10)

$$\begin{aligned}= & {} (1-q + q{A_x}) \varepsilon _{ki}^t + q {A_y} {B_x} \varepsilon _{ii}^t + \frac{q}{N-1} \sum _{j \ne i} [{A_y} {B_y} \varepsilon _{ij}^t + {A_z} \varepsilon _{kj}^t], \end{aligned}$$

(11)

which leads to

$$\begin{aligned} \frac{d}{dt} \varepsilon _{ki} \approx -q(1-{A_x}) \varepsilon _{ki} + q {A_y} {B_x} \varepsilon _{ii} + \frac{q}{N-1} \sum _{j \ne i} [{A_y} {B_y} \varepsilon _{ij} + {A_z} \varepsilon _{kj}], \end{aligned}$$

(12)

if time is regarded as a continuous variable. This is a linear-algebraic system with an $N^2 \times N^2$ matrix. In principle, we can find the stability at the origin as well as the speed of convergence toward it by calculating the eigenvalues. By attempting this calculation from $N=2$ to 5 with a symbolic-algebra system⁴¹, we see the following pattern in the eigenvalue structure:

$$\begin{aligned} \Lambda _1^{(N^2-2N+1)}= & {} q\left( -1+{A_x}- \frac{1}{N-1}{A_z} \right) \end{aligned}$$

(13)

$$\begin{aligned} \Lambda _2^{(N-1)}= & {} q(-1+{A_x}+{A_z}) \end{aligned}$$

(14)

$$\begin{aligned} \Lambda _3^{(N-1)}= & {} q\left( -1+{A_x}-\frac{1}{N-1}{A_z}+{A_y} {B_x} -\frac{1}{N-1}{A_y} {B_y} \right) \end{aligned}$$

(15)

$$\begin{aligned} \Lambda _4^{(1)}= & {} q(-1 + {A_x} + {A_z} + {A_y} {B_x} + {A_y} {B_y}), \end{aligned}$$

(16)

where each superscript on the left-hand side means multiplicity of the corresponding eigenvalue. Based on this observation, we conjecture that this pattern is valid for general N. A sufficient condition for recovery to take place in this first-order calculation is that the largest eigenvalue is negative. The largest eigenvalue is the last one, $\Lambda _4^{(1)}$, because all the derivatives are non-negative. In other words, the first-order perturbation analysis gives a sufficient condition for local recovery as

$$\begin{aligned} Q \equiv -1 + {A_x} + {A_z} + {A_y} ({B_x} + {B_y}) < 0. \end{aligned}$$

(17)

Suppression of mutants

To analyse the effect of a mutant norm, we will look at the long-time behaviour in Eq. (3). That is, for given sets of rules $\{ \alpha _i \}$ and $\{ \beta _i \}$, we assume that the image matrix $\{ m_{ij} \}$ will converge to a stationary state as $t \rightarrow \infty$, satisfying

$$\begin{aligned} m_{ki} = \frac{1}{N-1} \sum _{j \ne i} \alpha _k \left[ m_{ki}, \beta _i (m_{ii}, m_{ij}), m_{kj} \right] . \end{aligned}$$

(18)

Note that q only affects the speed of convergence to stationarity: It is an irrelevant parameter as far as we work with a stationary state, which is in contrast with Eq. (2), where q appears as an essential condition for indirect reciprocity. In the donation game with benefit b and cost c [Eq. (1)], player j’s expected payoff can be computed as

$$\begin{aligned} \pi _j = \frac{1}{N-1} \left[ b \sum _{i \ne j} \beta _i (m_{ii}, m_{ij}) - c \sum _{i \ne j} \beta _j (m_{jj}, m_{ji}) \right] . \end{aligned}$$

(19)

For the sake of simplicity, let us assume that every person with index 1 to $N-1$ has the same rules and equal reputation, so that player $i=1$ is representative for all of them in the resident population. Now, the situation is effectively reduced to a two-body problem between players 0 and 1. By assumption, the system initially starts from a fully cooperative state where everyone has good reputation, i.e., $m_{11} = \beta (1,1) = \alpha (1,1,1)= 1$. The rules used by the resident population will be denoted by $\alpha \equiv \alpha _1$ and $\beta \equiv \beta _1$ without the subscripts. Now, the focal player 0 attempts a slightly different norm, defined by $\alpha _0 (x,y,z) = \alpha (x,y,z) - \delta (x,y,z)$ and $\beta _0(x,y) = \beta (x,y) - \eta (x,y)$ with $|\delta | \ll 1$ and $|\eta | \ll 1$. Let us assume that the introduction of $\delta$ and $\eta$ causes small changes in the image matrix: Only the elements related to the focal player will be affected because the residents can still give $m_{11}=1$ to each other when the mutant occupies a negligible fraction of the population, i.e., $N \gg 1$. Therefore, if mutation leads to $m_{00} = 1-\varepsilon _{00}$, $m_{01} = 1-\varepsilon _{01}$, and $m_{10} = 1-\varepsilon _{10}$ with $\varepsilon _{ij} \ll 1$, by expanding Eq. (18) to the linear order of perturbation (see Methods), we obtain

$$\begin{aligned} \varepsilon _{00}= & {} \frac{(1-{A_x}+{A_y} {B_y})\delta _1 + (1-{A_x}-{A_z}){A_y} \eta _1}{(1-{A_x}-{A_z})(1-{A_x}-{A_y} {B_x})} \end{aligned}$$

(20)

$$\begin{aligned} \varepsilon _{01}= & {} \frac{\delta _1}{1-{A_x}-{A_z}} \end{aligned}$$

(21)

$$\begin{aligned} \varepsilon _{10}= & {} \frac{({B_x}+{B_y})\delta _1 + (1-{A_x}-{A_z})\eta _1}{(1-{A_x}-{A_z})(1-{A_x}-{A_y} {B_x})} {A_y}, \end{aligned}$$

(22)

where $\delta _1 \equiv \delta (1,1,1) \ge 0$ and $\eta _1 \equiv \eta (1,1) \ge 0$, provided that

$$\begin{aligned} {A_x} + {A_z}< & {} 1 \end{aligned}$$

(23)

$$\begin{aligned} {A_x} + {A_y} {B_x}< & {} 1. \end{aligned}$$

(24)

We can now calculate the focal player 0’s payoff as follows:

$$\begin{aligned} \pi _0= & {} \frac{1}{N-1} \left[ b \sum _{i \ne 0} \beta _i (m_{ii}, m_{i0}) - c \sum _{i \ne 0} \beta _0 (m_{00}, m_{0i}) \right] \end{aligned}$$

(25)

$$\begin{aligned}= & {} b \beta (m_{11}, m_{10}) - c\beta _0 (m_{00}, m_{01}) \end{aligned}$$

(26)

$$\begin{aligned}\approx & {} b \left( 1- {B_y} \varepsilon _{10} \right) - c \left( 1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{01} - \eta _1 \right) . \end{aligned}$$

(27)

If we plug Eqs. (20), (21), and (22) here, the payoff change $\Delta \pi _0 \equiv \pi _0 - (b-c)$ is given as

$$\begin{aligned} \Delta \pi _0 = -\frac{b{A_y} {B_y} - c(1-{A_x})}{1-{A_x} -{A_y} {B_x}} \left[ \left( \frac{{B_x}+{B_y}}{1-{A_x}-{A_z}} \right) \delta _1 + \eta _1 \right] , \end{aligned}$$

(28)

and we require this quantity to be negative for any small positive $\delta _1$ and $\eta _1$. Here, it is worth stressing that the signs of $\delta _1$ and $\eta _1$ are determined because we start from a fully cooperative state with $m_{ij}=1$: For other states where $\delta$ and $\eta$ can take either sign, the first-order terms should vanish so that the second-order terms can determine the sign of $\Delta \pi _0$. In this respect, the payoff analysis is greatly simplified by choosing the specific initial state. Because of Eqs. (23) and (24), the negativity of Eq. (28) is reduced to the following inequality:

$$\begin{aligned} \frac{b}{c} > \frac{1-{A_x}}{{A_y} {B_y}}, \end{aligned}$$

(29)

which, together with Eqs. (23) and (24), characterizes a condition for a social norm to stabilize cooperation against local mutants, as an alternative to Eq. (2). This result is intuitively plausible because cooperation will be unstable if one does not lose reputation by decreasing the degree of cooperation (i.e., ${A_y} \approx 0$) or if no punishment is imposed on an ill-reputed player (i.e., ${B_y} \approx 0$).

Two remarks are in order: First, whether mutation occurs to a single individual or to a fraction of the population does not alter the final result in this first-order calculation. Suppose that the population is divided into two groups with fractions p and $1-p$, respectively. One group has $\alpha$ and $\beta$, and the other group has $\alpha +\delta$ and $\beta +\eta$. Then, the payoff difference between two players, each from a different group, is still the same as Eq. (28) (see Methods). Therefore, if an advantageous mutation occurs with $p \ll 1$, the mutants are always better off than the resident until they take over the whole population, i.e., $p \rightarrow 1$. In this sense, our condition determines not only the initial invasion but also the fixation of a mutant norm, as long as it is a close variant of the resident one. Second, one could ask what happens if a mutant differs only in the slopes while kee** $\delta _1=\eta _1=0$. Equation (28) does not answer this question because it is based on an assumption that the $\left. \partial \delta / \partial \xi \right| _{(1,1,1)} \varepsilon _{ij}$ and $\left. \partial \eta / \partial \lambda \right| _{(1,1)} \varepsilon _{ij}$, where $\xi \in \{ x, y, z\}$ and $\lambda \in \{x, y\}$, are all negligibly small in the first-order calculation. However, even if the derivatives are taken into consideration, we find that $\delta _1$ or $\eta _1$ must still be positive to make a finite payoff change. In other words, the basic form of Eq. (28) is still useful, although the coefficients include correction terms. The performance of such a ‘slope mutant’ will be checked numerically at the end of the next section.

Results

In this section, we will numerically check the continuous-reputation system in the presence of inhomogeneity, noise, and incomplete information. More specifically, the simulation code should allow each player i to adopt a different set of $\alpha _i$ and $\beta _i$ to simulate an inhomogeneous population. The outputs of $\alpha _i$ and $\beta _i$ can be affected by random-number generation to simulate a noisy environment where misperception and misimplementation occur, and every interaction between a pair of players will update only some part of the reputation system, parametrized by the observation probability q, because information is incomplete.

Our numerical simulation code is based on a publicly available one³⁶ but has been modified to handle continuous variables. To simulate the dynamics of a society of N players, we work with an $N \times N$ image matrix $\{ m_{ij} \}$ whose elements are all set to be ones at the beginning. Every player starts with zero payoff, i.e., $\pi _i = 0$ initially. In each round, we randomly pick up two players, say, i and j, so that i is the donor and j is the recipient of the donation game [Eq. (1)], which has $b=2$ and $c=1$ unless otherwise noted. Each other member of the population, say, k, independently observes the interaction with probability q and updates $m_{ki}$ according to his or her own assessment rule $\alpha _k$. Although the above analyses are generally applicable to any norms defined by $\alpha$ and $\beta$ as long as Eq. (4) is true, we would like to focus on Simple Standing as a representative example of successful norms. Misperception may occur with probability e, whereby $m_{ki}$ becomes a random number drawn from the unit interval. Implementation error is also simulated in a similar way by setting the output of $\beta$ to a random number between zero (defection) and one (cooperation) with probability $\gamma$. This process is repeated for M rounds, during which every player’s payoff is accumulated. Equation (18) suggests that q will only affect the convergence rate toward a stationary state. For this reason, we will fix this parameter at $q=0.4$ throughout the simulation unless otherwise mentioned. Note also that we have deliberately made this parameter low enough to violate the inequality in Eq. (2).

To see the effect of Q on recovery [Eq. (17)], we have tested three norms one by one in a homogeneous population with $e=\gamma =0$ (Fig. 1). All these norms have $\alpha (1,1,1)=1$ and $\beta (1,1)=1$ in common but their local slopes are different to make Q positive, zero, or negative. The first norm under consideration has $({A_x}, {A_y}, {A_z}) = (0.2,0.9,0.1)$ and $({B_x}, {B_y}) = (0.2,0.8)$, which together make $Q>0$. If some members of the population initially have slightly imperfect reputations, they fail to recover under such a norm. If $Q<0$, on the other hand, the recovery process indeed takes place with a finite time scale. Although Simple Standing violates Eq. (17) by having $Q=0$, our simulation shows that it gets reputation recovered with the aid of higher-order terms, and it is a slow process with a diverging time scale. Among the leading eight, L1, L3 (Simple Standing), L4, and L7 (Staying) fall into this category of $Q=0$, whereas the other four, i.e., L2 (Consistent Standing), L5, L6 (Stern Judging), and L8 (Judging), have positive Q. The difference between these two groups is whether ${A_z} = \alpha _{1C1} - \alpha _{1C0} = 1-\alpha _{1C0}$ is zero or one: If a well-reputed player has to risk his or her own reputation in hel** an ill-reputed co-player, i.e., $\alpha _{1C0}=0$, it means ${A_z}=1$ and $Q>0$, so we can conclude that the initial state of $m_{ki} \approx 1$ will not be recovered. According to an earlier study on the leading eight³⁶, the latter four with $Q>0$ have long recovery time from a single disagreement in reputation. Although it is not derived from a continuum formulation, the result is qualitatively consistent with ours.

As for the effect of mutation in assessment rules, let us consider the following scenario: One half of the population have adopted Simple Standing [Eq. (5)], whereas the other half are “mutants” using a different assessment rule $\alpha _\text {SS} - \delta$ with

$$\begin{aligned} \delta (x,y,z) = \delta _1 (2yz - 2z + 1), \end{aligned}$$

(30)

where $\delta _1$ is a small number, say, $\delta _1 = 0.02$ in numerical calculation. Such a half-and-half configuration is being used because the payoff difference [Eq. (28)] is unaffected by the fraction of mutants, p (see Methods). Figure 2a shows that the level of cooperation is still high if $e \ll 1$, and the cooperation rate of Simple Standing in the continuous form converges to $100\%$ in a monomorphic population (not shown). Furthermore, we see that mutants are worse off than the players of Simple Standing, i.e., $\pi _0 < \pi _1$, as expected.

From a theoretical viewpoint, an important question is how quickly the mutants’ payoff difference $\Delta \pi _0 \equiv \pi _0 - \pi _1$ becomes negative: Although we have argued that the inequality will be true for Simple Standing, the calculation is based on several assumptions. In particular, one could say that Eq. (3) corresponds to $M \propto N^2$ because it seems to assume that everyone meets every other player with a weighting factor of $1/(N-1)$. If $M \propto N^2$, however, it would pose a serious obstacle to applying such a norm to the society where the number of interactions will grow linearly with N. Fortunately, the inset of Fig. 2a shows that $M \propto N$ indeed suffices to make $\Delta \pi _0$ negative. One could also point out that the payoff difference should be $\Delta \pi _0 = - \delta _1$ according to Eq. (28), whereas the result in Fig. 2a has smaller magnitude. A part of the reason is that Eq. (28) does not take perception error into account, so the numerical value recovers the predicted order of magnitude as $e \rightarrow 0$. In addition, Eq. (28) is based on a first-order approximation, and a higher-order calculation reproduces the observed value with greater precision (see Methods).

An important prediction of our analysis is the threshold of b/c to make a local mutant worse off than the resident population [Eq. (29)]. In Fig. 2b, we directly check Eq. (29) by measuring payoffs in equilibrium in a population of size $N=50$. A variant of Simple Standing is chosen as the resident norm, which occupies $p=0.5$ of the population with $\alpha (1,1,1)=\beta (1,1)=1$ and ${A_x} = {A_z} = {B_x} = 0$. The only difference from Simple Standing is that ${A_y} = {B_y} = 0.9$, and the reason of this variation is that the first-order perturbation for the leading eight develops spurious singularity when p is finite (see Methods). When perception is free from error, i.e., $e=0$, the results do not depend on the observation probability q, as expected from stationarity [Eq. (18)], and the threshold value is consistent with the first- and second-order calculations [the arrows in Fig. 2b]. When $e>0$, on the other hand, the threshold is pushed upward, implying that cooperation becomes harder to stabilize because of the perception error. In addition, we now see that incomplete information with $q<1$ can shift the threshold further with the aid of positive e. We have also changed the value of $\gamma$, but it does not not change the average behaviour in the above results. Overall, the point of Fig. 2b is that our analysis does capture the correct picture.

Finally, we can numerically check the effect of a ‘slope mutant’, which has $\alpha (1,1,1)=1$ as a fixed point and the same behavioural rule as Simple Standing but differs in the slopes ${A_x}$, ${A_y}$ and ${A_z}$. To be more specific, let us assume that a mutant norm occupies $10\%$ of the population whereas the rest of them are using Simple Standing. The values of $\alpha (x,y,z)$ at the vertices of the three-dimensional unit hypercube are randomly drawn from the unit interval, except for $\alpha (1,1,1)=1$. Then, the trilinear interpolation is used to construct the continuous assessment rule. According to our simulation (Fig. 3), the performance of the mutant norm is strongly correlated with its Q-value [Eq. (17)]. Recall that the expression of Q has been derived in the context of recovery from small disagreement in a homogeneous population. Figure 3 nevertheless suggests that it can also serve as a useful indicator to tell if a minority of ‘slope mutants’ will be competitive with the resident norm, even when the difference between their assessment rules is not necessarily small.

Summary and discussion

In summary, we have studied indirect reciprocity with private, noisy, and incomplete information by extending the binary variables for reputation and behaviour to continuous ones. The extension to continuum is an idealization because it would impose an excessive cognitive burden to keep track of others' reputations without discretization; nonetheless, this abstraction allows us to overcome the fact that the sharp dichotomy between good and bad is often found insufficient in reporting an assessment^42,43,44. In particular, this formulation makes it possible to check the role of sensitivity to new information in judging others and adjusting our own behaviour. That is, according to Eq. (29), the benefit-cost ratio of cooperation should increase for stabilizing the cooperative initial state, if reputation is insensitive to observed behaviour (low ${A_y}$) or if the level of cooperation is insensitive to the recipient’s reputation (low ${B_y}$). At the same time, in contrast to the well-known condition for indirect reciprocity akin to Hamilton’s rule [Eq. (2)], we have observed that incompleteness of information, controlled by $q<1$, mainly affects the convergence toward a stationary state without altering the overall conclusion. This approach sheds light on difference among the leading eight in their recovery speeds from a single disagreement. Our analysis has identified the key factor $\alpha _{1C0}$ in Table 1, i.e., how to assign reputation to a well-reputed donor who chooses C against an ill-reputed recipient: If this choice is regarded as good according to $\alpha _{1C0}=1$, making the assessment function $\alpha (x,y,z)$ insensitive to z, the recovery can take place smoothly. As a result, we conclude that $\alpha$ should respond to the donor’s defection (${A_y}>0$) but not necessarily to the players’ reputations (e.g., ${A_x} = {A_z} = 0$). A recent study also argues that hel** an ill-reputed player should be regarded as good to maintain stable cooperation⁴⁵. Such understanding of indirect reciprocity in terms of sensitivity is important because, as is usual, information processing through reputation has a trade-off between robustness and sensitivity: One could underestimate new information and fail to adapt, or, one could overestimate it and fail to distinguish noise from the signal. In practice, the best way of assessment seems to be updating little by little upon arrival of new information⁴⁶, and such a possibility is already incorporated in this continuum formulation.

It should be emphasized that our analysis has focused on local perturbation to the existing norm. Therefore, our inequalities cannot be interpreted as a condition for evolutionary stability against every possible mutant. Moreover, although $\Delta \pi _0$ is found independent of p in our analysis, one should keep in mind that it results from a first-order theory so that higher-order corrections generally show dependence on p. If a mutant is sufficiently different from the resident, then the first-order theory fails and the payoff difference may well depend on p. For instance, if we think of a population consisting of L1 and L8 (Table 1), we see that L1 is better off only when it comprises the majority of the population (not shown). Having said that, our local analysis can nevertheless provide a necessary condition which will hold for stronger notions of stability as well. We also believe that this locality assumption is usually plausible in reality, considering that a social norm is a complex construct that combines expectation and action in a mutually reinforcing manner and thus resists change but small ones⁴⁷. An empirical analysis shows that even orthographic and lexical norms change so slowly that it takes centuries unless intervened by a formal institution⁴⁸. Another restriction in our analytic approach is that the mutation is assumed to have positive $\delta _1$ so that the mutant is not fully content with the initial cooperative state. If two norms have $\delta _1=0$ in common and differ only by slopes at the initial state, the first-order perturbation does not give a definite answer as to their dynamics. Having positive $\delta _1$ can be interpreted from a myopic player’s point of view as follows: A selfish player in a cooperating population may feel tempted to devalue others’ cooperation and reduce his or her own cost of cooperation toward them. If our condition is met, however, such behaviour will eventually be punished by the social norm.

“Maturity of mind is the capacity to endure uncertainty,” says a maxim. Although one lesson of the life is that we have to accept the grey area between good and bad, reputation is still something that can be easily driven to extremes, and what is worse is that it often goes in a different direction for each observer. Despite the theoretical achievement of indirect reciprocity, its real difficulties are thus manifested in the problem of private assessment, noise, and incomplete information. Our finding suggests that we can get a better grip on indirect reciprocity by preparing reputational and behavioural scales with finer gradations, which may be thought of as a form of systematic deliberation to protect each other’s reputation from rash judgement.

Methods

Linear-order corrections

Equation (18) in the large-N limit is written as follows:

$$\begin{aligned} m_{00}= & {} \frac{1}{N-1} \sum _{j\ne 0} \alpha _0 [m_{00}, \beta _0 (m_{00}, m_{0j}), m_{0j}] = \alpha _0 [m_{00}, \beta _0 (m_{00}, m_{01}), m_{01}] \end{aligned}$$

(31)

$$\begin{aligned} m_{01}= & {} \frac{1}{N-1} \sum _{j\ne 1} \alpha _0 [m_{01}, \beta _1 (m_{11}, m_{1j}), m_{0j}] \approx \alpha _0 [m_{01}, \beta _1 (m_{11}, m_{11}), m_{01}] \end{aligned}$$

(32)

$$\begin{aligned} m_{10}= & {} \frac{1}{N-1} \sum _{j\ne 0} \alpha _1 [m_{10}, \beta _0 (m_{00}, m_{0j}), m_{1j}] = \alpha _1 [m_{10}, \beta _0 (m_{00}, m_{01}), m_{11}] \end{aligned}$$

(33)

$$\begin{aligned} m_{11}= & {} \frac{1}{N-1} \sum _{j\ne 1} \alpha _1 [m_{11}, \beta _1 (m_{11}, m_{1j}), m_{1j}] \approx \alpha _1 [m_{11}, \beta _1 (m_{11}, m_{11}), m_{11}]. \end{aligned}$$

(34)

With $m_{00} = 1-\varepsilon _{00}$, $m_{01} = 1-\varepsilon _{01}$, and $m_{10} = 1-\varepsilon _{10}$, Eq. (32) becomes

$$\begin{aligned} 1-\varepsilon _{01}= & {} \alpha _0 (1-\varepsilon _{01}, 1, 1-\varepsilon _{01}) = \alpha (1-\varepsilon _{01}, 1, 1-\varepsilon _{01}) - \delta (1-\varepsilon _{01}, 1, 1-\varepsilon _{01}) \end{aligned}$$

(35)

$$\begin{aligned}\approx & {} \alpha (1,1,1) - {A_x} \varepsilon _{01} - {A_z} \varepsilon _{01} - \delta (1,1,1) = 1 - {A_x} \varepsilon _{01} - {A_z} \varepsilon _{01} - \delta _1, \end{aligned}$$

(36)

where $\alpha _\xi \equiv \left. \partial \alpha / \partial \xi \right| _{(1,1,1)}$ and $\delta _1 \equiv \delta (1,1,1)$. Thus, we have

$$\begin{aligned} \varepsilon _{01} \approx \left( 1-{A_x} - {A_z} \right) ^{-1} \delta _1. \end{aligned}$$

(37)

Likewise,

$$\begin{aligned} \beta _0 (1-\varepsilon _{00}, 1-\varepsilon _{01})= & {} \beta (1-\varepsilon _{00}, 1-\varepsilon _{01}) - \eta (1-\varepsilon _{00}, 1-\varepsilon _{01}) \end{aligned}$$

(38)

$$\begin{aligned}\approx & {} 1 - {B_x} \varepsilon _{00} - {B_y} \varepsilon _{01} - \eta _1, \end{aligned}$$

(39)

where $\beta _\lambda \equiv \left. \partial \beta / \partial \lambda \right| _{(1,1)}$ and $\eta _1 \equiv \eta (1,1)$. Using this expression, we obtain from Eq. (33) the following:

$$\begin{aligned} 1-\varepsilon _{10}= & {} \alpha \left( 1-\varepsilon _{10}, 1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{01} -\eta _1, 1 \right) \end{aligned}$$

(40)

$$\begin{aligned}\approx & {} 1- {A_x} \varepsilon _{10} - {A_y} \left( {B_x} \varepsilon _{00} + {B_y} \varepsilon _{01} + \eta _1 \right) , \end{aligned}$$

(41)

which means

$$\begin{aligned} \varepsilon _{10}= & {} \frac{{A_y}}{1-{A_x}} ({B_x} \varepsilon _{00} + {B_y} \varepsilon _{01} + \eta _1 ) \end{aligned}$$

(42)

$$\begin{aligned}= & {} \frac{{A_y}}{1-{A_x}} \left[ {B_x} \varepsilon _{00} + {B_y} (1-{A_x} - {A_z})^{-1} \delta _1 + \eta _1 \right] . \end{aligned}$$

(43)

To get a closed-form expression for this, we need $\varepsilon _{00}$ in addition to $\varepsilon _{01}$ [Eq. (37)]. Thus, from Eq. (31), we derive

$$\begin{aligned} 1-\varepsilon _{00}\approx & {} \alpha \left[ 1-\varepsilon _{00}, \beta _0 (1-\varepsilon _{00}, 1-\varepsilon _{01}), 1-\varepsilon _{01} \right] - \delta _1 \end{aligned}$$

(44)

$$\begin{aligned}\approx & {} \alpha \left( 1-\varepsilon _{00}, 1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{01} - \eta _1, 1-\varepsilon _{01} \right) - \delta _1 \end{aligned}$$

(45)

$$\begin{aligned}\approx & {} 1 - {A_x} \varepsilon _{00} - {A_y} \left( {B_x} \varepsilon _{00} + {B_y} \varepsilon _{01} + \eta _1 \right) - {A_z} \varepsilon _{01} - \delta _1, \end{aligned}$$

(46)

which gives

$$\begin{aligned} \varepsilon _{00}= & {} \frac{1}{1-{A_x} - {A_y} {B_x}} \left[ ({A_y} {B_y} + {A_z}) \varepsilon _{01} + {A_y} \eta _1 + \delta _1 \right] \end{aligned}$$

(47)

$$\begin{aligned}= & {} \frac{1}{1-{A_x} - {A_y} {B_x}} \left[ \frac{{A_y} {B_y} + {A_z}}{1-{A_x} - {A_z}} \delta _1 + {A_y} \eta _1 + \delta _1 \right] , \end{aligned}$$

(48)

where we have used Eq. (37). By substituting Eq. (48) into Eq. (43), we can write $\varepsilon _{10}$ explicitly.

Finite fraction of mutants

If a mutant norm occupies a finite fraction p, Eqs. (31) to (34) are generalized to

$$\begin{aligned} m_{00}= & {} p \alpha _0 [m_{00}, \beta _0(m_{00}, m_{00}), m_{00}] + {\bar{p}} \alpha _0 [m_{00}, \beta _0(m_{00}, m_{01}), m_{01}] \end{aligned}$$

(49)

$$\begin{aligned} m_{01}= & {} p \alpha _0 [m_{01}, \beta _1(m_{11}, m_{10}), m_{00}] + {\bar{p}} \alpha _0 [m_{01}, \beta _1(m_{11}, m_{11}), m_{01}] \end{aligned}$$

(50)

$$\begin{aligned} m_{10}= & {} p \alpha _1 [m_{10}, \beta _0(m_{00}, m_{00}), m_{10}] + {\bar{p}} \alpha _1 [m_{10}, \beta _0(m_{00}, m_{01}), m_{11}] \end{aligned}$$

(51)

$$\begin{aligned} m_{11}= & {} p \alpha _1 [m_{11}, \beta _1(m_{11}, m_{10}), m_{10}] + {\bar{p}} \alpha _1 [m_{11}, \beta _1(m_{11}, m_{11}), m_{11}], \end{aligned}$$

(52)

where ${\bar{p}} \equiv 1-p$. Through linearisation, the above equations are rewritten as

$$\begin{aligned} 1-\varepsilon _{00}\approx & {} p [1-{A_x} \varepsilon _{00} - {A_y} ({B_x} \varepsilon _{00} + {B_y} \varepsilon _{00} + \eta _1) - {A_z} \varepsilon _{00} - \delta _1]\nonumber \\&+ {\bar{p}} [1-{A_x} \varepsilon _{00} - {A_y} ({B_x} \varepsilon _{00} + {B_y} \varepsilon _{01} + \eta _1) - {A_z} \varepsilon _{01} - \delta _1] \end{aligned}$$

(53)

$$\begin{aligned} 1-\varepsilon _{01}\approx & {} p [1-{A_x} \varepsilon _{01} - {A_y} ({B_x} \varepsilon _{11} + {B_y} \varepsilon _{10}) - {A_z} \varepsilon _{00} - \delta _1]\nonumber \\&+ {\bar{p}} [1-{A_x} \varepsilon _{01} - {A_y} ({B_x} \varepsilon _{11} + {B_y} \varepsilon _{11}) - {A_z} \varepsilon _{01} - \delta _1] \end{aligned}$$

(54)

$$\begin{aligned} 1-\varepsilon _{10}\approx & {} p [1-{A_x} \varepsilon _{10} - {A_y} ({B_x} \varepsilon _{00} + {B_y} \varepsilon _{00} + \eta _1) - {A_z} \varepsilon _{10}]\nonumber \\&+ {\bar{p}} [1-{A_x} \varepsilon _{10} - {A_y} ({B_x} \varepsilon _{00} + {B_y} \varepsilon _{01} + \eta _1) - {A_z} \varepsilon _{11}] \end{aligned}$$

(55)

$$\begin{aligned} 1-\varepsilon _{11}\approx & {} p [1-{A_x} \varepsilon _{11} - {A_y} ({B_x} \varepsilon _{11} + {B_y} \varepsilon _{10}) - {A_z} \varepsilon _{10}]\nonumber \\&+ {\bar{p}} [1-{A_x} \varepsilon _{11} - {A_y} ({B_x} \varepsilon _{11} + {B_y} \varepsilon _{11}) - {A_z} \varepsilon _{11}]. \end{aligned}$$

(56)

After some algebra, we find

$$\begin{aligned} \varepsilon _{00}= & {} \frac{\delta _1 \left\{ {A_x}^2+{A_x} ({A_y} {B_x}+{A_z}-2)-{\bar{p}}{A_y}^2 {B_x} {B_y}-{\bar{p}}{A_y}^2 {B_y}^2+{A_z} [{A_y} (p{B_x}-{\bar{p}}{B_y})-1]-{A_y} {B_x}+1\right\} }{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})}\nonumber \\&+\frac{{A_y} \eta _1 (1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x} -{\bar{p}}{A_y} {B_y}-{\bar{p}}{A_z})}{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})} \end{aligned}$$

(57)

$$\begin{aligned} \varepsilon _{01}= & {} \frac{{A_y} \eta _1 p (1-{A_x}-{A_z}) ({A_y} {B_y}+{A_z})}{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})}\nonumber \\&+\frac{\delta _1 \left[ {A_x}^2+{A_x} (2 {A_y} {B_x}+{A_y} {B_y}+{A_z}-2)+{A_y}^2 {B_x}^2+{A_y}^2 {B_x} {B_y} p+{A_y}^2 {B_x} {B_y}+{A_y}^2 {B_y}^2 p \right] }{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})}\nonumber \\&+\frac{\delta _1 \left\{ {A_z} [{A_y} (p{B_x}+{B_x}+p{B_y})-1]-2 {A_y} {B_x}-{A_y} {B_y}+1\right\} }{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})} \end{aligned}$$

(58)

$$\begin{aligned} \varepsilon _{10}= & {} \frac{{A_y} (1-{A_x}-{A_y} {B_x}-{\bar{p}}{A_y} {B_y}-{\bar{p}}{A_z}) [\eta _1 (1-{A_x}-{A_z})+({B_x} +{B_y}) \delta _1]}{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})} \end{aligned}$$

(59)

$$\begin{aligned} \varepsilon _{11}= & {} \frac{{A_y} p ({A_y} {B_y}+{A_z}) (\eta _1 (1-{A_x}-{A_z})+({B_x} +{B_y}) \delta _1)}{(1-{A_x}-{A_z}) (1-{A_x}-{A_y} {B_x}) (1-{A_x}-{A_y} {B_x}-{A_y} {B_y}-{A_z})}, \end{aligned}$$

(60)

from which one can reproduce the previous results [Eqs. (20) to (22)] by taking the limit of $p \rightarrow 0$. The denominators seem to require another inequality in addition to Eqs. (23) and (24), that is,

$$\begin{aligned} {A_x} + {A_z} + {A_y} ({B_x} + {B_y}) < 1, \end{aligned}$$

(61)

which is equivalent to Eq. (17). Recall that the continuous versions of the leading eight always have ${A_y} = {B_y} = 1$ and ${A_x} = {B_x} = 0$ in common, which means that they all violate this inequality. However, in practice, no singularity arises for Simple Standing if higher-order corrections are included, and even the second-order calculation agrees moderately well with numerical results.

The payoff earned by a mutant is calculated as

$$\begin{aligned} \pi _0= & {} b [p \beta _0 (m_{00}, m_{00}) + (1-p) \beta _1(m_{11}, m_{10})] \nonumber \\&-c [p \beta _0 (m_{00}, m_{00}) + (1-p) \beta _0(m_{00}, m_{01})] \end{aligned}$$

(62)

$$\begin{aligned}\approx & {} b [p (1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{00} - \eta _1) + (1-p)(1-{B_x} \varepsilon _{11} - {B_y} \varepsilon _{10})]\nonumber \\&-c [p (1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{00} - \eta _1) + (1-p) [1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{01} - \eta _1]], \end{aligned}$$

(63)

whereas a resident player earns

$$\begin{aligned} \pi _1= & {} b [p \beta _0 (m_{00}, m_{01}) + (1-p) \beta _1(m_{11}, m_{11})] \nonumber \\&-c [p \beta _1 (m_{11}, m_{10}) + (1-p) \beta _1(m_{11}, m_{11})] \end{aligned}$$

(64)

$$\begin{aligned}\approx & {} b [p (1-{B_x} \varepsilon _{00} - {B_y} \varepsilon _{01} - \eta _1) + (1-p)(1-{B_x} \varepsilon _{11} - {B_y} \varepsilon _{11})]\nonumber \\&-c [p(1-{B_x} \varepsilon _{11} - {B_y} \varepsilon _{10}) + (1-p) (1-{B_x} \varepsilon _{11} - {B_y} \varepsilon _{11})]. \end{aligned}$$

(65)

If we plug Eqs. (57) to (60) here, the payoff difference $\Delta \pi _0 = \pi _0 - \pi _1$ becomes identical to Eq. (28) with no dependence on p.

Second-order corrections

We assume that $\delta$, $\eta$, as well as their partial derivatives, and $\varepsilon _{ij}$’s are small parameters of the same order of magnitude. The second-order perturbation for $\beta _1$ can thus be written as follows:

$$\begin{aligned} \beta _1 (m_{11}, m_{1j})= & {} \beta (1-\varepsilon _{11}, 1-\varepsilon _{1j}) \end{aligned}$$

(66)

$$\begin{aligned}\approx & {} 1 - {B_x} \varepsilon _{11} - {B_y} \varepsilon _{1j} + \frac{1}{2} {B_{xx}} \varepsilon _{11}^2 + {B_{xy}} \varepsilon _{11} \varepsilon _{1j} + \frac{1}{2} {B_{yy}} \varepsilon _{1j}^2 \end{aligned}$$

(67)

$$\begin{aligned}\equiv & {} 1 - \kappa _1. \end{aligned}$$

(68)

Here, we write $\kappa _1 \equiv \kappa _1^{(1)} + \kappa _1^{(2)}$, where $\kappa _1^{(1)} \equiv {B_x} \varepsilon _{11} + {B_y} \varepsilon _{1j}$ and $\kappa _1^{(2)} \equiv - \left( \frac{1}{2} {B_{xx}} \varepsilon _{11}^2 + {B_{xy}} \varepsilon _{11} \varepsilon _{1j} + \frac{1}{2} {B_{yy}} \varepsilon _{1j}^2 \right)$ are first- and second-order corrections, respectively, and $B_{\mu \nu } \equiv \left. \partial ^2 \beta /\partial \mu \partial \nu \right| _{(1,1)}$. Likewise,

$$\begin{aligned} \beta _0 (m_{00}, m_{0j})= & {} \beta (m_{00}, m_{0j}) - \eta (m_{00}, m_{0j}) \end{aligned}$$

(69)

$$\begin{aligned}= & {} \beta (1-\varepsilon _{00}, 1-\varepsilon _{0j}) - \eta (1-\varepsilon _{00}, 1-\varepsilon _{0j}) \end{aligned}$$

(70)

$$\begin{aligned}\approx & {} \left( 1 - {B_x} \varepsilon _{00} - {B_y} \varepsilon _{0j} + \frac{1}{2} {B_{xx}} \varepsilon _{00}^2 + {B_{xy}} \varepsilon _{00} \varepsilon _{0j} + \frac{1}{2} {B_{yy}} \varepsilon _{0j}^2 \right) \nonumber \\&- \left( \eta _1 - \eta _x \varepsilon _{00} - \eta _y \varepsilon _{0j} \right) \end{aligned}$$

(71)

$$\begin{aligned}\equiv & {} 1 - \kappa _0, \end{aligned}$$

(72)

where $\kappa _0 \equiv \kappa _0^{(1)} + \kappa _0^{(2)}$ with $\kappa _0^{(1)} \equiv {B_x} \varepsilon _{00} + {B_y} \varepsilon _{0j} + \eta _1$ and $\kappa _0^{(2)} \equiv -\left( \frac{1}{2} {B_{xx}} \varepsilon _{00}^2 + {B_{xy}} \varepsilon _{00} \varepsilon _{0j} + \frac{1}{2} {B_{yy}} \varepsilon _{0j}^2 \right) - (\eta _x \varepsilon _{00} + \eta _y \varepsilon _{0j})$.

The second-order perturbation for $\alpha _1$ is also straightforward:

$$\begin{aligned} \alpha _1 [m_{1i}, \beta _i (m_{ii}, m_{ij}), m_{1j}]\approx & {} \alpha (1-\varepsilon _{1i}, 1-\kappa _i, 1-\varepsilon _{1j}) \end{aligned}$$

(73)

$$\begin{aligned}\approx & {} 1 - {A_x} \varepsilon _{1i} - {A_y} \kappa _i - {A_z} \varepsilon _{1j} + \frac{1}{2} {A_{xx}} \varepsilon _{1i}^2 + \frac{1}{2} {A_{yy}} \left( \kappa _i^{(1)} \right) ^2 + \frac{1}{2} {A_{zz}} \varepsilon _{1j}^2\nonumber \\&+ {A_{xy}} \varepsilon _{1i} \kappa _i^{(1)} + {A_{yz}} \kappa _i^{(1)} \varepsilon _{1j} + {A_{zx}} \varepsilon _{1i} \varepsilon _{1j}, \end{aligned}$$

(74)

where $A_{\mu \nu } \equiv \left. \partial ^2 \alpha / \partial \mu \partial \nu \right| _{(1,1,1)}$, and similarly,

$$\begin{aligned} \alpha _0 [m_{0i}, \beta _i (m_{ii}, m_{ij}), m_{0j}]\approx & {} \alpha (1-\varepsilon _{0i}, 1-\kappa _i, 1-\varepsilon _{0j}) - \delta (1-\varepsilon _{0i}, 1-\kappa _i, 1-\varepsilon _{0j}) \end{aligned}$$

(75)

$$\begin{aligned}\approx & {} \left[ 1 - {A_x} \varepsilon _{0i} - {A_y} \kappa _i - {A_z} \varepsilon _{0j} + \frac{1}{2} {A_{xx}} \varepsilon _{0i}^2 + \frac{1}{2} {A_{yy}} \left( \kappa _i^{(1)} \right) ^2 + \frac{1}{2} {A_{zz}} \varepsilon _{0j}^2 \right. \nonumber \\&\left. + {A_{xy}} \varepsilon _{0i} \kappa _i^{(1)} + {A_{yz}} \kappa _i^{(1)} \varepsilon _{0j} + {A_{zx}} \varepsilon _{0i} \varepsilon _{0j} \right] \nonumber \\&- \left( \delta _1 - \delta _x \varepsilon _{0i} - \delta _y \kappa _i^{(1)} - \delta _z \varepsilon _{0j} \right) . \end{aligned}$$

(76)

Data availability

The source code for this study is available at https://github.com/yohm/sim_game_continuous_reputation.

References

Burke, J. The Day the Universe Changed (London Writers Ltd., London, 1985).
Google Scholar
Hamlin, J. K., Wynn, K. & Bloom, P. Social evaluation by preverbal infants. Nature 450, 557–559 (2007).
Article ADS CAS PubMed Google Scholar
Engelmann, J. M., Herrmann, E. & Tomasello, M. Five-year olds, but not chimpanzees, attempt to manage their reputations. PLoS ONE 7, e48433 (2012).
Article ADS CAS PubMed PubMed Central Google Scholar
Abdai, J. & Miklósi, Á. The origin of social evaluation, social eavesdrop**, reputation formation, image scoring or what you will. Front. Psychol. 7, 1772 (2016).
Article PubMed PubMed Central Google Scholar
Alexander, R. The Biology of Moral Systems (A. de Gruyter, New York, 1987).
Nowak, M. A. & Sigmund, K. Evolution of indirect reciprocity by image scoring. Nature 393, 573 (1998).
Article ADS CAS PubMed Google Scholar
Leimar, O. & Hammerstein, P. Evolution of cooperation through indirect reciprocity. Proc. R. Roc. Lond. B 268, 745–753 (2001).
Article CAS Google Scholar
Brandt, H. & Sigmund, K. Indirect reciprocity, image scoring, and moral hazard. Proc. Natl. Acad. Sci. USA 102, 2666–2670 (2005).
Article ADS CAS PubMed PubMed Central Google Scholar
Nowak, M. A. & Sigmund, K. Evolution of indirect reciprocity. Nature 437, 1291–1298 (2005).
Article ADS CAS PubMed Google Scholar
Ohtsuki, H., Iwasa, Y. & Nowak, M. A. Indirect reciprocity provides only a narrow margin of efficiency for costly punishment. Nature 457, 79 (2009).
Article ADS CAS PubMed PubMed Central Google Scholar
Nax, H. H., Perc, M., Szolnoki, A. & Helbing, D. Stability of cooperation under image scoring in group interactions. Sci. Rep. 5, 12145 (2015).
Article ADS CAS PubMed PubMed Central Google Scholar
Axelrod, R. Evolution of Cooperation (Basic Books, New York, 1984).
MATH Google Scholar
Baek, S. K. et al. Intelligent tit-for-tat in the iterated prisoner’s dilemma game. Phys. Rev. E 78, 011125 (2008).
Article ADS MathSciNet CAS Google Scholar
Baek, S. K., Jeong, H.-C., Hilbe, C. & Nowak, M. A. Comparing reactive and memory-one strategies of direct reciprocity. Sci. Rep. 6, 1–13 (2016).
Article CAS Google Scholar
Yi, S. D., Baek, S. K. & Choi, J.-K. Combination with anti-tit-for-tat remedies problems of tit-for-tat. J. Theor. Biol. 412, 1–7 (2017).
Article MathSciNet PubMed MATH Google Scholar
Murase, Y. & Baek, S. K. Seven rules to avoid the tragedy of the commons. J. Theor. Biol. 449, 94–102 (2018).
Article PubMed MATH Google Scholar
Murase, Y. & Baek, S. K. Automata representation of successful strategies for social dilemmas. Sci. Rep. 10, 13370 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Murase, Y. & Baek, S. K. Five rules for friendly rivalry in direct reciprocity. Sci. Rep. 10, 16904 (2020).
Article ADS CAS PubMed PubMed Central Google Scholar
Murase, Y. & Baek, S. K. Friendly-rivalry solution to the iterated n-person public-goods game. PLoS Comput. Biol. 17, e1008217 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Clark, D., Fudenberg, D. & Wolitzky, A. Indirect reciprocity with simple records. Proc. Natl. Acad. Sci. USA 117, 11344–11349 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wedekind, C. & Milinski, M. Cooperation through image scoring in humans. Science 288, 850–852 (2000).
Article ADS CAS PubMed Google Scholar
Milinski, M., Semmann, D. & Krambeck, H.-J. Reputation helps solve the ‘tragedy of the commons’. Nature 415, 424–426 (2002).
Article ADS PubMed Google Scholar
Nowak, M. A. Five rules for the evolution of cooperation. Science 314, 1560–1563 (2006).
Article ADS CAS PubMed PubMed Central Google Scholar
Kandori, M. Social norms and community enforcement. Rev. Econ. Stud. 59, 63–80 (1992).
Article MathSciNet MATH Google Scholar
Ohtsuki, H. & Iwasa, Y. How should we define goodness? Reputation dynamics in indirect reciprocity. J. Theor. Biol. 231, 107–120 (2004).
Article MathSciNet PubMed MATH Google Scholar
Ohtsuki, H. & Iwasa, Y. The leading eight: Social norms that can maintain cooperation by indirect reciprocity. J. Theor. Biol. 239, 435–444 (2006).
Article MathSciNet PubMed MATH Google Scholar
Sugden, R. The Economics of Rights, Cooperation and Welfare (Blackwell, Oxford, 1986).
Google Scholar
Boyd, R. Mistakes allow evolutionary stability in the repeated prisoner’s dilemma game. J. Theor. Biol. 136, 47–56 (1989).
Article MathSciNet CAS PubMed Google Scholar
Panchanathan, K. & Boyd, R. A tale of two defectors: The importance of standing for evolution of indirect reciprocity. J. Theor. Biol. 224, 115–126 (2003).
Article MathSciNet PubMed MATH Google Scholar
Brandt, H., Ohtsuki, H., Iwasa, Y. & Sigmund, K. A survey of indirect reciprocity. In Takeuchi, Y., Iwasa, Y. & Sato, K. (eds.) Mathematics for ecology and environmental sciences, 30 (Springer, Berlin, 2007).
Uchida, S. Effect of private information on indirect reciprocity. Phys. Rev. E 82, 036111 (2010).
Article ADS CAS Google Scholar
Uchida, S. & Sasaki, T. Effect of assessment error and private information on stern-judging in indirect reciprocity. Chaos Solitons Fractals 56, 175–180 (2013).
Article ADS MATH Google Scholar
Okada, I., Sasaki, T. & Nakai, Y. Tolerant indirect reciprocity can boost social welfare through solidarity with unconditional cooperators in private monitoring. Sci. Rep. 7, 1–11 (2017).
Article CAS Google Scholar
Okada, I., Sasaki, T. & Nakai, Y. A solution for private assessment in indirect reciprocity using solitary observation. J. Theor. Biol. 455, 7–15 (2018).
Article PubMed MATH Google Scholar
Santos, F. P., Santos, F. C. & Pacheco, J. M. Social norm complexity and past reputations in the evolution of cooperation. Nature 555, 242–245 (2018).
Article ADS CAS PubMed Google Scholar
Hilbe, C., Schmid, L., Tkadlec, J., Chatterjee, K. & Nowak, M. A. Indirect reciprocity with private, noisy, and incomplete information. Proc. Natl. Acad. Sci. USA 115, 12241–12246 (2018).
Article CAS PubMed PubMed Central Google Scholar
Radzvilavicius, A. L., Stewart, A. J. & Plotkin, J. B. Evolution of empathetic moral evaluation.. Elife 8, e44269 (2019).
Article PubMed PubMed Central Google Scholar
Quan, J. et al. Withhold-judgment and punishment promote cooperation in indirect reciprocity under incomplete information. EPL 128, 28001 (2020).
Article ADS CAS Google Scholar
Tanabe, S., Suzuki, H. & Masuda, N. Indirect reciprocity with trinary reputations. J. Theor. Biol. 317, 338–347 (2013).
Article MathSciNet PubMed MATH Google Scholar
Olejarz, J., Ghang, W. & Nowak, M. Indirect reciprocity with optional interactions and private information. Games 6, 438–457 (2015).
Article MathSciNet MATH Google Scholar
Mathematica, Version 10.0 (Wolfram Research, Inc., Champaign, IL, 2014).
Alwin, D. F. Feeling thermometers versus 7-point scales: Which are better?. Sociol. Methods Res. 25, 318–340 (1997).
Article Google Scholar
Preston, C. C. & Colman, A. M. Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychol. 104, 1–15 (2000).
Article CAS Google Scholar
Svensson, E. Comparison of the quality of assessments using continuous and discrete ordinal rating scales. Biom. J 42, 417–434 (2000).
Article MATH Google Scholar
Okada, I. Two ways to overcome the three social dilemmas of indirect reciprocity. Sci. Rep. 10, 1–9 (2020).
Article CAS Google Scholar
Tetlock, P. E. & Gardner, D. Superforecasting: The art and science of prediction (Random House, New York, 2015).
Google Scholar
Mackie, G., Moneti, F., Denny, E. & Shakya, H. What are Social Norms? How are They Measured? (UNICEF/UCSD Center on Global Justice Project Cooperation Agreement Working Paper, San Diego, CA, 2014).
Amato, R., Lacasa, L., Díaz-Guilera, A. & Baronchelli, A. The dynamics of norm change in the cultural evolution of language. Proc. Natl. Acad. Sci. USA 115, 8260–8265 (2018).
Article ADS CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

Y.M. acknowledges support from Japan Society for the Promotion of Science (JSPS) (JSPS KAKENHI; Grant no. 18H03621 and Grant no. 21K03362). S.K.B. acknowledges support by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2020R1I1A2071670). We appreciate the APCTP for its hospitality during completion of this work.

Author information

Authors and Affiliations

Department of Physics, Pukyong National University, Busan, 48513, Korea
Sanghun Lee & Seung Ki Baek
RIKEN Center for Computational Science, Kobe, Hyogo, 650-0047, Japan
Yohsuke Murase

Authors

Sanghun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yohsuke Murase
View author publications
You can also search for this author in PubMed Google Scholar
Seung Ki Baek
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

S.L. carried out computation and analysed the results. Y.M. verified the method and reviewed the manuscript. S.K.B. conceived the work and wrote the manuscript.

Corresponding author

Correspondence to Seung Ki Baek.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lee, S., Murase, Y. & Baek, S.K. Local stability of cooperation in a continuous model of indirect reciprocity. Sci Rep 11, 14225 (2021). https://doi.org/10.1038/s41598-021-93598-7

Download citation

Received: 09 April 2021
Accepted: 28 June 2021
Published: 09 July 2021
DOI: https://doi.org/10.1038/s41598-021-93598-7
Springer Nature Limited

This article is cited by

Quantitative assessment can stabilize indirect reciprocity under imperfect information
- Laura Schmid
- Farbod Ekbatani
- Krishnendu Chatterjee
Nature Communications (2023)
Social norms in indirect reciprocity with ternary reputations
- Yohsuke Murase
- Minjae Kim
- Seung Ki Baek
Scientific Reports (2022)

Local stability of cooperation in a continuous model of indirect reciprocity

Abstract

Similar content being viewed by others

Quantitative assessment can stabilize indirect reciprocity under imperfect information

Social norm complexity and past reputations in the evolution of cooperation

The evolution of indirect reciprocity under action and assessment generosity

Introduction

Analysis

Recovery from disagreement

Suppression of mutants

Results

Summary and discussion

Methods

Linear-order corrections

Finite fraction of mutants

Second-order corrections

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

This article is cited by

Quantitative assessment can stabilize indirect reciprocity under imperfect information

Social norms in indirect reciprocity with ternary reputations

Navigation

Local stability of cooperation in a continuous model of indirect reciprocity

Abstract

Similar content being viewed by others

Quantitative assessment can stabilize indirect reciprocity under imperfect information

Social norm complexity and past reputations in the evolution of cooperation

The evolution of indirect reciprocity under action and assessment generosity

Introduction

Analysis

Recovery from disagreement

Suppression of mutants

Results

Summary and discussion

Methods

Linear-order corrections

Finite fraction of mutants

Second-order corrections

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Quantitative assessment can stabilize indirect reciprocity under imperfect information

Social norms in indirect reciprocity with ternary reputations

Search

Navigation