Keywords

1 Introduction

Blockchain, a secure and trustless decentralised database, enables recording information using linked “blocks”. Since the data structure used in Blockchain is naturally immune to tampering, the 1st-generation approaches rise up first in cryptocurrency applications, e.g. Bitcoin [1] is a decentralised ledger to record transactions, the characteristics are decentralisation, anonymous and transparent [2]. As a superset of ledgering, generic computations are supported by the 2nd-generation Blockchain. Ethereum [3] is a typical approach which supports Turing-complete computations through smart contracts [4] thus enabling Decentralised Applications (DApps) [5]. In other words, DApps are software that rely on smart contracts operating through Blockchain as backend services.

Comparing with Blockchain platforms that support DApps (E.g. EOS, Steem, POA and xDai), the majority of DApp approaches are developed and published for Ethereum (2486 DApps in Ethereum out of 2667 DApps in total till 21/May/2019Footnote 1) which enables converging various aspects of life such as energy, healthcare, finance, entertainment and insurance. However, more than 21% of the DApps have been broken or abandoned on the Ethereum platform. The high requirements and insufficient achieving ratio of DApps motivate us to further investigate the performances of Ethereum. The experimental results and learnt insights presented in this paper aims to guide DApp developers to better design their products.

In literatures, a few works evaluated the performances of Ethereum through private-nets. Specifically, Aldweesh et al. [7] tested the private chain using both Parity and Geth clients; Pongnumkul et al. [8] evaluated Hyperledger Fabric and Ethereum also through private-nets. Although existing works studied the performance of Ethereum using theoretical analysis without real-world generalizable experiments, to the best of the authors’ knowledge, no works have tested the performance of Ethereum while considering practical use cases and constraints. To close this gap, we select Ethereum test-nets for comparative evaluations for the following reasons:

  • Some test-nets, e.g. Ropsten, acts like main-nets, which use the same consensus scheme (Proof-of-Work, PoW).

  • Kovan and Rinkeby use Proof of Authority (PoA) [9] consensus schemes, which are a potential direction of Ethereum evolutions.

In this paper, three test-nets (Ropsten, Kovan and Rinkeby) with popular consensus schemes are selected for evaluations. the experiments are designed, to measure:

  • Account balance query latency: The access delay is obtained by inquiring account balances multiple times, which is the time interval between the point that an inquiry is sent out and the point that balance is received.

  • Block generation time: The transaction confirmation speed through measuring the time duration of generation a block. To avoid rolling back of Ethereum Blockchain states, we continuously measure the time durations of generating 12 consecutive blocks.

  • End-to-end transaction acceptance latency: Four levels of transactions (TXs) loads are considered: 1 single TX; 25 concurrent TXs; 50 concurrent TXs; 100 concurrent TXs.

The structure of the paper is as follows. Section 2 illustrates the related works about Ethereum. Evaluation methodology is developed and shown in Sect. 3. We present the experimental results in Sect. 4. Finally, Sect. 6 concludes this paper.

2 Related Work

2.1 Ethereum Overview

Ethereum [10] borrows heavily from the Bitcoin protocol and its Blockchain design, but tweaks it to support applications beyond money, in which Ethereum improves the concept of scripting and online meta-protocols.

From the perspective of ledgering transactions, different from the inefficient Unspent Transaction Output (UTXO) scheme [1] used in Bitcoin for estimating account balances, accounts are introduced in Ethereum, which formulate the states of the whole network. In addition, the state transitions are defined as the value transfers or data records between accounts (including smart contract).

The signed data package that stores a message to be sent from an account is defined as a transaction which contains the recipient, the sender’s signature, the number of tokens and the data to send. In addition, Ethereum introduces gas which is to pay miners who execute transactions and smart contracts inside.

Both Ethereum and Bitcoin store the entire transaction histories in their respective networks. The difference is that Ethereum embeds smart contracts in the transactions in which contracts enable Turing-complete programming to support feature rich applications. These transactions are grouped ‘block’, every block being chained together with its previous blocks. But before the transaction can be added to the ledger, it needs to validate through a consensus algorithm, e.g. PoW or PoA.

In theory, the block gas limits in Ethereum is usually set to 7,999,992, where each transaction will cost 21,000 gas without smart contracts. This means Ethereum can store and execute around 380 pure transactions using each block. Because the block generation is around 14.37 sFootnote 2, this gives the transaction speed 25.346 TX/s approximately.

2.2 Ethereum Test Network (Test-Net)

Test networks [12] facilitates individual developers and companies to test their business logic by deploying smart contracts before delivering their products to the Ethereum main-net. The test-nets provides the exact same service without needs of exchanging monetary tokens. We’ll be covering the most popular 3 test-nets, they are:

  • Ropsten: The test-net in Ropsten, its Ether is mined following the same scheme as the main-net. Ropsten resemble the current main-net the most due to its PoW consensus algorithm which is the same to the real main-net.

  • Kovan: Different from the main-net, Kovan supports the ParityFootnote 3 client only and uses PoA as the consensus algorithm. Thus, Kovan cannot be considered a very accurate simulation to the current main-net. Despite this, it is immune to spam attacks, reliable and stable, so it is convenient or public testing.

  • Rinkeby: Rinkeby shares the advantages of Kovan, supporting the GethFootnote 4 client only.

2.3 Distributed Consensus

PoW (Proof of Work) and PoA (Proof of Authority) are the main consensus algorithms using in Ethereum and test-nets. The details of PoW and PoA are listed below:

  • PoW: In Bitcoin and Ethereum, PoW is employed to confirm transactions in the world and to prevent cyber-attacks on the network. The principle of PoW is to allocate the accounting rights and monetary rewards according to the computation power that is contributed by each node [14]. The workload as the safeguard in PoW. The new block will be connected to the previous block. If someone wants to tamper with the Blockchain that will difficult, all of nodes trust the longest chain and the cost can be less than gains from tampering. The PoW can protect the safety of the Blockchain.

  • PoA: PoA consensus can be viewed an efficient deviation of Proof of Stake (PoS) where known validators will confirm transactions. It also includes a governance-based penalty system to punish malicious behavers. In practice, PoA can provide faster transaction rates than PoW without mining processes.

2.4 Existing Works of Performance Evaluation on Ethereum

According to the authors’ study, only two paper experimentally evaluate the performance of Ethereum. Specifically, Aldweesh et al. [7] deploys private chains using both Parity and Geth clients with different consensus algorithms. The result demonstrates that Parity can handle concurrent transactions significantly better than Geth. Different from our work, Rouhani’s work focuses on the intrinsic performance of Ethereum but does not study Ethereum which operates in the wild, i.e. with time-varying transaction burdens and numbers of users. In other words, its results cannot reflect the performance of the main-net. Pongnumkul et al. [8] comparatively evaluated Hyperledger Fabric and Ethereum by building private-nets, the experimental results indicate that Ethereum cannot achieve both higher throughput and lower latency but is able to handle more number of concurrent transactions. The paper lacks a discussion on why the performance of Ethereum is more unstable, and the test results captured from private-nets cannot reflect the performance of the main-net. Xu et al. [6] proposed a taxonomy method to compare blockchains and blockchain-based systems thus assisting the design and assessment of their impact on software architectures. Macdonald et al. [13] discussed how the blockchain can be used outside of Bitcoin, then presented a comparison of five general-purpose blockchain platforms which include Ethereum, IBM Open Blockchain, Intel Sawtooth Lake, BlockStream Sidechain Elements, Eris. Recently, Maple et al. [11] introduced a format for outlining a generic blockchain anatomy which ranges from permissions to consensus and can be referenced when assessing blockchain solutions architecture, to assist in the design and implementation of business logic.

In this paper, to capture the performance of Ethereum from a more reliable and generalisable manner, we select the test-nets that are popular and have strong links to the main-net. This experimental results targets to better understand Ethereum constraints and give advices on DApp development.

3 Evaluation Methodology

This section will discuss and present experimental design and testing results that are used for evaluating Ethereum through Ethereum test-nets (Ropsten, Kovan and Rinkeby). The performance of each test-net is evaluated through three aspects: 1) Account balance query latency 2) Block generation time and 3) End-to-end transaction acceptance latency. Each test is repeated 10 times to guarantee reliability.

Steps for Evaluating Transaction Acceptance Latency:

  1. 1

    User Operation: A set of N (1, 25, 50 and 100) signed transactions, i.e. TX = {TX0, TX1, TX2,… TXN1}, are created.

  2. 2

    User Operation: The user node sends the created transactions to an Ethereum network and captures the current time point t.

  3. 3

    Ethereum Internal Process: The Ethereum’s P2P network distributes the transactions to miners. The mining process (e.g. PoW or PoA consensus algorithms) will confirm the valid transactions and broadcast them to all the nodes in Ethereum.

  4. 4

    User Operation: The user node will continuously query the confirmation of each transaction through Etherscan API until all transactions in TX are confirmed. This is because Ethereum never returns the status of any transaction without queries.

  5. 5

    User Operation: When a transaction TXnTX is confirmed, the user node captures the current time Tn then calculate and records the transaction time of TXn, Δtn = Tntn.

  6. 6

    User Operation: If no transactions are confirmed, GOTO Step 4.

Steps for Evaluating Ethereum Block Generation Time:

  1. 1

    User Operation: The user node sends a query to check the block number N (i.e. the total number of confirmed blocks at the initial time) of the Ethereum network then record it with the current time point tN.

  2. 2

    User Operation: The user node continuously queries the block number n of the current time point tn until the block number is increased by 12.

  3. 3

    User Operation: If the change of block number is detected, e.g. n = N + 1, the user node calculates and records the time interval between generating two adjacent blocks, i.e. Δtn = tntN. Then, it updates N using n and tN using tn, i.e. the block generation time is measured for each block.

  4. 4

    User Operation: If no changes are detected, GOTO Step 2.

Steps for Evaluating Account Balance Query Latency:

  1. 1

    User Operation: The user node sends a query to get the balance for its account and captures the current time point t.

  2. 2

    User Operation: When the queried balance is received, the user node captures the current time point T, and then calculates and records the time interval Δt = Tt.

  3. 3

    User Operation: The process is repeated 10 times to perform balance queries.

4 Experimental Results and Analysis

The experimental results first present the Blockchain’s block generation time and transaction acceptance latency for each test-net separately. Then, the last sub-section will demonstrate the Blockchain’s query latencies for the three test-nets.

Ropsten

Figure 1 shows that the time to produce blocks varies significantly (from 8.25 to 19.16 s for 12 blocks, and between 2 and 53 s per block) Fig. 2, 3, 4 and 5 present the transaction acceptance speeds with different loads (1, 25, 50 & 100 concurrent transactions), which show that the transaction time is between 21 and 418 s for 1 transaction, between 52 and 750 s for 25 transactions, between 31 and 463 s for 50 transactions, and between 4.79 and 186 s for 100 transactions.

Fig. 1.
figure 1

Time of generating 12 blocks in Ropsten

Fig. 2.
figure 2

Time of 1 transaction in Ropsten

Fig. 3.
figure 3

Time of 25 concurrent transactions in Ropsten

Fig. 4.
figure 4

Time of 50 concurrent transactions in Ropsten

Fig. 5.
figure 5

Time of 100 concurrent transactions in Ropsten

In PoW, the averaged time interval between generating two hash values is predefined (15 s). However, due to the randomness of hitting a hash that meets the condition (to generate a block), the real time intervals between generating two blocks can be very different. The randomness of block generation speed potentially leads to longer transaction queues even when Blockchain is not congested [15]. This may be the culprit of fluctuations in transaction acceptance rates (Fig. 2, 3, 4 and 5). Furthermore, transaction acceptance rates depend on the transaction generation speed of the network and miners’ policy for including a transaction in a block.

Kovan

Figure 6 shows that the total times for generating 12 blocks are similar (about 143 s), however, there is a large variation for each block (from 2 to 20 s, in Fig. 6 (a)). Figure 7, 8, 9 and 10 show the transaction acceptance latencies for different concurrent transactions (1, 25, 50, & 100). The transaction acceptance time is between 12 and 34 s for 1 transaction, between 5 and 29 s for 25 concurrent transactions, between 6.8 and 30 s for 50 concurrent transactions, and between 6.22 and 19.33 s for 100 concurrent transactions.

Fig. 6.
figure 6

Time of generating 12 blocks in Kovan

Fig. 7.
figure 7

Time of 1 transaction in Kovan

Fig. 8.
figure 8

Time of 25 concurrent transactions in Kovan

Fig. 9.
figure 9

Time of 50 concurrent transactions in Kovan

Fig. 10.
figure 10

Time of 100 concurrent transactions in Kovan

Figure 6 displays that the time of block generation is more stable than Ropsten. Kovan employs an authority round PoA algorithm (Aura) [9] instead of PoW. PoA is much more stable in block generation because the authority to generate a block is assigned to a node with more stable time intervals. The end-to-end transaction acceptance time is also more stable than that of Ropsten as shown in Fig. 7, 8, 9 and 10.

Rinkeby

Figure 11 displays that the time to produce 12 blocks is relatively stable (around 180 s for total 12 blocks, 13–16 s per block). Figure 12, 13, 14 and 15 show the transaction acceptance latencies for different transaction concurrencies (1, 25, 50, & 100). The time to accept the transactions is between 6 and 31 s for 1 transaction, between 11 s and 27 s for 25 concurrent transactions, and between 13.88 and 50 s for 50 transactions, and between 2.55 and 18 s for 100 transactions.

Fig. 11.
figure 11

Time of generating 12 blocks in Rinkeby

Fig. 12.
figure 12

Time of 1 transaction in Rinkeby

Fig. 13.
figure 13

Time of 25 concurrent transactions in Rinkeby

Fig. 14.
figure 14

Time of 50 concurrent transactions in Rinkeby

Fig. 15.
figure 15

Time of 100 concurrent transactions in Rinkeby

Like Kovan, Rinkeby uses PoA instead of PoW to prevent wasting of computational resources, which leads to stable transaction time. Notably, the consensus scheme used by Rinkeby is Clique PoA. According to Fig. 11 Rinkeby show better stabilities than Kovan which is using Aura PoA. However, it is observed that the time interval configuration of block generation is set to 15 s. This results the end-to-end transaction time, as shown Fig. 12, 13, 14 and 15, to be longer than those in Kovan.

Balance Query Latency

Figure 16, 17 and 18 show the test results about the time of account balance query in Ropsten, Kovan & Rinkeby. The figures show that the time to check the balance is between 0.173 and 0.375 s in Ropsten, between 0.215 and 0.225 s in Kovan, and between 0.177 and 0.259 s in Rinkeby. The results show that the balance query latencies in all the test-nets are in the same order of magnitude with Kovan showing the least variations and Ropsten showing the largest variations.

Fig. 16.
figure 16

Time of account balance query in Ropsten

Fig. 17.
figure 17

Time of account balance query in Kovan

Fig. 18.
figure 18

Time of account balance query in Rinkeby

5 Discussion

The experimental results will quantize the impacts of consensus schemes on transaction performances such as throughputs and latencies. In consequence, we summaries the certainty and randomness of the Ethereum Blockchain. For PoW (Ropsten), although the average block generation time is around 15 s with high variances, the time duration from submitting a transaction to confirming it by miners can be more than 400 s, i.e. it is longer than generating 10 new blocks. For PoA (Kovan and Rinkeby), the stability of block generation time is dramatically improved, however the time cost of confirming a single transaction can be larger than the time of generating 3 new blocks (>45 s). As we can seem, for the experiments of 100 concurrent transactions for all the three test-net, the end-to-end latencies are slightly lower than the others. The reason is considered as the proportion of 100 transactions over the total transactions is higher than that of 1, 25, 50 transactions over the total transactions, thus it brings a higher probability to record more transactions in the most recent block and less in upcoming blocks, which reduces averaged latencies. However, the reduced latencies are trivial, which have little impacts on total latencies.

In short, the results demonstrate that the current Ethereum approaches with PoA and PoW consensus schemes suffer from long transaction time and instability issues. In other words, the existing on-chain consensus scheme can hardly support the applications (DApps) with frequent data submission requirements and low-latency constraints.

6 Conclusion and Future Work

In this paper, we present the performance evaluation results of Ethereum from different perspectives through three sets of experiments for Ropsten, Kovan and Rinkeby. The results show that the account balance query time is quite low and stable, which means that the time to access the latest Blockchain status and data is not the bottleneck. Nevertheless, for Ropsten, the transaction latencies fluctuate significantly in different experimental groups (with different transaction loads). This pattern also happens to the block generation time of Ropsten, i.e. the time to generate a new block is unstable. Note that Ropsten employs the same operating scheme (e.g. the PoW consensus algorithm) as Ethereum, which can be considered as the best approximation to the Ethereum main-net. In other words, the instability of Ropsten reflects the volatile performance of the current Ethereum.

As a conclusion, we consider that the current Ethereum approach is not powerful enough to support the applications (i.e. DApps) with low-latency constrains and high-frequency transaction submission requirements. For example, online games which requires a timely consensus on consistency among players cannot be supported by existing Ethereum approaches due to the inherent large time intervals (>10 s) of block generations. In the future work, we are going to investigate how off-chain transactions can tackle this problem, aiming at reducing transaction latencies and better supporting high-frequency transaction submission.

In future works, we plan to evaluate the performance of Ethereum for online game applications using real game datasets, especially focusing on consistency. In addition, we will extend the proposed experiments to the Ethereum main-net to capture more comprehensive results.