Data sharing and exchanging with incentive and optimization: a survey

Liu
, Liyuan; Han, Meng

doi:10.1007/s44248-024-00006-2

Data sharing and exchanging with incentive and optimization: a survey

Review
Open access
Published: 18 March 2024

Volume 2, article number 2, (2024)
Cite this article

Download PDF

You have full access to this open access article

Discover Data Aims and scope Submit manuscript

Data sharing and exchanging with incentive and optimization: a survey

Download PDF

Liyuan Liu ¹ &
Meng Han²

1077 Accesses
Explore all metrics

Abstract

As the landscape of big data evolves, the paradigm of data sharing and exchanging has gained paramount importance. Nonetheless, the transition to efficient data sharing and exchanging is laden with challenges. One of the principal challenges is incentivizing diverse users to partake in the data sharing and exchange process. Users, especially those in potential competitive positions, often exhibit reluctance towards sharing or exchanging their data, particularly if they perceive the rewards as inadequate. Given this context, it’s imperative to institute an incentive mechanism that’s not only computationally efficient and secure but also provides both monetary and trustworthy inducements. This study introduces a taxonomy of incentive-based data sharing and exchanging, structured around its lifecycle, and elucidates the challenges inherent in each phase. We classify incentive mechanisms into monetary and non-monetary categories, postulating that the concomitant use of both types of incentives is more effective for data sharing and exchanging applications. Subsequent sections provide an overview of extant literature pertinent to each phase of the data sharing and exchanging lifecycle. In conclusion, we underscore the prevailing challenges in this domain and advocate for intensified efforts to refine the design of incentive mechanisms in data sharing and exchanging.

Permissions and Privacy

Solving Data Trading Dilemma with Asymmetric Incomplete Information Using Zero-Determinant Strategy

The Coming of Age of Open Data

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Big data represents the leading edge of innovation, competition, and productivity [1]. A multitude of advanced analytical algorithms and applications harness big data to pioneer novel theories and technologies, such as artificial intelligence and edge computing. In the midst of the big data surge, the processes of data sharing and exchanging occur ubiquitously and continuously. Such sharing and exchanging take place between specific entities, be they individuals, devices, or databases. These entities relay information amongst themselves, with the mechanisms of transmission spanning electronic methods or specialized systems [2]. Notably, while data exchanging entails a bi-directional transfer, data sharing is a unidirectional process. In recent decades, the paradigm of hosting, sharing, and exchanging data in the cloud has emerged as the predominant design choice. This has led to the rise of third-party platforms as the preferred means for participants in data sharing and exchanging. For instance, Amazon introduced the “Amazon Web Service (AWS) Data Exchange”, a platform that allows customers to tap into third-party data sources within the AWS marketplace. This service ensures reliable access for customers on an unprecedented scale and doubles as a streamlined tool for data ingestion and utilization [3].

Data sharing and exchanging offer a plethora of benefits, including fee-less transactions, tamper resistance, enhanced services, high transparency, and real-time engagement for all involved parties. A pertinent example is Google Drive’s collaboration with WhatsApp [4], allowing users to back up their chat histories and media to the cloud, ensuring data portability and recovery without transaction fees. Nevertheless, this paradigm faces a multitude of challenges:

A predominant challenge in data sharing and exchanging concerns the willingness of ordinary individuals to engage and share their data resources. Concomitant issues of privacy, security, and costs (e.g., energy consumption, network bandwidth) might deter participants, especially if the rewards are not deemed sufficient. Thus, crafting mechanisms to incentivize participation becomes a pressing priority.
There exists an inherent trade-off between data privacy and accessibility. For cloud-based data-sharing and exchanging platforms, striking a balance between security and efficiency becomes pivotal during mechanism design.
As the digital market for data sharing and exchanging evolves, devising an equitable data pricing strategy emerges as a new challenge. The quest for an efficient digital market necessitates mechanisms that price data transparently while safeguarding data privacy.

Given these challenges, designing incentive-based mechanisms stands out as a pivotal research area within the realm of data sharing and exchanging.

In recent years, the design of incentive mechanisms has become increasingly prevalent in crowdsensing applications within the realm of computing. One illustrative case is Waze, a crowdsourcing-based navigation application. The platform introduced a mechanism termed “Awazeing Race” to motivate both existing and prospective users to engage with the Major Traffic Event (MTE) tool in the Waze Map Editor (WME). This initiative was aimed at enhancing the volume of user-contributed MTEs and closures, thereby enriching the overall Waze experience for local users. A review of pertinent literature reveals that incentive mechanisms in computing can be broadly classified into three categories: entertainment, service, and monetary incentives [5]. Entertainment-centric incentives predominantly employ location-based mobile games to spur participation [6,7,8]. Service-oriented incentives, on the other hand, leverage the promise of enhanced service benefits as a motivational strategy. For instance, in GPS applications, users not only consume data but also contribute to its generation, driven by the aspiration for superior service quality [9, 10].

Monetary-based incentives have emerged as a prevalent strategy to motivate mobile sensors to participate. Within this domain, price determination and the criteria for winner selection have piqued the interest of numerous researchers. For instance, ride-sharing apps like Uber use dynamic pricing algorithms that incentivize drivers (mobile sensors) by increasing fares during peak demand times, effectively balancing supply and demand [11]. Nevertheless, the intricacies of designing incentive mechanisms escalate when applied to the data-sharing and exchanging process. Our analysis reveals that, relative to the aforementioned categories, monetary incentives garner more extensive attention in computing research. As delineated in Table 2, a significant fraction of researchers have gravitated towards leveraging game theory algorithms in computing to objectives such as utility maximization [12,13,14], profit maximization [15,16,17], and social welfare maximization [13, 18,19,20,21]. Concurrently, there are studies employing economic incentives to attain analogous goals. Notwithstanding this proliferation, it is noteworthy that only a limited number of researchers have delved into the holistic design of incentive mechanisms within the entire data-sharing and exchanging platform. A real-world example is the use of cashback rewards by credit card companies to encourage consumers to share transaction data, which is then utilized for personalized marketing and data analysis [22].

We decompose the data sharing and exchanging process into four principal components: data creation, data storage, data access, and data privacy preservation. Contrary to the classifications of earlier researchers, we posit that it’s redundant to segregate incentive mechanisms into entertainment-based, service-based, and money-based categories. Instead, an amalgamation of both monetary and non-monetary incentives is imperative to galvanize holistic participation in the data-sharing and exchanging ecosystem. For instance, on such a platform, integrating service-based with monetary incentives can be an efficacious strategy. This would entail providing participants with both service credits and direct monetary rewards. Notably, even though providers in the data-sharing and exchanging paradigm might concurrently serve as requesters, the allure of service credits remains undiminished, proving invaluable when they seek access to future data resources. Take Microsoft Azure, for instance, which provides credits to users who contribute to its machine learning datasets, encouraging a reciprocal data-sharing ecosystem [23].

In the ensuing sections of this survey, we commence by presenting a preliminary definition of data sharing, data exchanging, and the underlying incentive mechanisms. Subsequently, we delve into a thorough review and discourse on the associated incentive mechanisms and optimization algorithms that underpin the life cycles of data sharing and exchanging. Ultimately, we shed light on the prevailing challenges and opportunities encompassing data creation, storage, access, and privacy preservation in the context of data exchange and sharing.

Our primary contributions to this domain can be distilled as follows:

We put forth a nuanced taxonomy of the incentive-driven processes in data sharing and exchanging, predicated on its lifecycle. Concurrently, we encapsulate the challenges inherent to each phase.
Our discourse extends to a meticulous examination of incentive mechanisms pivotal to data sharing and exchanging. Although we bifurcate these mechanisms into monetary and non-monetary classifications, our stance diverges from preceding researchers; we advocate for a synergistic integration of both categories to stimulate greater participation in data sharing and exchanging.
For the first time, we systematically deconstruct the lifecycle of data sharing and exchanging into its quartet of elements: data creation, data storage, data access, and data privacy preservation. Each segment is underpinned by a comprehensive exploration to serve as a point of reference. Additionally, we provide an exhaustive analysis of the nuances of privacy preservation spanning the entire lifecycle.
We highlight five emergent research trajectories in the ambit of incentive mechanisms for data sharing and exchanging. These span computational efficiency, trustworthiness, data privacy and security, data management system intricacies, and data quality, among others. Within each avenue, we discern current lacunae and prospective directions. A notable proposition is the conceptualization of a system rooted in blockchain technology for data sharing and exchange, synergized with diverse incentive mechanisms. The integration of such mechanisms with deep learning algorithms, we posit, will pave the way for the next generation of incentive-centric data-sharing and exchanging frameworks.

The remainder of this paper is organized in a systematic fashion. Section 2 delineates the preliminary definitions central to our discussion. In Sect. 3, we present a review of the pertinent existing literature. Section 4 offers a comprehensive analysis of the lifecycle associated with data sharing and exchanging. Challenges inherent to the field are highlighted in Sect. 5, while potential research avenues are explored in Sect. 6. Finally, Sect. 7 elucidates the research opportunities present within various incentive-based data-sharing and exchanging applications.

2 Preliminary definition

2.1 Data sharing

Data sharing occurs among n distinct entities, which can be represented by the list $\mathbb {B}=(\beta _1, \beta _2, \beta _3,..., \beta _n)$. This sharing can take various forms: it can transpire between individuals with extensive databases, between individuals and public organizations, or between public and private entities. We can denote a collection of datasets as $\mathbb {D}= ( d _1, d _2, d _3,..., d _m)$. In the context of data sharing, an entity represented by $\beta _i$ can access the dataset represented by $d _j$ provided they obtain the requisite authority from another entity. Database access grants have traditionally been utilized as a mechanism for data sharing. To access such a granted database, it is imperative for user accounts to be part of one or more user groups. Authorization to the databases is then conferred upon these users. For instance, the SQL grant database is a widely recognized method employed across various database systems, such as SQL Server [24].

Cloud-based data sharing has become an ubiquitous method in the contemporary era of data dissemination. Cloud storage and computing serve as pivotal elements in the domains of data sharing and exchange. These cloud infrastructures utilize standard protocols to provide access to a myriad of configurable resources, encompassing applications, storage solutions, networks, servers, and various services. The concept of using the cloud for data sharing can trace its origins to an internal document of Compaq in 1996. This idea matured over the subsequent decade, culminating in the advent of cloud computing. In 2006, Amazon took a significant step in this direction by launching the Elastic Compute Cloud (EC2) to bolster its Amazon Web Services. Following suit, Google introduced the Google App Engine in 2008. The 2010s witnessed the emergence of sophisticated Smarter Computing frameworks, subsequent to IBM’s unveiling of IBM SmartCloud. Within these advanced data sharing and computing architectures, cloud-based data sharing is an integral component [25].

Utilizing cloud-based platforms for data sharing offers a multitude of advantages, notably reducing costs and infrastructural management overheads. Users benefit from the “pay-as-you-go” model, incurring costs only for data processing and storage, be it in a public or private cloud. Furthermore, cloud services are renowned for their scalability, adeptly adjusting to varying demands, expediting development tasks, and delivering efficient computing solutions [26, 27]. Data sharing via the cloud empowers entities to seamlessly access data remotely [28]. Numerous applications leverage cloud data-sharing capabilities, enhancing quality of life and productivity. For instance, Google Docs [29] furnishes a collaborative environment for users to disseminate diverse data types like documents and images. Similarly, DocuSign [30] facilitates the sharing of documents requiring signatures. However, with the exponential proliferation of IoT devices and the advent of 5G technology, the demands on data sharing have intensified. Critical questions arise, such as the cloud’s capacity to process and store voluminous data from myriad IoT devices, and whether latency issues can be effectively managed for time-sensitive applications like autonomous vehicles. Addressing these concerns, recent years have seen the emergence of cutting-edge edge computing paradigms, including fog computing (FC) [31], mobile edge computing (MEC) [32], and mobile cloud computing (MCC) [33]. Shifting data sharing to the edge has proven increasingly efficient and prevalent. These avant-garde distributed computing frameworks share a core principle: rather than relying on centralized cloud resources, they harness computational power closer to the end-users - typically through smart or edge devices. Such a configuration optimizes data sharing by processing data at the edge, markedly reducing transmission times. A practical example of MEC is the deployment of edge servers by telecom operators to provide low-latency gaming experiences on mobile devices. MCC’s real-world application can be seen in services like iCloud, which seamlessly integrate edge devices with cloud storage to optimize data accessibility and processing [34].

2.2 Data exchanging

Data exchange, while falling under the umbrella of data sharing, is distinct in its bidirectional nature. In this paradigm, entities engaged in the data-sharing ecosystem reciprocally exchange their resources to fulfill their respective data requirements.

Table 1 Data exchanging highlights

Full size table

Data exchange, while being a subset of data sharing, is distinct due to its bidirectional nature. This implies that different entities participating in the data-sharing process must reciprocate with their resources to access the desired data.

Similar to data sharing, data exchanging takes place among n entities and is denoted by the list $\mathbb {B}=(\beta _1, \beta _2, \beta _3,..., \beta _n)$. These exchanges can involve various entities, from individuals with vast databases to interactions between individuals and public organizations, and between public and private organizations. The datasets involved are represented as $\mathbb {D}=( d _1, d _2, d _3,..., d _m)$. When data sharing occurs, each entity $\beta _n$ can access the dataset $d _m$ if they obtain the authority from another entity. However, in data exchanging, there is a defined set of goals or targets. These are represented as $\mathbb {T}=( t _1, t _2, t _3,..., t _i)$. To realize the data exchanging targets of $t _i$, an appropriate incentive schema should be in place to spur the targets to completion. Data sharing historically took place in centralized databases and the cloud. However, the trend is increasingly shifting towards decentralization. Table 1 elucidates the significant milestones in the evolution of languages used for data exchange.

Data exchange enables the transfer of data between various systems and organizations while maintaining its integrity and meaning, ensuring that no modifications or alterations are made to the content [35]. This process often involves incentives to foster participation. The data requesters can compensate the data owners through various means, including monetary rewards or alternative data resources. Several algorithms, drawing from fields such as economics and game theory, have been developed to determine the optimal compensation or reward in the data-exchanging scenario. A practical example of this mechanism in action is the platform Airbnb [36]. Airbnb, a peer-to-peer service for people to list, discover, and book accommodations around the world, embodies the essence of data exchange. Property owners list their homes for travelers to rent, essentially sharing their data (property details, availability, price, etc.) with potential guests. In return, they receive monetary compensation when travelers book their spaces. Simultaneously, Airbnb uses optimization and ranking algorithms to gauge the success of each property listing. Properties that fare well, receive positive reviews, or fit specific criteria are then prioritized and given more visibility in the platform’s search results, benefiting the homeowners further. This system of rewards, both in visibility and monetary compensation, exemplifies the principles of data exchange.

Participants in data exchange might be hesitant to share their data if they believe the data they receive in return falls short of their expectations. A significant incident occurred in 2021 with the Microsoft Exchange Server data breach, in which attackers gained access to user emails and passwords, eroding trust in secure data exchange [37]. Participants might have believed that it wasn’t worthwhile to share their data without adequate incentives. Thus, establishing a fair data valuation and incentivizing participants to engage in the exchange becomes paramount. Within the sphere of data pricing, numerous factors require optimization. Challenges such as determining the right price point and allocating value based on data quality are pivotal. Consequently, the development of robust and precise pricing algorithms is integral to the success of data exchange.

2.3 Data sharing and exchanging life-cycle

The process of data sharing and exchanging hinges on four primary components: data creation, data storage, data access, and data privacy-preserving. The genesis of this process, data creation or collection, hinges on pivotal decisions regarding the nature, method, and volume of data collection. Central questions include: What kind of data should be collated? What are the optimal methods for its collection? How extensive should the data pool be? Once created, the data’s preservation requires both secure and efficient storage solutions. Data access, the subsequent phase, revolves around granting permissions to various stakeholders involved in the data sharing and exchange process. Meanwhile, data privacy-preserving is not merely an isolated component but an omnipresent factor throughout the data lifecycle, ensuring the integrity and confidentiality of shared data. The entire lifecycle of data sharing and exchange can be visualized in Fig. 1.

The distinction between data exchanging and data sharing lies in the transactional nature of the former. Data exchange embodies a two-way data exchange mechanism, characterized by a reciprocal trading process. Hence, in the context of data exchange, it becomes paramount to motivate multiple entities to actively engage in the exchange while ensuring that the process remains robustly secure. These considerations underscore the key research themes in this realm.

2.4 Incentive mechanisms

Incentive mechanisms have traditionally played a pivotal role in the realm of human resources management, acting as catalysts to drive employee motivation, performance, and overall achievement [38]. A notable example can be seen in Google’s work environment, which has garnered a reputation for being exceptionally gratifying. The tech giant has ingeniously embedded incentive-driven strategies into its human resource management framework. Through an intricate system of incentives, ranging from peer bonuses to performance-based rewards, Google has fostered an organizational climate ripe with trust. This ecosystem not only promotes collaboration and teamwork but also empowers employees within similar departments to synergize their efforts and aid one another [39]. As the digital landscape evolved, especially with the proliferation of the Internet of Things (IoT) and the ubiquity of big data, these incentive mechanisms have found their application extended to the domain of data science.

The surge in mobile device usage has catalyzed the development of a myriad of mobile crowdsensing applications. These tools harness the power of collective intelligence, leveraging mobile users to share data for various sensing tasks in a crowdsourced fashion. Challenge.gov stands as a testament to this trend—a digital platform where the public collaboratively addresses pressing issues faced by federal agencies. Through this platform, innovative solutions are crowd-sourced, facilitating more informed and effective governmental decisions [40]. Existing research classifies incentive mechanisms within crowdsensing applications into three primary categories: entertainment-based, service-based, and money-based [5]. Entertainment-Based Mechanisms: These are designed to pique user interest by integrating elements of fun and engagement. Specifically, they encourage participation through location-based mobile games. Such gamified mechanisms have been explored and discussed in various studies, highlighting their effectiveness in promoting user engagement in crowdsensing tasks [6,7,8].

Entertainment-based mechanisms: These are designed to pique user interest by integrating elements of fun and engagement. Specifically, they encourage participation through location-based mobile games. Such gamified mechanisms have been explored and discussed in various studies, highlighting their effectiveness in promoting user engagement in crowdsensing tasks [6,7,8].
Service-based mechanisms: Such incentives offer tangible service benefits to users in return for their participation, capitalizing on the mutual relationship where both the provider and the participant stand to gain. A prime example can be observed in GPS applications. Here, users, while benefiting from the service, also act as data providers. The underlying principle is that a collective effort from all users ensures a more refined and accurate service [9, 10].
Money-based mechanisms: Monetary rewards remain a tried-and-true incentive. Within the realm of crowdsensing, the intricacies lie in determining the appropriate pricing strategy and selecting winners. These components are pivotal and have piqued the interest of many researchers aiming to optimize and refine monetary incentive systems.

In summary, as the digital landscape grows more interconnected, the potential of mobile crowdsensing applications continues to expand. Harnessing this potential effectively necessitates the design and implementation of robust incentive mechanisms that cater to a diverse user base.

In recent times, the burgeoning field of blockchain technology has emerged as a pivotal solution for safeguarding data privacy. Numerous scholars have delved into the realm of incentive mechanisms within the blockchain environment. Broadly, these mechanisms can be categorized into two predominant types: those rooted in game theory and external incentives [41].

Consensus algorithms, intrinsic to blockchain operations, necessitate incentives to galvanize miners to compute the hash functions, subsequently facilitating the creation of new transactions. The overarching objective of achieving consensus within blockchain networks is to ensure a unanimous agreement among all participating nodes. This process empowers even the untrusted nodes, enabling them to select an individual or a cluster of nodes responsible for instigating new transactions. Various incentive strategies have been formulated in the blockchain context, including but not limited to, Proof of Work (PoW), Proof of Stake (PoS), and Zero-Knowledge Proof.

In recent times, the burgeoning field of blockchain technology has emerged as a pivotal solution for safeguarding data privacy. Numerous scholars have delved into the realm of incentive mechanisms within the blockchain environment. Broadly, these mechanisms can be categorized into two predominant types: those rooted in game theory and external incentives [41]. During the COVID-19 pandemic, federated learning was instrumental at UCSF in develo** AI models to predict the need for supplemental oxygen in patients, leveraging data across 20 hospitals without compromising patient privacy [42].

The intricate process underlying federated learning is illustrated in Fig. 2.

Consequently, introducing incentives in federated learning becomes imperative to counteract potential challenges posed by selfish nodes and participants of suboptimal quality.

3 Existing data sharing and exchanging incentive mechanisms

This section categorizes the prevailing incentives in data sharing and exchanging into two distinct types: monetary and non-monetary incentives. As depicted in Fig. 4, these incentives structure the landscape of existing research. Notably, from our analysis, a synergistic approach combining both monetary and non-monetary incentives could be more effective in motivating participants to actively engage in the data-sharing and exchanging processes.

3.1 Monetary incentives

In the realm of computing, the emphasis predominantly falls on monetary incentives, as evidenced by a majority of the research in this domain. As illustrated in Table 2, we have encapsulated 24 notable studies from the computing sector. These studies provide insights into the performance metrics, types of mechanisms employed, applications, and optimization objectives of each respective paper. A discernible trend from our analysis indicates that game theory remains the quintessential algorithmic approach for designing incentive mechanisms. A majority of these works pivot around objectives of utility maximization and social cost minimization during the formulation of their optimization strategies.

3.1.1 Game theory-based incentives

Numerous game theory algorithms have prominently featured in the incentive mechanisms for data sharing and exchanging. According to a survey by Liang et al. [43], the dominant algorithms in this realm include the Stackelberg game, non-cooperative game, bargaining game, and the Vickrey-Clarke-Groves (VCG) game.

Table 2 Related works for monetary incentives

Full size table

In the Stackelberg game, the decision-making process is divided into two periods. Detailed formulations of the game can be found in Machado’s work [60]. During the initial period, every node within the network selects its respective quantity, denoted as $\mathcal {Q}_n$. The associated production cost is represented by $\varsigma _n \mathcal {Q}_n$. For a scenario involving one leader and one follower in the Stackelberg game, the demand curve is defined as:

$$\begin{aligned} P(\mathcal {Q}_1+\mathcal {Q}_2)=a-b(\mathcal {Q}_1+\mathcal {Q}21) \end{aligned}$$

(1)

The total profit can be denoted as $\prod _n (\mathcal {Q}_1+\mathcal {Q}_2)$, and can be calculated by:

$$\begin{aligned} \prod {_n}(\mathcal {Q}_1+\mathcal {Q}_2)=P(\mathcal {Q}_1+\mathcal {Q}_2)\mathcal {Q}_n-\varsigma _n \mathcal {Q}_n \end{aligned}$$

(2)

In the second period, the maximum profit or revenue can be defined as:

$$\begin{aligned} \max _{\mathcal {Q}_2}\prod {^2}=(P(\mathcal {Q}_1+\mathcal {Q}_2)-\varsigma )\mathcal {Q}_2=(a-b(\mathcal {Q}_1+\mathcal {Q}_2)-\varsigma )\mathcal {Q}_2 \end{aligned}$$

(3)

In the initial period, the maximum profit or revenue can be represented as:

$$\begin{aligned} \max _{\mathcal {Q}_1}\prod {^1}=(P(\mathcal {Q}_1+\mathcal {Q}_2)-\varsigma )\mathcal {Q}_1=(a-b(\mathcal {Q}_1+R_2(\mathcal {Q}_1))-\varsigma )\mathcal {Q}_1 \end{aligned}$$

(4)

The increasing demand for more efficient and advanced data-sharing and exchanging mechanisms has led researchers to explore various game theory models. The Stackelberg game, in particular, has been at the forefront of such explorations due to its effectiveness in handling hierarchical decision-making processes. A closer look at recent literature sheds light on its widespread application across multiple domains: Li et al. [13] ventured into the domain of WiFi-based indoor localization systems. Recognizing the challenges of constructing a radio map via conventional site surveys, they turned to crowdsourcing as a remedy. Mobile users were incentivized to contribute their indoor trajectories. Employing a two-stage Stackelberg game, the authors ensured the dual goals of maximizing mobile users’ utility while ensuring profitability for the crowdsourcing platform. ** the future of data sharing and exchanging mechanisms.

The Vickrey-Clarke-Groves (VCG) mechanism stands out in the realm of mechanism design for its truth-inducing properties, fostering participants to reveal their genuine valuations. This mechanism ensures an outcome that optimizes social welfare. In VCG, each winner’s payment $\rho _i$ will be the difference between the total cost for the other when verifier i is not participating and the total cost for the others when verifier i joins. It can be defined as:

$$\begin{aligned} \rho _i=\sum _{{\nu _j}\ne {\nu _i}}\zeta _j(W^*_{-i})-\sum _{{\nu _j}\ne {\nu _i}}\zeta _j(W^*_{i}) \end{aligned}$$

(5)

Several research endeavors have adopted the VCG mechanism within the context of blockchain ecosystems. Notably, these studies predominantly center around the allocation of computational resources between miners and edge service providers. For instance, Jiao et al. [18] formulated an auction game between edge computing service providers and miners requiring computational resources. Through their proposed auction mechanism, they managed to optimize social welfare. Furthermore, their methodology ensures individual rationality, truthfulness, and computational efficiency. In another study, Gu et al. [19] leveraged the VCG auction mechanism to address issues related to storage transactions. Implementing their model on the Ethereum platform, they were able to demonstrate that their approach fosters secure, efficient, and cost-effective resource trading.

The VCG mechanism has also been embraced in a myriad of domains including edge computing, wireless networks, crowdsourcing, and crowdsensing, among others. For instance, in the sphere of mobile crowdsensing, Li et al. [20] leveraged the VCG mechanism. Their theoretical algorithms aimed to enhance the efficiency of platforms while making them more appealing for prospective participants. Similarly, Zhou et al. [21] pioneered a novel framework within the crowdsensing domain. Their methodology combined the rewarding potential of the VCG mechanism with edge computing to alleviate computational traffic and workload. Moreover, they integrated advanced deep learning algorithms, such as Convolutional Neural Networks (CNN), to sieve out spurious and irrelevant information that could be disseminated by inauthentic participants. Their empirical case study further reinforced the robustness of their proposed framework. Liu [64], venturing into the realm of ridesharing systems, harnessed the VCG mechanism to conceive a cost-sharing architecture. He meticulously devised two VCG-centric mechanisms tailored for both rudimentary and intricate scenarios. His model notably underscored the potential of minimizing societal costs. Lastly, Borjigin et al. [65] melded VCG algorithms into their innovative multiple-Walrasian auction mechanism, particularly for the valuation service of trees in the network function virtualization market. Their primary objective in utilizing the VCG mechanism was to accentuate and maximize societal effectiveness.

In non-cooperative games, players act independently, making decisions based on predictions of other players’ strategies and payoffs, with the aim of identifying a Nash Equilibrium [66]. Such games are characterized by four fundamental components: players, actions, strategies, and payoffs. Assume we have a set of players participating in the game denotes to $\mathbb {P}=\{\rho _1,\rho _2, ..., \rho _n\}$. A set of strategies denotes $\mathbb {S}=\{\phi _1,\phi _2, ..., \phi _m\}$, which represents how the player will act in every possible distinguishable circumstance. The payoffs will be the utility of each player, if the utility of player i denotes $\mu _i(\phi _i, \phi _-i )$, then other players’ strategies will be $\overrightarrow{\phi }_{-i}=\{\phi _1, \phi _2, ..., \phi _{i-1}, \phi _{i+1}, \phi _m \}$. To find the optimal utility of players, the player i’s strategy ${\phi _i}^*$ is the best response to the strategies specified for the other $n-1$ players. The Nash Equilibrium can be defined as follows:

$$\begin{aligned} {\phi _i}^*= \mathop {\textrm{argmax}}\limits _{\phi _i}\mu _i(\phi _i, \overrightarrow{\phi }_{-i}) \end{aligned}$$

(6)

Zhang et al. [45] introduced a game-theoretic model tailored to enhance the outcomes of the non-cooperative equilibria observed in crowdsourcing applications. Their research identified a delicate balance between social welfare and non-cooperative equilibria. In response, they developed incentive mechanisms rooted in non-cooperative games, pinpointing an optimized solution that maximizes social welfare. Zhan et al. [14] highlighted that as the Internet of Things (IoT) continues to evolve, federated learning emerges as an adept solution to address issues related to network bandwidth, storage, and most pertinently, privacy. Yet, the federated learning landscape is devoid of robust incentive mechanisms, primarily due to the challenges posed by the reluctance to share information and the complexities of contribution evaluation. Addressing this, they introduced a two-tiered incentive mechanism, with the latter stage anchored in a non-cooperative game. This mechanism aimed to galvanize edge nodes, motivating them to more actively and efficiently participate in the training process. Hossain et al. [67] utilized a non-cooperative game approach to address the challenge of resource constraints within a vehicular edge computing setting. In their model, each vehicle autonomously devises its strategy, determining whether to offload a task to a multi-access edge computing server or a cloud server, with the objective of optimizing its benefits.

A bargaining game pertains to a scenario wherein players negotiate to decide the division of benefits derived from cooperation. An illustrative example of this is the negotiation between a seller and a buyer over the price of an automobile. There exists a set of players’ strategies denoted as $\mathbb {S}=\{\phi _1,\phi _2, ..., \phi _m\}$. For any two players, $\phi _i$ is the seller, and $\phi _j$ is the buyer, they will determine the selling price ${\phi _i}^*$, the expected utility for the seller denotes as ${\mu _i}^*$. Similarly, the buyer will also determine his/her utility ${\mu _j}^*$. If ${\mu _i}^*>{\mu _j}^*$, there will be disagreement between two players, and the negotiations need to be continued. When ${\mu _i}^*\le {\mu _j}^*$, the bargaining game is performed, and the price strategy $({\phi _i}^*,{\phi _j}^* )$ is the Nash Equilibrium of this game [43].

Recent research has delved into the application of bargaining games in various sectors: Magerkurth et al. [47] crafted a multi-stage bargaining game tailored for crowdfunding platforms. Their primary objective was to navigate the challenges of crowdfunding benefit allocation, with the ultimate goal of optimizing social welfare. In another study, Lu et al. [48] advanced an incentive mechanism that integrated a bargaining game. Recognizing the constraints of non-cooperative games, they introduced a two-sided rating protocol. Through systematic rating, they devised strategies anchored on intrinsic parameters, aiming for the pinnacle of social welfare maximization. Wang et al. [49] ingeniously melded a Nash bargaining game with deep reinforcement learning methodologies, focusing on enhancing communication in heterogeneous vehicular networks. The core of their approach lies in optimizing the network’s overall performance, striving for the zenith of total reward maximization. Kim [68], on the other hand, conceived a resource management model for pervasive edge computing infrastructure, founded on a bargaining game. He embarked on a comprehensive exploration of the allocation challenges related to computation and communication resources, offering solutions via his proposed model.

3.1.2 Demand and supply models based-incentives

The challenge of determining appropriate reward pricing in incentive mechanisms is perennial. A renowned economic model, known as the demand and supply model, offers insight into determining the price associated with data sharing and exchanging. The demand and supply model elucidates the interplay between data owners and data requesters. At a specific point, an equilibrium price emerges when the quantity demanded aligns with the supply. Such an equilibrium enables efficient resource allocation. Let’s consider the subsequent equations for demand and supply functions. In these equations, P symbolizes the price corresponding to each quantity:

$$\begin{aligned} Q_d=a-b*P \end{aligned}$$

(7)

$$\begin{aligned} Q_s=-c+d*P \end{aligned}$$

(8)

In a recent study by Ma et al. [51], a time and location correlation incentive mechanism was introduced for deep data collection in crowdsourcing networks. They established a metric termed “Quality of Information Satisfaction Degree” (QoISD) to assess the adequacy of collected sensing data. By designing two demand-based incentive mechanisms, they aimed to optimize the QoISD and the associated rewards. Simulations affirmed their method’s efficacy, reducing costs and enhancing QoISD. Sun et al. [69] proposed a dynamic digital twin-based incentive mechanism for resource allocation in aerial-assisted Internet of Vehicles. This two-stage algorithm adeptly handles fluctuating resource supply and demands, ensuring efficient resource scheduling and allocation. Meanwhile, Esfandiari et al. [70] leveraged demand-supply theory to counteract nodes’ selfish behaviors in disruption-tolerant networks, enhancing criteria such as delivery ratio, delay, dropped messages, and overhead ratio (Fig. 3).

3.1.3 Cost model based-incentives

The cost model allows for the determination of the final price of a product by taking into account the total production cost and adding the intended profit margin. When applied to incentive mechanisms, this model provides a means to establish the appropriate reward or price. Let the desired income be represented by $\eta$, the total cost be $\varsigma$, and a predefined profit percentage be $\rho$. The relationship between the cost and income can then be expressed as follows:

$$\begin{aligned} \eta =\varsigma (1+\rho ) \end{aligned}$$

(9)

Cost models offer a straightforward and cost-effective approach when compared to other economic-based models. The implementation of a cost model as an incentive mechanism results in efficient computation due to its relative simplicity. However, it’s important to note that cost models have limitations as they tend to overlook elements like competition and replacement costs. They primarily consider internal factors while neglecting external ones, as highlighted in prior research [43, 71].

Cheng et al. [54] identified a challenge in the context of crowdsourcing platforms, particularly when these platforms sent location-based requests to workers. The challenge revolved around optimizing the assignment of workers to tasks. To address this challenge, they devised three effective heuristic methods: the greedy approach, g-divide and conquer, and cost model-based adaptive algorithms. Experimental results demonstrated the efficiency and effectiveness of these methods in maximizing workers’ rewards within a limited budget. Xue et al. [72] applied both public and private cost models for rational miners in a Bitcoin mining pool. They introduced a Budget Feasible Reward Optimization (BFRO) model aimed at maximizing the reward function while adhering to budget constraints. To solve the BFRO problem, they developed a budget-feasible reverse auction mechanism.

3.1.4 Competition model-based incentives

Competition-based models assist organizations in formulating their pricing strategies by taking into account the pricing strategies of their competitors. In contrast to the cost model, competition-based models consider an external factor: competition within the market. Prices or rewards are determined by assessing market information. In these models, participants establish their prices by benchmarking against similar tasks, aiming to align with a leader’s pricing decisions, which are then followed by others.

Dong et al. [55] employed a competition-based model to establish QoE-ensured pricing in mobile networks. They combined game theory and the competition model to depict social behavior and understand the relationships among devices, service organizers, and users. Damien et al. [56] highlighted the common implementation of cooperation and competition modes in crowdsourcing platforms. They introduced a hybrid model called “coopetition,” which blends both approaches. Their experiments demonstrated that the hybrid model outperformed the two traditional ones. Ghasemi et al. [73] designed a competition-based pricing strategy for the cloud market environment. Their experimental results showcased a significant increase in profits for providers compared to other pricing policies discussed in previous literature.

3.2 Non-monetary incentives

Non-monetary incentives in previous research can be categorized into two main types: entertainment-based incentives and service-based incentives [77] employed a service-based incentive mechanism in their global crowdsourcing platform, named “mClerk.” Using their method, low-income workers could have more new employment opportunities. The users in mClerk both sent and received tasks via SMS. Huang et al. [82] highlighted security concerns in traditional cloud data management systems and introduced a secure multi-owner data sharing management scheme named “Mona.” This scheme utilized group signature and dynamic broadcast encryption techniques to enhance security. The majority of research in data management systems has concentrated on structured data and cloud-based systems. Before data sharing and exchange can occur, it is imperative to ensure high data quality during the data creation step. Data quality plays a critical role in determining the effectiveness and efficiency of data sharing and exchange processes. In recent years, an increasing number of researchers have considered data quality as a significant parameter when designing incentive mechanisms. For example, Yang et al. [83] observed that data quality was often overlooked in the mobile crowdsensing domain. To address this issue, they integrated quality estimation and monetary incentives into their model to support data sharing. Additionally, they employed an unsupervised learning approach to quantify data quality and implemented outlier detection techniques to filter out anomalous data. Similarly, Luo et al. [84] identified limitations in using data mining techniques to control data quality. They introduced a cross-validation approach to identify a validating crowd capable of verifying the contributions made by sensor data providers. Furthermore, they employed weighted oversampling methods and privacy-aware trust algorithms to enhance the services of mobile crowdsensing systems. However, it’s worth noting that many researchers continue to rely on traditional machine learning methods for data quality filtering.

Data processing and data transformation are crucial steps aimed at converting raw data into meaningful and structured information. When creating or collecting data, it’s essential to establish a standardized data structure for efficient big data management. This involves tasks such as data format conversion, data cleaning, and factor extraction, among others. To facilitate data creation and integration, unified data processing and transformation formats become essential. These unified data integration frameworks can significantly reduce the time spent on data wrangling and help save costs. For instance, Ma et al. [85] introduced a novel graph-based data integration framework built upon a unified conceptual model. They applied this framework to address a real-world refueling problem and demonstrated improved precision and recall results. Given the diversity of data types in the realm of big data, some researchers have developed data integration frameworks tailored for unstructured data. Williams et al. [86] designed an image data integration platform for bioimages sourced from various channels, including high-content screening, multi-dimensional microscopy, and digital pathology. They also established a computational resource for remote access to their system, enabling users to re-analyze the data. Nevertheless, unified data integration frameworks may still face challenges, such as data security concerns and process efficiency optimization.

4.2 Data storage

Within the data storage process, several key components play critical roles: data backups, data replication, data deduplication, and cloud storage. Data backups serve as a crucial means of ensuring data protection and mitigating costs in the event of data loss. Some organizations still employ tape backup as their method of choice for safeguarding against data loss. This involves storing data on magnetic media. However, it’s important to note that tape backups can be vulnerable to corruption. Even when organizations opt for cloud storage or other backup solutions, the possibility of disasters leading to system shutdowns remains a concern. In modern data management practices, a secure approach involves the combination of full backups and partial backups. This strategy enhances data protection and resilience against data loss scenarios. A full backup corresponds to a specific moment in time, involving the capture of a comprehensive system image, which is then stored on a secondary device. In contrast, partial backups encompass differential and incremental methods. However, regardless of the traditional backup strategies implemented, a persistent risk remains: the potential for system corruption [87, 88].

Data replication The key distinction between backups and replication lies in the accessibility of replicas, which are more readily available to production systems. Data deduplication is a vital data cleaning process in data storage, serving to mitigate data redundancy and optimize storage space utilization. The primary objective of data deduplication algorithms is to enhance the efficiency of databases by eliminating redundancies without compromising data accuracy or integrity. In recent research, there has been a significant focus on develo** secure data deduplication mechanisms. For instance, Fan et al. [89] introduced a hybrid data deduplication mechanism tailored for cloud storage systems, addressing security concerns associated with the deduplication process. Their experimental results demonstrated the effectiveness of their approach in resolving security issues within data deduplication. Similarly, Rashid et al. [90] proposed a two-level data deduplication framework designed for cloud storage systems. The framework comprised two tiers: the enterprise-level and the cloud storage provider-level. At the enterprise level, data deduplication was performed, and the deduplicated data was stored in the cloud. Subsequently, at the cloud storage provider level, duplicate data was systematically removed to optimize storage space while ensuring data security and control. The authors showcased the advantages of their framework in terms of security, control, space efficiency, and reduced storage costs.

Cloud storage and edge storage Data storage is an essential method for preserving data, and much research attention has been devoted to develo** incentive mechanisms for this purpose. Conventional data storage relies on established mechanisms for accessing multiple configurable resources. Over the past few decades, numerous researchers have dedicated their efforts to enhancing cloud storage systems through various incentive mechanisms. However, with the advent of 5G technology, the Internet of Things (IoT), and the proliferation of big data, cloud-based data storage has exhibited certain limitations. Cloud computing, in its current form, lacks some crucial functionalities required to cope with the surging volumes of big data effectively. These shortcomings include challenges related to low latency and jitter, ensuring high availability, and scalability. Consequently, several transformative changes are poised to impact our daily lives. Key questions arise, such as “Can services be delivered closer to end-users through distributed computing?” “Can your smartphone serve as your primary data repository?” “Can your vehicle monitor machine health, facilitate software updates, and identify real-time maintenance issues promptly?” “What if smart edge devices could offer deterministic latency and support time-sensitive applications while analyzing real-time and streaming data at the edge?” These questions present formidable challenges for data storage as we design incentive mechanisms for data sharing and exchange.

In response to these challenges, recent studies have focused on the development of edge data storage and processing solutions aimed at addressing the aforementioned questions. Ge et al. [91] investigated the data caching resource allocation problem in fog radio access networks environment. They employed a Stackelberg game to incentivize the data providers to participate in the resource allocation process. They applied the simple search method to solve the optimization problem that could optimize the data caching resource allocation. Alioua emphet al. [92] developed an incentive mechanism of edge caching for the Internet of vehicles system. Their incentive mechanism focused on the economic side of caching by considering the competitive cache-enablers market. They employed a Stackelberg game between the data provider and the multiple mobile network operators and found a Nash equilibrium to reduce the caching cost. However, we still have many opportunities to improve data storage in the data sharing and data exchanging process.

4.3 Data access

In the context of incentive mechanisms for data sharing and exchanging, “data access” is a broad concept that encompasses the authorization to access the data. This area comprises several critical data access components, including identity and authentication, access control, encryption, and trust management.

Identity and authentication is a term used to describe the process of granting different parties access to the data. In the past, authentication protocols were primarily designed for single-server environments, which are ill-suited for the new architecture of big data and IoT environments. Around 2015, an increasing amount of sensitive data, such as healthcare records, began to transition into digital formats. Consequently, many researchers began develo** more efficient authentication schemes to safeguard E-healthcare databases. For instance, both Wu et al. and Jiang et al. concentrated on devising three-factor authentication protocols to mitigate various types of attacks [93, 94]. Recognizing the limitations of single-server authentication schemes, some researchers began to explore the creation of inter-cloud identity management systems like OpenID and SAML, which offer Single-Sign-On (SSO) authentication capabilities.

Access control is employed to prevent unauthorized entities from accessing devices and sharing or exchanging data. Historically, the majority of research has been concentrated on designing access control systems for the cloud. However, as edge computing architectures have evolved, there have been relatively few developments in edge access control mechanisms. Yu et al. designed an access control system by leveraging techniques from various encryption schemes, establishing efficient fine-grained data access control [95]. Additionally, they introduced a novel framework for access control within the healthcare domain in a cloud computing environment [96].

Encryption has been a popular research topic for many years. However, traditional encryption methods like the Triple Data Encryption Algorithm (TDEA) and Triple Data Encryption (3DSE) have their limitations. They require devices to have prior knowledge of information recipients’ identities and share credentials, which may not be feasible in many data-sharing and exchanging scenarios where recipients are often unknown. To address these challenges, encryption methods tailored for data sharing and exchanging environments have been developed, providing solutions for scenarios where traditional algorithms fall short. For instance, Attribute-Based Encryption (ABE) is one such encryption algorithm that involves a key authority between a data sender and recipient [97]. This approach offers more flexibility and adaptability in complex data-sharing and exchanging systems.

Encryption methods have evolved to address the security needs of various computing environments, including centralized cloud servers and emerging edge paradigms. Researchers have proposed innovative encryption schemes to protect data in these diverse settings. In centralized cloud environments, Wu et al. combined hierarchical identity-based encryption (HIBE) with ciphertext attributed-based encryption (CP-ABE) to create an efficient encryption scheme for sharing confidential data [98]. Li et al. extended this approach to safeguard healthcare data on cloud servers, utilizing attribute-based encryption (ABE) techniques to encrypt patient’s personal health record (PHR) files [99].

With the advent of edge computing paradigms, encryption methods have been adapted to suit these environments. Alrawais et al. introduced an efficient key exchange protocol based on CP-ABE and digital signature techniques in fog computing environments, achieving improved performance in terms of confidentiality, authentication, verifiability, and access control [128, 129]. Yet, even these sophisticated frameworks are not without their constraints. Crucially, as we advance these systems, considerations surrounding data privacy and security cannot be sidelined. Moreover, there’s an evident merit in harnessing deep learning methodologies. By doing so, the framework could potentially auto-adjust its structure, accommodating the multifaceted nature of data.

Augmenting the data management system with capabilities for real-time data processing is another pivotal aspect. The rapid proliferation of big data and the IoT has ushered in an era where numerous applications are tethered to the immediacy of data sharing and exchange. Paradigmatic instances include autonomous vehicles [130], emergency fire response [131], and medical emergency services [132]. Yet, it’s evident that real-time data sharing in these sectors is fraught with limitations. Take, for instance, the unfortunate event of a vehicular accident. Present protocols necessitate a phone call, awaiting police intervention-a process that inadvertently prolongs the accident’s aftermath and could potentially delay critical medical interventions. Hence, it becomes paramount for the data management system to evolve, equip** itself with the agility to seamlessly handle real-time data streams.

6.3 Employ artificial intelligence algorithms to improve data quality

Data quality directly impacts the efficiency and accuracy of data sharing and exchange processes [133]. Leveraging artificial intelligence techniques to discern and eliminate inauthentic and subpar data is an invaluable research trajectory.

While many researchers have hitherto applied conventional machine learning and deep learning algorithms to this domain [134, 135], relying solely on these traditional methods to sieve out low-quality data can be resource-intensive. As a remedy, federated learning emerges as a promising paradigm. By adopting a distributed approach to deep learning, computation resources can be conserved. Additionally, the presence of counterfeit data undermines data quality, diminishing the efficacy of data sharing and exchange. Thus, devising mechanisms to detect such spurious data is an avenue worth exploring in future research. Beyond the conventional deep learning techniques, there’s potential in harnessing reinforcement learning. This can expedite model creation, obviating the need to amass training and testing datasets beforehand. Integrating anomaly detection systems into federated learning architectures can significantly bolster the integrity of data quality. Utilizing the distributed topology of the network, these systems are adept at pinpointing and segregating questionable data entries, thus enhancing the robustness of the dataset across the collective nodes. Moving beyond traditional deep learning methodologies, reinforcement learning presents a compelling alternative. It can streamline the model development process by eliminating the extensive requirement for precompiled training and testing datasets. Additionally, reinforcement learning algorithms are designed to adapt to evolving data trends autonomously, providing a scalable and adaptive solution for maintaining data quality amidst the complexities of expansive network environments.

6.4 Using distributed data storage techniques to ensure data security and privacy

In contrast to centralized data storage solutions, distributed storage methods have gained increasing prominence. Recently, blockchain has emerged as a widely researched approach for data storage. Blockchain, characterized as a decentralized, digital, secure, and transparent ledger for cryptographic data transactions, has begun to revolutionize numerous sectors [137, 138], cyber-physical systems [139], education [136], supply chain management [140], and crowdsourcing and crowdsensing [141], among others.

Figure 6 provides an illustrative depiction of how blockchain can enhance the data sharing and exchange paradigm [118]. The figure delineates n entities, represented as $\mathbb {B}=(\beta _1, \beta _2, \beta _3,..., \beta _n)$. Each of these entities can assume the role of either a data provider or a requester. The datasets, denoted by $\mathbb {D}$=($d _1, d _2, d _3,..., d _m)$, constitute the content intended for sharing and exchange between entities. To safeguard data privacy throughout these operations, various encryption techniques, represented by $\varepsilon$, are invoked. To spur entities’ participation, an incentive algorithm, $\Gamma$, is integrated. This algorithm may encompass both monetary and non-monetary rewards. Considering the data storage and processing phases, a suite of distributed data storage and processing strategies can be implemented to bolster data privacy and efficiency. We advocate for the incorporation of blockchain and smart contracts as robust mechanisms for secure data storage.

6.5 Design authentication and encryption mechanisms

Efficient and private authentication schemes [142, 143] have become paramount in the realm of data sharing and exchange. Historically, authentication protocols within the cloud ecosystem were primarily tailored to single-server environments. This model, however, is increasingly incongruent with the emergent architectures of 5G and the IoT [144], which champion distributed service environments. With the introduction of cloud data sharing in 2009, there was an uptick in user growth and a pronounced demand for shared services. This phenomenon led researchers to embark on the quest for robust trust and security authentications that would seamlessly link cloud users to services. Established frameworks like the SSL Authentication Protocol (SAP) [145] were often perceived as cumbersome and unintuitive by a vast swath of users.

In recognizing the constraints of singular server authentication systems, several scholars veered towards the development of inter-cloud identity management solutions. This journey saw the emergence of protocols such as OpenID [146] and SAML [147], both championing Single-Sign-On (SSO) authentication. Yet, these systems inherently hinge on third-party intermediaries, potentially ushering in unforeseen security vulnerabilities. Consequently, the crafting of efficient, private authentication schemes that resonate with a distributed service environment remains an ongoing academic challenge. In today’s data exchange ecosystems, the vast majority of IoT devices necessitate users to establish personal accounts, often requiring the divulgence of sensitive information. Thus, the dual challenge of guaranteeing user anonymity whilst ensuring efficient authentication becomes evident. Several burgeoning research opportunities have been identified:

Synergies: Most of the cutting-edge research delving into efficient, privacy-centric authentication [148, 149] is anchored in the domains of cloud computing and mobile cloud computing. As the locus of future data sharing and exchange is likely to shift towards edge computing, it is imperative to discern potential collaborations between mobile cloud computing and other edge paradigms.
Security vs. privacy trade-offs: In the process of devising novel authentication protocols, it becomes essential to strike an equilibrium between security and privacy. For instance, within the paradigm of lightweight authentication [150], the assurance of rigorous user anonymity takes precedence. Furthermore, for devices tethered to batteries, the nexus between energy conservation and security emerges as a captivating area of research.

Historically, the bulk of research efforts have been funneled into crafting access control systems [151] that seamlessly integrate with cloud computing. Contrastingly, research endeavors exploring access control mechanisms within the context of edge computing have been sparse. Thus, the development and implementation of pragmatic access control algorithms tailored to the edge environment stand out as a promising research trajectory. Given the anticipated surge in edge devices in the near future, a new set of challenges centered on efficient access model identification and optimization of finite resources come to the fore. This is especially pertinent for devices that are battery-dependent. In the nascent stages of trust management within the realm of cloud computing, Service Level Agreements (SLAs) [152] emerged as the foundational technique. However, these were not universally consistent across cloud providers, leading to potential trust issues. Most of the scholarly endeavors in trust management have historically been rooted in centralized services. Yet, come 2016, a modest shift was observed with more research gravitating toward distributed computing services. Trust management, when viewed through the lens of distributed computing within data sharing and exchange, emerges as a potential research frontier.

Regarding encryption mechanisms integral to data sharing and exchange processes, it’s evident that a majority of these mechanisms are tailored for cloud-based data sharing and fog data sharing ecosystems. However, the landscape is replete with opportunities to architect more efficient encryption algorithms that dovetail with mobile edge computing and cloud computing paradigms. While a significant chunk of the academic community is engrossed with the CP-ABE algorithms [153], alternative encryption strategies that seamlessly integrate with CP-ABE, such as fully homomorphic encryption (FHE) [154] and ciphertext policy attribute-based proxy re-encryption (CP-ABPRE) [155], also hold promise. As a result, refining encryption mechanisms in the data sharing and exchange space remains a paramount academic pursuit.

6.6 Use reinforcement learning to solve the un-shared decision problem

In data sharing and exchanging processes, it’s commonplace for multiple nodes to collaboratively complete a task. For instance, within the federated learning [183].

7.5 Automotive industry

The advent of data sharing and exchange has been instrumental in advancing the automotive sector, especially in the realm of autonomous vehicle technology. The capability for real-time data exchange is essential, enabling vehicular and infrastructural intercommunications to enhance safety and traffic management. Notwithstanding the advantages, the integrity of Vehicle-to-Everything (V2X) communication systems is of paramount concern. It is imperative to establish robust incentive mechanisms to motivate all parties, including vehicle owners and manufacturers, to contribute data, while concurrently ensuring stringent cybersecurity protocols. Strategic partnerships between automakers and cybersecurity enterprises have culminated in the fortification of V2X systems, with Tesla and BMW emerging as pioneers in this technological integration [184].

Nevertheless, the exchange of data within the automotive industry raises significant privacy concerns. Such data could potentially disclose an individual’s location, daily patterns, and private conversations, especially with the presence of camera recording devices in vehicles. It is, therefore, essential to implement stringent data security protocols that bolster user confidence in sharing their information. Furthermore, sophisticated incentive algorithms can be employed to refine data sharing and exchange systems, ensuring the automotive industry’s advancement without compromising personal privacy.

7.6 Financial services

Data sharing has revolutionized the financial sector, enabling banks and fintech entities to offer bespoke services. For example, Plaid [185] is a platform to link bank accounts to financial apps, streamlining transactions, and enhancing user experiences. Similarly, Yodlee [186] offers data aggregation and analytics services, providing insights to both consumers and financial institutions. Nonetheless, the sector is navigating through a labyrinth of rigorous data privacy and security regulations. Incentive-driven data-sharing frameworks have the potential to catalyze the secure exchange of data, aligning with regulatory mandates such as GDPR and CCPA. The drive towards open banking, propelled by API technology, has facilitated the creation of novel applications that elevate service delivery, ranging from unified financial interfaces to advanced fraud detection mechanisms [187].

However, breaches of financial records pose considerable risks. Stakeholders are often reticent to share financial data without assurance of robust security measures. Additionally, within financial data-sharing platforms, participants may also be competitors from different institutions, naturally cautious about disclosing customer data without substantial incentives and algorithms to safeguard their clients’ information. Consequently, incentive-based data sharing and exchange platforms are of paramount importance in the financial sector, balancing competitive interests with collaborative imperatives.

8 Conclusion

In this comprehensive survey, we explored various incentive mechanisms and optimization algorithms related to data sharing and exchanging, offering foundational definitions and related concepts. We segmented the lifecycle of data sharing and exchanging into four distinct parts, presenting in-depth insights on associated works within each category. Among the challenges identified in the design of incentive mechanisms, two primary concerns stand out in the majority of incentive-based applications: the challenge of motivating different users, especially competitors, to engage in data sharing and exchanging; and the imperative to protect sensitive user data. Addressing the former, combining both monetary and non-monetary incentives appears to be an effective approach to stimulate user participation in the sharing process. For ensuring data security, the integration of tailored encryption algorithms and the use of distributed data storage methods, such as blockchain and federated learning, emerge as sound strategies. In scenarios where data quality is paramount, deep learning presents a potential solution to both identify fake users and anticipate user behavior. In our rapidly evolving digital landscape, the crafting of trustworthy, efficient, and economical incentive mechanisms for data sharing and exchanging holds significant importance across numerous domains.

References

Manyika J, Chui M, Brown B, Bughin J, Dobbs R, Roxburgh C, Byers AH. Big data: the next frontier for innovation, competition. Washington, DC: McKinsey Global Institute; 2011.
Google Scholar
Wikipedia. Information exchange. Wikipedia, 2015.
Wiggers K. Amazon launches AWS Data Exchange for tracking and sharing data sets. VentureBeat, 2019.
Moreno-Guerrero A-J, Rodríguez-Jiménez C, Ramos-Navas-Parejo M, Soler-Costa R, López-Belmonte J. Whatsapp and google drive influence on pre-service students’ learning. In: Frontiers in education, vol. 5, p. 152. Frontiers Media SA; 2020.
Zhang X, Yang Z, Sun W, Liu Y, Tang S, **ng K, Mao X. Incentives for mobile crowd sensing: a survey. IEEE Commun Surv Tutor. 2015;18(1):54–67.
Article Google Scholar
Magerkurth C, Cheok AD, Mandryk RL, Nilsen T. Pervasive games: bringing computer entertainment back to the real world. Comput Entertain (CIE). 2005;3(3):4–4.
Article Google Scholar
Avouris NM, Yiannoutsou N. A review of mobile location-based games for learning across physical and virtual spaces. J UCS. 2012;18(15):2120–42.
Google Scholar
Matyas S. Playful geospatial data acquisition by location-based gaming communities. IJVR. 2007;6(3):1–10.
Google Scholar
Barkhuus L, Chalmers M, Tennent P, Hall M, Bell M, Sherwood S, Brown B. Picking pockets on the lawn: the development of tactics and strategies in a mobile game. In: International Conference on Ubiquitous Computing. Springer; 2005. pp.358–374.
Bell M, Chalmers M, Barkhuus L, Hall M, Sherwood S, Tennent P, Brown B, Rowland D, Benford S, Capra M, et al. Interweaving mobile games with everyday life. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, ACM; 2006. pp. 417–426.
Yan C, Zhu H, Korolko N, Woodard D. Dynamic pricing and matching in ride-hailing platforms. Naval Res Logist (NRL). 2020;67(8):705–24.
Article MathSciNet Google Scholar
Yang D, Xue G, Fang X, Tang J. Crowdsourcing to smartphones: incentive mechanism design for mobile phone sensing. In: Proceedings of the 18th Annual International Conference on Mobile Computing and Networking, 2012. pp. 173–184.
Li W, Zhang C, Liu Z, Tanaka Y. Incentive mechanism design for crowdsourcing-based indoor localization. IEEE Access. 2018;6:54042–51.
Article Google Scholar
Zhan Y, Li P, Qu Z, Zeng D, Guo S. A learning-based incentive mechanism for federated learning. IEEE Internet Things J. 2020;7(7):6360–8.
Rawat DB, Shetty S, **n C. Stackelberg-game-based dynamic spectrum access in heterogeneous wireless systems. IEEE Syst J. 2014;10(4):1494–504.
Article ADS Google Scholar
**ong Z, Zhang Y, Niyato D, Wang P, Han Z. When mobile blockchain meets edge computing. IEEE Commun Mag. 2018;56(8):33–9.
Article Google Scholar
Yaïche H, Mazumdar RR, Rosenberg C. A game theoretic framework for bandwidth allocation and pricing in broadband networks. IEEE/ACM Trans Networking. 2000;8(5):667–78.
Article Google Scholar
Jiao, Y., Wang, P., Niyato, D., **ong, Z.: Social welfare maximization auction in edge computing resource allocation for mobile blockchain. In: 2018 IEEE International Conference on Communications (ICC), pp. 1–6 (2018). IEEE
Gu, Y., Hou, D., Wu, X.: A cloud storage resource transaction mechanism based on smart contract. In: Proceedings of the 8th International Conference on Communication and Network Security, pp. 134–138 (2018). ACM
Li J, Cai Z, Wang J, Han M, Li Y. Truthful incentive mechanisms for geographical position conflicting mobile crowdsensing systems. IEEE Trans Comput Soc Syst. 2018;5(2):324–34.
Article Google Scholar
Zhou Z, Liao H, Gu B, Huq KMS, Mumtaz S, Rodriguez J. Robust mobile crowd sensing: when deep learning meets edge computing. IEEE Netw. 2018;32(4):54–60.
Article Google Scholar
Ballestar MT, Grau-Carles P, Sainz J. Consumer behavior on cashback websites: network strategies. J Bus Res. 2016;69(6):2101–7.
Article Google Scholar
Ihle C, Trautwein D, Schubotz M, Meuschke N, Gipp B. Incentive Mechanisms in Peer-to-Peer Networks—A Systematic Literature Review. ACM Comput Surv. 2023;55(14):308.
Date CJ, Darwen H. A guide to the SQL Standard vol. 3. Addison-Wesley New York, 1987.
Qian L, Luo Z, Du Y, Guo L. Cloud computing: an overview. In: IEEE International Conference on Cloud Computing. Springer; 2009;pp. 626–631.
Armbrust M, Fox A, Griffith R, Joseph AD, Katz R, Konwinski A, Lee G, Patterson D, Rabkin A, Stoica I, et al. A view of cloud computing. Commun ACM. 2010;53(4):50–8.
Article Google Scholar
Hashizume K, Rosado DG, Fernández-Medina E, Fernandez EB. An analysis of security issues for cloud computing. J Internet Serv Appl. 2013;4(1):5.
Article Google Scholar
Varshini B, Prem MV, Geethapriya J. A review on secure data sharing in cloud computing environment. Int J Adv Res Comput Eng Technol. 2007;6(3):224–8.
Google Doc. Google. https://www.google.com/docs/about/.
DocuSign. DocuSign. https://www.docusign.com.
Bonomi, F., Milito, R., Zhu, J., Addepalli, S.: Fog computing and its role in the internet of things. In: Proceedings of the First Edition of the MCC Workshop on Mobile Cloud Computing. ACM; 2012. pp. 13–16.
Hu YC, Patel M, Sabella D, Sprecher N, Young V. Mobile edge computing—a key technology towards 5g. ETSI white paper. 2015;11(11):1–16.
Google Scholar
Dinh HT, Lee C, Niyato D, Wang P. A survey of mobile cloud computing: architecture, applications, and approaches. Wirel Commun Mob Comput. 2013;13(18):1587–611.
Article Google Scholar
Jude A. How will 5G and edge computing transform the future of mobile gaming? https://www.ericsson.com/en/blog/2021/3/5g-edge-computing-transform-gaming. 2021.
Zope N. What is data exchange? Blogger. 2019.https://www.datasciencejunction.com/2019/03/what-is-data-exchange.html .
Guttentag D. Airbnb: disruptive innovation and the rise of an informal tourism accommodation sector. Curr Issue Tour. 2015;18(12):1192–217.
Article Google Scholar
Microsoft Exchange Server data breach. Wikipedia. 2021. https://en.wikipedia.org/wiki/2021_Microsoft_Exchange_Server_data_breach.
Scekic O, Truong H-L, Dustdar S. Incentives and rewarding in social computing. Commun ACM. 2013;56(6):72–82.
Article Google Scholar
Riberolles HD. How Google implemented an innovative incentive compensation system? https://www.primeum.com/en/blog/google-implemented-innovative-incentive-compensation-system.
Carraway N. Challenge.Gov—a model for government crowdsourcing. https://digital.hbs.edu/platform-rctom/submission/challenge-gov-a-model-for-government-crowdsourcing/.
Ren Y, Liu Y, Ji S, Sangaiah AK, Wang J. Incentive mechanism of data storage based on blockchain for wireless sensor networks. Mobile Inf Syst. 2018;2018:6874158.
Crane J, Damasceno P. Lessons learned from real-world federated learning: experience with COVID-19 modeling at UCSF. San Francisco: UCSF Intelligent Imaging; 2023. https://intelligentimaging.ucsf.edu/news/lessons-learned-real-world-federated-learning-experience-covid-19-modeling-ucsf.
Liang F, Yu W, An D, Yang Q, Fu X, Zhao W. A survey on big data market: pricing, trading and protection. IEEE Access. 2018;6:15132–54.
Article Google Scholar
Sarikaya Y, Ercetin O. Motivating workers in federated learning: A stackelberg game perspective. IEEE Netw Lett. 2019;2(1):23–7.
Zhang Y, Vander Schaar M. Reputation-based incentive protocols in crowdsourcing applications. In: 2012 Proceedings IEEE INFOCOM. IEEE; 2012. pp. 2140–2148.
Azimi SM, Manshaei MH, Hendessi F. Cooperative primary-secondary dynamic spectrum leasing game via decentralized bargaining. Wirel Netw. 2016;22(3):755–64.
Article Google Scholar
Bade M. Bargaining over crowdfunding benefits. J Entrep Public Policy. 2018;7(2):166–77.
Lu J, **n Y, Zhang Z, Liu X, Li K. Game-theoretic design of optimal two-sided rating protocols for service exchange dilemma in crowdsourcing. IEEE Trans Inf Forensics Secur. 2018;13(11):2801–15.
Article Google Scholar
Wang J, Zhuang Z, Qi Q, Li T, Liao J. Deep reinforcement learning-based cooperative interactions among heterogeneous vehicular networks. Appl Soft Comput. 2019;82.
Article Google Scholar
Jain S, Narayanaswamy B, Narahari Y. A multiarmed bandit incentive mechanism for crowdsourcing demand response in smart grids. In: Twenty-Eighth AAAI Conference on Artificial Intelligence. 2014.
Ma F, Liu X, Liu A, Zhao M, Huang C, Wang T. A time and location correlation incentive scheme for deep data gathering in crowdsourcing networks. Wirel Commun Mobile Comput. 2018;2018.
Mihailescu M, Teo YM. Strategy-proof dynamic resource pricing of multiple resource types on federated clouds. In: International Conference on Algorithms and Architectures for Parallel Processing. Springer; 2010. pp. 337–350.
Abedini N, Shakkottai S. Content caching and scheduling in wireless networks with elastic and inelastic traffic. IEEE/ACM Trans Netw. 2013;22(3):864–74.
Article Google Scholar
Cheng P, Lian X, Chen L, Han J, Zhao J. Task assignment on multi-skill oriented spatial crowdsourcing. IEEE Trans Knowl Data Eng. 2016;28(8):2201–15.
Article Google Scholar
Dong M, Liu X, Qian Z, Liu A, Wang T. Qoe-ensured price competition model for emerging mobile networks. IEEE Wirel Commun. 2015;22(4):50–7.
Article Google Scholar
Renard D, Zhao Z, Balagué C, Elmoukhliss M. Crowdsourcing collective intelligence through coopetition. In: Collective Intelligence Conference 2016. 2016.
Shen M, Duan J, Zhu L, Zhang J, Du X, Guizani M. Blockchain-based incentives for secure and collaborative data sharing in multiple clouds. IEEE J Sel Areas Commun. 2020;38(6):1229–41.
Article Google Scholar
Nguyen LD, Pandey SR, Beatriz S, Broering A, Popovski P. A marketplace for trading AI models based on blockchain and incentives for IoT data. ar**v preprint ar**v: 2112.02870. 2021.
Jaiman V, Pernice L, Urovi V. User incentives for blockchain-based data sharing platforms. PLoS ONE. 2022;17(4):0266624.
Article Google Scholar
Machado M. Stackelberg Model. Charles III University of Madrid. http://www.eco.uc3m.es.
Zeng F, Chen Y, Yao L, Wu J. A novel reputation incentive mechanism and game theory analysis for service caching in software-defined vehicle edge computing. Peer-to-Peer Netw Appl. 2021;14(2):467–81.
Article Google Scholar
Zhou H, Wang Z, Cheng N, Zeng D, Fan P. Stackelberg-game-based computation offloading method in cloud–edge computing networks. IEEE Internet Things J. 2022;9(17):16510–20.
Li Y, Yang B, Wu H, Han Q, Chen C, Guan X. Joint offloading decision and resource allocation for vehicular fog-edge computing networks: a contract-stackelberg approach. IEEE Internet Things J. 2022;9(17):15969–82.
Liu Y, Zhang C, Zheng Z, Chen G. Cost-sharing mechanism design for social cost minimization in ridesharing systems. In: International Conference on Wireless Algorithms, Systems, and Applications. Springer; 2021. pp. 277–289.
Borjigin W, Ota K, Dong M. Multiple-walrasian auction mechanism for tree valuation service in NFV market. IEEE Trans Comput Soc Syst. 2022;10(1):61–71.
Brandenburger A. Cooperative game theory: characteristic functions, allocations, marginal contribution. Stern School of Business. New York University 1, 1–6, 2007.
Hossain M, Sultana T, Layek M, Sone PP, Lee G-W, Huh E-N, et al. Dynamic task offloading for cloud-assisted vehicular edge computing networks: a non-cooperative game theoretic approach. Sensors. 2022;22(10):3678.
Article ADS PubMed PubMed Central Google Scholar
Kim S. Bargaining game-based resource management for pervasive edge computing infrastructure. IEEE Access. 2022;10:4072–80.
Sun W, Wang P, Xu N, Wang G, Zhang Y. Dynamic digital twin and distributed incentives for resource allocation in aerial-assisted internet of vehicles. IEEE Internet Things J. 2021;9(8):5839–52.
Esfandiari S, Rezvani MH. An optimized content delivery approach based on demand-supply theory in disruption-tolerant networks. Telecommun Syst. 2021;76(2):265–89.
Article Google Scholar
Bragg S. Cost plus pricing. AccountingTools. 2018. https://www.accountingtools.com/articles/2017/5/16/cost-plus-pricing.
Xue G, Xu J, Wu H, Lu W, Xu L. Incentive mechanism for rational miners in bitcoin mining pool. Inf Syst Front. 2021;23(2):317–27.
Article Google Scholar
Ghasemi S, Meybodi MR, Rooladi MDT, Rahmani AM. A competition-based pricing strategy in cloud markets using regret minimisation techniques. Int J Grid Util Comput. 2021;12(5–6):635–54.
Article Google Scholar
Neustaedter C, Tang A, Judge TK. Creating scalable location-based games: lessons from geocaching. Pers Ubiquit Comput. 2013;17(2):335–49.
Article Google Scholar
Rossitto C, Barkhuus L, Engström A. Interweaving place and story in a location-based audio drama. Pers Ubiquit Comput. 2016;20(2):245–60.
Article Google Scholar
Lammes S, Wilmott C. The map as playground: location-based games as cartographical practices. Convergence. 2018;24(6):648–65.
Article Google Scholar
Gupta A, Thies W, Cutrell E, Balakrishnan R. mclerk: enabling mobile crowdsourcing in develo** regions. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, 2012. pp. 1843–1852.
Huang Z, Su X, Zhang Y, Shi C, Zhang H, **e L. A decentralized solution for iot data trusted exchange based-on blockchain. In: 2017 3rd IEEE International Conference on Computer and Communications (ICCC). IEEE; 2017. pp. 1180–1184.
Yi Y, Yang Y, Cheng K, Wu Y, Wang X. Information dissemination with service-oriented incentive mechanism in industrial Internet of Things. IEEE Internet Things J. 2022;9(18):16897–907.
Vimalajeewa D, Kulatunga C, Berry DP, Balasubramaniam S. A service-based joint model used for distributed learning: Application for smart agriculture. IEEE Trans Emerg Top Comput. 2021;10(2):838–54.
Wang Y, Cai Z, Yin G, Gao Y, Tong X, Wu G. An incentive mechanism with privacy protection in mobile crowdsourcing systems. Comput Netw. 2016;102:157–71.
Article ADS Google Scholar
Liu X, Zhang Y, Wang B, Yan J. Mona: Secure multi-owner data sharing for dynamic groups in the cloud. IEEE Trans Parallel Distrib Syst. 2012;24(6):1182–91.
Article ADS Google Scholar
Yang S, Wu F, Tang S, Gao X, Yang B, Chen G. On designing data quality-aware truth estimation and surplus sharing method for mobile crowdsensing. IEEE J Sel Areas Commun. 2017;35(4):832–47.
Article Google Scholar
Luo T, Huang J, Kanhere SS, Zhang J, Das SK. Improving iot data quality in mobile crowd sensing: a cross validation approach. IEEE Internet Things J. 2019;6(3):5651–64.
Article Google Scholar
Ma B, Jiang T, Zhou X, Zhao F, Yang Y. A novel data integration framework based on unified concept model. IEEE Access. 2017;5:5713–22.
Article Google Scholar
Williams E, Moore J, Li SW, Rustici G, Tarkowska A, Chessel A, Leo S, Antal B, Ferguson RK, Sarkans U, et al. Image data resource: a bioimage data integration and publication platform. Nat Methods. 2017;14(8):775.
Article CAS PubMed PubMed Central Google Scholar
Thampy R. Top 6 Challenges with Traditional Data Backup and Disaster Recovery Solutions. Zmanda, a BETSOL company. 2019. https://blogs.zmanda.com/challenges-with-data-backup-and-disaster-recovery-solution.
Agrawal R, Nyamful C. Challenges of big data storage and management. Glob J Inf Technol Emerg Technol. 2016;6(1):1–10.
CAS Google Scholar
Fan C-I, Huang S-Y, Hsu W-C. Hybrid data deduplication in cloud environment. In: 2012 International Conference on Information Security and Intelligent Control. IEEE; 2012. pp. 174–177.
Rashid F, Miri A, Woungang I. Secure enterprise data deduplication in the cloud. In: 2013 IEEE Sixth International Conference on Cloud Computing. IEEE; 2013. pp. 367–374.
Ge H, Jiang Y, Bennis M, Zheng F-C, You X. Edge caching resource allocation in fog radio access networks: an incentive mechanism based approach. In: 2019 IEEE International Conference on Communications Workshops (ICC Workshops). IEEE; 2019. pp. 1–6.
Alioua A, Senouci S-M, Sedjelmaci H, Moussaoui S. Incentive edge caching in software-defined internet of vehicles: a stackelberg game approach. Int J Commun Syst. 2019;32(17):3787.
Article Google Scholar
Wu F, Xu L, Kumari S, Li X. A novel and provably secure biometrics-based three-factor remote authentication scheme for mobile client-server networks. Comput Electr Eng. 2015;45:274–85.
Article Google Scholar
Jiang Q, Khan MK, Lu X, Ma J, He D. A privacy preserving three-factor authentication protocol for e-health clouds. J Supercomput. 2016;72(10):3826–49.
Article Google Scholar
Yu S, Wang C, Ren K, Lou W. Achieving secure, scalable, and fine-grained data access control in cloud computing. In: Infocom, 2010 Proceedings IEEE. IEEE; 2010. pp. 1–9.
Li M, Yu S, Ren K, Lou W. Securing personal health records in cloud computing: patient-centric and fine-grained data access control in multi-owner settings. In: International Conference on Security and Privacy in Communication Systems. Springer; 2010. pp. 89–106.
Wang X, Zhang J, Schooler EM, Ion M. Performance evaluation of attribute-based encryption: toward data privacy in the iot. In: 2014 IEEE International Conference On Communications (ICC). (2014). IEEE; 2014. pp. 725–73.
Wang, G., Liu, Q., Wu, J.: Hierarchical attribute-based encryption for fine-grained access control in cloud storage services. In: Proceedings of the 17th ACM Conference on Computer and Communications Security. ACM; 2010. pp. 735–737.
Li M, Yu S, Zheng Y, Ren K, Lou W. Scalable and secure sharing of personal health records in cloud computing using attribute-based encryption. IEEE Trans Parallel Distrib Syst. 2013;24(1):131–43.
Article Google Scholar
Alrawais A, Alhothaily A, Hu C, **ng X, Cheng X. An attribute-based encryption scheme to secure fog communications. IEEE Access. 2017;5:9131–8.
Article Google Scholar
Jiang Y, Susilo W, Mu Y, Guo F. Ciphertext-policy attribute-based encryption against key-delegation abuse in fog computing. Futur Gener Comput Syst. 2018;78:720–9.
Article Google Scholar
Wang Z, Cao C, Yang N, Chang V. Abe with improved auxiliary input for big data security. J Comput Syst Sci. 2017;89:41–50.
Article MathSciNet Google Scholar
Fernandes A, Kotsovinos E, Östring S, Dragovic B. Pinocchio: incentives for honest participation in distributed trust management. In: International Conference on Trust Management. Springer; 2004. pp. 63–77.
Lafuente CB, Seigneur J-M. Extending trust management with cooperation incentives: a fully decentralized framework for user-centric network environments. J Trust Manage. 2015;2(1):7.
Article Google Scholar
Melnick J. Cloud Security Threats. Netwrix Blog. 2020 Sep 8. https://blog.netwrix.com/2020/09/08/cloud-security-threats/.
Cai L, Zhu Y. The challenges of data quality and data quality assessment in the big data era. Data Sci J. 2015;14.
Guo Y, Fang L, Geng K, Yin L, Li F, Chen L. Real-time data incentives for iot searches. In: 2018 IEEE International Conference on Communications (ICC). IEEE; 2018. pp. 1–6.
Understanding data replication and its impact on business strategy: stitch resource. https://www.stitchdata.com/resources/data-replication/.
Getoor L, Machanavajjhala A. Entity resolution: theory, practice and open challenges. Proc VLDB Endow. 2012;5(12):2018–9.
Article Google Scholar
Gregg F, Eder D. dedupeio/dedupe: A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution. https://github.com/dedupeio/dedupe.
Han M, Li L, **e Y, Wang J, Duan Z, Li J, Yan M. Cognitive approach for location privacy protection. IEEE Access. 2018;6:13466–77.
Article Google Scholar
Machanavajjhala A, Kifer D, Gehrke J, Venkitasubramaniam M. l-diversity: Privacy beyond k-anonymity. ACM Trans Knowl Discov Data (TKDD). 2007;1(1):3.
Article Google Scholar
Li N, Li T, Venkatasubramanian S. t-closeness: Privacy beyond k-anonymity and l-diversity. In: 2007 IEEE 23rd International Conference on Data Engineering. IEEE; 2007. pp. 106–115.
Dwork C. Differential privacy: a survey of results. In: International Conference on Theory and Applications of Models of Computation. Springer; 2008. pp. 1–19.
Leslie D. Understanding bias in facial recognition technologies. ar**v preprint ar**v: 2010.07023. 2020.
Garvie C, Frankle J. Facial-recognition software might have a racial bias problem. Atlantic. 2016;7(04):2017.
Google Scholar
Talha M, Sohail M, Hajji H. Analysis of research on amazon aws cloud computing seller data security. Int J Res Eng Innov. 2020;4(3):131–6.
Article Google Scholar
Liu L, Kong Y, Li G, Han M. Fairshare: an incentive-based fairness-aware data sharing framework for federated learning. In: International Conference on Intelligent Robotics and Applications. Springer; 2023. pp. 115–126.
Friedrich F, Schramowski P, Brack M, Struppek L, Hintersdorf D, Luccioni S, Kersting K. Fair diffusion: instructing text-to-image generation models on fairness. ar**v preprint ar**v:2302.10893 (2023). 2023.
Kakulapati V, Appiah P. Advance security: anomaly detection in mobile crowd sensing using machine learning techniques. December 23, 2020. https://doi.org/10.2139/ssrn.3754180.
Xu G, Li H, Zhang Y, Xu S, Ning J, Deng R. Privacy-preserving federated deep learning with irregular users. IEEE Trans Depend Secur Comput. 2020.
Tao X, Song W. Task allocation for mobile crowdsensing with deep reinforcement learning. In: 2020 IEEE Wireless Communications and Networking Conference (WCNC). IEEE; 2020. pp. 1–7.
Li S, Cheng Y, Liu Y, Wang W, Chen T. Abnormal client behavior detection in federated learning. ar**v preprint ar**v: 1910.09933. 2019.
Li S, Cheng Y, Wang W, Liu Y, Chen T. Learning to detect malicious clients for robust federated learning. ar**v preprint ar**v: 2002.00211. 2020.
Yin B, Yin H, Wu Y, Jiang Z. Fdc: a secure federated deep learning mechanism for data collaborations in the internet of things. IEEE Internet Things J. 2020;7(7):6348–59.
Article Google Scholar
Wen W, Cui Y, Quek TQ, Zheng F-C, ** S. Joint optimal software caching, computation offloading and communications resource allocation for mobile edge computing. IEEE Trans Veh Technol. 2020;69(7):7879–94.
Article Google Scholar
Tae KH, Roh Y, Oh YH, Kim H, Whang SE. Data cleaning for accurate, fair, and robust models: a big data-ai integration approach. In: Proceedings of the 3rd International Workshop on Data Management for End-to-End Machine Learning, 2019. pp. 1–4.
Wang T, Ke H, Zheng X, Wang K, Sangaiah AK, Liu A. Big data cleaning based on mobile edge computing in industrial sensor-cloud. IEEE Trans Industr Inf. 2019;16(2):1321–9.
Article Google Scholar
Xu X, Lei Y, Li Z. An incorrect data detection method for big data cleaning of machinery condition monitoring. IEEE Trans Ind Electron. 2019;67(3):2326–36.
Article Google Scholar
Namazi E, Li J, Lu C. Intelligent intersection management systems considering autonomous vehicles: a systematic literature review. IEEE Access. 2019;7:91946–65.
Article Google Scholar
Ma G, Wu Z. Bim-based building fire emergency management: combining building users’ behavior decisions. Autom Constr. 2020;109.
Article Google Scholar
Mashoufi M, Ayatollahi H, Khorasani-Zavareh D. Data quality assessment in emergency medical services: what are the stakeholders' perspectives?. Perspect Health Inf Manag. 2019;16(Winter):1c.
Kumar R, Wang W, Kumar J, Yang T, Khan A, Ali W, Ali I. An integration of blockchain and ai for secure data sharing and detection of ct images for the hospitals. Comput Med Imaging Graph. 2021;87.
Gu J, Lu S. An effective intrusion detection approach using SVM with naïve Bayes feature embedding. Comput Secur. 2021;103:102158.
Lu H, Ma X. Hybrid decision tree-based machine learning models for short-term water quality prediction. Chemosphere. 2020;249:126169.
Han M, Li Z, He JS, Wu D, ** attack in blockchain-based iot. IEEE Internet Things J. 2018;6(3):4614–26.
Article Google Scholar
Zhou Y, Han M, Liu L, Wang Y, Liang Y, Tian L. Improving iot services in smart-home using blockchain smart contract. In: 2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). 30 July 2018 (pp. 81–87). IEEE.
Liang G, Weller SR, Luo F, Zhao J, Dong ZY. Distributed blockchain-based data protection framework for modern power systems against cyber attacks. IEEE Trans Smart Grid. 2018;10(3):3162–73.
Kshetri N. 1 blockchain’s roles in meeting key supply chain management objectives. Int J Inf Manag. 2018;39:80–9.
Article Google Scholar
Li M, Weng J, Yang A, Lu W, Zhang Y, Hou L, Liu J-N, **ang Y, Deng R. Crowdbc: a blockchain-based decentralized framework for crowdsourcing. IEEE Trans Parallel Distrib Syst. 2018.
**ong W, Wang R, Wang Y, Zhou F, Luo X. Cppa-d: Efficient conditional privacy-preserving authentication scheme with double-insurance in vanets. IEEE Trans Veh Technol. 2021;70(4):3456–68.
Article Google Scholar
Mbarek B, Ge M, Pitner T. An efficient mutual authentication scheme for internet of things. Internet Things. 2020;9:100160.
Deep G, Mohana R, Nayyar A, Sanjeevikumar P, Hossain E. Authentication protocol for cloud databases using blockchain mechanism. Sensors. 2019;19(20):4444.
Article ADS PubMed PubMed Central Google Scholar
Schöler S, Will L. SAP IT service & application management. Galileo Press. 2006.
Recordon D, Reed D. Openid 2.0: a platform for user-centric identity management. In: Proceedings of the Second ACM Workshop on Digital Identity Management, 2006. pp. 11–16.
Hughes J, Maler E. Security assertion markup language (saml) v2.0 technical overview. OASIS SSTC Working Draft sstc-saml-tech-overview-2.0-draft-08 13, 12. 2005.
Adelin R, Alata E, Migliore V, Nicomette V. A user privacy-centric access control policy of data for intelligent transportation systems. In: Embedded Real Time Systems (ERTS) 2020.
Bhardwaj K, Gavrilovska A, Kolesnikov V, Saunders M, Yoon H, Bondre M, Babu M, Walsh J. Addressing the fragmentation problem in distributed and decentralized edge computing: a vision. In: 2019 IEEE International Conference on Cloud Engineering (IC2E). IEEE; 2019. pp. 156–167.
Wang Z, Huang J, Miao K, Lv X, Chen Y, Su B, Liu L, Han M. Lightweight zero-knowledge authentication scheme for iot embedded devices. Comput Netw. 2023;236.
Rathod YA. An access control and authorization model with open stack cloud for smart grid. ADCAIJ Adv Distrib Comput Artif Intell J. 2020;9(3):69.
Google Scholar
Papadakis-Vlachopapadopoulos K, González RS, Dimolitsas I, Dechouniotis D, Ferrer AJ, Papavassiliou S. Collaborative sla and reputation-based trust management in cloud federations. Future Gener Comput Syst. 2019; 100: 498–512
Xue S, Ren C. Security protection of system sharing data with improved cp-abe encryption algorithm under cloud computing environment. Autom Control Comput Sci. 2019;53:342–50.
Article Google Scholar
Fan J, Vercauteren F. Somewhat practical fully homomorphic encryption. Cryptology ePrint Archive. 2012.
Luo S, Hu J, Chen Z. Ciphertext policy attribute-based proxy re-encryption. In: Information and Communications Security: 12th International Conference, ICICS 2010, Barcelona, Spain, December 15-17, 2010. Proceedings 12. Springer; 2010. pp. 401–415.
Zhang C, **e Y, Bai H, Yu B, Li W, Gao Y. A survey on federated learning. Knowl-Based Syst. 2021;216.
Dayan P, Watkins CJ. Q-learning. Mach Learn. 1992;8(3):279–92.
Konda V, Tsitsiklis J. Actor-critic algorithms. Advances in neural information processing systems. 1999;12.
Walke F. Artificial intelligence explainability requirements of the ai act and metrics for measuring compliance. 2023.
Hamdani RE, Mustapha M, Amariles DR, Troussel A, Meeùs S, Krasnashchok K. A combined rule-based and machine learning approach for automated gdpr compliance checking. In: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, 2021. pp. 40–49.
Sun T, Gaut A, Tang S, Huang Y, ElSherief M, Zhao J, Mirza D, Belding E, Chang K-W, Wang WY. Mitigating gender bias in natural language processing: literature review. ar**v preprint ar**v: 1906.08976. 2019.
Field A, Blodgett SL, Waseem Z, Tsvetkov Y. A survey of race, racism, and anti-racism in nlp. ar**v preprint ar**v: 2106.11410. 2021.
Czarnowska P, Vyas Y, Shah K. Quantifying social biases in nlp: a generalization and empirical comparison of extrinsic fairness metrics. Trans Assoc Comput Linguist. 2021;9:1249–67.
Article Google Scholar
**e R, Yu F, Wang J, Wang Y, Zhang L. Multi-level domain adaptive learning for cross-domain detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019.
Jardim PSJ, Rose CJ, Ames HM, Echavez JFM, Van de Velde S, Muller AE. Automating risk of bias assessment in systematic reviews: a real-time mixed methods comparison of human researchers to a machine learning system. BMC Med Res Methodol. 2022;22(1):167.
Article PubMed PubMed Central Google Scholar
Bertino E, Kundu A, Sura Z. Data transparency with blockchain and ai ethics. J Data Inf Qual (JDIQ). 2019;11(4):1–8.
Article Google Scholar
Spanò R, Massaro M, Ferri L, Dumay J, Schmitz J. Blockchain in accounting, accountability and assurance: an overview. Account Audit Account J. 2022;35(7):1493–506.
Article Google Scholar
Wang B, Lin Z, Wang M, Wang F, **angli P, Li Z. Applying blockchain technology to ensure compliance with sustainability standards in the ppe multi-tier supply chain. Int J Prod Res. 2023;61(14):4934–50.
Article Google Scholar
Ayaz M, Pasha MF, Alzahrani MY, Budiarto R, Stiawan D. The fast health interoperability resources (fhir) standard: systematic literature review of implementations, applications, challenges and opportunities. JMIR Med Inform. 2021;9(7):21929.
Article Google Scholar
Hoffman K, Titus D. Data sharing and the transformation of veterans’ healthcare: opportunities and challenges in interoperability. Health Law. 2017;30:30.
Google Scholar
Jamil F, Hang L, Kim K, Kim D. A novel medical blockchain model for drug supply chain integrity management in a smart hospital. Electronics. 2019;8(5):505.
Article Google Scholar
UIC: Top 4 Threats to Healthcare Security. 2013. https://healthinformatics.uic.edu/resources/articles/top-4-threats-to-healthcare-security/.
Haselton, T.: Google Nest Hub new version can track your sleep with radar. CNBC Website. 2021. https://www.cnbc.com/2021/03/16/google-nest-hub-new-version-can-track-your-sleep-with-radar.html.
Amazon News. A new, easy way for properties to add Alexa to residential buildings. Amazon News Website. 2020. https://www.aboutamazon.com/news/devices/a-new-easy-way-for-properties-to-add-alexa-to-residential-buildings.
Lin H, Bergmann NW. Iot privacy and security challenges for smart home environments. Information. 2016;7(3):44.
Article Google Scholar
Vijayan J. Target attack shows danger of remotely accessible HVAC systems. Computerworld. 2014;7.
Chiu A. She installed a ring camera in her children’s room for ‘peace of mind.’ A hacker accessed it and harassed her 8-year-old daughter. Washington Post. 2019.
Alkhadra R, Abuzaid J, AlShammari M, Mohammad N. Solar winds hack: in-depth analysis and countermeasures. In: 2021 12th International Conference on Computing Communication and Networking Technologies (ICCCNT). IEEE; 2021. pp. 1–7.
Shen L. The nist cybersecurity framework: overview and potential impacts. Scitech Lawyer. 2014;10(4):16.
Google Scholar
Hirth L, Mühlenpfordt J, Bulkeley M. The entso-e transparency platform—a review of europe’s most ambitious electricity data platform. Appl Energy. 2018;225:1054–67.
Article ADS Google Scholar
Reeder JR, Hall T. Cybersecurity’s pearl harbor moment. Cyber Defense Rev. 2021;6(3):15–40.
Google Scholar
Stevens R, Zeltser L. IoT and security in the supply chain: making smart choices. 2018. https://www.inboundlogistics.com/cms/article/IoT-and-security-in-the-supply-chain-making-smart-choices/ .
Ichimura Y, Dalaklis D, Kitada M, Christodoulou A. Ship** in the era of digitalization: Map** the future strategic plans of major maritime commercial actors. Digit Bus. 2022;2(1):100022.
Kim K, Kim JS, Jeong S, Park JH, Kim HK. Cybersecurity for autonomous vehicles: Review of attacks and defense. Comput Secur. 2021;103:102150.
Plaid. https://plaid.com/products/auth/ . 2023.
Yodlee. https://www.yodlee.com/company/partner-with-us . 2023.
Brodsky L, Oakes L. Data sharing and open banking. McKinsey & Company. 2017:1105.

Download references

Author information

Authors and Affiliations

Saint Joseph’s University, 5600 City Ave, Merion Station, Philadelphia, PA, 19131, USA
Liyuan Liu
Zhejiang University Binjiang Research Institute and Zhejiang University, 866 Yuhangtang Rd, **hu, Hangzhou, Zhejiang, 310027, China
Meng Han

Authors

Liyuan Liu
View author publications
You can also search for this author in PubMed Google Scholar
Meng Han
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Han.

Ethics declarations

Competing interests

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

See Tables 3 and 4.

Table 3 Summary of abbreviations

Full size table

Table 4 Summary of symbols

Full size table

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Liu , L., Han, M. Data sharing and exchanging with incentive and optimization: a survey. Discov Data 2, 2 (2024). https://doi.org/10.1007/s44248-024-00006-2

Download citation

Received: 17 September 2023
Accepted: 17 January 2024
Published: 18 March 2024
DOI: https://doi.org/10.1007/s44248-024-00006-2

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Data sharing and exchanging with incentive and optimization: a survey

Abstract

Similar content being viewed by others

Permissions and Privacy

Solving Data Trading Dilemma with Asymmetric Incomplete Information Using Zero-Determinant Strategy

The Coming of Age of Open Data

1 Introduction

2 Preliminary definition

2.1 Data sharing

2.2 Data exchanging

2.3 Data sharing and exchanging life-cycle

2.4 Incentive mechanisms

3 Existing data sharing and exchanging incentive mechanisms

3.1 Monetary incentives

3.1.1 Game theory-based incentives

3.1.2 Demand and supply models based-incentives

3.1.3 Cost model based-incentives

3.1.4 Competition model-based incentives

3.2 Non-monetary incentives

4.2 Data storage

4.3 Data access

6.3 Employ artificial intelligence algorithms to improve data quality

6.4 Using distributed data storage techniques to ensure data security and privacy

6.5 Design authentication and encryption mechanisms

6.6 Use reinforcement learning to solve the un-shared decision problem

7.5 Automotive industry

7.6 Financial services

8 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Appendix

Appendix

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation