1 Introduction

In recent years, crowdsourcing has attracted extensive attention from both the industry and academia. It can help solve tasks that are intrinsically easier for humans than for computers by leveraging the intelligence of a large group of people [1]. Such tasks are usually intelligent and computer-hard, which cannot be effectively addressed by existing machine-based approaches, such as entity resolution, sentiment analysis, image recognition, and so on [2]. Currently, there are many successful crowdsourcing platforms, such as Upwork, Crowdflow, and Amazon Mechanical Turk (AMT). In the crowdsourcing platforms, requesters can publish tasks, which are accepted and performed by workers. Crowdsourcing tasks are classified from multiple dimensions in some current studies. Bhatti et al. [3] classify tasks as micro-, complex, creative, and macro-tasks. Microtasks can be performed in short amounts of time by individual workers. Complex tasks require skills, knowledge, and computational efforts to solve a problem and usually can be decomposed into smaller sub-tasks. Creative tasks are related to idea generation, creative design, or co-creation. Finally, macro-tasks are non-decomposable tasks and cannot be divided into smaller subproblems, require expert knowledge, skills and often involve collaboration among workers. In this paper, we focus on microtasks, which are independent and can be completed by a single worker in short amounts of time. These tasks have fine granularity and do not require workers to have specific expertise. There are many examples of microtask crowdsourcing, for example, translating text fragments, reading verification codes, rating, and ranking. Microtasks are simple for humans but difficult for computers. In the actual microtask crowdsourcing system, workers are not completely reliable, they may make mistakes or deliberately submit wrong answers. To obtain a more valid answer, tasks are usually assigned to more than one worker, each of whom performs the task independently (called a redundancy-based strategy). The answers given by different workers are then aggregated to infer the correct answer (called truth) of each task. This is a fundamental problem called Truth Inference that has been extensively studied in existing crowdsourcing which determines how to effectively infer the truth of each task. Specifically, we focus on truth inference for binary tasks in microtasks, that is, each task only has yes/on choices, which have important application value in crowdsourcing. For example, in a query such as “Do the two videos belong to the same theme?”, the expected answers of the form “yes/no” where yes is denoted by 1 and no is denoted by 0.

Truth inference is significant for controlling the quality of crowdsourcing and is a crucial issue for crowdsourcing platforms to get the correct answers. The truth inference algorithms proposed in existing works can be divided into two main categories, direct computation, and optimization methods. The direct computation estimates task truth directly based on worker answers without modeling workers or tasks. The optimization methods model the worker or task in advance then defines an optimization function to express the relationship between the worker responses and the task truth, and finally derive an iterative method to compute parameters collectively. In the optimization methods, the task truth and other parameters are mainly calculated iteratively until convergence using the EM algorithm, which is a classical and effective method for estimating the truth values of unknown variables. However, two limitations of EM hinder its effectiveness in this application scenario: EM-based algorithms are highly dependent on initialization parameters; using EM to estimate the maximum likelihood can only get the local optimal results, which often get stuck in undesirable local optima [4].

Global optimization is the most ideal result, but there are difficulties in its implementation. The most intuitive method to obtain the globally optimal result is to find the global maximum likelihood values of all possible map**s from tasks to answers, to find the most likely true answers. Consider a simple example, if we have 50 binary tasks, then the full number of task-answer sequences for these tasks is \(2^{50}\), an exceedingly large number. However, considering the large-scale operation in the context of crowdsourcing, the number of calculations required increases exponentially with the increase of tasks and workers. Therefore, it is often intractable to obtain these global optimal quality management technologies.

In this paper, global optimization cannot be achieved but we are not satisfied with the local optimal results derived by traditional optimization methods. We compromise between the inaccessible global optimization and the local optimum, and further truth discovery on the local optimum results derived from optimization-based methods, and propose an iterative optimization method to obtain an approximate global optimum solution. By modeling the worker quality, a likelihood function is constructed to capture the relationship between worker quality and task truth. We then prune the local optimum which is derived by iteratively converging the likelihood optimization function using the EM framework to construct the dominance ordering model (DOM). As a result, we narrow the search scope and reduce the map** space. Then, a Cut-point neighbor detection algorithm is designed to iteratively search the response with the maximum likelihood-based on our model until convergence [5], approaching the optimal solution without increasing large-scale computation.

To sum up, the main contributions of this paper include the following four points:

  1. 1.

    We present three different worker quality evaluation models to obtain local optimal results using maximum likelihood estimation, namely worker quality confusion matrix model, worker quality probability parameter model, and dynamic worker quality evaluation model;

  2. 2.

    We construct a pruning strategy-based dominance ordering model (DOM) based on the local optimal results, which is composed of worker responses and worker categories (i.e., task-response sequence), and reduces the space of potential task-response sequence while retaining the dominant sequence;

  3. 3.

    We propose a Cut-point neighbor detection algorithm on the constructed DOM model to find the task-response sequence with the maximum likelihood within the dominance ordering model (DOM) by iterative search;

  4. 4.

    We perform extensive experiments to compare the results obtained by our algorithm with the local optimum results obtained by the EM algorithm on a variety of metrics. The experimental results show that our algorithm significantly outperforms EM-based algorithms in both simulated data and real-world data.

The remaining of this paper is organized as follows. Section 2 discusses the related works. Section 3 describes our concept definitions and illustrates some symbols with an example. Section 4 describes the estimation model of worker quality based on static and dynamic. We describe the iterative optimization method in Sect. 5 and present our experimental results in Sect. 6. Finally, we conclude our work in Sect. 7.

2 Related Work

In existing crowdsourcing, it is common for multiple workers to be assigned the same task and the answers given by different workers are aggregated to infer the truth for each task. Since the crowd (called workers) may produce low-quality or even noisy answers, the problem of truth inference has been widely studied in existing crowdsourcing to tolerate low-quality workers and to infer high-quality results from noisy answers [6].

To solve this problem, a simple and straightforward idea is majority voting (MV), which treats the truth of each task as the answer given by the majority of employees [7,

Fig. 1
figure 1

Overall procedure of the proposed approach