A proposal for an operational methodology to assist the ranking-aggregation problem in manufacturing

Franceschini, Fiorenzo; Maisano, Domenico A.; Mastrogiacomo, Luca

doi:10.1007/s00163-024-00437-7

A proposal for an operational methodology to assist the ranking-aggregation problem in manufacturing

Original Paper
Open access
Published: 15 July 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

Research in Engineering Design Aims and scope Submit manuscript

A proposal for an operational methodology to assist the ranking-aggregation problem in manufacturing

Download PDF

Fiorenzo Franceschini¹,
Domenico A. Maisano¹ &
Luca Mastrogiacomo¹

Abstract

Ranking aggregation is an ancient problem with some characteristic elements: a number of experts, who individually rank a set of objects according to a certain (subjective) attribute, and the need to aggregate the resulting expert rankings into a collective judgment. Although this problem is traditionally very popular in fields such as social choice, psychometrics, and economics, it can also have several interesting applications in manufacturing, e.g., for customer-oriented design, reliability engineering, production management, etc. Through a case study related to a cobot-assisted manual (dis)assembly, the paper illustrates an operational methodology and various useful tools that assist in tackling the problem practically, effectively, and with a critical mind. Some of the proposed tools allow to estimate the degree of concordance among experts, and the collective judgment’s consistency and robustness. The paper is aimed at scientists and practitioners in manufacturing.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Suppose you are at the helm of a small manufacturing company looking for improvements in the product packaging process. The old packaging machines are no longer up to the task, while there are five new top models on the market. The challenge? Choosing the best one, with a large investment at stake—in the order of hundreds of thousands of euros—and a major impact on the company’s operations. To solve this complex decision, four experienced engineers and technicians provide assessments based on their technical expertise and (not always very in-depth) information on the five models; based on his/her own perspective, everyone formulates a preference ranking among the five models (e.g., the third model is preferred to the first, which in turn is preferred to the fourth, and so on). But how to combine these individual evaluations for a collective decision on the most suitable packaging machine? This is the challenge of the so-called ranking-aggregation problem, in which data and science are used to guide towards a well-informed choice. Specifically, this ancient and widespread problem has three characteristic elements (Spohn 2009; Reich 2010; Saari 2011):

1.
A set of objects to be prioritised according to a subjective attribute, i.e., a feature whose perception may depend on the person perceiving the stimulus, his/her technical knowledge and personal taste.
2.
A set of experts formulating preference rankings of the objects of interest. Experts may be regarded as equally important or with a hierarchy of importance, depending on their competence regarding the evaluation they are supposed to carry out.
3.
A collective judgement concerning the objects, resulting from the aggregation of expert rankings through a suitable aggregation technique. In the scientific literature, depending on the field and historical period, one can also encounter alternative expressions such as “collective/consensus judgement/assessment/evaluation”, which can, however, be considered interchangeable.

Traditional fields in which the problem is very popular are social choice, psychometrics, economics, and multi-criteria decision making (MCDM), with relevant contributions from eminent scientists (e.g., de Borda, Pareto, Samuelson, Arrow, Thurstone, Kendall, etc.) (Spohn 2009; Saari 2011; Arrow 2012; Bana e Costa, 2012; Köksalan et al. 2013; Franceschini et al. 2022). For instance, despite differences in terminology and application context, typical MCDM applications share the theoretical and methodological foundations of the ranking-aggregation problem. They begin with a finite number of alternatives (analogous to objects), each represented by its performance across multiple criteria (analogous to experts, though not necessarily flesh-and-blooded subjects). The objective is to identify the best alternative(s), akin to achieving a collective judgment (Belton and Stewart 2002; Zeleny 1976).

Due to the great generality, transversal disciplinary nature and multiplicity of potential applications, the ranking-aggregation problem has become of interest to many other scientific disciplines and operational contexts, including manufacturing. Some of the many possible manufacturing applications are as follows:

Production management, regarding the selection of the most appropriate production system on the basis of productivity, flexibility or other performance attributes (Chuu 2009; Chatterjee and Chakraborty 2014; Nestic et al. 2019; Hakimi-Asl et al. 2018; Qin et al. 2020).
Procurement, in which some managers have to identify the most appropriate suppliers or materials for a certain manufacturing system (Giachetti 1998; Yu and Hou 2016).
Conceptual design, regarding the opinions of different designers about alternative design concepts, from the perspective of specific technical features (Franceschini and Maisano 2019);
Quality control, regarding the prioritization of defects on manufactured parts, aggregating expert judgments by visual inspection (Franceschini and Maisano 2018b);
Reliability engineering, regarding the aggregation of the opinions of maintenance/reliability experts on the criticality of the (potential) failures in a production equipment (Geramian et al. 2019);
Customer-driven design, regarding the opinions of a panel of (service/product) customers on the degree of importance of a set of customer needs (Nahm et al. 2013);
Analysis of market demand, regarding the opinions by marketing experts about the most appropriate actions for the promotion of a new product/service (Franceschini and Maisano 2018a).

The analyst’s focus is often directed to the aggregation technique, which can be interpreted as a “black box” transforming input data (i.e., experts’ rankings and importance hierarchy) into output data (i.e., collective judgement) (Franceschini et al. 2022). However, this may lead to overlooking other important methodological aspects that characterise the ranking-aggregation problem, such as preliminary assessment of the degree of concordance among experts, verification of the consistency and robustness of output data, etc.

Aimed at scientists and practitioners in the manufacturing field, this work provides a set of useful tools to tackle the ranking-aggregation problem in a practical and effective manner, addressing the following research question: “How can the ranking-aggregation problem be effectively handled in the manufacturing field and what methodologies and tools can enhance the plausibility and robustness of the solution obtained through the expert-ranking aggregation?”. It is hypothesized that manufacturing scientists are often unfamiliar with the problem of interest, even though they may occasionally be dealing with it. Therefore, the article tries to bridge this knowledge gap by providing a relatively straightforward and effective operational methodology. The innovative aspect of this work lies in the integration of tools, within the proposed methodology, which are individually available in the scientific literature but are combined here in an organic manner. Furthermore, the proposed methodology is flexible, adapting to problems with different characteristics and incorporating different tools interchangeably. It is also iterative, including intermediate verifications that allow for adjustments and corrections while addressing the ranking-aggregation problem.

The remainder of this work is organised in three sections. Section 2 briefly introduces a real-world case study, concerning cobot-assisted manual (dis)assembly, which accompanies the description of the proposed methodology. Section 3, which is the heart of the article, provides a step-by-step description of the operational assisted methodology based on three phases, namely (i) problem formulation, (ii) collection of expert rankings, and (iii) collective judgment and validation. Section 4 summarises the original contributions of this work, its implications, limitations and insights for future research.

2 Case study

A company in the automotive industry reconditions different types of electrical components, mainly starter motors and alternators. Although the operations required are mostly manual and specific to each component, they can be divided into the following groups:

Disassembly
- Disassembling any external coverings and shells (to access the internal parts);
- Removal of electrical connectors and cables;
- Unfastening bolts, screws, and other fasteners;
- Separation of any electronic circuits (from the motherboard or main body);
- Extraction of internal components (sensors, relays, transistors, capacitors, diodes, etc.).
Reconditioning
- Identification of parts to be replaced or repaired;
- Repairing/replacing these parts;
- Intermediate testing.
Reassembly
- Mounting internal components: repaired or replaced in their respective housings;
- Reconnecting electrical cables and connectors;
- Fastening with bolts, screws, or other fastening elements;
- Ensuring that electrical connections are securely fastened;
- Reassembling external shells and coverings;
- Testing and diagnostics to verify the proper functioning of the reconditioned unit;
- Cleaning, polishing, and final marking.

Because of the wide variety of components and the complexity of (dis)assembly and repair operations, the company has been assisting human operators with collaborative robots, or simply cobots (see Fig. 1), which are particularly useful for assisting manual operations that require great precision, dexterity and strength (Gervasi et al. 2022). Cobots are extremely versatile for multiple tasks, such as (i) picking up, clam**, handing the tools and parts to be machined/assembled, (ii) supporting dimensional inspection, online quality control, etc., and (iii) guiding less experienced operators, like virtual tutors.

The current market includes a relatively wide range of cobot models, which could be adapted to the context of interest. The company management decided to identify the most appropriate cobot depending on the programming-practicality attribute, which is crucial in making task preparation faster and easier, while reducing the level of technical skills required by operators (El Zaatari et al. 2019). The following five cobot models were selected from those at the forefront of the market, as they all (i) have a similar payload (around 5–10 kg), (ii) are designed for precision assembly and machining applications, and (iii) are relatively cost-effective:

(o₁) Techman Robot TM5-700;
(o₂) ABB GoFa10;
(o₃) Universal Robots UR10E;
(o₄) Yaskawa Motoman HC10DTP Classic;
(o₅) Kinova Link 6.

In order to carry out a comprehensive evaluation, the company set up a panel of eight experts (mostly engineers, technicians and external consultants) from different technical areas and with diverse and complementary skills, a brief description of which follows:

(e₁) Industrial-automation expert with in-depth skills in industrial process design and optimization;
(e₂) Electrical engineer with comprehensive knowledge of electrical components and the technical specifications required for their safe assembly and disassembly;
(e₃) Artificial-vision specialist capable of integrating advanced vision systems onto cobots, for precise recognition and positioning of electrical components;
(e₄) Ergonomics expert with skills to define ergonomic and intuitive interaction modes with cobots for operators;
(e₅) Robot-programming specialist with significant experience in both traditional industrial robots and collaborative robots;
(e₆) Workplace-safety expert with in-depth knowledge of safety regulations and protocols for safe human-machine collaboration;
(e₇) Maintenance expert with skills for planning and managing preventive and corrective maintenance activities on cobots;
(e₈) Quality engineer with relevant experience to ensure that the robot-assisted assembly/disassembly process complies with quality standards.

3 Assisted operational methodology

This section illustrates an assisted operational methodology for tackling the ranking-aggregation problem in a practical, comprehensive and critical manner. The flowchart in Fig. 2 summarises the proposed methodology, which is divided into three operational phases illustrated in the corresponding subsections: “problem formulation”, “collection of expert rankings” and “collective judgment and validation”. The multiple feedback loops denote the iterative nature of the proposed procedure, which includes several intermediate verifications, with possible in-progress corrections and adjustments.

3.1 Problem formulation

First, the specific problem and its characteristics should be identified clearly and unambiguously. Based on the above considerations, a specific ranking-aggregation problem can be formulated. With reference to the case study, the cobot models are the n = 5 objects (o_1–o₅, cf. Section 2) that will be evaluated in terms of programming practicality, i.e., the attribute of interest. This attribute encompasses a range of desiderata, many of which are related to subjective perceptions, as detailed below:

An intuitive and easy-to-learn programming language will reduce development time and programming errors.
The user interface of the teach pendant should be intuitive and user-friendly to simplify and speed up the programming and control phase of the cobot.
It would be desirable to be able to programme and simulate the behaviour of the cobot even offline, without necessarily being connected to it.
The cobot should be integrable with external sensors (such as cameras, force sensors, etc.), so as to be more versatile for complex tasks.
The cobot programming should include advanced safety features to avoid accidents and ensure safe collaboration between the cobot and human operators.
Some cobots support the use of third-party programming languages, such as Python or C++ , making the import of external routines more versatile.
Tutorials, documentation and technical support should help make operator learning quicker and easier.

As seen in Sect. 2, the m = 8 experts (e₁–e₈) are technicians, engineers and external consultants who formulate their individual preference rankings of the cobot models. In general, when selecting experts (at least) two aspects must be taken into account:

1.
The greater the number of experts formulating their individual rankings, the higher the statistical relevance of the problem output (Friedman 1940; Kendall 1962; Gibbons and Chakraborti 2010). Unfortunately, there may be practical constraints that limit the availability of the number of experts (e.g., they should have a high level of technical expertise). Pragmatically, it would be desirable for m to be no less than 5–6 in order for the results of the study to be relevant (Franceschini et al. 2022).
2.
It may sometimes be appropriate to have a hierarchy of importance of experts, for instance by discriminating those with greater technical expertise. This hierarchy can be constructed in different ways, typically by associating each expert with a weight or defining an importance ranking (Gibbons and Chakraborti 2010; Leo Kumar 2019). In the case study, the technical competences of the experts are notably different and, at the same time, complementary, with no clear superiority of one over the other (cf. Sect. 2). For this reason, all these experts are regarded as equally important. From a practical point of view, this choice simplifies the handling of the problem and broadens the range of applicable aggregation techniques (cf. Sect. 3.3).

Next, the type of expert rankings can be determined depending on several factors, such as the goal of the problem (e.g., identifying the best/worst object(s), drawing up a complete ranking, etc.), the data-collection strategy (e.g., through focus groups, personal telephone/street interviews, online forms, etc.), the literacy level of experts, etc. Complete rankings—i.e., ranking in which experts order all objects by linking them with strict preference (“o_i ≻ o_j”) or indifference relationships (“o_i ~ o_j”)—represent a classic scenario, although their formulation requires some effort, especially if the number of objects is large (Lagerspetz 2016). On the other hand, incomplete rankings are more “digestible” for experts, because they can take into account possible hesitations or doubts; for instance, incomplete are those rankings in which only a small number of top or bottom objects are included (e.g., the three most/least preferred), or in which the expert decides to omit an object from his/her ranking (e.g., since he/she is not familiar with), or even rankings with incomparability relationships between objects (“o_i || o_j”) (Chen et al. 2012). Given the relatively small number of objects, in the present case experts are asked to formulate complete rankings of all five objects. In Sect. 3.2 we will illustrate a way to indirectly formulate complete rankings, through a simplified response mode.

Subsequently, the collective-judgment type must be defined according to the “desirable” properties for the specific problem. There is a wide range of possibilities: rankings, scalings on different scale types (e.g., interval, ratio), clusterings, scorings,^{Footnote 1} or collective judgments designating only the winner/loser object, etc. For the sake of simplicity, in the case study the expected collective judgment is represented by a complete ranking. The analyst must be aware that the choice of input/output data for the problem has implications for both the subsequent formulation of rankings (in Sect. 3.2) and the choice of aggregation technique (in Sect. 3.3).

3.2 Collection of expert rankings

This stage begins with a detailed explanation of the problem to experts, who need to understand exactly which objects are to be evaluated, the attribute against which the evaluation is to be made, and how to formulate individual rankings. In order to make this formulation less laborious, especially when the number of objects being compared is large, experts can formulate ratings^{Footnote 2} of the objects, which can then be converted into a complete ranking (see example in Fig. 3).

Returning to the case study, Fig. 4 reports the resulting (complete) expert rankings, which include relationships of strict preference (“o_i ≻ o_j”) and indifference (“o_i ~ o_j”) between objects. At this stage, it must be ensured that the experts’ rankings are formulated consistently with the expected type; if necessary, the formulation must be corrected/revised (see feedback loop from block 2.3 in Fig. 2).

Going into the rankings of the eight experts in the case study, it should come as no surprise that they sometimes differ from each other because they are often based on complementary perspectives. For example, experts e₄ and e₇ seem to express two radically different evaluations of item o₁. Probably influenced by their different backgrounds and training, these experts have developed very different perceptions of the cobot model o₁, resulting in evaluations in opposite directions.

3.2.1 Concordance among expert rankings

Evaluating the concordance^{Footnote 3} among expert rankings is a preliminary check of the plausibility of input data, which is useful to prevent difficulties, such as excessive heterogeneity in the selection of experts, poor understanding of the problem, errors in the formulation of rankings, or other potential obstacles to achieving consensus. The scientific literature includes various statistical indicators, which can be used depending on the problem characteristics (Agresti 2010; Gibbons and Chakraborti 2010; Sato and Tan 2023). Since the present case is characterized by complete expert rankings with equally-important experts, the Kendall’s W and Spearman’s ρ can be used (Franceschini et al. 2022).

W, known as coefficient of concordance, is a multivariate statistic that applies at the level of expert rankings and is related to the dispersion of the ranks associated with each object (Ross 2009; Franceschini et al. 2022). This measure belongs to the range [0, 1], where 1 indicates perfect concordance and 0 indicates independence (Legendre 2010).

Returning to the case study, each ranking can be translated into a set of ranks—that is, permutations of the integers {1, 2, 3, 4, 5}—which are then organized into a so-called rank table, i.e., a bidirectional matrix of size m × n, with row and column labels designating experts and objects (see Fig. 4b). In the case of tied objects—i.e., pairs of objects with indifference relationships, e.g., “o_i ~ o_j”—we conventionally use the average ranks that each set of bound objects would occupy if a preference could be expressed (Gibbons and Chakraborti 2010); for example, in a ranking where objects o₁ and o₃ are tied for 3rd and 4th place (e.g., see the ranking by e₆ in Fig. 4a), the average rank of (3 + 4)/2 = 3.5 would be assigned to both.

W is defined as:

$$W = \frac{{\mathop \sum \nolimits_{j = 1}^{n} \left( {R_{j} - \overline{R}} \right)^{2} }}{{\left[ {m^{2} \cdot n \cdot \left( {n^{2} - 1} \right) - m \cdot \mathop \sum \nolimits_{i = 1}^{m} T_{i} } \right]/12}},$$

(1)

being

n the number of objects (i.e., 5 here);

m the number of experts (i.e., 8 here);

R_j the column total related to the j-th column of the rank table;

$\overline{R }=m\bullet \left(n+1\right)/2$ the average column total (i.e., 24 here);

${T}_{i}={\sum }_{k=1}^{{g}_{i}}\left({t}_{k}^{3}-{t}_{k}\right)$ a correction factor for ties, in which t_k is the number of tied ranks in the k-th group of tied ranks (where a group is a set of values having constant tied rank) and g_i is the number of groups of ties in the set of ranks (ranging from 1 to n) for expert i. This correction factor ensures that, in the case of perfectly concordant rankings with ties (as all rankings coincide), W = 1 (or 100%) is obtained (Gibbons and Chakraborti 2010).

With reference to the case study (cf. expert rankings and related object ranks in Fig. 4), it is obtained W = 22.4%, denoting a relatively low level of concordance. Not surprisingly, a significance test to the null hypothesis of independence between rankings yields the following parameter:

$$Q = W \cdot m \cdot \left( {n - 1} \right) = 7.2 < \chi_{n - 1,\alpha }^{2} = 9.49,$$

(2)

${\chi }_{n-1,\alpha }^{2}={\chi }_{\text{4,5}\%}^{2}$ being a chi-square ($\chi$²) variable with n − 1 degrees of freedom, corresponding to a conventional significance level of α = 5%. Equation (2) indicates that the null hypothesis cannot be rejected with a confidence level of 1 − α = 95% (Ross 2009; Gibbons and Chakraborti 2010).

To further investigate the reasons for this low inter-expert concordance, the bivariate perspective of Spearman’s correlation coefficient^{Footnote 4} related to each possible pair of rankings (ρ) can be considered. Table 1 contains the ρ coefficients between all the possible $\left(\begin{array}{c}m\\ 2\end{array}\right)=\frac{m\cdot \left(m-1\right)}{2}=28$ pairs of expert rankings under consideration (Ross 2009).

Table 1 Spearman’s ρ correlation table for the expert rankings in Fig. 4a

Full size table

Rather pronounced negative correlations (i.e., ρ ≤ −0.4) between certain pairs of expert rankings stand out. Curiously, they often involve the ranking by e₅, denoting a sort of “countertrend” with respect to the other rankings. Upon brief investigation of the reasons for this counter-trend, it turns out that e₅ misunderstood the ranking construction, formulating it in the sense of reverse preference; therefore, the correct ranking should be “o₃≻(o₁ ~ o₂ ~ o₅)≻o₄” instead of “o₄≻(o₁ ~ o₂ ~ o₅)≻o₃” (see feedback loop from block 2.7 in Fig. 2). After this correction, the new value of W is almost twice as high as the initial one (i.e., W = 40.2% versus 22.4%) and the significance test in Eq. (2) results into $Q=12.9\ge {\chi }_{n-1,\alpha }^{2}=9.49$, which leads to rejecting the null hypothesis and considering the new level of concordance as statistically significant. Simultaneously, the relatively large negative ρ values for e₅ are “reabsorbed” (see Table 2, containing the new ρ values).

Table 2 Spearman’s ρ correlation table for the expert rankings in Fig. 4, after the correction of the ranking by e₅ (i.e., “o₃≻(o₁ ~ o₂ ~ o₅)≻o₄” instead of “o₄≻(o₁ ~ o₂ ~ o₅)≻o₃”)

Full size table

As exemplified, the concordance analysis can be useful in pointing out possible anomalies and “pitfalls” in the formulation of expert rankings (Franceschini et al. 2022).

3.3 Collective judgment and validation

At this point, it is needed to solve the ranking-aggregation problem by utilizing an appropriate aggregation technique and, subsequently, verifying the plausibility of the resulting output.

3.3.1 Ranking aggregation

This is the heart of the ranking-aggregation problem and implies some knowledge of the state-of-art aggregation techniques. Far from this ambition, Table 3 simply recalls some possible aspects to be taken into account while selecting the aggregation technique (Franceschini et al. 2022).

Table 3 Aspects to consider when selecting the aggregation technique, with reference to a specific ranking-aggregation problem (Franceschini et al. 2022)

Full size table

For an overview of the aggregation techniques, we refer the reader to relevant surveys and extensive reviews (Figueira et al. 2005; Reich 2010; Herrera-Viedma et al. 2014). For example, Table 4—adapted from (Franceschini et al. 2022)—classifies nine different aggregation techniques according to the aspects listed in Table 3. It can be noted that some techniques are suited to situations with few objects/experts, while others—which can be defined as more “parsimonious” (Kabirifar et al. 2023; Corrente et al. 2024)—are also suitable for situations with a relatively large number of objects/experts. The summary in Table 4 is evidently partial and not intended to be comprehensive. In future research, we aim to provide a more comprehensive overview in this regard. Here we just point out that (i) aggregation techniques are all inherently imperfect (Arrow 2012), (ii) their success depends not only on their efficacy, accuracy, and scientific rigour but also on their simplicity of use (Oukil 2019; Sarwar et al. 2021), and (iii) in general it would be good to avoid "falling in love" with one technique and—when possible—use multiple techniques simultaneously (cf. concept of wisdom of crowds) (Franceschini et al. 2022).

Table 4 Synthetic comparison among nine aggregation techniques illustrated in (Franceschini et al. 2022), according to the aspects in Table 3. The first and fourth techniques will be used for the case study

Full size table

In line with this consideration, two relatively simple aggregation techniques are applied for the problem of interest:

Borda count (BC). For each expert ranking, the first object accumulates one point, the second two points, and so on (Borda 1781; Saari 2011). In case of ties, the average ranks described in Sect. 3.2 can be used. The collective score (BC) of one object can be calculated by cumulating the scores related to each ranking; in this sense, the BC method implements the concept of “average rank position”. BC is used in various contexts, such as engineering design, the “RoboCup” robot soccer competition, the “Eurovision” song contest, etc. (Dym et al. 2002; Franceschini et al. 2022).
Best of the best (BoB). For each expert ranking, the most preferred object obtains one point. In case of a tie between leading objects, the point is fractionalized, dividing it by the number of objects themselves (e.g., ½ if there are 2 objects, 1/3 if there are 3, and so on). In some contexts, the BoB method is also referred to as “Plurality Voting” or “First Past the Post” (Blais 2008).

Figure 5a, b respectively show the results of the application of the BC and BoB techniques to the expert rankings (after the correction of the ranking by e₅, cf. Section 2.2). These two aggregation techniques—which are simple and well suited to complete rankings by equally-important experts—here result in two similar collective rankings (see the bottom of Fig. 5). Both techniques lead to the same “trio” of most suitable cobot models: o₃ (Universal Robots UR10E) followed by o₅ (Kinova Link 6) and again o₁ (Techman Robot TM5-700).

3.3.2 Consistency analysis

Every aggregation technique surely provides a result; but how does one know whether it is plausible? Certainly, the rationale of the aggregation technique represents a conceptual guarantee that it is capable of producing reasonable results. However, the aggregation technique that most consistently reflects expert rankings cannot be assessed ex ante, but only ex post and on a case-by-case basis (Chiclana 2002; Arrow 2012; McComb et al. 2017).

Studies have focused on the concept of consistency of the collective judgment with respect to input data, defined as “the ability of a collective judgment to reflect the rankings of experts, while taking the importance hierarchy into account” (Franceschini et al. 2022). Among the available tools to assess the degree of consistency of the solution to a certain ranking-aggregation problem, p-indicators are very versatile, as they can be adapted to a variety of contexts, such as those in which expert rankings are (i) not necessarily complete, (ii) equally important, or (iii) characterized by an importance hierarchy (Franceschini et al. 2022). In general, p-indicators can be divided into two families:

p_j, indicators of local consistency, which are based on the comparison of each j-th expert’s ranking with the collective judgement.

A preliminary operation for determining p_j is constructing “a paired-comparison table” in which each ranking (i.e., those from experts and that one deduced from the collective judgment) is transformed into sets of paired-comparison relationships (see symbols “≻” and “ ~ ” in Tables 6a, 7a. Next, a “consistency table”—which turns the paired-comparison relationships of each expert into scores, according to the scoring system in Table 5—is constructed; the conventional assignment of 0.5 points in the case of weak consistency is justified by the fact that this is the intermediate case between that of full consistency (with score 1) and that of inconsistency (with score 0) (Franceschini et al. 2022). The consistency table also reports the sum of the scores (x_j) obtained by each j-th expert ranking. Tables 6b and 7b exemplify two consistency tables related to the case study of interest, for both the aggregation techniques (BC and BoB respectively). Tables 6c and 7c show that both techniques result in collective rankings that are generally consistent with the single expert rankings. The least consistent expert rankings (i.e., those with lower p_j values) appear to be those formulated by e₄ and e₆, although the distinction is small.

Table 5 Scoring system used in the construction of the “consistency table”

Full size table

Table 6 a Paired-comparison table, b consistency table, and c p-indicators related to the BC technique (which resulted into the collective ranking: o₃≻o₅≻o₁≻o₂≻o₄, cf. Fig. 5b)

Full size table

Table 7 a Paired-comparison table, b consistency table, and c p-indicators related to the BoB technique (which resulted into the collective ranking: o₃≻o₅≻o₁≻(o₂ ~ o₄), cf. Fig. 5c)

Full size table

Next, for each j-th expert, the portion of “consistent” paired-comparisons can be calculated as:

$${p}_{j}=\frac{{x}_{j}}{\left(\begin{array}{c}n\\ 2\end{array}\right)}=\frac{{x}_{j}}{10},$$

(3)

being

x_j the total score related to the j-th expert;

$\left(\begin{array}{c}n\\ 2\end{array}\right)=\frac{n!}{2!\cdot \left(n-2\right)!}=\frac{n\cdot \left(n-1\right)}{2}$ the overall number of paired comparisons (i.e., 10 here).

p, i.e., indicator of global consistency. In the case of equally-important experts, the p_j values are aggregated through the arithmetic average (Franceschini et al. 2022):
$$p = \frac{1}{m} \cdot \mathop \sum \limits_{j = 1}^{m} p_{j} ,\quad p \in \left[ {0,1} \right].$$
(4)

In this particular case, the two aggregation techniques result in two relatively close p-values: i.e., 75.0% for BC and 73.8% for BoB (see Tables 6c and 7c). This confirms that both techniques yield collective rankings that are relatively consistent with the input data (and vice versa), with a slight predominance of BC over BoB. In the case of non-equally-important experts and/or incomplete expert rankings, the formulation of p-indicators is more complex (Franceschini et al. 2022).

Besides the p-indicators, another tool for assessing consistency is ${W}_{k}^{\left(m+1\right)}$, i.e., an indicator inspired by Kendall’s W (cf. Equation 1), which is nothing more than W itself applied to (m + 1) rankings consisting of: (i) the m-expert rankings, and (ii) the collective ranking obtained after the application of a given aggregation model (k) to the previous expert rankings. Consistency between collective ranking and expert rankings is assessed in relative terms, by comparing ${W}_{k}^{\left(m+1\right)}$ with the traditional W. ${W}_{k}^{\left(m+1\right)}\ge W$ denotes consistency (or positive consistency) between the collective ranking and the m-rankings, while ${W}_{k}^{\left(m+1\right)}<W$ denotes inconsistency (or negative consistency) (Franceschini and Maisano 2021). The latter situation can occur when a collective ranking is somehow conflicting with the m-rankings. To make the consistency assessment easier, another synthetic indicator can be used:

$$b_{k}^{\left( m \right)} = \frac{{W_{k}^{{\left( {m + 1} \right)}} }}{{W^{\left( m \right)} }},\quad b_{k}^{(m)} \in ]0, + \infty ].$$

(5)

For a specific set of m rankings, ${b}_{k}^{\left(m\right)} \ge 1$ indicates that the aggregation model (k) provides a somehow consistent collective ranking (positive consistency), while ${b}_{k}^{\left(m\right)}<1$ indicates that it provides a somehow inconsistent collective ranking (negative consistency). Table 8 exemplifies the calculation of indicators ${W}_{k}^{\left(m+1\right)}$ and ${b}_{k}^{\left(m\right)}$ for the case study, considering the BC and BoB aggregation techniques respectively. Positive consistency is observed for both techniques, with a slight predominance of BC over BoB (e.g., consider the ${b}_{k}^{(m)}$ value of 1.13 for BC versus 1.12 for BoB), confirming the result obtained through p-indicators.

Table 8 W, ${W}_{k}^{\left(m+1\right)}$, and ${b}_{k}^{m}$ indicators for the collective rankings resulting from the application of BC and BoB aggregation techniques to the problem of interest

Full size table

3.3.3 Robustness of the solution

The formulation of rankings is often affected by inherent variability, which can "propagate" onto the variability of the output (Saltelli et al. 2006). Only a very few aggregation techniques associate the resulting collective judgment with a corresponding estimate of variability (Franceschini and Maisano 2020). In general, it may be useful to perform a sensitivity analysis to assess the robustness of the solution against small variations in the input data (Saltelli et al. 2006). An example of sensitivity analysis follows.

Table 9 contains three sets of expert rankings: (i) the initial one (cf. Fig. 5a) and (ii, iii) two additional ones, obtained by applying small distortions to the initial one. These distortions can be achieved automatically in multiple ways. In the present case, a procedure described in the following four steps was adopted.

1.
Each expert ranking is translated into a scoring corresponding to the average ranks of individual objects. For example, the ranking by e₅, i.e., o₃≻(o₁ ~ o₅)≻o₂≻o₄, is translated into the scores (s) o₁ = 2.5, o₂ = 4, o₃ = 1, o₄ = 5, o₅ = 2.5 (cf. Fig. 5b).
2.
Next, the score (s) of each object is distorted by adding to it an error (ε) given by a zero-mean random variable, uniformly distributed within the interval [-1, + 1], i.e., $\varepsilon \sim U(-1,+1)$. Translating this into a formula:

$$s^{\prime} = s + \varepsilon ,$$

(6)

being ${s}^{\prime}$ the resulting distorted score. The above interval of variability seems in line with the idea of small (positive or negative) variations of input rankings.

Table 9 Set of rankings used for sensitivity analysis

Full size table

3.
Next, the score ($s^\prime$) of each object is rounded to the nearest integer, resulting in the new score:
$$s^{\prime\prime} = round\left( {s^\prime } \right),$$
(7)
where round(·) is an operator that rounds a certain score to the nearest integer.

For example, applying the distortion in Eq. 6 to the scores (s) at step one, we get the scoring ($s^\prime$): o₁ = 2.1, o₂ = 4.6, o₃ = 1.6, o₄ = 4.8, o₅ = 2.6; then, applying the rounding in Eq. 7, we get the new scoring (${s}^{{\prime}{\prime}}$): o₁ = 2, o₂ = 5, o₃ = 2, o₄ = 5, o₅ = 3.

4.
Subsequently, the set of ${s}^{{\prime}{\prime}}$ scores are translated into an “additional” ranking, with relationships of strict preference ("≻") and indifference (" ~ "), similarly to the transformation from rating to ranking in Fig. 3. Returning to the above example, the ${s}^{{\prime}{\prime}}$ scores at the following step are transformed into the (additional) ranking: (o₃ ~ o₁)≻o₅≻(o₂ ~ o₄). The procedure was extended to all initial rankings and repeated twice, resulting in the two additional sets of rankings in Table 9(ii), (iii).

For each set (initial and additional), the collective scoring/ranking was determined by applying the BC and BoB aggregation techniques (see results in Table 10). Next, the average dispersion in the rank position of individual objects can be used as a proxy for the robustness of the resulting collective rankings (see Table 11). In this specific case—BC provides somewhat more robust results than BoB (i.e., lower mean standard deviation of 0.44 against 0.60). However, both solutions appear relatively robust (i.e., mean standard deviation lower than 1), therefore no revision of the aggregation techniques seems necessary (cf., feedback loop from block 3.6 of Fig. 2).

Table 10 Rank tables and collective scorings/rankings resulting from sensitivity analysis

Full size table

Table 11 Results of sensitivity analysis, in terms of mean standard deviation of the objects’ rank positions

Full size table

4 Conclusion

This paper focused on the ranking-aggregation problem, highlighting its significance due to the variety of potential applications in the field of manufacturing. By adopting a pragmatic approach based on a case study, the paper has elucidated a sequential and iterative operational methodology to address the problem at various levels:

Checking the plausibility of expert rankings in terms of concordance, through multivariate and bivariate statistical measures;
Guiding the aggregation-technique selection, depending on the desired types of input and output data;
Evaluating the consistency and robustness of the resulting collective judgment.

The case study has demonstrated that approaching the problem systematically necessitates multiple iterations and corrections at the aforementioned levels. Notably, the application of the aggregation technique is just one component of the proposed methodology, with various verifications and corrections required prior to the aggregation phase.

This study not only enhances the understanding of the complexity of the ranking-aggregation problem but also provides practical tools to tackle it in a structured and efficient manner. The outcomes hold value for both scientists and practitioners in the manufacturing domain who encounter decision-making challenges related to ranking aggregation. It is worth mentioning these actors may not have extensive expertise to deal with the ranking-aggregating problem comprehensively; thus, the proposed procedure helps to fill this gap.

The proposed methodology can be considered modular in that it is able to combine several practical tools interchangeably; however, the discussion provided in this paper, to avoid excessive length, was limited to exemplifying a few specific tool (e.g., ρ and W as indicators of concordance of experts’ rankings, and p-indicators as measures of consistency between input and output data). Furthermore, the authors acknowledge that the choice of the aggregation technique remains perhaps the most delicate aspect, which was only marginally addressed in this paper. Future research plans include establishing an extensive taxonomy of aggregation techniques and analytical tools to facilitate their selection for specific problems. It is envisaged to create a step-by-step procedure that, based on the problem’s characteristics specified by the user, will guide the selection of appropriate aggregation techniques tailored to the specific case.

Availability of data and materials

Not applicable.

Notes

A scoring can be seen as the synthesis of multiple rankings through scores associated with objects, according to a conventional rule. The scoring is not necessarily a measurement of the degree of the attribute of an object (Fransceschini et al. 2022).
A rating is defined as a set of categories designed to elicit information about an attribute. In social sciences and psychology, a common example is the 5-level Likert response scale, in which an expert selects the number in the ordinal category that is believed to reflect the perceived attribute of an object.
In statistics, the notion of concordance is also referred to as degree of association (Ross 2009).
ρ belongs to [-1, 1]; a perfect ρ correlation of + 1 or − 1 occurs when the first set of rank values is a perfect monotone (respectively increasing or decreasing) function of the second one (Ross 2009).

References

Agresti A (2010) Analysis of ordinal categorical data, 2nd edn. John Wiley & Sons, New York
Book Google Scholar
Arrow KJ (2012) Social choice and individual values, 3rd edn. Yale University Press, New Haven
Google Scholar
Bana e Costa CA (ed) (2012) Readings in multiple criteria decision aid. Springer Science & Business Media, Berlin Heidelberg
Google Scholar
Belton V, Stewart T (2002) Multiple criteria decision analysis: an integrated approach. Springer Science & Business Media, New York
Book Google Scholar
Blais A (ed) (2008) To keep or to change first past the post? the politics of electoral reform. Oxford University Press, Oxford
Google Scholar
Borda JC (1781) Mémoire sur les élections au scrutin comptes rendus de l’académie des sciences. Translated by alfred de grazia as mathematical derivation of an election system. Isis 44:42–51. https://doi.org/10.1086/348187
Article Google Scholar
Chatterjee P, Chakraborty S (2014) Flexible manufacturing system selection using preference ranking methods: a comparative study. Int J Ind Eng Comput 5(2):315–338. https://doi.org/10.5267/J.IJIEC.2013.10.002
Article Google Scholar
Chen S, Liu J, Wang H, Augusto JC (2012) Ordering based decision making—a survey. Inf Fusion 14(4):521–531. https://doi.org/10.1016/j.inffus.2012.10.005
Article Google Scholar
Chiclana F, Herrera F, Herrera-Viedma E (2002) A note on the internal consistency of various preference representations. Fuzzy Sets Syst 131(1):75–78. https://doi.org/10.1016/S0165-0114(01)00256-1
Article MathSciNet Google Scholar
Chuu SJ (2009) Group decision-making model using fuzzy multiple attributes analysis for the evaluation of advanced manufacturing technology. Fuzzy Sets Syst 160(5):586–602. https://doi.org/10.1016/j.fss.2008.07.015
Article MathSciNet Google Scholar
Corrente S, Greco S, Rezaei J (2024) Better decisions with less cognitive load: the parsimonious BWM. Omega 126:103075
Article Google Scholar
Dym CL, Wood WH, Scott MJ (2002) Rank ordering engineering designs: pairwise comparison charts and Borda counts. Res Eng Design 13:236–242. https://doi.org/10.1007/s00163-002-0019-8
Article Google Scholar
El Zaatari S, Marei M, Li W, Usman Z (2019) Cobot programming for collaborative industrial tasks: an overview. Robot Auton Syst 116:162–180. https://doi.org/10.1016/j.robot.2019.03.003
Article Google Scholar
Figueira J, Greco S, Ehrgott M (2005) Multiple criteria decision analysis: state of the art surveys. Springer, New York
Book Google Scholar
Franceschini F, Maisano D (2018a) A new proposal to improve the customer competitive benchmarking in QFD. Qual Eng 30(4):730–761. https://doi.org/10.1080/08982112.2018.1437178
Article Google Scholar
Franceschini F, Maisano D (2018b) Classification of objects into quality categories in the presence of hierarchical decision-making agents. Accred Qual Assur 23(1):5–17. https://doi.org/10.1007/s00769-017-1291-7
Article Google Scholar
Franceschini F, Maisano D (2019) Design decisions: concordance of designers and effects of the Arrow’s theorem on the collective preference ranking. Res Eng Design 30(3):425–434. https://doi.org/10.1007/S00163-019-00313-9
Article Google Scholar
Franceschini F, Maisano D (2020) Aggregation of incomplete preference rankings: robustness analysis of the ZM II-technique. J Multi-Criteria Decis Anal 27(5–6):337–356. https://doi.org/10.1002/mcda.1721
Article Google Scholar
Franceschini F, Maisano D (2021) Aggregating multiple ordinal rankings in engineering design: the best model according to the Kendall’s coefficient of concordance. Res Eng Design 32(1):91–103. https://doi.org/10.1007/s00163-020-00348-3
Article Google Scholar
Franceschini F, Maisano D, Mastrogiacomo L (2022) Rankings and decisions in engineering: conceptual and practical insights. International series in operations research & management science series, vol 319. Springer International Publishing, Cham
Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11(1):86–92. https://doi.org/10.1214/AOMS/1177731944
Article MathSciNet Google Scholar
Geramian A, Abraham A, Ahmadi NM (2019) Fuzzy logic-based FMEA robust design: a quantitative approach for robustness against groupthink in group/team decision-making. Int J Prod Res 57(5):1331–1344. https://doi.org/10.1080/00207543.2018.1471236
Article Google Scholar
Gervasi R, Mastrogiacomo L, Maisano DA, Antonelli D, Franceschini F (2022) A structured methodology to support human–robot collaboration configuration choice. Prod Eng Res Devel 16:435–451. https://doi.org/10.1007/s11740-021-01088-6
Article Google Scholar
Giachetti RE (1998) A decision support system for material and manufacturing process selection. J Intell Manuf 9(3):265–276. https://doi.org/10.1023/A:1008866732609
Article Google Scholar
Gibbons JD, Chakraborti S (2010) Nonparametric statistical inference, 5th edn. CRC Press, Boca Raton. https://doi.org/10.1201/9781439896129
Book Google Scholar
Hakimi-Asl A, Amalnick MS, Hakimi-Asl M (2018) Proposing a graph ranking method for manufacturing system selection in high-tech industries. Neural Comput Appl 29:133–142. https://doi.org/10.1007/s00521-016-2420-7
Article Google Scholar
Herrera-Viedma E, Cabrerizo FJ, Kacprzyk J, Pedrycz W (2014) A review of soft consensus models in a fuzzy environment. Inf Fusion 17:4–13
Article Google Scholar
Kabirifar K, Ashour M, Yazdani M, Mahdiyar A, Malekjafarian M (2023) Cybernetic-parsimonious MCDM modeling with application to the adoption of circular economy in waste management. Appl Soft Comput 139:110186
Article Google Scholar
Kendall MG (1962) Ranks and measures. Biometrika 49(1/2):133–137
Article MathSciNet Google Scholar
Köksalan M, Wallenius J, Zionts S (2013) An early history of multiple criteria decision making. J Multi-Criteria Decis Anal 20(1–2):87–94
Article Google Scholar
Lagerspetz E (2016) Social choice and democratic values. Springer, Heidelberg
Book Google Scholar
Legendre P (2010) Coefficient of concordance. Encycl Res Design 1:164–169
Google Scholar
Leo Kumar SP (2019) Knowledge-based expert system in manufacturing planning: state-of-the-art review. Int J Prod Res 57(15–16):4766–4790. https://doi.org/10.1080/00207543.2018.1424372
Article Google Scholar
McComb C, Goucher-Lambert K, Cagan J (2017) Impossible by design? Fairness, strategy and Arrow’s impossibility theorem. Des Sci 3:1–26
Article Google Scholar
Nahm YE, Ishikawa H, Inoue M (2013) New rating methods to prioritize customer requirements in QFD with incomplete customer preferences. Int J Adv Manuf Technol 65(9–12):1587–1604
Article Google Scholar
Nestic S, Lampón JF, Aleksic A, Cabanelas P, Tadic D (2019) Ranking manufacturing processes from the quality management perspective in the automotive industry. Expert Syst 36(6):e12451
Article Google Scholar
Oukil A (2019) Embedding OWA under preference ranking for DEA cross-efficiency aggregation: issues and procedures. Int J Intell Syst 34(5):947–965. https://doi.org/10.1002/int.22082
Article MathSciNet Google Scholar
Qin Y, Qi Q, Scott PJ, Jiang X (2020) An additive manufacturing process selection approach based on fuzzy Archimedean weighted power Bonferroni aggregation operators. Robot Comput-Integr Manuf 64:101926
Article Google Scholar
Reich Y (2010) My method is better! Res Eng Design 21(3):137–142
Article Google Scholar
Ross SM (2009) Introduction to probability and statistics for engineers and scientists. Academic Press, New York
Google Scholar
Saari DG (2011) Decision and elections. Cambridge University Press, Cambridge
Google Scholar
Saltelli A, Ratto M, Tarantola S, Campolongo F (2006) Sensitivity analysis practices: strategies for model-based inference. Reliab Eng Syst Saf 91(10–11):1109–1125
Article Google Scholar
Sarwar M, Akram M, Liu P (2021) An integrated rough ELECTRE II approach for risk evaluation and effects analysis in automatic manufacturing process. Artif Intell Rev 54(6):4449–4481. https://doi.org/10.1007/s10462-021-10003-5
Article Google Scholar
Sato Y, Tan KH (2023) Inconsistency indices in pairwise comparisons: an improvement of the consistency index. Ann Oper Res 326(2):809–830
Article MathSciNet Google Scholar
Spohn W (2009) A survey of ranking theory. In: Huber F, Schmidt-Petri C (eds) Degrees of belief. Springer, Dordrecht, pp 185–228
Chapter Google Scholar
Yu Q, Hou F (2016) An approach for green supplier selection in the automobile manufacturing industry. Kybernetes 45(4):571–588
Article Google Scholar
Zeleny M (ed) (1976) Multiple criteria decision making Kyoto 1975 (Vo. 123). Lecture notes in economics and mathematical systems. Springer Science & Business Media, Heidelberg
Google Scholar

Download references

Funding

Open access funding provided by Politecnico di Torino within the CRUI-CARE Agreement.

Author information

Authors and Affiliations

DIGEP (Dept. of Management and Production Engineering), Politecnico di Torino, Corso Duca Degli Abruzzi 24, 10129, Torino, Italy
Fiorenzo Franceschini, Domenico A. Maisano & Luca Mastrogiacomo

Authors

Fiorenzo Franceschini
View author publications
You can also search for this author in PubMed Google Scholar
Domenico A. Maisano
View author publications
You can also search for this author in PubMed Google Scholar
Luca Mastrogiacomo
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The authors have provided an equal contribution to the drafting of the paper.

Corresponding author

Correspondence to Fiorenzo Franceschini.

Ethics declarations

Conflict of interest

The authors do not have conflict of interest.

Ethical approval

The authors respect the Ethical Guidelines of the Journal.

Consent to participate

Not applicable.

Consent to publish

Not applicable.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Franceschini, F., Maisano, D.A. & Mastrogiacomo, L. A proposal for an operational methodology to assist the ranking-aggregation problem in manufacturing. Res Eng Design (2024). https://doi.org/10.1007/s00163-024-00437-7

Download citation

Received: 19 March 2023
Accepted: 28 June 2024
Published: 15 July 2024
DOI: https://doi.org/10.1007/s00163-024-00437-7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A proposal for an operational methodology to assist the ranking-aggregation problem in manufacturing

Abstract

1 Introduction

2 Case study

3 Assisted operational methodology

3.1 Problem formulation

3.2 Collection of expert rankings

3.2.1 Concordance among expert rankings

3.3 Collective judgment and validation

3.3.1 Ranking aggregation

3.3.2 Consistency analysis

3.3.3 Robustness of the solution

4 Conclusion

Availability of data and materials

Notes

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethical approval

Consent to participate

Consent to publish

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation