|This page describes the general theory behind the voting system for A' Design Award and Competitions.|
Design Competition Voting System - How to derive real criteria weights for a design competition using the preference ordering votes
Abstract: In order to have intertemporally comparable results for design competitions, we should focus on criteria based voting with weights assigned to each criteria. However, till now, predetermined weights for criteria have been used and these predetermined weights are not based on solid backgrounds; they were selected with simple reasoning. On the other hand, by using reserve-engineering of preference orderings, we could derive the real criteria weight that jurors use when reflecting in action during the voting process of a design competition. To do so, we run a true design competition where the jury is asked to vote twice, first being the preference order of designs and second being the criteria based voting. We aim to gather following information: 1. What are the real criteria weights that are used by jury members when they are voting designs in a preference order. 2. What are other possible criteria that should be considered when voting for designs. 3. How can we use the real criteria weights to improve voting processes of design competitions. Finally, we would like to run a survey to collect further information about the fundamental rules that govern the voting mechanism in a design competition. This article explains how we could design an established, fair and founded voting system for a design competition.
Before, we can develop such a system, we should see what is actually happening in a real-life case. To do so, we will discuss about a hypothetical design competition where there are 4 entries, the number of submissions is selected as 4 for ease of demonstration.
Because there are only 4 submissions, the ranking board should look like the following:
First Place (1st) is the obvious winner, where second (2nd), third (3rd) and fourth (4th) places are other possibilities, such as awarded, mention, runner-up or not-awarded etc.
Assume that we have 4 different submissions or designs (D1,D2,D3,D4) to be voted. The number of possible ordering is 4! = 4 x 3 x 2 x 1= 24. The general rule is n! (n-factorial)(In mathematics, the factorial of a positive integer n, denoted by n!, is the product of all positive integers less than or equal to n).
Assume that we have 4 different jury members (J1,J2,J3,J4), and also assume that each jury member has a different preference order than other; if P is the Preference Order Mentality of a jury member, we can state the following: ∀P(Jx)≠P(Jy), in other words, P(J1)≠P(J2)≠P(J3)≠P(J4).
Now that we have both the submissions and the jury, we could run a jury session to define the winners. In most cases, entries are ranked so that there is a winner, a second place, a third place, a fourth place and so on so forth. But this might not be the case, in some design competitions only the winners are ranked, and the rest are discarded.
The most common way to order entries in a design competition is by collective preference ordering, where simultaneously entries are ranked all together by up or down votes with many jury members acting together at once through discussions, physical displacement, ordering of designs and constant dialogue, in this case within short amounts of time many entries are ranked. This is a hive-mind structural process where individual jury members lose their distinct personalities and act as a community, by forming the Community Jury (J-C). This is an efficient way to rank designs but comes up with its' unique issues, especially;
Issues with Collective Preference Ordering (Community Jury)
We can improve this voting system for design competitions in such a way that; we could instead ask jury members to vote independently of each other. This is actually a pretty common way of ranking that is also used in international sports competitions. But what we suggest at this step differs; instead of directly giving scores for each design, lets stay more focused on the ordering, so that we could understand how the scores are given.
At this step, each jury member orders the design submissions on their own by casting preference votes. (In some competitions, jury have specific amount of "votes" to distribute between designs)
How do we define the winner in this case? If we consider all the Jury Members as equals, we could use an established strategy: Multiple Winner Borda Count. We start by assinging a score for a rank; lets say that 1st is 4 points, 2nd is 3 points, 3rd is 3 points and 4th is 1 point. Then for each of the submissions, we sum these points.
We see that for this voting session, the jury has selected the D3 as the winner. But, we still have some issues to solve; all the jury members are indeed not equals when we consider their skill sets; they have different backgrounds that give them better judging skill on different aspects of a design. Jury members coming from the industry are more likely to now if a design is easy to manufacture, and for instance, jury members coming from the academic sphere are more likely to now if a design has already been done before or not etc.
How can we reflect the different skill-sets of jury-members on voting? The answer is trivial; we can weight the votes of different jury members, for each criteria or checkpoints. Criteria or checkpoints are specificly focused information that we consider when evaluating a design project, for example the ergonomics aspect of a design could be a criteria, ease of manufacturing, innovative use of material could also be different criterias. Although the answer is trivial, application requires further study.
So the question now becomes, what criteria can we use when evaluating projects; for each different product category, the criteria should also be different; for a graphic design we cannot talk about ergonomics for example. We could have many classifications; for instance just for industrial design, we might have 32 categories if we had used locarno classificiation. However, a better approach could be dependend, instead of categories, criterias that are present in any type of design.
Given these criteria, we could vote on any type of design project, we will have a total score for each design using the following formula:
Total Score (TS) = A x Wa + B x Wb + C x Wc + D x Wd + E x We + F x Wf + G x Wg + H x Wh + I x Wi + Z x Wz
The following is an exaple voting example, from only a single jury member, if all the critera have equal weights.
Now lets demonstrate the weights as well: The Criteria Score (CS) is calculated as, criteria point times weight of the criteria. (Y x Wy).
If we were to repeat it for each jury member, we would than have a table such as the following:
Ranking will be made by ascending order, design with the highest score (TS), becomes the 1st. But now, we have another question that is of importance: What are the weights of these criteria on designs? How can we calculate a correct, real score, if we do not know the weights? In the above example, we had given equal weights to each criteria.
This is intriguing, because normally the weights for criteria are given at the begining of a design competition, pre-determined by some experienced jury members, consultants, or the organizer. However, the truth is that, these pre-determined, given values for criteria weight rarely reflect the true preferences of jury members. The ranking by preference ordering, therefore is usally different from the ranking of criteria voting. The aim is to find the correct weights for the criteria such that the resulting rankings of both systems will give us very similar results; ranking of preference ordering should be similar to ranking by criteria voting.
There is indeed a way to find these weights by reverse-engineering of preferences. To do so, we need the jury members to evaluate each design twice using both of the methods. Afterwards, we can now run a regression analysis to find out what the weights are. This could be computed, or could also be brute-force calculated by trying millions of possibilities within seconds using a computer algorithm. We aim the preference ordering ranks for each design to be similar with the criteria voting rankings.
The analysis could be done for each jury member, to see their personal preferences, or could be done globally for understanding the community preferences of all jury members to see the general results. We need to consider one more thing; we cannot let the total submissions effect the preference score in a great way. Instead of using the standard formula for preference score; Preference Score (PS) = Total Submissions (N) +1 - Preference Order of Design. We can use a modified formula, which will have normalized score even if the number of submissions vary. To do so, we come up the following formula:
Modified Preference Score (MPS) = PS / Max(PS) * Max(TS).
Above, if the maximum total score is 100, then we have normalized, modified preference ordering scores for each of the designs.
Now we try to find the best weights such that total scores from criteria voting would be equal or similar to the modified preference ordering scores by preference voting.
Of course the above equaition is unsolveable; there are not enough submissions to determine the weights; for the above equation to be solveable, we need at least 10 submissions, furthermore another issue is that with only the minimum number of submissions, the numbers would again not make sense, as the modified preference ordering score is actually somehow biased. Instead, with a statistically significant number of votes our aim is to find consistency with the following way:
Where, S1 > S2 > S3 > S4, and we could use the modified preference ordering score as a reference. Given all the above information, our test is as follows:
Jury votes twice by 1. Preference Ordering, and 2. Criteria Voting., we then try to have consistent criteria weights that would provide similar results with the preference orderings.
You might have asked why do we try to match the criteria votes to that of preference ordering values, the reason is to have intertemporally comparable results in the end; if we do run preference ordering every competition, the criteria weights would have been different, and the results in different runs could not be compared.
|Follow us : Twitter | Facebook | Google+.||