By Jorge Campos
When annotators label the same data, adjudication is the process to resolve inconsistencies among the different versions and to promote a final version to the gold standard (master).
In a multi-user annotation environment, adjudication is usually a manual and lengthy process. It involves comparing each version and resolving conflicts one by one, especially in NLP.
tagtog now supports automatic adjudication to accelerate this process. Today we show when and how to use it.
If you want to know more about the different types of methods for adjudication you can read this post: The adjudication process in collaborative annotation
Requirements
There are two major constraints you need to be aware of before automating this process:
Overlapping in the annotated data
There must be overlapping within the data annotated. A portion of the data items must be annotated by more than one annotator to guarantee annotations are comparable.
In tagtog, you choose the overlapping degree when you distribute the dataset among users. For instance, three different users must annotate each document.
Clear and well-structured guidelines
If there is room for ambiguity, the judgment of the annotators might differ in identical scenarios and lower IAA.
Create and maintain your guidelines at tagtog. You can add pictures and format your rules using Markdown.
It all boils down to the complexity of your schema and domain. Before starting the production work, it is usually required to go over several annotation sessions to refine the guidelines.
We calculate the IAA automatically while users annotate. After each session, you can monitor progress and decide when you start with the bulk of your work. This usually means getting high IAA metrics first.
Automatic adjudication
In an environment with the appropriate conditions described above, it makes sense to move towards automation. It is recommended to interact with annotators, and it is clear that the manual resolution of inconsistencies is always more accurate. However, this process is hardly scalable.
We would like to introduce automatic adjudication using IAA. It chooses the annotations from the available-best annotator for each annotation task.
It follows a merging strategy based on choosing the annotations from the user with the highest IAA for all the documents for a specific annotation task.
A sample project
Let's create a new project and edit the settings.
Entities
Create two entity types (Settings > Entities):
- SoftSkill: annotate soft skills in job offers.
- TechnicalSkill: annotate technical skill in job offers.
Guidelines
In certain cases, the annotation tasks created might cause confusion across your team. Let's try to be more specific in the guidelines (Settings > Guidelines) of the project.
Members
We add some collaborators to help us tag.
Work distribution
In this example 3 different annotators will annotate each document. Let's setup the distribution settings.
Now we are ready to import some text.
Documents
For this sample project, I have added the requirements of 6 job offers. As tagtog supports Markdown, I have imported formatted text, so the content is more engaging for the annotation process.
Each time a document is imported, it is assigned automatically to 3 members.
When users enter the project, they are automatically redirected to the TODO filter. This filter only shows the documents in their queue.
Once all users annotated and confirmed their version of the documents assigned, we have enough information to calculate the agreement among annotators (IAA). The platform will crunch the numbers for us.
You can always check the IAA at the Metrics section in your project.
The matrices in Fig. 6. show the agreement between each possible pair of annotators for each annotation task. For example: member03 and jorge agree on the 84.41% of the cases for the annotation task techicalSkill. Whereas, member03 and jorge agree on the 74.27% of the cases for the annotation task softSkill.
These numbers reveal:
- jorge is the annotator with the highest IAA average for the annotation task technicalSkill
- member03 is the annotator with the highest IAA average for the annotation task softSkill
Automatic adjudication
If the quality metrics meet the requirements, an admin can start the adjudication process. tagtog does the dirty work.
The adjudication process doesn’t choose a specific version from the users. It integrates the best annotations for each annotation task. These resulting annotations go to the master version.
What has just happened? tagtog integrated the technicalSkill annotations from jorge (top annotator for the annotation task) and the softSkill annotations from member03 (top annotator for the annotation task).
Now we can either leave this version as final or assign it for review.
👏👏👏 if you like the post, and want to share it with others! 💚🧡