Overlapping text annotations

When it comes to text annotation, sometimes you need to annotate entities or fragments of text contained within others or simply overlapping others.

One of our goals at tagtog is to allow users to quickly train machine learning models. Often, your training data is not great in terms of quality or quantity, however, you can still achieve quick results with acceptable accuracy, and in some cases, such an action might solve your problem. From there you can iterate.

Overlapping annotations increase the flexibility and allow you to make the most out of your data

Let’s take a closer look into this type of annotations.

For example: Toyota Corolla , where Toyota is a Company and Toyota Corolla is a Vehicle Model.

Image for post
One annotation containing another. Each entity type is represented by one color.

The user needs to read the text, so the point is to visualize the annotations, and not to disturb the user while reading it. We did that by not increasing the height among lines or the space among words.

That was easy! let’s go with the next example.

Image for post
Three entities annotated, two annotations are overlapping

From the visual side, as in the previous case, we didn’t break the text structure to provide a nice reading experience. Annotations:

  • Reddit as a Company
  • Reddit closed $50 million in funding as Investment
  • $50 million in funding at a $500 million valuation as Valuation

Most of the text annotation tools out there do not support such annotations, and the process to create an annotated corpora can be stricter, slower and more expensive.

You can handle other scenarios as annotations contained within the exact same text span. This is convenient for an annotation representing more than one concept. For example:

Image for post
Sample of customer feedback. Two annotations (first in pink, second in yellow) within the same span represent two entities

In this case the span brake adjuster represents a Vehicle Part and a Failing Part.

Summing up:

Using overlapped or contained annotations can help you focus on the value of your data and reduce your dependency from rigorous guidelines, hierarchies and other painful steps very often not required to get results.

And YES, it is possible to display these annotations and make the annotation journey a pleasant and efficient one.

With this text annotation tool you can generate training data at scale. You can use it for free at tagtog.net, send us your feedback!

The text annotation tool to train #AI. Easy. 🔗tagtog.net

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store