By Jorge Campos
Taking the spaCy v3 release opportunity, we wanted to write a small piece about how to integrate tagtog and spaCy.
tagtog is a multi-user text annotation tool designed to build high-quality data efficiently. spaCy is an open-source library for advanced Natural Language Processing (NLP) in Python.
This example uses spaCy to automatically generate NER (Named-Entity Recognition) annotations and display these annotations directly in tagtog.
First, we create a project in tagtog and define a few entity types in the project settings. Bear in mind that these types should map those used by your model. We will use the spaCy’s model en_core_web_sm for English to extract entities representing people, organizations, and money. We review the model label scheme and find these labels: PERSON, ORG, and MONEY. In the web application, we create similar entity types:
Second, we want to upload text annotated by this model. To do that, first we transform the annotations coming out of the spaCy model and transform them into the annotations tagtog can digest. Below you can find a Python code snippet that does the following:
- Given a sample text, it forwards it to the en_core_web_sm model.
- It transforms the model response into annotations.
- It pushes the text and annotations (pre-annotated document) using its API.
The sample text:
Paypal Holdings Inc (PYPL) President and CEO Daniel Schulman Sold $2.7 million of Shares
🪄Now you can find the annotated text in your project.
Please reach out for any questions!
Feel free to sign up at tagtog, and try this sample with your own model, it's free.
👏👏👏 if you like the post, and want to share it with others! 💚🧡