Software & Data:

   For our text annotation efforts, we are using the tagtog annotation tool, which allows us to easily enrich various types of texts (including PDFs) with annotation layers. Tagtog provides a nice user interface to speed up the annotation process, along with several functionalities to easily inspect, compare and merge annotations. Check out their tool, we warmly recommend it!

   skweak is a weak supervision toolkit for NLP built around a very simple idea: Instead of annotating texts by hand, we define a set of labelling functions to automatically label our documents, and then aggregate their results to obtain a labelled version of our corpus. skweak can be applied to both sequence labelling and text classification, and comes with a complete API that makes it possible to create, apply and aggregate labelling functions with just a few lines of code.