Corpus Create Annotation Schema By BioTML Tagger

From Anote2Wiki
Jump to: navigation, search

Select Option

To perform a new NER (Entity Recognition) annotation based in a machine learning created using the BioTML plugin, start by loading a Corpus to the Clipboard containing the documents you wish to annotate.
Selecting the Corpus, you should right click it and choose Corpus -> NER -> BioTML Tagger -> Annotate with Model


Corpus Process NER BioTML Tagger.png


Load BioTML Model File For Annotation

A GUI appears allowing to select a BioTML NER model file. Select a compatible BioTML model file (could be a .zip or a .gz file) and press next. The BioTML Model loading will start.


Load BioTML Tagger.png


Select BioTML Model Configurations To Annotate

The selected model will contains NER classes to annotate and model properties that could be selected in the search table.

The NER BioTML Tagger allows you to define options for annotations, as follows:

In the NLP System Tokenizer option, you can select the natural language processing system used for tokenization of all documents to annotate.

In the Number of running Threads option, you can select the number of processing threads that the NER BioTML Tagger algorithm will be able to use.

In the Normalization option, you can select the @Note2 normalization system that allows to add white space between word delimiters and increase the entity recognition accuracy. This option changes the original offsets of the text..


Select Classes And Settings for NER Annotation By BioTML Tagger.png


The configuration is complete when you press Ok. At all times you can cancel the operation clicking on the respective button.

Performing the operation

NER BioTML Tagger operation will now start and a progress window will appear, indicating the execution status of the operation. It will take a few minutes or hours depending on the number of documents, document size, number of given threads, number of model classes in the model and features used automatically by BioTML model.
When the process ends, a new NER Process object will be added to the clipboard and can be visualized through the Corpus Process View.