Corpus Create Annotation Schema By NER Lexical Resources

From Anote2Wiki
Revision as of 16:30, 9 January 2012 by Anote2Wiki (talk | contribs) (Created page with "Category:HOWTOs When there are one or more Corpus available in a clipboard, it is possible to execute an entity name recognition (NER). '''Named Entity Recognition whit...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


When there are one or more Corpus available in a clipboard, it is possible to execute an entity name recognition (NER). Named Entity Recognition whit Lexical Resources is a native operation over Corpus(right clicking it - Clipboard).

Corpus -> NER -> Lexical Resources

Corpus Process NER ANote.png


A wizard will be presented. This allows to configure the NER process. The first step is to select the Publication Set over which the NER will be performed. When the desired Publication Set is selected, the Next button is pressed.



In the next step, a dictionary must be selected for the NER. Here, a new dictionary can be imported (how to import dictionaries will be described later in this document). After the dictionary has been chosen, the list of possible classes will be presented. The user selects the classes to annotate by moving them from the left to the right list.



In the last step, a set of complementary classes that the user can choose to be annotated are presented. Those are classes which are given by lists of terms manually compiled. The available options are:

  • Biology-related Verbs;
  • Laboratory Techniques;
  • Physiological States;
  • Predefined Expert Hand Rules.


In the same window the user defines if he decides to annotate abstracts or full texts.


After all the configurations have been made, the Execute button (gear icon) has to be pressed. When the button is pressed, the NER operation will start and a small window will appear, indicating the execution of the operation. The NER operation will take a few minutes.



When the process is finished, a new Ner Box List object will be added to the clipboard. This object contains a list of items of the datatype ANoteNerBox, each being the result of a NER operation. The Ner Box List exists because it is possible to create different kinds of configurations to NER (e.g. distinct dictionaries), and each configuration yields a distinct NerBox.