Corpus Create Annotation Schema By NER Lexical Resources

From Anote2Wiki
Revision as of 18:33, 4 December 2012 by Hcosta (talk | contribs)
Jump to: navigation, search

Select Option

The user can perform a new NER (Entity recognition) based in Lexical Resources loading Corpus to Clipboard based on previous settings (NER already performed - Same Resources and Same Options).

Selecting Corpus, the user must press right mouse button an select Corpus -> NER -> Lexical Resources

Corpus Process NER ANote.png

New Configuration or Load Configuration

A wizard will be presented. The first allows to select two options: Create a new process ( New Configuration) and Load Configuration from a process that already performed. The user must select New Configuration and press Next button.

NER ANote Wizard1.png

Resources Selection

In the next panel, the user must select lexical resources. Here, dictionaries, lookup tables, Rules set and Ontologies can be added to NER process. Selecting tabs the user can change resources type and select different resources. After lexical resources selection, the user must press Next button.

NER ANote Wizard1a.png
NER ANote Wizard1b.png

Select Class ( For each resource)

In the next panel, For each lexical resource previous selected the user can filter for classes.

NER ANote Wizard2.png

Pre-Processing

Stop Words

Proceeding, appears Stop Words GUI. Here the user can select a list of stop words (Lexical Words Set - Lexical Resources) to perform NER algorithm. Stop words are important for algorithm don't annotate common English word as entities ( Remove false positive annotations). After select options select Next Button

NER ANote Wizard3.png


Normalization Option

The last panel, The user could select a Normalization option. The Normalization options permits adding white space between words delimiters and increase the entity recognition. This option cahnge the original offsets for text.

NER ANote Wizard4.png

The configurations have been made, the Ok button must be pressed. All time the user could cancel operation clinking on cancel button.

Performing

NER operation will start and a small window will appear, indicating the execution of the operation. The NER operation will take a few minutes or hours depending number of documents, document size and total resources terms.

When process finishing , a new NER Process object will be added to the Corpus Process View.