Difference between revisions of "Corpus Create Annotation Schema By linnaeus Tagger"
RRodrigues (talk | contribs) (→Performing) |
RRodrigues (talk | contribs) (→Resources Selection) |
||
Line 12: | Line 12: | ||
== Resources Selection == | == Resources Selection == | ||
+ | In the next panel, select '''lexical resources'''. Here, dictionaries, lookup tables, rules sets and ontologies can be added to be used in the NER process. Selecting the respective tabs the user can select from existing resources organized by their types. When all lexical resources are selected, press '''Next'''.<br /> | ||
+ | [[File:NER_Linnaeus_Tagger_Wizard1a.png|center]]<br /> | ||
+ | [NER_Linnaeus_Tagger_Wizard1b.png|center] | ||
+ | === Rule: Partial Match with Dictionaries === | ||
+ | Using Rules, it is possible to associate some Rule annotations to Dictionary Terms (including only the Dictionaries selected in this step). For that purpose, you need to select the option Partial Match with Dictionaries on the Rules Tab as shown in the following figure. | ||
+ | [[File:ER Linnaeus Tagger Wizard1c.png.png|center]] | ||
+ | <pre> | ||
+ | '''Example''' | ||
+ | |||
+ | Dictionary Terms: | ||
+ | relA Gene [List Synonyms][List External Ids] | ||
+ | relB Gene [List Synonyms][List External Ids] | ||
+ | ppGpp Coumpound [List Synonyms][List External Ids] | ||
+ | |||
+ | Rule: | ||
+ | |||
+ | AA(.*)?\b | ||
+ | |||
+ | If the rule is applied to to the following text segment: | ||
+ | |||
+ | A AArelA gene for some organism interacts with AArelB. | ||
+ | |||
+ | '''''Results'''´´ | ||
+ | * Without using this option | ||
+ | relA ( Annotated by Partial Match Rule) | ||
+ | relB ( Annotated by Partial Match Rule) | ||
− | + | * Using Partial Match with Dictionaries | |
+ | relA ( Annotated by Partial Match Rule) + associated with dictionary term relA (List Synonyms and External Ids) | ||
+ | relB ( Annotated by Partial Match Rule) + associated with dictionary term relA (List Synonyms and External Ids) | ||
+ | </pre> | ||
== Select Class and Case Sensitivity == | == Select Class and Case Sensitivity == |
Revision as of 15:50, 25 June 2014
Contents
Select Option
To perform a new NER (Entity Recognition) based in Linnaeus Tagger, start by loading a Corpus to the Clipboard.
Selecting the Corpus, you should right click over it and choose Corpus -> NER -> Linnaeus Tagger
New Configuration or Load Configuration
A wizard will be presented to configure the process. The first step allows to select two options: Create a new process (New Configuration) or Load Configuration from an NER process that was already performed. To start a new process select New Configuration and press the Next button.
Resources Selection
In the next panel, select lexical resources. Here, dictionaries, lookup tables, rules sets and ontologies can be added to be used in the NER process. Selecting the respective tabs the user can select from existing resources organized by their types. When all lexical resources are selected, press Next.
[NER_Linnaeus_Tagger_Wizard1b.png|center]
Rule: Partial Match with Dictionaries
Using Rules, it is possible to associate some Rule annotations to Dictionary Terms (including only the Dictionaries selected in this step). For that purpose, you need to select the option Partial Match with Dictionaries on the Rules Tab as shown in the following figure.
'''Example''' Dictionary Terms: relA Gene [List Synonyms][List External Ids] relB Gene [List Synonyms][List External Ids] ppGpp Coumpound [List Synonyms][List External Ids] Rule: AA(.*)?\b If the rule is applied to to the following text segment: A AArelA gene for some organism interacts with AArelB. '''''Results'''´´ * Without using this option relA ( Annotated by Partial Match Rule) relB ( Annotated by Partial Match Rule) * Using Partial Match with Dictionaries relA ( Annotated by Partial Match Rule) + associated with dictionary term relA (List Synonyms and External Ids) relB ( Annotated by Partial Match Rule) + associated with dictionary term relA (List Synonyms and External Ids)
Select Class and Case Sensitivity
PreProcessing
Advanced Options
Normalization
In the last panel, you can select a Normalization option that allow to add white space between words delimiters and increase the entity recognition accuracy. This option changes the original offsets of the text.
The configuration is now complete once you press Ok. At all times you can cancel the operation clicking on the respective button.
Performing
NER Linnaeus Tagger operation will now start and a progress window will appear, indicating the execution of the operation. The NER Linnaeus Tagger operation will take a few minutes or hours depending on the number of documents, document size and size of the resources.
When the process ends, a new NER Process object will be added to the clipboard and can be visualized through the Corpus Process View.