Difference between revisions of "Corpus Relation Extraction"

From Anote2Wiki
Jump to: navigation, search
(Manual Curation (from Other Process))
 
(22 intermediate revisions by 3 users not shown)
Line 1: Line 1:
 +
__TOC__
 
[[Category:HOWTOs]]
 
[[Category:HOWTOs]]
  
 +
== Operation ==
 +
To perform a '''Relation Extraction''' (RE) process based in Natutal Language Processing algorithms, you should right click a Corpus data-type object and select the option '''Corpus -> RE-> Relation Extraction'''.
 +
If the Corpus is not in the clipboard, you should begin by  [[Corpora_Load_Corpus|loading the Corpus to the Clipboard]].
  
The user can perform a new RE('''Relation Extraction''') after [[Load_Corpus|loading Corpus to Clipboard]] based in Natutal Language Processing.
 
  
For that user must select Corpus, pressing '''right mouse button''' an select '''Corpus -> RE-> Relation Extraction'''
+
[[Image:Corpus_Process_RE.png|1500px|center]]
 +
 
 +
 
 +
A new wizard is presented that allows to configure the RE process.
  
[[Image:Corpus_Process_RE.png|1500px|center]]
+
== Entity Process Selection ==
 +
The first panel enables the '''selection''' of the '''processes''' that contain the entities annotated, allowing to view some statistics and process properties. After selecting the process press '''Next''' to continue.
  
A new wizard will be presented. This allows to configure the RE.
 
The first panel enables the '''selection''' of the '''processes''' that contains entities schema. Here the user can select the process that contains entities and at same time view some statics and process properties. After select Process press '''Next button''' to continue.
 
  
 
[[Image:RE1.png|800px|center]]
 
[[Image:RE1.png|800px|center]]
  
Next Panel is for '''POS-Tagger selection''', Here user select witch POS-Tagger will be used, and some information about POS_TAgger origin will be present. After choose POS-Tagger must press '''Next Button''' to continue.
+
== POS-Tagger selection ==
 +
The next panel allows for '''POS-Tagger selection'''. Here, you select which POS-Tagger will be used, and some information about its origin and properties will be presented. After choosing the desired POS-Tagger press '''Next''' to continue.
 +
 
  
 
[[Image:RE2.png|800px|center]]
 
[[Image:RE2.png|800px|center]]
  
Proceeding, next panel is for '''Relation Extraction Model Selection'''. The user select the Relation model and panel show the result of model in relation extraction ( Panel Image Above). After selcted the best model the user must press '''Next Button'''
+
== Relation Extraction Model Selection ==
 +
The following panel allows for '''Relation Extraction Model Selection'''. Select the most adequate model (the panel below shows the expected type of results). After selecting the best model press '''Next'''.
 +
 
  
 
[[Image:RE3.png|800px|center]]
 
[[Image:RE3.png|800px|center]]
  
Proceeding, next panel is for choose '''Verb List Filter option'''. User can select a list of filter verbs (Lexical Words). This option, when selected, permits remove list of verbs from Relation Clues. For example, normally common English verbs like be, are not good indicators for Relations.
+
== Advanced Options ==
 +
The following panel allows for ''Relation Extraction Advanced Options'''. Here, you can filter relations by relation types or advanced model options.
 +
 
 +
=== Relation Types ===
 +
 
 +
You can select the relations types that will be extracted:
 +
 
 +
 
 +
[[Image:RE3_2.png|800px|center]]
 +
 
 +
 
 +
=== Advanced Model Options ===
 +
 
 +
You can also select one of the advanced relation model options:
 +
* Using a maximum word distance between verbs (clues) and annotated entities in the sentence;
 +
* Keep Only Relations where verbs are associated only with nearest entities
 +
* Keep Only relations where the entities are only associated to the nearest verb
 +
* Without constrains
 +
 
 +
 
 +
[[Image:RE3_3.png|800px|center]]
 +
 
 +
=== Manual Curation (from Other Process) ===
 +
 
 +
You can activate the possibility to merge manually curated entities and relations from another previous process in the RE Process.
 +
 
 +
The manual process selection is made in the combo box denoted in the section highlighted in blue.
 +
 
 +
The manual curation details (ordered by document) are also present in the section highlighted in red.
 +
 
 +
 
 +
[[File:RE3_4.png|800px|center]]
 +
 
 +
 
 +
Notes : Annotation Types Meaning
 +
 
 +
ENTITYUPDATE: Entities whose class was changed (Violet)
 +
 
 +
ENTITYREMOVE: Entities that were removed (Red)
 +
 
 +
ENTITYADD: Entities that were added (green)
 +
 
 +
RELATIONUPDATE: Relation whose annotated entities was changed (Violet)
 +
 
 +
RELATIONREMOVE: Relation that was removed (Red)
 +
 
 +
RELATIONADD: Relation that was added (green)
 +
 
 +
== Verb List Filter ==
 +
The next panel allows choosing if a '''Verb List''' will be used to filter the results and define this list. You can select a list of verbs (a Lexical Words object) that will not be used to create relations Typically, this will be used to avoid relations with common English verbs (e.g. be, do).
 +
 
  
 
[[Image:RE4.png|800px|center]]
 
[[Image:RE4.png|800px|center]]
  
Proceeding, next panel is for choose '''Verb List Addition option'''. User can select a list of addition verbs (Lexical Words). This option, when selected, permits adding list of verbs that normally are not co-notated as verbs in POS-Taggers to addition as Relation Clue. After the user must press '''Ok Button'''.
+
== Verb List Added ==
 +
The next panel allows choosing an additional '''Verb List'''. You can select a list of verbs (a Lexical Words object) to add as relation clues, i.e. they provide a complement to the verbs list used internally by @Note.
 +
Once this option is selected, press '''Ok'''.
 +
 
  
 
[[Image:RE5.png|800px|center]]
 
[[Image:RE5.png|800px|center]]
  
RE operation will start and a small window will appear, indicating the execution of the operation. The REoperation will take a few minutes or hours, depending of corpus size.
+
== Result ==
 +
The RE operation will start and a progress window is shown, indicating the execution of the operation. The RE operation will take a few minutes or hours, depending on corpus size.
  
When the process finishing, a new '''RE Process''' object will be added to the [[Corpus_Load_Process|''Corpus Process View'']].
+
When the process ends, a new '''RE Process''' object will be added to the [[Corpus_Load_Process|''Corpus Process View'']].

Latest revision as of 17:22, 16 January 2015

Operation

To perform a Relation Extraction (RE) process based in Natutal Language Processing algorithms, you should right click a Corpus data-type object and select the option Corpus -> RE-> Relation Extraction. If the Corpus is not in the clipboard, you should begin by loading the Corpus to the Clipboard.


Corpus Process RE.png


A new wizard is presented that allows to configure the RE process.

Entity Process Selection

The first panel enables the selection of the processes that contain the entities annotated, allowing to view some statistics and process properties. After selecting the process press Next to continue.


RE1.png

POS-Tagger selection

The next panel allows for POS-Tagger selection. Here, you select which POS-Tagger will be used, and some information about its origin and properties will be presented. After choosing the desired POS-Tagger press Next to continue.


RE2.png

Relation Extraction Model Selection

The following panel allows for Relation Extraction Model Selection. Select the most adequate model (the panel below shows the expected type of results). After selecting the best model press Next.


RE3.png

Advanced Options

The following panel allows for Relation Extraction Advanced Options'. Here, you can filter relations by relation types or advanced model options.

Relation Types

You can select the relations types that will be extracted:


RE3 2.png


Advanced Model Options

You can also select one of the advanced relation model options:

  • Using a maximum word distance between verbs (clues) and annotated entities in the sentence;
  • Keep Only Relations where verbs are associated only with nearest entities
  • Keep Only relations where the entities are only associated to the nearest verb
  • Without constrains


RE3 3.png

Manual Curation (from Other Process)

You can activate the possibility to merge manually curated entities and relations from another previous process in the RE Process.

The manual process selection is made in the combo box denoted in the section highlighted in blue.

The manual curation details (ordered by document) are also present in the section highlighted in red.


RE3 4.png


Notes : Annotation Types Meaning

ENTITYUPDATE: Entities whose class was changed (Violet)

ENTITYREMOVE: Entities that were removed (Red)

ENTITYADD: Entities that were added (green)

RELATIONUPDATE: Relation whose annotated entities was changed (Violet)

RELATIONREMOVE: Relation that was removed (Red)

RELATIONADD: Relation that was added (green)

Verb List Filter

The next panel allows choosing if a Verb List will be used to filter the results and define this list. You can select a list of verbs (a Lexical Words object) that will not be used to create relations Typically, this will be used to avoid relations with common English verbs (e.g. be, do).


RE4.png

Verb List Added

The next panel allows choosing an additional Verb List. You can select a list of verbs (a Lexical Words object) to add as relation clues, i.e. they provide a complement to the verbs list used internally by @Note. Once this option is selected, press Ok.


RE5.png

Result

The RE operation will start and a progress window is shown, indicating the execution of the operation. The RE operation will take a few minutes or hours, depending on corpus size.

When the process ends, a new RE Process object will be added to the Corpus Process View.