Difference between revisions of "Workflow : Information Extraction From Query"
(→NER - Based in Lexical Resources) |
Anote2Wiki (talk | contribs) (→Pre-Configuration) |
||
(41 intermediate revisions by 2 users not shown) | |||
Line 4: | Line 4: | ||
== Operation == | == Operation == | ||
− | Information Extraction From Query Workflow allows you to set some tasks | + | The Information Extraction From Query Workflow allows you to set up some tasks over the results of a query, including Corpus creation (mandatory) and NER and RE Processes (optionally). |
− | + | ||
+ | To run the workflow, you must select a Query object on the Clipboard and right click '''Workflow -> Information Extraction (From query)''' or click on the '''Workflow Information Extraction''' button in the Query View. | ||
+ | |||
[[File:Workflow_Syngenta1.png||1500px|center]] | [[File:Workflow_Syngenta1.png||1500px|center]] | ||
+ | |||
== Select Query Publications == | == Select Query Publications == | ||
− | The first step of the workflow is to select candidate publications to create a Corpus. If | + | The first step of the workflow is to select candidate publications to create a Corpus. If you have already selected publications using the Query View this information will already be present. |
− | After selecting | + | After selecting the desired publications, press '''Next'''. |
+ | |||
[[File:Workflow_Syngenta2.png|800px|center]] | [[File:Workflow_Syngenta2.png|800px|center]] | ||
+ | |||
== Select Steps == | == Select Steps == | ||
− | The next step is to determine | + | The next step is to determine the tasks that will be executed by the workflow. The Corpus creation is mandatory but NER and RE processes can also be applied to this Corpus in subsequent operations. |
− | + | After selecting the tasks, press '''Next ''' to continue. | |
+ | |||
[[File:Workflow_Syngenta3.png|800px|center]] | [[File:Workflow_Syngenta3.png|800px|center]] | ||
+ | |||
== Create Corpus == | == Create Corpus == | ||
Line 27: | Line 34: | ||
(Mandatory) | (Mandatory) | ||
− | The next step | + | The next step is to configure the corpus creation, where you have to select the name of the Corpus and its type. |
=== Corpus Type === | === Corpus Type === | ||
Line 35: | Line 42: | ||
''' Full Text :''' Only publications with full Text / PDF will be considered. | ''' Full Text :''' Only publications with full Text / PDF will be considered. | ||
− | ''' Retrieve PDF :''' Only publications with full Text / PDF will be considered, and a Journal Retrieval Process to all | + | ''' Retrieve PDF :''' Only publications with full Text / PDF will be considered, and a Journal Retrieval Process will be launched to collect all selected documents |
+ | |||
[[File:Workflow_Syngenta4.png|center|800px]] | [[File:Workflow_Syngenta4.png|center|800px]] | ||
+ | |||
[[File:Workflow_Syngenta4b.png|center|800px]] | [[File:Workflow_Syngenta4b.png|center|800px]] | ||
− | |||
− | Otherwise press '''next | + | If you have selected only the task of corpus creation, the processing of your data will start after clicking '''ok'''. |
+ | When this process finishes, you can see the results in the [[Workflow_:_Information_Extraction_From_Query#Workflow_Report|Workflow_Report]] | ||
+ | |||
+ | Otherwise, if other steps are included, press '''next''' to proceed to NER configuration. | ||
== Select NER Process == | == Select NER Process == | ||
− | The next step is configuration | + | (Optional) |
+ | |||
+ | The next step is the configuration of an NER Process. Using the combo box (in blue), select the NER process configuration that is more suitable. For each NER Process, the specific settings appear under the combo box (area in orange) | ||
+ | |||
+ | |||
+ | [[File:Workflow_NER_Selection.png|center|800px]] | ||
+ | |||
+ | |||
+ | ===[[Corpus_Create_Annotation_Schema_By_NER_Lexical_Resources|NER - Based in Lexical Resources]] === | ||
− | + | You can select an NER process based in lexical resources. The configuration has two panels: Basic and Advanced: | |
− | + | ==== Basic ==== | |
− | + | Here, you can select one or more dictionaries to use as resources in the NER. For each dictionary, you can filter for classes, i.e. select which classes will be associated to the dictionary. | |
− | |||
[[File:Workflow_Syngenta5.png|center|800px]] | [[File:Workflow_Syngenta5.png|center|800px]] | ||
− | |||
− | Expert | + | ==== Advanced Option ==== |
+ | |||
+ | Expert users can configure some advanced options. These options are based in the operation NER - Lexical Resources: check details in [[Corpus_Create_Annotation_Schema_By_NER_Lexical_Resources|NER - Lexical Resources]] | ||
+ | |||
[[File:Workflow_Syngenta5b.png|center|800px]] | [[File:Workflow_Syngenta5b.png|center|800px]] | ||
+ | |||
+ | |||
+ | === [[Corpus_Create_Annotation_Schema_By_Abner|ABNER]] === | ||
+ | |||
+ | === [[Corpus_Create_Annotation_Schema_By_Chemical_Tagger|Chemical Tagger]] === | ||
+ | |||
+ | === [[Corpus_Create_Annotation_Schema_By_linnaeus_Tagger|Linnaeus Tagger]] === | ||
+ | |||
+ | If your workflow terminates in the NER process, the data processing will start after clicking '''ok'''. | ||
+ | When the process finishes, you can see results the in the [[Workflow_:_Information_Extraction_From_Query#Workflow_Report|Workflow_Report]] | ||
+ | |||
+ | Otherwise, if an RE process will be conducted, press '''next''' to proceed to its configuration. | ||
== Select RE Process == | == Select RE Process == | ||
+ | |||
+ | (Optional) | ||
+ | |||
+ | The next step is the configuration of an REProcess. Using the combo box select the RE process that best serves your needs (in blue). For each RE Process, the specific settings appear under combo box (area in the orange box) | ||
+ | |||
+ | |||
+ | [[File:Workflow_RE_Selection.png|center|800px]] | ||
+ | |||
+ | |||
+ | === [[Corpus_Relation_Extration|RE Based in POS-Tagging]] === | ||
+ | |||
+ | You can select an RE Based in Natural Language processing. The configuration has two panels: Basic an Advanced. | ||
+ | |||
+ | ==== Basic ==== | ||
+ | |||
+ | Here, you can select the relation model | ||
+ | |||
+ | |||
+ | [[File:Workflow_Syngenta6.png|center|800px]] | ||
+ | |||
+ | |||
+ | ==== Advanced ==== | ||
+ | |||
+ | Expert users can configure some advanced options. These options are based in the RE operation detailed in: [[Corpus_Relation_Extraction|Relation Extraction]] | ||
+ | |||
+ | |||
+ | [[File:Workflow_Syngenta6b.png|center|800px]] | ||
+ | |||
+ | |||
+ | === [[Corpus_Relation_Coocorrence_Extration|RE Cooccurrence]] === | ||
+ | === [[BioEvent_Extraction|Bio Event Extraction]] === | ||
+ | |||
+ | == Processing == | ||
+ | |||
+ | As the process proceeds you can check the status of activity looking at the progress bar | ||
+ | |||
+ | |||
+ | [[File:Workflow_Syngenta7.png|center]] | ||
+ | |||
== Workflow Report == | == Workflow Report == | ||
+ | |||
+ | As a result of the process, a [[Workflow_report]] is shown. | ||
+ | |||
+ | == Pre-Configuration == | ||
+ | |||
+ | Default settings can be changed using the option Preferences in the Settings Menu and selecting Workflow -> Query Node | ||
+ | |||
+ | |||
+ | [[File:Workflow_Query_Preconfiguration.png|center]] |
Latest revision as of 17:00, 8 September 2014
Contents
Operation
The Information Extraction From Query Workflow allows you to set up some tasks over the results of a query, including Corpus creation (mandatory) and NER and RE Processes (optionally).
To run the workflow, you must select a Query object on the Clipboard and right click Workflow -> Information Extraction (From query) or click on the Workflow Information Extraction button in the Query View.
Select Query Publications
The first step of the workflow is to select candidate publications to create a Corpus. If you have already selected publications using the Query View this information will already be present. After selecting the desired publications, press Next.
Select Steps
The next step is to determine the tasks that will be executed by the workflow. The Corpus creation is mandatory but NER and RE processes can also be applied to this Corpus in subsequent operations. After selecting the tasks, press Next to continue.
Create Corpus
(Mandatory)
The next step is to configure the corpus creation, where you have to select the name of the Corpus and its type.
Corpus Type
Abstract : : Only publications with abstracts will be considered.
Full Text : Only publications with full Text / PDF will be considered.
Retrieve PDF : Only publications with full Text / PDF will be considered, and a Journal Retrieval Process will be launched to collect all selected documents
If you have selected only the task of corpus creation, the processing of your data will start after clicking ok.
When this process finishes, you can see the results in the Workflow_Report
Otherwise, if other steps are included, press next to proceed to NER configuration.
Select NER Process
(Optional)
The next step is the configuration of an NER Process. Using the combo box (in blue), select the NER process configuration that is more suitable. For each NER Process, the specific settings appear under the combo box (area in orange)
NER - Based in Lexical Resources
You can select an NER process based in lexical resources. The configuration has two panels: Basic and Advanced:
Basic
Here, you can select one or more dictionaries to use as resources in the NER. For each dictionary, you can filter for classes, i.e. select which classes will be associated to the dictionary.
Advanced Option
Expert users can configure some advanced options. These options are based in the operation NER - Lexical Resources: check details in NER - Lexical Resources
ABNER
Chemical Tagger
Linnaeus Tagger
If your workflow terminates in the NER process, the data processing will start after clicking ok. When the process finishes, you can see results the in the Workflow_Report
Otherwise, if an RE process will be conducted, press next to proceed to its configuration.
Select RE Process
(Optional)
The next step is the configuration of an REProcess. Using the combo box select the RE process that best serves your needs (in blue). For each RE Process, the specific settings appear under combo box (area in the orange box)
RE Based in POS-Tagging
You can select an RE Based in Natural Language processing. The configuration has two panels: Basic an Advanced.
Basic
Here, you can select the relation model
Advanced
Expert users can configure some advanced options. These options are based in the RE operation detailed in: Relation Extraction
RE Cooccurrence
Bio Event Extraction
Processing
As the process proceeds you can check the status of activity looking at the progress bar
Workflow Report
As a result of the process, a Workflow_report is shown.
Pre-Configuration
Default settings can be changed using the option Preferences in the Settings Menu and selecting Workflow -> Query Node