Difference between revisions of "Workflow : Information Retrieval and Extraction"

From Anote2Wiki
Jump to: navigation, search
(Select Steps)
(Pre-Configuration)
 
(33 intermediate revisions by 2 users not shown)
Line 4: Line 4:
 
== Operation ==
 
== Operation ==
  
The Information Retrieval and Extraction Workflow allows you to set up some of the most important tasks in @Note, including the Journal Retrieval (mandatory), Journal Crawling, Corpus creation, NER Process and RE Processes (optional).  
+
The Information Retrieval and Extraction Workflow allows you to set up some tasks in @Note, including the Journal Retrieval (mandatory) and Journal Crawling, Corpus creation, NER and RE Processes (optionally).  
 +
 
 
To run the workflow, you must select the option '''Workflow ->  Information Retrieval and Extraction''' on the Menu Bar.
 
To run the workflow, you must select the option '''Workflow ->  Information Retrieval and Extraction''' on the Menu Bar.
  
Line 13: Line 14:
 
== Select Steps ==
 
== Select Steps ==
  
The next step is to determine the tasks that will be executed by the workflow.  
+
The next step is to determine the tasks that will be executed by the workflow. The Pubmed Search is mandatory but Corpus Creation and NER and RE processes can also be applied to this Corpus in subsequent operations. After selecting the tasks, press '''Next''' to continue.
  
After selecting the tasks, press Next to continue.
 
  
 
[[File:Workflow_2.png|800px|center]]
 
[[File:Workflow_2.png|800px|center]]
 +
  
 
== PubMed Search ==
 
== PubMed Search ==
 +
 +
(Mandatory)
 +
 +
The next step is to select PubMed search options. You can restrict the search to a specific organism or keywords and can also select the name of an author, a journal, the type of article, if the article is present in PubMed Central or Medline, if full text is available or select a publication date range.
 +
 +
 +
[[File:Workflow3.png|center|800px]]
 +
 +
 +
If you have selected only the task of Pubmed Search, the processing of your data will start after clicking ok. When this process finishes, you can see the results in the [[Workflow_Report]]
 +
 +
Otherwise, if other steps are included, press next to proceed to Corpus Creation Configuration.
  
 
== Create Corpus ==
 
== Create Corpus ==
  
The next step are to configure the corpus creation, here you have to select the name of the Corpus and select the type of corpus.
+
(Optional)
 +
 
 +
The next step is to configure the corpus creation, where you have to select the name of the Corpus and its type.
  
 
=== Corpus Type ===
 
=== Corpus Type ===
Line 31: Line 46:
 
''' Full Text :''' Only publications with full Text / PDF will be considered.
 
''' Full Text :''' Only publications with full Text / PDF will be considered.
  
''' Retrieve PDF :''' Only publications with full Text / PDF will be considered, and a Journal Retrieval Process to all select document are launched (after configuration steps)
+
''' Retrieve PDF :''' Only publications with full Text / PDF will be considered, and a Journal Retrieval Process will be launched to collect all selected documents
 +
 
  
 
[[File:Workflow4.png|center|800px]]
 
[[File:Workflow4.png|center|800px]]
 +
  
 
[[File:Workflow4b.png|center|800px]]
 
[[File:Workflow4b.png|center|800px]]
  
If you just select corpus creation your workflows process starts after clinking '''ok button'''. When process over you can see results in [[Workflow_:_Information_Extraction_From_Query#Workflow_Report|Workflow_Report]]
 
  
Otherwise press '''next button ''' to proceed to NER configuration.
+
If this is the last task, the processing of your data will start after clicking '''ok'''.
 +
When this process finishes, you can see the results in the [[Workflow_:_Information_Extraction_From_Query#Workflow_Report|Workflow_Report]]
 +
 
 +
Otherwise, if other steps are included, press '''next''' to proceed to NER configuration.
  
 
== Select NER Process ==
 
== Select NER Process ==
Line 45: Line 64:
 
(Optional)
 
(Optional)
  
The next step is configuration a NER Process. Using combo box the must select NER process that servers your efforts.  
+
The next step is the configuration of an NER Process. Using the combo box (in blue), select the NER process configuration that is more suitable. For each NER Process, the specific settings appear under the combo box (in the area delimited by the orange box)
 +
 
 +
 
 +
[[File:Workflow_NER_Selection.png|center|800px]]
  
=== NER - Based in Lexical Resources ===
 
  
You can select a NER Based in lexical resources. The configuration have two panel Basics an Advanced:
+
===[[Corpus_Create_Annotation_Schema_By_NER_Lexical_Resources|NER - Based in Lexical Resources]] ===
  
==== Basic Option ====
+
You can select an NER process based in lexical resources. The configuration has two panels: Basic and Advanced:
 +
 
 +
==== Basic ====
 +
 
 +
Here, you can select one or more dictionaries to use as resources in the NER. For each dictionary, you can filter for classes, i.e. select which classes will be associated to the dictionary.
  
Here you must select one or many dictionaries to run NER.
 
  
 
[[File:Workflow5.png|center|800px]]
 
[[File:Workflow5.png|center|800px]]
  
==== Advance Option ====
 
  
Expert User can configure some advance options. This options are based in NER - Lexical Resource [[Corpus_Create_Annotation_Schema_By_NER_Lexical_Resources|NER - Lexical Resources]]
+
==== Advanced Option ====
 +
 
 +
Expert users can configure some advanced options. These options are based in the operation NER - Lexical Resources: check details in [[Corpus_Create_Annotation_Schema_By_NER_Lexical_Resources|NER - Lexical Resources]]
 +
 
  
 
[[File:Workflow5b.png|center|800px]]
 
[[File:Workflow5b.png|center|800px]]
  
If you just select NER PRocess, workflows process starts after clinking '''ok button'''. When process over you can see results in Workflow_Report
 
  
Otherwise press '''next button''' to proceed to RE configuration.
+
=== [[Corpus_Create_Annotation_Schema_By_Abner|ABNER]] ===
 +
 
 +
=== [[Corpus_Create_Annotation_Schema_By_Chemical_Tagger|Chemical Tagger]] ===
 +
 
 +
=== [[Corpus_Create_Annotation_Schema_By_linnaeus_Tagger|Linnaeus Tagger]] ===
 +
 
 +
 
 +
If your workflow terminates in the NER process, the data processing will start after clicking '''ok'''.
 +
When the process finishes, you can see the results in the [[Workflow_:_Information_Extraction_From_Query#Workflow_Report|Workflow_Report]]
 +
 
 +
Otherwise, if an RE process will be conducted, press '''next''' to proceed to its configuration.
  
 
== Select RE Process ==
 
== Select RE Process ==
Line 71: Line 106:
 
(Optional)
 
(Optional)
  
The next step is configuration a REProcess. Using combo box the must select Re process that servers your efforts.  
+
The next step is the configuration of an REProcess. Using the combo box select the RE process that best serves your needs (in blue).  For each RE Process, the specific settings appear under the combo box (in the area within the orange box).
 +
 
  
=== RE Based in POS-Tagging ===
+
[[File:Workflow_RE_Selection.png|center|800px]]
  
You can select a RE Based Natural Language processing. The configuration have two panel Basics an Advanced.
 
  
==== Basic Option ====
+
=== [[Corpus_Relation_Extration|RE Based in POS-Tagging]] ===
 +
 
 +
You can select an RE Based in Natural Language processing. The configuration has two panels: Basic an Advanced.
 +
 
 +
==== Basic ====
 +
 
 +
Here, you can select the relation model.
  
Here you must select the relation model
 
  
 
[[File:Workflow6.png|center|800px]]
 
[[File:Workflow6.png|center|800px]]
  
==== Advance Option ====
 
  
Expert User can configure some advance options. This options are based in RE [[Corpus_Relation_Extraction|Relation Extraction]]
+
==== Advanced ====
 +
 
 +
Expert users can configure some advanced options. These options are based in the RE operation detailed in: [[Corpus_Relation_Extraction|Relation Extraction]]
 +
 
  
 
[[File:Workflow6b.png|center|800px]]
 
[[File:Workflow6b.png|center|800px]]
 +
 +
 +
=== [[Corpus_Relation_Coocorrence_Extration|RE Cooccurrence]] ===
 +
=== [[BioEvent_Extraction|Bio Event Extraction]] ===
  
 
== Processing ==
 
== Processing ==
  
As the process proceeds the user can see the status of activity looking to progress bar
+
As the process proceeds you can check the status of activity looking at the progress bar
 +
 
  
 
[[File:Workflow7.png|center]]
 
[[File:Workflow7.png|center]]
 +
  
 
== Workflow Report ==
 
== Workflow Report ==
  
As a result of the process appears a [[Workflow_report]]
+
As a result of the process, a [[Workflow_report]] is shown.
 +
 
 +
== Pre-Configuration ==
 +
 
 +
Default settings can be changed using the Preferences options in the Settings Menu and selecting Workflow -> General Node
 +
 
 +
 
 +
[[File:Workflow_General_Preconfiguration.png|center]]

Latest revision as of 17:09, 8 September 2014

Operation

The Information Retrieval and Extraction Workflow allows you to set up some tasks in @Note, including the Journal Retrieval (mandatory) and Journal Crawling, Corpus creation, NER and RE Processes (optionally).

To run the workflow, you must select the option Workflow -> Information Retrieval and Extraction on the Menu Bar.


Workflow1.png


Select Steps

The next step is to determine the tasks that will be executed by the workflow. The Pubmed Search is mandatory but Corpus Creation and NER and RE processes can also be applied to this Corpus in subsequent operations. After selecting the tasks, press Next to continue.


Workflow 2.png


PubMed Search

(Mandatory)

The next step is to select PubMed search options. You can restrict the search to a specific organism or keywords and can also select the name of an author, a journal, the type of article, if the article is present in PubMed Central or Medline, if full text is available or select a publication date range.


Workflow3.png


If you have selected only the task of Pubmed Search, the processing of your data will start after clicking ok. When this process finishes, you can see the results in the Workflow_Report

Otherwise, if other steps are included, press next to proceed to Corpus Creation Configuration.

Create Corpus

(Optional)

The next step is to configure the corpus creation, where you have to select the name of the Corpus and its type.

Corpus Type

Abstract : : Only publications with abstracts will be considered.

Full Text : Only publications with full Text / PDF will be considered.

Retrieve PDF : Only publications with full Text / PDF will be considered, and a Journal Retrieval Process will be launched to collect all selected documents


Workflow4.png


Workflow4b.png


If this is the last task, the processing of your data will start after clicking ok. When this process finishes, you can see the results in the Workflow_Report

Otherwise, if other steps are included, press next to proceed to NER configuration.

Select NER Process

(Optional)

The next step is the configuration of an NER Process. Using the combo box (in blue), select the NER process configuration that is more suitable. For each NER Process, the specific settings appear under the combo box (in the area delimited by the orange box)


Workflow NER Selection.png


NER - Based in Lexical Resources

You can select an NER process based in lexical resources. The configuration has two panels: Basic and Advanced:

Basic

Here, you can select one or more dictionaries to use as resources in the NER. For each dictionary, you can filter for classes, i.e. select which classes will be associated to the dictionary.


Workflow5.png


Advanced Option

Expert users can configure some advanced options. These options are based in the operation NER - Lexical Resources: check details in NER - Lexical Resources


Workflow5b.png


ABNER

Chemical Tagger

Linnaeus Tagger

If your workflow terminates in the NER process, the data processing will start after clicking ok. When the process finishes, you can see the results in the Workflow_Report

Otherwise, if an RE process will be conducted, press next to proceed to its configuration.

Select RE Process

(Optional)

The next step is the configuration of an REProcess. Using the combo box select the RE process that best serves your needs (in blue). For each RE Process, the specific settings appear under the combo box (in the area within the orange box).


Workflow RE Selection.png


RE Based in POS-Tagging

You can select an RE Based in Natural Language processing. The configuration has two panels: Basic an Advanced.

Basic

Here, you can select the relation model.


Workflow6.png


Advanced

Expert users can configure some advanced options. These options are based in the RE operation detailed in: Relation Extraction


Workflow6b.png


RE Cooccurrence

Bio Event Extraction

Processing

As the process proceeds you can check the status of activity looking at the progress bar


Workflow7.png


Workflow Report

As a result of the process, a Workflow_report is shown.

Pre-Configuration

Default settings can be changed using the Preferences options in the Settings Menu and selecting Workflow -> General Node


Workflow General Preconfiguration.png