Difference between revisions of "Corpora"

From Anote2Wiki
Jump to: navigation, search
(Created page with "__TOC__ == About PubMed Retrieval Plug-in == This plugin was developed in collaboration with the SING group at University of Vigo - Spain. Information Retrieval plug-in di...")
 
Line 1: Line 1:
 
__TOC__
 
__TOC__
  
== About PubMed Retrieval Plug-in ==
+
== About Corpora Plug-in ==
  
 
+
Central @Note2 plug-in that Define Corpora (Corpus Set) perspective and possibility the integration of extraction processes ( like entity extraction or relation extraction processes).
This plugin was developed in collaboration with the SING group at University of Vigo - Spain.
+
Which Corpus, a set of document, could be composed for processes. Which Corpus Process have Annotation for a document ( entity/event annotations).
 
+
In this plug-in is already some standard View for Annotated documents.
Information Retrieval plug-in divided and two main points:
 
* Publication searching in PubMed, with the combination of keywords and specific fields. The result is a set of publication what contains information about title, abstract, journal. These publications are organized in queries
 
* Publication Retrieval that try get PDF file for a Publication indexed for PMID.There are limitation for crawling article (free access only).
 
 
 
For structuring this plug-in was defined Queries, a publication Set. This publications can be classified in a Query ( Publication Query Relevance ). This step can be important for example in corpus creation or in learning data for automatic classification systems.
 
  
 
== User HowTOs ==
 
== User HowTOs ==

Revision as of 15:30, 4 January 2012

About Corpora Plug-in

Central @Note2 plug-in that Define Corpora (Corpus Set) perspective and possibility the integration of extraction processes ( like entity extraction or relation extraction processes). Which Corpus, a set of document, could be composed for processes. Which Corpus Process have Annotation for a document ( entity/event annotations). In this plug-in is already some standard View for Annotated documents.

User HowTOs

MVC AIBench Model:

Data-types:

PublicationManager: Plug-in Main Data-type that contains information about all PubMed search already done. Contain information about proxy and database given by configuration file and directory for publications PDF files.

QueryInformationRetrievalExtension: Represent a Query. Contain information about database ID, date, keywords, organism, matching publications, available abstracts and other generic query properties.

Operations:

AddFileToPublicationManagerOperation: Manually add a PDFfile for publication.

AddPublicationToQueryOperation: Add a new publication to QueryInformationRetrievalExtension.

ExitOperation: Publication Manager exit operation.

InitReferenceManager: Initialize PublicationManager plug-in.

JournalRetrivalListDocs: Operation that given a list PMID try gets PDF files.

PubmedSearchOperation: Operation for Search in PubMed given the query details.

SelectRelevance: Change document relevance for query.

UpdateQueryOperation: Update QueryInformationRetrievalExtension (updates the result of PubMed search in time).

Views:

PublicationManagerView: PublicationManager View that contains a visualizer off all queries presents in Publication Manager and permits have some filters.

QueryRelevanceView: QueryInformationRetrievalExtension View that permits viewing documents in a query whit your relevance.

QueryView: QueryInformationRetrievalExtension View that permits viewing documents in a query an permits some search steps.