Corpus Create Annotation Schema By Chemical Tagger

One of the options for performing NER is the use of the Chemical Tagger NER algorithm (for more details please visit ).

Selecting the Corpus object on the clipboard, you should right click over it an choose Corpus -> NER -> Chemistry Tagger

A GUI will be presented, where you can select tagger sources:

  • Ion(e.g. Fe3+, Cl-).
  • Compound formulas (e.g. SO2, H2O, H2SO4 ...).
  • Element names and symbols (e.g. Sodium and Na).

Normalization option allow to add white space between words delimiters and increase the entity recognition accuracy.

Pressing Ok the operation will be launched and a progress window will be shown, indicating the execution of the operation and estimate time left. The NER operation will take a few minutes or hours depending on the number of documents and document size.

When the process ends, a new NER Process object will be added to the clipboard and can be visualized through the Corpus Process View.