|ISTEX-TermSuiteLauncher||2 years ago|
|model||2 years ago|
|ISTEX-TermSuiteLauncher.jar||2 years ago|
|ISTEXlauncher.sh||2 years ago|
|README.md||2 years ago|
|launcher.sh~||2 years ago|
This program is a launcher which purpose is to make TermSuite usages for ISTEX easier. The objective is to allow another person to start using TermSuite on ISTEX corpora without non essential options or features. It is a modified version of the InlineLauncher made by Damien Cram (LINA-CNRS) for which personnal CLI, models, and utils were added.
For all usage you will need to put the "model" directory in root, next to ISTEXlauncher.sh.
It is the recommended method as it allows you not to rewrite long command lines.
In the ISTEXlauncher.sh file several variables can be edited: Mandatory fields:
If you don't want to use the shell script, you can use the following basic command:
java -jar ISTEX-TermSuiteLauncher.jar -corpus [dir path] -resources [dir path] -output [path]
Of course, you can always add the "-help" flag option or only launch ISTEXTermSuiteLauncher without any options to show the help message.
java -jar ISTEX-TermSuiteLauncher.jar OR java -jar ISTEX-TermSuiteLauncher.jar [option1] [option...] -help
By default, Mate will be used for PoS tagging and lemmatization. If, instead, you want to use TreeTagger you will need two options "-treetaggerhome" to set the HOME directory of your installation of TreeTagger, and specify "treetagger" using the "-lemmatizerpostagger" option. Here is an example:
java -jar ISTEX-TermSuiteLauncher.jar -corpus [dir path] -resources [dir path] -model [dir path] -output [file path] -lemmatizerpostagger treetagger -treetaggerhome /home/gael/programs/TreeTagger
The output files always are a json file and a tsv file. The tsv file contains indexation based on Weirdness Ratio and will be created next to the json file. It is made of: line number | Term(T) or Variant(V) | TermPilot | Lemmas | Frequency | Weirdness Ratio | SpottingRule
gael dot guibon at gmail.com gael dot guibon at inist.fr istex at inist.fr
@2015 ISTEX INIST-CNRS