Download and usage

The Tint command line tool is packaged as a self-contained, cross-platform Java 8 application. In order to install it, download the following package and unpack it in the location you prefer:

tint-0.3-complete.tar.gz (latest stable version, 2021-04-19)
old versions: 0.1 | 0.2

If you want all the new features, you can use the last development version (see below).

Compiling the last version

Sometimes new features are added in Tint in the develop branch, and there may be time to have them in the stable version.

If you want to try the last version of the tool, you should run the following commands:

[bash]
mkdir tint-develop
cd tint-develop
git clone https://github.com/fbk/utils
cd utils
git checkout develop
mvn clean install
cd ..
git clone https://github.com/dhfbk/tint
cd tint
git checkout develop
mvn clean package -Pcomplete
[/bash]

You’ll find the file tint-runner-[version]-bin.tar.gz in the tint-runner/target folder.

Tint Java API

Tint can be included easily into an existing Java project using Maven.

In the pom.xml file, add the following dependency:

<dependency>
<groupId>eu.fbk.dh</groupId>
<artifactId>tint-runner</artifactId>
<version>0.2</version>
</dependency>

Then, this is an example on how to instantiate it in a Java project:

[java]
// Initialize the Tint pipeline
TintPipeline pipeline = new TintPipeline();
// Load the default properties
// see https://github.com/dhfbk/tint/blob/master/tint-runner/src/main/resources/default-config.properties
pipeline.loadDefaultProperties();
// Add a custom property
// pipeline.setProperty(&quot;my_property&quot;, &quot;my_value&quot;);
// Load the models
pipeline.load();
// Use for example a text in a String
String text = &quot;I topi non avevano nipoti.&quot;;
// Get the original Annotation (Stanford CoreNLP)
Annotation stanfordAnnotation = pipeline.runRaw(text);
// **or**
// Get the JSON
// (optionally getting the original Stanford CoreNLP Annotation as return value)
InputStream stream = new ByteArrayInputStream(text.getBytes(StandardCharsets.UTF_8));
Annotation annotation = pipeline.run(stream, System.out, TintRunner.OutputFormat.JSON);
[/java]

Command-line usage

Installing Tint is quite straightforward, as it only needs to be downloaded and uncompressed. On a Linux/Mac shell, just run these commands:

[bash]
wget http://www.airpedia.org/tint/1.0-SNAPSHOT/tint-runner-1.0-SNAPSHOT-bin.tar.gz
tar xzf tint-runner-1.0-SNAPSHOT-bin.tar.gz
cd tint
./tint.sh [options]
[/bash]

where [options] is documented as follows:

[code]
-c,–config-file [FILE] Configuration file
–debug enable verbose output
-f,–output-format [FORMAT] Output format: textpro, json, xml, conll, naf,
readable (default conll)
-h,–help display this help message and terminate
-i,–input-file [FILE] Input text file (default stdin)
-o,–output-file [FILE] Output processed file (default stdout)
–properties [PROPS] Additional properties for Stanford CoreNLP
–trace enable very verbose output
-v,–version display version information and terminate
[/code]

If the -i option is missing, the standard input is used instead. Similarly, if the -o option is not present, the standard output is used.

The -c option can be used to add a configuration file written for Stanford CoreNLP. The preferences in this file are added to the ones included in the default configuration file (default-config.properties in the resources folder). If you want to add a single property, you can just use the --properties option.

The priority is as follows: first the default properties are loaded; then the custom config file is loaded (-c option); finally, the additional properties (--properties option) are loaded. Whenever a property with a particular key is loaded, the previous one with the same key is overwritten.

The models and the properties file for tokenization, sentence splitting, part-of-speech tagging, named-entity recognition, lemmatization and parsing are alredy included in default config, therefore you can simply run ./tint.sh to have Tint read from the standard input and write to the standard output. The wrappers for entity linking, time expression extraction and keywords digging are not acivated by default because they need some additional configuration and software.

A super-quick example:

[code]
./tint.sh [enter]
[type text, including newlines]
[Ctrl+D]
[/code]

will output the result of the text analysis in a readable format.

This use of Tint will result in loading the models after the Ctrl+D sequence, therefore you will wait some seconds before the text will be processed. You can also parse a plain text file using the command

[bash]
cat /path/to/plain.txt | ./tint.sh [enter]
[/bash]

If you don’t want to wait, you can load the models once, by importing Tint as a Maven module in an existing project or by running it as a server.

Running Tint as a web server

After uncompressing the Tint package (see above), you can run the tint-server.sh script to run it as a server. In particular the syntax is

[bash]
./tint-server.sh [options]
[/bash]

where [options] is documented as follows:

[code]
-c,–config [FILE] Configuration file
–debug enable verbose output
-h,–help display this help message and terminate
-p,–port [NUM] Host port (default 8012)
–properties [PROPS] Additional properties
–trace enable very verbose output
-v,–version display version information and terminate
[/code]

The -c and --properties options work similarly to the tint.sh script. With the -p option you can configure the listening port for the Tint server.

Once the server is loaded (the line INFO: [HttpServer] Started. will appear on standard output), you can test it by surfing to http://localhost:8012/tint?text=[text][/text]&format=[format] where [text][/text] is a text in Italian and [format] is the output format (see above in the ./tint.sh documentation).

For instance, surfing to

[code]
http://localhost:8012/tint?text=Barack%20Obama%20era%20il%20presidente%20degli%20Stati%20Uniti%20d%27America.
[/code]

will result in the JSON file containing all the desired annotations.

If you set the -p option, replace 8012 in the above example with the given port number.