Past editions: 2015 | 2014 | 2013 | 2012

TASS 2016

Welcome to the 5th evaluation workshop for sentiment analysis focused on Spanish. TASS 2016 will be held as part of the 32st SEPLN Conference in Salamanca, Spain, on September 14th, 2016. You are invited to attend the workshop, taking part in the proposed tasks and visiting this beautiful city!

TASS2016 Proceedings - CEUR Vol 1702

Welcome to TASS 2016!

TASS is an experimental evaluation workshop for sentiment analysis and online reputation analysis focused on Spanish language, organized as a satellite event of the annual conference of the Spanish Society for Natural Language Processing (SEPLN). After four previous successful editions, TASS 2016 will take place on September 14th, 2016 at University of Salamanca, Spain.

The aim of TASS is to provide a forum for discussion and communication where the latest research work and developments in the field of sentiment analysis in social media, specifically focused on Spanish language, can be shown and discussed by scientific and business communities. The main objective is to promote the application of state-of-the-art algorithms and techniques for sentiment analysis applied to short text opinions extracted from social media messages (specifically Twitter).

Several challenge tasks are proposed, intended to provide a benchmark forum for comparing the latest approaches in these fields. In addition, with the creation and release of the fully tagged corpus, we aim to provide a benchmark dataset that enables researchers to compare their algorithms and systems.

Tasks

It is important continue with the traditional sentiment analysis at global level, using a new corpus. Moreover, we want to foster the research in the analysis of fine-grained polarity analysis at aspect level (aspect-based sentiment analysis, one of the new requirements of the market of natural language processing in these areas.

Participants are expected to submit up to 3 results of different experiments for one or both of these tasks, in the appropriate format described below.

Along with the submission of experiments, participants will be invited to submit a paper to the workshop in order to describe their experiments and discussing the results with the audience in a regular workshop session. More information about format and requirements will be provided soon.

Information for submissions

Submissions must be done through the following page, using the provided user and password:

http://tass.sepln.org/2016/private/evaluate.php

There you must select the task and fill in the name of your group, the run ID and the run file, and the system will automatically check and evaluate your submission according to the defined metrics and keep a history of everything.

If you want to resubmit your experiment, just use the same group name and run id.

Please notice that the list of submissions is public and open to all participants.

You may submit any experiment at any moment that you want, but the valid official runs are the ones up to August 18th, included.

Call for Papers

All participants are invited to submit a paper with the description of the main keys of your systems and the discussion of your results. The papers will be reviewed by a scientific committee, and only the accepted papers will be published at CEUR.

Depending on the final number of participants and the time slot allocated for the workshop, all or a selected group op papers will be selected to be presented and discussed in the Workshop session.

The manuscripts must to satisfy the following rules:

The maximum size allowed for contributions is up to 6 DIN A4 pages, including references and figures.
Articles can be written in English or Spanish. The title, abstract and keywords must be written in both languages.
The document format must be Word or Latex, but the submission must be in PDF format. The allowed template is at the SEPLN webpage.

Instead of describing the task and/or the corpus, focus on the description of your experiments and the analysis of your results, and include a citation to the Overview paper.

More information will be provided soon.

Task 1: Sentiment Analysis at global level

This task consists on performing an automatic sentiment analysis to determine the global polarity of each message in the provided test set of the General corpus (see below). This task is a reedition of the task in the previous years. Participants will be provided with the training set of the General corpus so that they may train and validate their models.

There will be two different evaluations: one based on 6 different polarity labels (P+, P, NEU, N, N+, NONE) and another based on just 4 labels (P, N, NEU, NONE).

Participants are expected to submit (up to 3) experiments for the 6-labels evaluation, but are also allowed to submit (up to 3) specific experiments for the 4-labels scenario.

Accuracy (correct tweet polarity according to the gold standard) will be used for ranking the systems. The confusion matrix will be generated and then used to evaluate the precision, recall and F1-measure for each individual category (polarity). Macroaveraged precision, recall and F1-measure will be also calculated for the whole run.

Results must be submitted in a plain text file with the following format:

tweetid \t polarity

where polarity can be:

P+, P, NEU, N, N+ and NONE for the 6-labels case
P, NEU, N and NONE for the 4-labels case.

The test corpus will be used for the evaluation, to allow for comparison among systems.

Task 2: Aspect-based sentiment analysis

Participants will be provided with a corpus tagged with a series of aspects, and systems must identify the polarity at the aspect-level. Two corpora will be provided: the Social-TV corpus, and the STOMPOL corpus, used last year. Both corpora have been splitted into training and test set, the first one for building and validating the systems, and the second for evaluation.

Participants are expected to submit up to 3 experiments for each corpus, each in a plain text file with the following format:

tweetid \t aspect \t polarity

Allowed polarity values are P, NEU and N.

For evaluation, a single label combining "aspect-polarity" will be considered. Similarly to the first task, accuracy will be used for ranking the systems; precision, recall and F1-measure will be used to evaluate each individual category ("aspect-polarity" label); and macroaveraged precision, recall and F1-measure will be also calculated for the global result.

Corpus

General Corpus

Important advice: The delay in the start of TASS2016 was due to the interest in working with a new general corpus. This new general corpus has passed through all stages in their development, filtering and manual labeling, but premilinar evaluation has not been satisfactory and the TASS2016 organization has chosen to work in 2016 with the same general corpus of previous editions.

This general corpus contains over 68 000 Twitter messages, written in Spanish by about 150 well-known personalities and celebrities of the world of politics, economy, communication, mass media and culture, between November 2011 and March 2012. Although the context of extraction has a Spain-focused bias, the diverse nationality of the authors, including people from Spain, Mexico, Colombia, Puerto Rico, USA and many other countries, makes the corpus reach a global coverage in the Spanish-speaking world.

The general corpus has been divided into two sets: training (about 10%) and test (90%). The training set will be released so that participants may train and validate their models. The test corpus will be provided without any tagging and will be used to evaluate the results provided by the different systems. Obviously, it is not allowed to use the test data from previous years to train the systems.

Each message in both the training and test set is tagged with its global polarity, indicating whether the text expresses a positive, negative or neutral sentiment, or no sentiment at all. A set of 6 labels has been defined: strong positive (P+), positive (P), neutral (NEU), negative (N), strong negative (N+) and one additional no sentiment tag (NONE).

In addition, there is also an indication of the level of agreement or disagreement of the expressed sentiment within the content, with two possible values: AGREEMENT and DISAGREEMENT. This is especially useful to make out whether a neutral sentiment comes from neutral keywords or else the text contains positive and negative sentiments at the same time.

Moreover, the polarity at entity level, i.e., the polarity values related to the entities that are mentioned in the text, is also included for those cases when applicable. These values are similarly tagged with 6 possible values and include the level of agreement as related to each entity.

On the other hand, a selection of a set of topics has been made based on the thematic areas covered by the corpus, such as "política" ("politics"), "fútbol" ("soccer"), "literatura" ("literature") or "entretenimiento" ("entertainment"). Each message in both the training and test set has been assigned to one or several of these topics (most messages are associated to just one topic, due to the short length of the text).

All tagging has been done semiautomatically: a baseline machine learning model is first run and then all tags are manually checked by human experts. In the case of the polarity at entity level, due to the high volume of data to check, this tagging has just been done for the training set.

The following figure shows the information of two sample tweets. The first tweet is only tagged with the global polarity as the text contains no mentions to any entity, but the second one is tagged with both the global polarity of the message and the polarity associated to each of the entities that appear in the text (UPyD and Foro Asturias).

        <tweet>
          <tweetid>0000000000</tweetid>
          <user>usuario0</user>
          <content><![CDATA['Conozco a alguien q es adicto al drama! Ja ja ja te suena d algo!]]></content>
          <date>2011-12-02T02:59:03</date>
          <lang>es</lang>
          <sentiments>
            <polarity><value>P+</value><type>AGREEMENT</type></polarity>
          </sentiments>
          <topics>
            <topic>entretenimiento</topic>
          </topics>
        </tweet>
        <tweet>
          <tweetid>0000000001</tweetid>
          <user>usuario1</user>
          <content><![CDATA['UPyD contará casi seguro con grupo gracias al Foro Asturias.]]></content>
          <date>2011-12-02T00:21:01</date>
          <lang>es</lang>
          <sentiments>
            <polarity><value>P</value><type>AGREEMENT</type></polarity>
            <polarity><entity>UPyD</entity><value>P</value><type>AGREEMENT</type></polarity>
            <polarity><entity>Foro_Asturias</entity><value>P</value><type>AGREEMENT</type></polarity>
          </sentiments>
          <topics>
            <topic>política</topic>
          </topics>
        </tweet>

STOMPOL Corpus

STOMPOL (corpus of Spanish Tweets for Opinion Mining at aspect level about POLitics) is a corpus of Spanish tweets prepared for the research in the challenging task of opinion mining at aspect level. The tweets were gathered from 23rd to 24th of April, and are related to one of the following political aspects that appear in political campaigns:

Economia (Economics): taxes, infrastructure, markets, labor policy...
Sanidad (Health System): hospitals, public/private health system, drugs, doctors...
Educacion (Education): state school, private school, scholarships...
Propio_partido (Political party): anything good (speeches, electoral programme...) or bad (corruption, criticism) related to the entity
Otros_aspectos (Other aspects): electoral system, environmental policy...

Each aspect is related to one or several entities (separated by pipe |) that correspond to one of the main political parties in Spain, which are:

Partido_Popular (PP)
Partido_Socialista_Obrero_Español (PSOE)
Izquierda_Unida (IU)
Podemos
Ciudadanos (Cs)
Unión_Progreso_y_Democracia (UPyD)

Each tweet in the corpus has been manually tagged by two different annotators, and a third one in case of disagreement, with the sentiment polarity at aspect level. Sentiment polarity has been tagged from the point of view of the person who writes the tweet, using 3 levels: P, NEU and N. Again, no difference is made between no sentiment and a neutral sentiment (neither positive nor negative).

Each political aspect is linked to its correspondent political party and its polarity.

Some examples are shown in the following figure:

<tweet id="591267548311769088">@ahorapodemos @Pablo_Iglesias_ @SextaNocheTV Que alguien pregunte si habrá cambios en las <sentiment aspect="Educacion" entity="Podemos" polarity="NEU">becas</sentiment> MEC para universitarios, por favor.</tweet>

<tweet id="591192167944736769">#Arroyomolinos lo que le interesa al ciudadano son Políticos cercanos que se interesen y preocupen por sus problemas <sentiment aspect="Propio_partido" entity="Union_Progreso_y_Democracia" polarity="P">@UPyD</sentiment> VECINOS COMO TU</tweet>

The corpus is composed by 1284 tweets, and has been splitted into training set (784 tweets), which is provided for building and validating the systems, and test set (500 tweets) that will be used for evaluation.

Important Dates

~~July 12th, 2016~~	Release of training and test corpora (General and STOMPOL).
August ~~18th~~ 23rd, 2016	Experiment submission and evaluation.
August ~~27th~~ 31st, 2016	Submission of papers.
September 13th, 2016	Workshop.

Registration

Please send an email to tass AT sngularmeaning.team or camara AT ukp.informatik.tu-darmstadt.de or magc AT ujaen.es filling in the TASS Corpus License agreement with your email, affiliation (institution, company or any kind of organization). After that, we will send you an username and password to access to the webpage protected area (to download the corpora).

All corpora will be made freely available to the community after the workshop.

If you use the corpus in your research (papers, articles, presentations for conferences or educational purposes), please include a citation to one of the following publications:

Martínez-Cámara, E., García-Cumbreras, M.A., Villena-Román, J., & García-Morera, J. (2016). TASS 2015 - The The Evolution of the Spanish Opinion Mining Systems. Procesamiento del Lenguaje Natural, 56. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/issue/view/218.
Villena-Román, J., Martínez-Cámara, E., García-Morera, J. & Jiménez-Zafra, S. (2015). TASS 2014 - The Challenge of Aspect-based Sentiment Analysis. Procesamiento del Lenguaje Natural, 54. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/5095.
Villena-Román, J., García-Morera, J., Lana-Serrano, S., & González-Cristóbal, J.C. (2014). TASS 2013 - A Second Step in Reputation Analysis in Spanish. Procesamiento del Lenguaje Natural, 52. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/4901.
Villena-Román, J., Lana-Serrano, S., Martínez-Cámara, E., González-Cristobal, J.C. (2013). TASS - Workshop on Sentiment Analysis at SEPLN. Procesamiento del Lenguaje Natural, 50. http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/article/view/4657.
TASS (Taller de Análisis de Sentimientos en la SEPLN) website. http://www.sngularmeaning.team/TASS.

Downloads

TASS 2016

general-tweets-train-tagged.xml : General corpus training set

[3.5MB]

general-tweets-test.xml : General corpus test set (for task 1)

[16.3MB]

general-tweets-test1k.xml : General corpus 1k test set (for task 1)

[274.1KB]

stompol-tweets-train-tagged.xml : STOMPOL corpus training set

[213.3KB]

stompol-tweets-test.xml : STOMPOL corpus test set (for task 2)

[126KB]

Past editions

general-users-tagged.xml : General corpus user information, manually tagged with political orientation (TASS 2013 task 3)

[102.2KB]

general-topics.qrel : QREL file for General corpus topic classification (TASS 2012-2014 task 2)

[1.8MB]

politics2013-tweets-test-tagged.xml : Politics 2013 corpus, manually tagged (TASS 2013 task 4)

[1.4MB]

politics2013.qrel : QREL file for Politics 2013 corpus (TASS 2013 task 4)

[67KB]

Organization

Organizing Commitee

Julio Villena-Román - Singular Meaning, Spain
Miguel Ángel García-Cumbreras - University of Jaen, Spain (SINAI-UJAEN)
Eugenio Martínez-Cámara - University of Jaen, Spain (SINAI-UJAEN)
Manuel Carlos Díaz-Galiano - University of Jaen, Spain (SINAI-UJAEN)
Janine García-Morera - Singular Meaning, Spain
L. Alfonso Ureña-López - University of Jaen, Spain (SINAI-UJAEN)
María-Teresa Martín-Valdivia - University of Jaen, Spain (SINAI-UJAEN)

Programme Commitee

Alexandra Balahur - EC-Joint Research Centre, Italy
José Carlos Cortizo - European University of Madrid, Spain
José María Gómez-Hidalgo - Optenet, Spain
Carlos A. Iglesias-Fernández - Technical University of Madrid, Spain
Zornitsa Kozareva - Information Sciences Institute, USA
Sara Lana-Serrano - Technical University of Madrid, Spain
Ruslan Mitkov - University of Wolverhampton, U.K.
Andrés Montoyo - University of Alicante, Spain
Rafael Muñoz - University of Alicante, Spain
Constantin Orasan - University of Wolverhampton, U.K.
Mike Thelwall - University of Wolverhampton, U.K.
José Antonio Troyano - University of Seville, Spain
José Manuel Perea - University of Extremadura, Spain
María Teresa Taboada Gómez - Simon Fraser University, Canada
Ferran Plan Santamaría - Polytechnic University of Valencia, Spain
Lluís F. Hurtado - Polytechnic University of Valencia, Spain