LX-Lemmatizer

Developed at the University of Lisbon, Dept. of Informatics, by the NLX-Natural Language and Speech Group.


features    |    versão portuguesa

 

 

Features


Table of contents

LX-Lemmatizer

LX-Lemmatizer (beta version) is a freely available online service for fully-fledged lemmatization of Portuguese verbs. It was developed and is mantained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics.

You may be also interested to use our LX-Suite online service for the shallow processing of Portuguese.

Features

LX-Lemmatizer takes a Portuguese verb form and delivers all the corresponding lemmata (infinitive forms) together with the inflectional feature values. Lemmata that are less likely, but still orthographically possible, are grouped together in a last section under the header "Other possible lemmata".

At the date of its inception (November 2005), it is the first freely available online service for fully-fledged Portuguese verb lemmatization, including the full range of pronominal conjugation forms. It thus handles:

 

Additionally, LX-Lemmatizer exhaustively handles a set of inflection cases which tend not to be supported together in verbal lemmatizers:

 

LX-Lemmatizer handles both known verbs and unknown verbs. It thus lemmatizes:

 

It is also worth noting the following design principles, that LX-Lemmatizer adopts with respect to the so called defective verbs:

 

LX-Lemmatizer handles the very few cases where there may be different forms in different variants:

 

Aiming at optimizing usability, LX-Lemmatizer adopts the following scheme concerning the position of clitics:

Authorship

LX-Lemmatizer is being developed by António Branco and Filipe Nunes, with the help of Francisco Costa, of the NLX-Natural Language and Speech Group, at the University of Lisbon, Department of Informatics.

Acknowledgments

The work leading to the LX-Lemmatizer was partly supported by FCT-Fundação para a Ciência e Tecnologia under the grant POSI/PLP/47058/2002 for the project TagShare.

White Papers

Branco, António, Filipe Nunes and João Silva, 2006, Verb Analysis in an Inflective Language: Simpler is better, Internal report, University of Lisbon, Department of Informatics.

Branco, António, Francisco Costa and Filipe Nunes, 2006, Processing of Verb Inflectional Ambiguity: Towards a Problem Space Delimitation, Internal report, University of Lisbon, Department of Informatics.

Contact us

Contact us using the following email address: 'nlxgroup' concatenated with 'at' concatenated with 'di.fc.ul.pt'.

Why LX-Lemmatizer?

LX because LX is the "code" name Lisboners like to use to refer to their hometown.