LX Dep

Developed at the University of Lisbon, Dept. of Informatics, by the NLX-Natural Language and Speech Group.


LX DepParser    |    features    |    versão portuguesa

 

 

Features


Index

LX Dependency Parser

LX-DepParser (beta) is a free online service for the syntactic analysis of Portuguese. It allows the automatic parsing of sentences in Portuguese in terms of their grammatical functions.

This service was developed and is maintained at the University of Lisbon by the NLX-Speech and Natural Language Group, Department of Informatics.

Parser

LX-DepParser is a MSTParser trained with Portuguese data.

For the training of the parser, 22,118 sentences were used (comprising 250,056 word tokens). The sentences were taken from the CINTIL-Treebank. This treebank is being developed and maintained at the University of Lisbon by the NLX-Speech and Natural Language Group of the Department of Informatics. In terms of evaluation, LX-DepParser's UAS (unlabeled attachment score) is 94.42 and its LAS (labeled attachment score) is 91.23. Scores were obtained through 10-fold cross-validation.

Tagset

Part-of-speech tags (high granularity)

Tag
Meaning
A
Adjective
AP
Adjective Phrase
ADV
Adverb
ADVP
Adverb Phrase
C
Complementizer
CP
Complementizer Phrase
CARD
Cardinal
CONJ
Conjuction
CONJP
Conjuction Phrase
D
Determiner
DEM
Demonstrative
N
Noun
NP
Noun Phrase
P
Preposition
PP
Preposition Phrase
POSS
Possessive
QNT
Predeterminer
S
Sentence
V
Verb
VP
Verb Phrase

Part-of-speech tags (low granularity)

TagCategoryExamples
ADJAdjectivesbom, brilhante, eficaz, …
ADVAdverbshoje, já, sim, felizmente, …
CARDCardinalszero, dez, cem, mil, …
CJConjunctionse, ou, tal como, …
CLCliticso, lhe, se, …
CNCommon Nounscomputador, cidade, ideia, …
DADefinite Articleso, os, …
DEMDemonstrativeseste, esses, aquele, …
DFRDenominators of Fractionsmeio, terço, décimo, %, …
DGTRRoman NumeralsVI, LX, MMIII, MCMXCIX, …
DGTArabic Numerals0, 1, 42, 12345, 67890, …
DMDiscourse Markerolá, …
EADRElectronic Addresseshttp://www.di.fc.ul.pt, …
EOEEnd of Enumerationetc
EXCExclamationah, ei, …
GERGerundssendo, afirmando, vivendo, …
GERAUXGerund "ter"/"haver" in compound tensestendo, havendo
IAIndefinite Articlesuns, umas, …
INDIndefinitestudo, alguém, ninguém, …
INFInfinitiveser, afirmar, viver, …
INFAUXInfinitive "ter"/"haver" in compound tensester, haver, …
INTInterrogativesquem, como, quando, …
ITJInterjectionbolas, caramba, …
LTRLettersa, b, c, …
MGTMagnitude Classesunidade, dezena, dúzia, resma, …
MTHMonthsJaneiro, Dezembro, …
NPNoun Phrasesidem, …
ORDOrdinalsprimeiro, centésimo, penúltimo, …
PADRPart of AddressRua, av., rot., …
PNMPart of NameLisboa, António, João, …
PNTPunctuation Marks., ?, (, …
POSSPossessivesmeu, teu, seu, …
PPAPast Participles not in compound tensessido, afirmados, vivida, …
PPPrepositional Phrasesalgures, …
PPTPast Participle in compound tensessido, afirmado, vivido, …
PREPPrepositionsde, para, em redor de, …
PRSPersonalseu, tu, ele, …
QNTQuantifierstodos, muitos, nenhum, …
RELRelativesque, cujo, tal que, …
STTSocial TitlesPresidente, drª., prof., …
SYBSymbols@, #, &, …
TERMNOptional Terminations(s), (as), …
UM"um" or "uma"um, uma
UNITAbbreviated Measurement Unitkg., km., …
VAUXFinite "ter" or "haver" in compound tensestemos, haveriam, …
VVerbs (other than PPA, PPT, INF or GER)falou, falaria, …
WDWeek Dayssegunda, terça-feira, sábado, …
Tags for multi-word expressions
LADV1…LADVnMulti-Word Adverbsde facto, em suma, um pouco, …
LCJ1…LCJnMulti-Word Conjunctionsassim como, já que, …
LDEM1…LDEMnMulti-Word Demonstrativeso mesmo, …
LDFR1…LDFRnMulti-Word Denominators of Fractionspor cento
LDM1…LDMnMulti-Word Discourse Markerspois não, até logo, …
LITJ1…LITJnMulti-Word Interjectionsmeu Deus
LPRS1…LPRSnMulti-Word Personalsa gente, si mesmo, V. Exa., …
LPREP1…LPREPnMulti-Word Prepositionsatravés de, a partir de, …
LQD1…LQDnMulti-Word Quantifiersuns quantos, …
LREL1…LRELnMulti-Word Relativestal como, …
Tags specific to the spoken corpus
EMPEmphasis
ELExtra-linguistic
PLPara-linguistic
FRGFragment

Inflection tags

TagDescription
Tags for nominal categories
mMasculine
fFeminine
sSingular
pPlural
dimDiminutive
supSuperlative
compComparative
Tags for verbs
1First Person
2Second Person
3Third Person
piPresente do Indicativo
ppiPretérito Perfeito do Indicativo
iiPretérito Imperfeito do Indicativo
mpiPretérito Mais que Perfeito do Indicativo
fiFuturo do Indicativo
cCondicional
pcPresente do Conjuntivo
icPretérito Imperfeito do Conjuntivo
fcFuturo do Conjuntivo
impImperativo
Tags for infinitive verbs
iflInflected
niflNot Inflected

Gramatical Function Tagset

Tag
Meaning
C
Complement
CARD
Cardinal in multi-word cardinals
COORD
Coordination
CONJ
Conjunction
DO
Direct Object
IO
Indirect Object
OBL
Oblique Object
M
Modifier
N
Relationship between words and named entities
OBL
Oblique Complement
PUNCT
Punctuation
PRD
Predicate
SJ
Subject
SJac
Subject of an anticausative
SJcp
Subject of complex predicate
SP
Specifier

Annotation guidelines

The analyses produced by LX-DepParser are similar to the dependency representations found in the dependency treebank on which LX-DepParser was trained. This dependency treebank was designed along the principles described in the following handbook:

Branco António, Sérgio Castro, João Silva, Francisco Costa, 2011, CINTIL DepBank Handbook: Design options for the representation of grammatical dependencies. Department of Informatics, University of Lisbon, Technical Reports series, nb. di-fcul-tr-11-03.

Authorship

LX-DepParser is being developed by Rúben Reis, under the direction of António Branco in NLX-Group on Natural Language and Speech.

Contacts

You can contact us at the following email address: 'nlx' followed by '@' followed by 'di.fc.ul.pt'.

Acknowledgments

LX-DepParser was partially funded by FCT-Foundation for Science and Technology, under the contract FCT/PTDC/PLP/81157/2006 for the project SemanticShare.