Lingua::DE::Tagger
This module uses part-of-speech statistics from the Penn Treebank
to assign POS tags to German text. The tagger applies a bigram (two-word)
Hidden Markov Model to guess the appropriate POS tag for a word. That means
that the tagger will try to assign a POS tag based on the known POS tags
for a given word and the POS tag assigned to its predecessor.
The tagger tends to assume unknown words are nouns, but this behavior is
configurable.
The POS tagger can also be used to find maximal noun phrases in tagged text.
You can also use this module to extract all nouns and/or noun phrases.
TAG SET
----------------------------------------------------------------
The set of POS tags used here is a modified version of the
Penn Treebank tagset. Tags with non-letter characters have been
redefined to work better in our data structures. Also, the
``Determiner'' tag (DET) has been changed from `DT', in order to
avoid confusion with the HTML tag,
.
-----------------------------------------------------------------
CC Conjunction, coordinating and, or
CD Adjective, cardinal number 3, fifteen
DET Determiner this, each, some
EX Pronoun, existential there there
FW Foreign words
IN Preposition / Conjunction for, of, although, that
JJ Adjective happy, bad
JJR Adjective, comparative happier, worse
JJS Adjective, superlative happiest, worst
LS Symbol, list item A, A.
MD Verb, modal can, could, 'll
NN Noun aircraft, data
NNP Noun, proper London, Michael
NNPS Noun, proper, plural Australians, Methodists
NNS Noun, plural women, books
PDT Determiner, prequalifier quite, all, half
POS Possessive 's, '
PRP Determiner, possessive second mine, yours
PRPS Determiner, possessive their, your
RB Adverb often, not, very, here
RBR Adverb, comparative faster
RBS Adverb, superlative fastest
RP Adverb, particle up, off, out
SYM Symbol *
TO Preposition to
UH Interjection oh, yes, mmm
VB Verb, infinitive take, live
VBD Verb, past tense took, lived
VBG Verb, gerund taking, living
VBN Verb, past/passive participle taken, lived
VBP Verb, base present form take, live
VBZ Verb, present 3SG -s form takes, lives
WDT Determiner, question which, whatever
WP Pronoun, question who, whoever
WPS Determiner, possessive & question whose
WRB Adverb, question when, how, however
PP Punctuation, sentence ender ., !, ?
PPC Punctuation, comma ,
PPD Punctuation, dollar sign $
PPL Punctuation, quotation mark left ``
PPR Punctuation, quotation mark right ''
PPS Punctuation, colon, semicolon, elipsis :, ..., -
LRB Punctuation, left bracket (, {, [
RRB Punctuation, right bracket ), }, ]
INSTALLATION
To install this module, run the following commands:
perl Makefile.PL
make
make test
make install
SUPPORT AND DOCUMENTATION
After installing, you can find documentation for this module with the
perldoc command.
perldoc Lingua::DE::Tagger
You can also look for information at:
RT, CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Lingua-DE-Tagger
AnnoCPAN, Annotated CPAN documentation
http://annocpan.org/dist/Lingua-DE-Tagger
CPAN Ratings
http://cpanratings.perl.org/d/Lingua-DE-Tagger
Search CPAN
http://search.cpan.org/dist/Lingua-DE-Tagger
COPYRIGHT AND LICENCE
Copyright (C) 2008 Tobias Schulz
This program is free software; you can redistribute it and/or modify it
under the same terms as Perl itself.