Training a Swedish POS-tagger for Stanford CoreNLP

Introduction

Training

## tagger training invoked at Tue Jul 08 16:08:39 PDT 2014 with arguments:
model = swedish-pos-tagger-model
arch = words(-1,1),unicodeshapes(-1,1),order(2),suffix(4)
wordFunction =
trainFile = format=TSV,wordColumn=1,tagColumn=4,talbanken-stanford-train.conll
closedClassTags =
closedClassTagThreshold = 40
curWordMinFeatureThresh = 2
debug = false
debugPrefix =
tagSeparator = _
encoding = iso-8859–1
iterations = 100
lang =
learnClosedClassTags = false
minFeatureThresh = 2
openClassTags =
rareWordMinFeatureThresh = 10
rareWordThresh = 5
search = qn
sgml = false
sigmaSquared = 0.0
regL1 = 0.75
tagInside =
tokenize = false
tokenizerFactory =
tokenizerOptions = asciiQuotes
verbose = true
verboseResults = true
veryCommonWordThresh = 250
xmlInput = null
outputFile =
outputFormat = slashTags
outputFormatOptions =
nthreads = 1
trainFile = format=TSV,wordColumn=1,tagColumn=4,talbanken-stanford-train.conll
model = swedish-pos-tagger-model
java -classpath stanford-postagger-3.5.1.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -prop swedish-tagger.props

Testing

java -classpath stanford-postagger-3.5.1.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -prop swedish.tagger.props -model swedish.tagger -testFile talbanken-stanford-test.conll

--

--

Prototyper and tinkerer | Data Science@Meltwater

Love podcasts or audiobooks? Learn on the go with our new app.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store