2016-08-23 5 views
0

coreNLPを使用してフランス語のcoreference注釈を実行するための設定を修正するのに役立つ人はいますか?私はプロパティを編集して、基本的な提案をしようと試みているファイル:CoreNLPを使用しているフランス語のcoferenceアノテーション

annotators = tokenize, ssplit, pos, parse, lemma, ner, parse, depparse, mention, coref 
tokenize.language = fr 
pos.model = edu/stanford/nlp/models/pos-tagger/french/french.tagger  
parse.model = edu/stanford/nlp/models/lexparser/frenchFactored.ser.gz 

コマンド:

[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos 
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/french/french.tagger ... done [0.3 sec]. 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse 
[main] INFO edu.stanford.nlp.parser.common.ParserGrammar - Loading parser from serialized file edu/stanford/nlp/models/lexparser/frenchFactored.ser.gz ... 
done [2.2 sec]. 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner 
Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [2.0 sec]. 
Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.7 sec]. 
Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.9 sec]. 
[main] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1. 
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/defs.sutime.txt 
ago 23, 2016 5:37:34 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules 
INFORMACIÓN: Read 83 rules 
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.sutime.txt 
ago 23, 2016 5:37:34 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules 
INFORMACIÓN: Read 267 rules 
Reading TokensRegex rules from edu/stanford/nlp/models/sutime/english.holidays.sutime.txt 
ago 23, 2016 5:37:34 PM edu.stanford.nlp.ling.tokensregex.CoreMapExpressionExtractor appendRules 
INFORMACIÓN: Read 25 rules 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator parse 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse 
Loading depparse model file: edu/stanford/nlp/models/parser/nndep/english_UD.gz ... 
PreComputed 100000, Elapsed Time: 1.639 (s) 
Initializing dependency parser done [6.4 sec]. 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator mention 
Using mention detector type: rule 
[main] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref 
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space 
    at java.util.Arrays.copyOfRange(Arrays.java:3664) 
    at java.lang.String.<init>(String.java:207) 
    at java.lang.StringBuilder.toString(StringBuilder.java:407) 
    at java.io.ObjectInputStream$BlockDataInputStream.readUTFBody(ObjectInputStream.java:3097) 
    at java.io.ObjectInputStream$BlockDataInputStream.readUTF(ObjectInputStream.java:2892) 
    at java.io.ObjectInputStream.readString(ObjectInputStream.java:1646) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1344) 
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) 
    at java.util.HashMap.readObject(HashMap.java:1402) 
    at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:498) 
    at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1058) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1909) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
    at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2018) 
    at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1942) 
    at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1808) 
    at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1353) 
    at java.io.ObjectInputStream.readObject(ObjectInputStream.java:373) 
    at edu.stanford.nlp.io.IOUtils.readObjectFromURLOrClasspathOrFileSystem(IOUtils.java:324) 
    at edu.stanford.nlp.scoref.SimpleLinearClassifier.<init>(SimpleLinearClassifier.java:30) 
    at edu.stanford.nlp.scoref.PairwiseModel.<init>(PairwiseModel.java:75) 
    at edu.stanford.nlp.scoref.PairwiseModel$Builder.build(PairwiseModel.java:57) 
    at edu.stanford.nlp.scoref.ClusteringCorefSystem.<init>(ClusteringCorefSystem.java:31) 
    at edu.stanford.nlp.scoref.StatisticalCorefSystem.fromProps(StatisticalCorefSystem.java:48) 
    at edu.stanford.nlp.pipeline.CorefAnnotator.<init>(CorefAnnotator.java:66) 
    at edu.stanford.nlp.pipeline.AnnotatorImplementations.coref(AnnotatorImplementations.java:220) 
    at edu.stanford.nlp.pipeline.AnnotatorFactories$13.create(AnnotatorFactories.java:515) 
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:85) 
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:375) 

私はあると思いますが行わ:以下の出力ログを取得し

java -cp "*" -Xmx2g edu.stanford.nlp.pipeline.StanfordCoreNLP -props frenchProps.properties -file frenchFile.txt 

余分な設定の不足。

答えて

0

AFAIK AFAIK CoreNLPはフランス語のためのコアリファレンス解決を提供しません。 (http://stanfordnlp.github.io/CoreNLP/coref.htmlも参照してください)

+0

ありがとうございます@Igor。私はすでに親切に示したリンクを読んでいます。私は、テキストを英語に翻訳して近似を試みました。感情のような注釈者のために、おそらくこの部分解は非常に偏っている可能性があります。しかし、それは私のために働いたのです。いずれにせよ、別の提案があれば歓迎します。 – Nacho

関連する問題