BERT language model for Korean(KorBERT)

Detail 1

BERT language model(KorBERT) developed reflecting the characteristics of the Korean language is provided in open API services (http://aiopen.etri.re.kr) provided by ETRI

Move
Standards for construction of corpora based on Korean language analysis

Detail 1

Morphemes analysis, named entity recognition, and parsing have been selected as the Telecommunication Technology Association (TTA) standards through validation by various experts under direction of the ETRI language intelligence research section. The standards for thematic role recognition and question analysis are submitted and currently subject to validation. You can download the standards below

Move
ETRI Corpora

Detail 1

Exobrain QA datasets (ETRI), language analysis corpora (ETRI), Korean TimeBank and SpaceBank, Morphology/Semantics corpora provided by University of Ulsan, corpora service in open API services (http://aiopen.etri.re.kr) provided by ETRI

Move
Current status of development and plan for improvement of Exobrain’s Korean language analysis and question answering technology

Detail 1

Journal of Society for Information Science and Technology, Vol. 35, No. 8, Aug. 2017

Download
Ulsan University corpus

Detail 3(2017~2019)

Ulsan University corpus provides stemming and homonym annotation corpus (UTagger-HG), dependency and semantic annotation corpus (Utagger-DP/SR), and multilingual-level semantic dictionary (UPropBank)

Move
Lexical map corpora of the Korean language

Detail 3(2017~2019)

Korlex1.0 database is under free distribution. Search service on web for Korlex1.5

Move
CG data extracted from TriviaQA evidence documents & queries

Detail 3(2017~2019)

  • Extracted by applying CG Extractor to TriviaQA text data (evidence documents and queries)
  • Contains approximately 48,000 documents / 16,800,000 triples in json form
Queries Evidence documents
Data list of Exobrain-Detail3

Detail 3(2017~2019)

  • English concept embedding
  • English context embedding
  • Morphological-semantic annotation corpus
  • Reliance-thematic role annotation corpus
  • Object name annotation corpus
Move