Storage of Usage Information on Research Resource Metadata Database: Corpus Construction Using Academic Articles

Shunsuke Kozawa; Hitomi Tohyama; Kiyotaka Uchimoto; Shigeki Matsubara

doi:10.37936/ecti-cit.201152.54241

PDF

DOI: https://doi.org/10.37936/ecti-cit.201152.54241

Keywords:

Research Resource Language Resource Usage Information Academic Articles

Shunsuke Kozawa

Graduate School of Information Science, Nagoya University Furo-cho, Chikusa-ku, Nagoya, 464-8603, Japan

Hitomi Tohyama

Graduate School of Information Science, Nagoya University Furo-cho, Chikusa-ku, Nagoya, 464-8603, Japan

Kiyotaka Uchimoto

National Institute of Information and Communications Technology 3-5 Hikari-dai, Seika-cho, Sorakugun, Kyoto, 619-0289, Japan

Shigeki Matsubara

Graduate School of Information Science, Nagoya University Furo-cho, Chikusa-ku, Nagoya, 464-8603, Japan

Abstract

Recently, language resources have become indispensable for linguistic researches. However, existing language resources are seldom fully utilized because their variety of usage is not well known, indicating that their intrinsic value is not recognized very well either. Regarding this issue, lists of usage information might improve language resource searches and lead to their efficient use. In this research, therefore, we collect a list of usage information for each language resource from academic articles to promote the efficient utilization of language resources. This paper describes the construction of a text corpus annotated with usage information (UI corpus). In particular, we automatically extract sentences containing language resource names from academic articles. Then, the extracted sentences are annotated with usage information by two annotators in a cascaded manner. We will show that the UI corpus contributes to efficient language resource searches, by combining the UI corpus with a metadata database of language resources and comparing the number of language resources retrieved with and without the UI corpus.

How to Cite

[1]

S. Kozawa, H. Tohyama, K. Uchimoto, and S. Matsubara, “Storage of Usage Information on Research Resource Metadata Database: Corpus Construction Using Academic Articles”, ECTI-CIT Transactions, vol. 5, no. 2, pp. 98–106, Apr. 2016.

Issue

Vol. 5 No. 2 (2011): ECTI Transaction on CIT (Nov 2011)

Section

Artificial Intelligence and Machine Learning (AI)

Article Sidebar

Main Article Content

Abstract

Article Details