FORMAL-FUNCTIONAL MODELS OF THE UZBEK ELECTRON CORPUS
Abstract
The paper is devoted to the structure and its linguistic annotation for building Uzbek Corpus. Linguistic annotation, metadata and corpus manager as formal-functional model of the corpus are important for usage for many purposes. The fact that the platform allows users to address language and literature issues, use it online. The Uzbek corpus based on structural and sub corpus models, which partially represented in this paper, is going on process to develop Uzbek language technology.
Keywords: Uzbek corpus, morphoanalyzer, metadata, parallel corpora, text analysis, corpus manager.
References
Sulevmanov, D., Gatiatullin, A., Prokopyev, N., Abdurakhmonova, N. (2020) Turkic morpheme web portal as a platform for turkology research International Conference on Information Science and Communications Technologies, ICISCT 2020, 2020, 9351500.
Khusainov, A., Suleymanov, D., Gilmullin, R., Minsafina, A., Kubedinova, L., Abdurakhmonova. N. (2020) First Results of the “TurkLang-7” Project: Creating Russian-Turkic Parallel Corpora and MT Systems CMLS 2020 CEUR Workshop Proceedings, 2020, pp. 90-101.
Khusainov, A., Suleymanov, D., Gilmullin, R., Gatiatullin, A. (2018) Building the Tatar-Russian NMT system based on re-translation of multilingual data Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11107 LNAI, pp. 163–170.
Абдураҳмонова Н. (2020) Замонавий корпусларнинг компьютер моделлари // Ўзбекистонда хорижий тиллар. -2020. - № 1(30). - Б. 50-58. https://doi.org/ 10.36078/
Мухамедшин, Д.Р., Сулейманов Д.Ш. (2018) Система корпус-менеджер: архитектура и модели корпусных данных Программные продукты и системы / Software & Systems 4 (31) – C. 6.
В. П. Захаров, И. В. Азарова, О. А. Митрофанова, А. М. Попов, М. В. Хохлова (2019) Моделирование в корпусной лингвистике Специализированные корпусы русского языка, Санкт-Петербургский государственный университет. -C. 19.
Erhard Hinrichs, Marie Hinrichs, Thomas Zastrow, Gerhard Heyer, Volker Boehlke, Uwe Quasthoff, Helmut Schmid, Ulrich Heid, Fabienne Fritzinger, Alexander Siebert, and Jorg Didakowski. (2009) Weblicht: Web-based LRT services for German. In Workshop on linguistic processing pipelines, GSCL Jahrestagung, Potsdam.
Аброскин А. А. Поиск по корпусу: проблемы и методы их решения // Национальный корпус русского языка: 2006-2008. Новые результаты и перспективы. СПб.: Нестор-История, 2009, 277-282.
https://uz.wikipedia.org/wiki/O%CA%BBzbek_tili
Jinyi Zhang, Tadahiro Matsumoto (2019) Corpus Augmentation for Neural Machine Translation with ChineseJapanese Parallel Corpora / Applied sciences (9), 2036.
Downloads
Published
How to Cite
Issue
Section
License
Declaration/Copyright transfer:
1. In consideration of the undertaking set out in paragraph 2, and upon acceptance by ANGLISTICUM for publication of the manuscript in the Journal, I/We hereby assign and transfer publication rights to ANGLISTICUM, whereas I/We retain the copyright for the manuscript. This assignment provides ANGLISTICUM the sole right and responsibility to publish the manuscript in its printed and online version, and/or in other media formats.
2. In consideration of this assignment, ANGLISTICUM hereby undertakes to prepare and publish the manuscript in the Journal, subject only to its right to refuse publication if there is a breach of the Author’s warranty in paragraph 4 or if there are other reasonable grounds.
3. Editors and the editorial board of ANGLISTICUM are empowered to make such editorial changes as may be necessary to make the Manuscript suitable for publication.
4. I/We hereby acknowledge that: (a) The manuscript submitted is an original work and that I/We participated in the work substantively and thus I/We hereby are prepared to take public responsibility for the work; (b) I/We hereby have seen and approved the manuscript as submitted and that the manuscript has not either been published, submitted or considered for publication elsewhere; (c) The text, illustration, and any other materials included in the manuscript do not infringe upon any existing copyright or other rights of anyone.
5. I/We hereby indemnify ANGLISTICUM and the respective Editors of the Journal as mentioned in paragraph 3, and hold them harmless from any loss, expense or damage occasioned by a claim or suit by a third party for copyright infringement, or any suit arising out of any breach of the foregoing warranties as a result of publication of the manuscript.