{"title":"Grammatically Coded Corpus of Spoken Lithuanian: Methodology and Development","authors":"L. Kamandulyt\u0117-Merfeldien\u0117","volume":124,"journal":"International Journal of Cognitive and Language Sciences","pagesStart":874,"pagesEnd":879,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10006833","abstract":"
The paper deals with the main issues of methodology of the Corpus of Spoken Lithuanian <\/em>which was started to be developed in 2006. At present, the corpus consists of 300,000 grammatically annotated word forms. The creation of the corpus consists of three main stages: collecting the data, the transcription of the recorded data, and the grammatical annotation. Collecting the data was based on the principles of balance and naturality. The recorded speech was transcribed according to the CHAT requirements of CHILDES. The transcripts were double-checked and annotated grammatically using CHILDES. The development of the Corpus of Spoken Lithuanian has led to the constant increase in studies on spontaneous communication, and various papers have dealt with a distribution of parts of speech, use of different grammatical forms, variation of inflectional paradigms, distribution of fillers, syntactic functions of adjectives, the mean length of utterances.<\/p>\r\n","references":"[1]\tJ. Kuva\u010d Kraljevi\u0107, and G. Hr\u017eica, \u201cCroatian adult spoken language corpus (HrAL),\u201d FLUMINENSIA: Journal for Philological Research, vol. 28, no. 2, 2017, pp. 87\u2013102.\r\n[2]\tD. Biber, \u201cInvestigation language use through corpus-based analyses of association patterns,\u201d International Journal of Corpus Linguistics, vol. 1, no. 2, 1996, pp. 171\u2013198.\r\n[3]\tD. Biber, University Language: A Corpus-based Study of Spoken and Written Registers. Amsterdam: John Benjamins, 2006.\r\n[4]\tG. Gravier, G. Adda, N. Paulson, M. Carr\u00e9, A. Giraudel, and O. Galibert, \u201cThe ETAPE corpus for the evaluation of speech-based TV content processing in the French language,\u201d in LREC-Eighth international conference on Language Resources and Evaluation, Turkey, 2012.\r\n[5]\tR. Simpson, and D. Mendis, \u201cA Corpus-Based Study of Idioms in Academic Speech\u201d, Tesol Quarterly, vol. 37, iss. 3, 2003, pp. 419\u2013441.\r\n[6]\tR. Reppen, \u201cEnglish language teaching and corpus linguistics: Lessons from the American National Corpus,\u201d in Contemporary Corpus Linguistics, P. Baker, Ed. London: Continuum, 2012, pp. 204\u2013213.\r\n[7]\tR. Carter, and M. McCarthy, Exploring Spoken English. Cambridge: Cambridge University Press, 1997.\r\n[8]\tM. McCarthy, and M. Handford, \u201cInvisible to us: A preliminary corpus-based study of spoken business English,\u201d in Discourse in the Professions: Perspectives form Corpus Linguistics, U. Connor, T. Upton, Eds. Amsterdam: John Benjamins, 2004, pp.167\u2013201.\r\n[9]\tCorpus of Spoken Lithuanian, http:\/\/donelaitis.vdu.lt\/sakytines-kalbos-tekstynas\/ Accessed on 20\/03\/2017.\r\n[10]\tChild Language Data Exchange System, https:\/\/childes.psy.cmu.edu\/ Accessed on 20\/03\/2017.\r\n[11]\tB. MacWhinney, \u201cThe TalkBank Project,\u201d in Creating and Digitizing Language Corpora: Synchronic Databases, vol. 1, J. C. Beal, K. P. Corrigan & H. L. Moisl, Eds. Houndmills: Palgrave-Macmillan, 2007, pp. 163\u2013180.\r\n[12]\tI. Daba\u0161inskien\u0117, and L. Kamandulyt\u0117, \u201cCorpora of Spoken Lithuanian,\u201d Estonian papers in applied linguistics, no. 5, 2009, pp. 67\u201377.\r\n[13]\tL. Kamandulyt\u0117-Merfeldien\u0117, and I. Bal\u010di\u016bnien\u0117, \u201cSyntactically Coded Corpus of Spoken Lithuanian: Developmental Issues and Pilot Studies,\u201d Studies about Languages, no. 28, 2016, pp. 92\u2013101,\r\n[14]\tL. Kamandulyt\u0117-Merfeldien\u0117, \u201cPertar\u0173 da\u017enumas ir \u012fvairov\u0117 sakytin\u0117je kalboje (The Frequency and Variety of Fillers in Spoken Lithuanian Language),\u201d Bendrin\u0117kalba, no. 87, 2014, pp. 1\u201310.\r\n[15]\tL. Kamandulyt\u0117-Merfeldien\u0117, and I. Bal\u010di\u016bnien\u0117, \u201cFunkciniai pasakym\u0173 tipai sakytin\u0117je kalboje (Types of Sentences and their Functions in Spoken Lithuanian),\u201d Thought elaboration: linguistics, literature, media expression: coolection of scientific papers, 2016, pp. 11\u201329.\r\n[16]\tL. Kamandulyt\u0117-Merfeldien\u0117, and I. Bal\u010di\u016bnien\u0117, \u201cAtributini\u0173 ir predikatini\u0173 jungini\u0173 su b\u016bdvard\u017eiais da\u017enumas ir strukt\u016bra sakytin\u0117je kalboje (Frequency and structure of attributive and predicative utterances in spoken Lithuanian)\u201d, Lituanistica, vol. 62, no. 2, 2016, pp. 127\u2013137.\r\n[17]\t L. Kamandulyt\u0117-Merfeldien\u0117, \u201cMorphological modifications in Lithuanian child directed speech\u201d, Estonian Papers in Applied Linguistics, no. 3, 2007, pp. 155\u2013166. \r\n[18]\tA. J. Liddicoat, An Introduction to Conversion Analysis, London: Continuum, 2007.\r\n[19]\tG. Brown, and G. Yule, Discourse analysis, Cambridge University Press, 2001.\r\n[20]\tD. Crystal, A Dictionary of Linguistics and Phonetics, Blackwell Reference, 2003.\r\n[21]\tI. Daba\u0161inskien\u0117, \u201c\u0160nekamosios lietuvi\u0173 kalbos morfologin\u0117s ypatyb\u0117s (The Morphological Features of Spoken Lithuanian)\u201d, Acta Linguistica Lithuanica, no. 60, 2009, pp. 1\u201315.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 124, 2017"}