Using a parallel corpus to study patterns of word order variation: determiners and quantifiers within the noun phrase in European languages
DOI:
https://doi.org/10.6092/issn.2785-0943/15653Keywords:
word order, determiner, quantifier, entropy, Universal Dependency, European languagesAbstract
Despite the wealth of studies on word order, there have been very few studies on the order of minor word categories such as determiners and quantifiers. This is likely due to the difficulty of formulating valid cross-linguistic definitions for these categories, which also appear problematic from a computational perspective. A solution lies in the formulation of comparative concepts and in their computational implementation by combining different layers of annotation with manually compiled list of lexemes; the proposed methodology is exemplified by a study on the position of these categories with respect to the nominal head, which is conducted on a parallel corpus of 17 European languages and uses Shannon’s entropy to quantify word order variation. Whereas the entropy for the article-noun pattern is, as expected, extremely low, the proposed methodology sheds light on the variation of the demonstrative-noun and the quantifier-noun patterns in three languages of the sample.
References
Alexander, Ronelle. 2006. Bosnian, Croatian, Serbian, a Grammar. Madison: The University of Wisconsin Press.
Alzetta, Chiara & Dell’Orletta, Felice & Montemagni, Simonetta & Venturi, Giulia. 2018. Universal Dependencies and Quantitative Typological Trends. A Case Study on Word Order. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA). url: https://www.aclweb.org/anthology/L18-1719.
Antova, Evgenia & Boytchinova, Ekaterina & Benatova, Poly. 2002. A short grammar of Bulgarian for English speaking learners (2 ed.). Sofia: ABM Komers.
Arnaiz, Alfredo R. 1998. The main word order characteristics of Romance. In Siewerska, Anna (ed.) Constituent Order in the Language of Europe, 47-74. Berlin: Mouton de Gruyter.
Barbu Mititelu, Verginica & Irimia, Elena & Perez, Cenel-Augusto & Ion, Radu & Simionescu, Radu & Popel, Martin. 2016. UD Romanian RoRefTrees. https://github.com/UniversalDependencies/UD_Romanian-RRT.
Batchelor, Ronald E. & Chebli-Saadi, Malliga. 2011. A Reference Grammar of French. Cambridge: Cambridge University Press.
Bielec, Dana. 1998. Polish: An Essential Grammar. London & New York: Routledge.
Butt, John & Benjamin, Carmen & Rodríguez, Antonia Moreira. 2019. A New Reference Grammar of Modern Spanish (6 ed.). London & New York: Routledge.
Cinque, Guglielmo. 2005. Deriving Greenberg's Universal 20 and Its Exceptions. Linguistic Inquiry 36(3). 315–332.
Croft, William. 2001. Radical Construction Grammar: Syntactic Theory in Typological Perspective. Oxford: Oxford University Press.
Croft, William. 2016. Comparative concepts and language-specific categories: Theory and practice. Linguistic Typology 20(2). 377–393.
Croft, William. 2022. Morphosyntax: constructions of the world’s languages. Cambridge: Cambridge University Press.
Diessel, Holger & Coventry, Kenny R. 2020. Demonstratives in Spatial Language and Social Interaction: An Interdisciplinary Review. Frontiers in psychology 11.
Dobrovie-Sorin, Carmen & Giurgea, Ion (eds.). 2013. A Reference Grammar of Romanian: Volume 1: The Noun Phrase. Amsterdam / Philadelphia: John Benjamins Publishing Company.
Dobrovie-Sorin, Carmen & Giurgea, Ion. 2013. Introduction: Nominal features and nominal projections. In Dobrovie-Sorin, Carmen & Ion Giurgea (eds.), A Reference Grammar of Romanian: Volume 1: The Noun Phrase, 1-48. Amsterdam / Philadelphia: John Benjamins Publishing Company.
Dryer, Matthew S. 1992. The Greenbergian Word Order Correlations. Language 68(1). 81-138.
Dryer, Matthew S. 1998. Aspects of Word Order in the Languages of Europe. In Anna Siewierska (ed.), Constituent Order in the Languages of Europe, 283-319. European Science Foundation Language Typology series. Berlin: Mouton de Gruyter.
Dryer, Matthew S. 2007. Lexical nominalization. In Shopen, Timothy (ed.), Language Typology and Syntactic Description. Grammatical Categories and the Lexicon (Second Edition, 151–205. Cambridge: Cambridge University Press.
Dryer, Matthew S. 2009. The Branching Direction Theory of Word Order Correlations Revisited. In Scalise, Sergio & Magni, Elisabetta & Bisetto, Antonietta (eds.), Universals of Language Today, 185–207. Berlin: Springer.
Dryer, Matthew S. 2013a. Order of Demonstrative and Noun. In Dryer, Matthew S. & Haspelmath, Martin (eds.), The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. url: https://wals.info/chapter/88
Dryer, Matthew S. 2013b. Determining Dominant Word Order. In Dryer, Matthew S. & Haspelmath, Martin (eds.), The World Atlas of Language Structures Online. Leipzig: Max Planck Institute for Evolutionary Anthropology. url: https://wals.info/chapter/s6
Dryer, Matthew S. 2018. The order of demonstrative, numeral, adjective, and noun. Language 94(4). 798-833.
Futrell, Richard & Mahowald, Kyle & Gibson, Edward. 2015. Quantifying Word Order Freedom in Dependency Corpora. In Proceedings of the Third International Conference on Dependency Linguistics (Depling 2015), 91–100, Uppsala, Sweden. Uppsala University, Uppsala, Sweden.
Gerdes, Kim & Kahane, Sylvain & Chen, Xinying. 2019. Rediscovering Greenberg’s Word Order Universals in UD. In Proceedings of the Third Workshop on Universal Dependencies (UDW, SyntaxFest 2019). Paris, France: Association for Computational Linguistics, 124–131. doi: 10.18653/v1/W19-8015. url: https://www.aclweb.org/anthology/W198015.
Giurgea, Ion. 2013. The syntax of determiners and other functional categories. In Dobrovie-Sorin, Carmen & Giurgea, Ion (eds.), A Reference Grammar of Romanian: Volume 1: The Noun Phrase, 97-174. Amsterdam / Philadelphia: John Benjamins Publishing Company.
Gönczöl-Davies, Ramona. 2008. Romanian: an essential grammar. London & New York: Routledge.
Greenberg, Joseph H. 1963. Some Universals of Grammar with Particular Reference to the Order of Meaningful Elements. In Greenberg, Joseph H. (ed.), Universals of Human Language, 73-113. Cambridge, Mass: MIT Press.
Greenberg, Joseph H. 1978. Generalizations about Numeral Systems. In Greenberg, Joseph H. & Ferguson, Charles A. & Moravcsik, Edith A. (eds.), Universals of Human Language, Volume 3: Word Structure, 249–295. Stanford: Stanford University Press.
Haspelmath, Martin 2010. Comparative concepts and descriptive categories in cross-linguistic studies. Language 86(3), 663–687.
Haspelmath, Martin. 2018. How comparative concepts and descriptive linguistic categories are different. In Van Olmen, Daniël & Mortelmans, Tanja & Brisard, Frank (eds.), Aspects of Linguistic Variation, 83-114. Berlin: De Gruyter.
Hawkins, John A. 1983. Word Order Universals. New York: Academic Press.
Heinecke, Johannes & Tyers, Francis M. 2019. Development of a Universal Dependencies treebank for Welsh. In Proceedings of the Celtic Language Technology Workshop. Dublin: European Association for Machine Translation, 21-31. url: https://www.aclweb.org/anthology/W19-6904
Himmelmann, Nikolaus P. 2001. Articles. In Haspelmath, Martin & König, Ekkehard & Oesterreicher, Wulf & Raible, Wolfgang (eds.), Language Typology and Language Universals Vol. 1, 831-841. Berlin: Walter de Gruyter.
Holmberg, Andres & Rijkhoff, Jan. 1998. Word order in the Germanic languages. In Siewerska, Anna (ed.), Constituent Order in the Language of Europe, 75-104. Berlin: Mouton de Gruyter.
Ioup, Georgette. 1975. Some universals for quantifier scope. In Kimball, John (ed.), Syntax and Semantics, vol. 5, Academic Press, New York.
Keenan, Edward L. 2012. The Quantifier Questionnaire. In Keenan, Edward & Paperno, David (eds.), Handbook of Quantifiers in Natural Language. Studies in Linguistics and Philosophy, vol 90, 1-20. Dordrecht: Springer.
King, Gareth. 2003. Modern Welsh: A Comprehensive Grammar. London & New York: Routledge.
Koplenig, Alexander & Meyer, Peter & Wolfer, Sascha & Müller-Spitzer, Carolyn. 2017. The statistical trade-off between word order and word structure - large-scale evidence for the principle of least effort. PLoS ONE 12.
Lyons, Christopher. 1999. Definiteness. Cambridge: Cambridge University Press.
Lascaratou, Chryssoula. 1998. Basic characteristics of Modern Greek word order. In Siewerska, Anna (ed.), Constituent Order in the Language of Europe, 151-171. Berlin: Mouton de Gruyter.
Levshina, Natalia. 2019. Token-based typology and word order entropy: A study based on Universal Dependencies. Linguistic Typology 23(3). 533–572.
Levshina, Natalia. 2021. Cross-linguistic trade-offs and causal relationships between cues to grammatical subject and object, and the problem of efficiency-related explanations. Front. Psychol 12.
Levshina, Natalia & Namboodiripad, Savithry & Allassonnière-Tang, Marc & Kramer,
Mathew A. & Talamo, Luigi & Verkerk, Annemarie & Wilmoth, Sasha & Garrido Rodriguez, Gabriela & Gupton, Timothy & Kidd, Evan & Liu, Zoey & Naccarato, Chiara & Nordlinger, Rachel & Panova, Anastasia & Stoynova, Natalia. 2023. Why we need a gradient approach to word order. Linguistics 61(4). 825-883.
Lundskær-Nielsen, Tom, & Holmes, Philip. 2010. Danish: A comprehensive grammar. 2nd edn. Cambridge: Cambridge University Press.
de Marneffe, Marie-Catherine & Manning, Christopher D. & Nivre, Joakim & Zeman, Daniel. 2021. Universal Dependencies. Computational Linguistics 47(2). 255–308.
Montemurro, M. A., & Zanette, D. H. 2011. Universal Entropy of Word Ordering Across Linguistic Families. PLoS ONE 6(5).
Naranjo, Matías Guzmán, & Becker, Laura. 2018. Quantitative word order typology with UD. In Haug, Dag & Oepen, Stephan & Øvrelid, Lilja & Candito, Marie & Hajič, Jan (eds.), Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018), 91–104. Oslo: Linköping Electronic Conference Proceedings.
Naughton, James. 2005. Czech: an essential grammar. London & New York: Routledge.
Qi, Peng & Zhang, Yuhao & Zhang, Yuhui & Bolton, Jason & Manning, Christopher D. 2020. Stanza: A Python Natural Language Processing Toolkit for Many Human Languages. In Jurafsky, Dan & Chai, Joyce & Schluter, Natalie & Tetreault, Joel (eds.), Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. url: https://aclanthology.org/2020.acl-demos.14.pdf
Ramonienė, Meilutė & Pribušauskaitė, Joana & Ramonaitė, Jogilė Teresa & Vilkienė, Loreta. 2019. Lithuanian: A Comprehensive Grammar. London & New York: Routledge.
Siewerska, Anna. (eds.). 1998. Constituent Order in the Languages of Europe. Berlin: Mouton de Gruyter.
Siewierska, Anna & Uhlířová, Ludmila. 1998. Word order in the Slavic languages. In Siewerska, Anna (ed.), Constituent Order in the Language of Europe, 105-149. Berlin: Mouton de Gruyter.
Stenson, Nancy. 2020. Modern Irish: A Comprehensive Grammar. London & New York: Routledge.
Talamo, Luigi & Verkerk, Annemarie. 2022. A new methodology for an old problem: A corpus-based typology of adnominal word order in European languages. Italian Journal of Linguistics 34(1). 171-226.
Tallerman, Maggie. 1998. Word order in Celtic. In Siewerska, Anna (ed.), Constituent Order in the Language of Europe, 21-45. Berlin: Mouton de Gruyter.
Thurneysen, Rudolf. 1990. A Grammar of Old Irish, revised and enlarged edition, translated from the German by Daniel A. Binchy and Osborn Bergin. Dublin: Dublin Institute for Advanced Studies.
Timberlake, Alan. 2004. A reference grammar of Russian. Cambridge: Cambridge: Cambridge University Press.
Wälchli, Bernhard. 2009. Data reduction typology and the bimodal distribution bias. Linguistic Typology 13(1). 77-94.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Luigi Talamo
This work is licensed under a Creative Commons Attribution 4.0 International License.