Linguistic Combinatorics: Infinity and Human Language

In the following essay, I consider the question of infinity in regard to human languages, and conclude that human language does not allow an infinity of sentential structure. With this conclusion I can go further and imply (but not prove) that many aspects of our human existence can't be infinite. For instance, take fingerprints which we all know are unique for every human individual. Can there be an infinity of human fingerprints? The answer is a qualified no. By which I mean there is a potential infinity of human fingerprints. Yet, there is no absolute infinity for human fingerprints. There can't be since the human race itself is a finite living species. Our evolutionary form is destined to end at some point in the future, so no matter how many new humans are born, at some point we'll become extinct, either through our own actions or the process of our solar system being extinguished. If human existence were infinite, we could have an infinite number of fingerprints, as every human being has a unique fingerprint. I'm not sure this is true. Empirical scientists like geneticists only know from the DNA coding schema that the process of sexual reproduction hasn't yielded repetitive sets of fingerprints. I am excluding twins of course. But, that could change, and we could have identical fingerprint sets for two individuals. This remark could be made for any human physical characteristic. This means like human language, there isn't an infinite number of human physical attributes either.

Linguistic Combinatorics: Infinity and Human Language

Robleh Wais Wed Apr 22, 2009 10:52 pm

I've just started reading The Language Instinct by Steven Pinker. It's a book directed at laymen to the field of linguistics. I had been enjoying how he shows that we are genetically-specified to acquire language. I concur with this notion. Then I came to the chapter, entitled How Language Works and here I had a problem. My problems led me to begin thinking about what at first sight appears to be a simple question.

Are there an infinite number of ways we can combine words of any language (in our case English) to form meaningful sentences? Simple it seems huh? Not so fast. Pinker repeatedly tells us yes, we can make an infinite number of sentences in this or any other language. He uses algebraic combinatorics to show this. But, to me this doesn't seem so. The implication of my dissent is that we have an upper supremum on what meaningful statements we can make in any language.

Take English, with 26 letters we can form a large finite set of words. I've read, from lexical sources estimates that English has approximately 500,000 to as high as 800,000 words. The combinations of those words arithmetically are seemingly infinite. But, they are not. If we specify that the sentences formed must be meaningful in this set it is limited. What do I mean by meaningful? We will discover that shortly. Pinker gives an example like this in the above-named chapter. The happy dog eats candy. He then goes on to show we could form another sentence by simply adding another happy to the given one. And based on this any given sentence could be infinitely extended by adding more words. So, using this example let's say we took the sentence The happy dog eats candy and added every single one of the 800,000 English words to the end of it, one at a time. This would amount to taking The happy dog eats candy + 1, The happy dog eats candy + 2....The happy dog eats candy + N, where N is every English word up to 800,000. So, we now have The happy dog eats candy + 800,000. We could keep doing that with any given sentence and the numbers of sentences would grow immensely.

Here is an example of how new words can be created simply by using the concatenation function MS Excel. See the list below:
a uzzle auzzle a ittle aittle b uzzle buzzle b ittle bittle c uzzle cuzzle c ittle cittle d uzzle duzzle d ittle dittle e uzzle euzzle e ittle eittle f uzzle fuzzle f ittle fittle g uzzle guzzle g ittle gittle h uzzle huzzle h ittle hittle i uzzle iuzzle i ittle iittle j uzzle juzzle j ittle jittle k uzzle kuzzle k ittle kittle l uzzle luzzle l ittle little m uzzle muzzle m ittle mittle n uzzle nuzzle n ittle nittle o uzzle ouzzle o ittle oittle p uzzle puzzle p ittle pittle q uzzle quzzle q ittle qittle r uzzle ruzzle r ittle rittle s uzzle suzzle s ittle sittle t uzzle tuzzle t ittle tittle u uzzle uuzzle u ittle uittle v uzzle vuzzle v ittle vittle w uzzle wuzzle w ittle wittle x uzzle xuzzle x ittle xittle y uzzle yuzzle y ittle yittle z uzzle zuzzle z ittle zittle

It is clear most of these words have no semantic meaning in English. But they could! I could begin a new slang expression simply by using something like zittle to mean maybe a weird person. Still, even with this word creation we are limited in how far this could go.

Or better yet, take the permutations of say a 15 word English sentence, like: I know this would be a very large permutation indeed, if we do the calculation. The equation is P(n,r)(With n=r) = N! In this case N=15! and that equals 1,307,674,368,000. Notice I've been very careful to choose a sentence that is meaningful to the topic discussed and there are no repeated words, which is required by the permutation category of combinatorical calculation. Now, that's just a 15 word sentence! How 'bout a 30 word sentence? We won't even do the calculation. This process could be repeated infinitely. Would all of these constructions make sense? Let me show how one permutation wouldn't. Take the same sentence ordered as such: would calculation a be this if very we permutation indeed large do the know I. Would you understand this had you not seen the proper order of the sentence? Decidedly not I'm sure. Pinker points out that integer sets become infinite by simply appending 1 to the end of any given number. He doesn't explicitly state this, but with a roundabout explanation this is what he is indicating. Yet, constructions in spoken or written language must make sense to the listener or reader. Because combinations are infinite, doesn't make any combinatorical construction valid in a language. Why doesn't he see this? What he forgets to note is that the integers are mathematically called an infinite ring. Not being a mathematician, I can forgive him for this glaring mistake in his book. Anyway, we can take the numbers 0 to 9 and keep augmenting them by using a rule of repetition, not so with human languages. For instance, it is easy to imagine an integer that goes on forever. In fact, we all know a non-integer that does this, e.g. π. Can you imagine an endless word that would be meaningful in any language? Of course not! Take the word good, if we repeated like such and considered it ONE word:

good good good good good good good good good good good good good good good good good good good good good good ....Going on forever.

Is There Conceptual Infinity to Human Language?

Would it then mean that this word is the greatest good? Yes, the pun is intended here. No, it wouldn't mean any such thing. It would become meaningless, idiotic, the babble of the insane, and physically impossible to do for one thing. An infinite word is only a conceptual possibility not a reality. And with this comment, it would appear that infinite sentences are just conceptual possibilities, that we can well understand, yet not real world constructions we can experience and know perceptually. Underlining my point is the plain fact that the concept of a word necessarily means limitation. We conceive of a word as a finite unit of sound and meaning, which in linguistic jargon is a phoneme and a morpheme. So, too is a sentence. This is true because we only perceive a limited set of things and our conceptions are built from our perceptual experiences. It doesn't mean we can't conceive an infinity. It simply means, we can't experience such a state. And thus, nor can our languages allow us to experience such a state. There are those that construct clever sentences that appear to go on infinitely. These sentences are conceptually infinite, or so it seems. Most of these constructions use a subject that refers to itself and names an object. For instance: The man that saw a man, and that man saw a man, and that man saw a man. This sentence starts with a subject, e.g. man, then in each succeeding reference to this noun subject, it is referring to the previous noun subject man. It is a reflexive structure. The initial man subject is the base from which all others are generated. Sentences of this type have the verisimilitude of infinite construction. Yet, they don't. Sentences of this type run smack into some form of the Burali-Forti Paradox . This paradox occurs in the theory of transfinite ordinal numbers. It applies to numbers and just as easily can apply to words in a sentence structured in ordinal sequence. Burali-Forti Paradox (BFP) can be applied to the simple infinite sentence cited above. First, we must have an idea of what this paradox means.

BFP creates a paradox from 3 existence conditions:

1. Every well-ordered set has a unique ordinal number.

Well-ordered means the set is arranged from least number to greatest. The unique order number is the number that defines the place of every number before it. Thus a 5th ordinal number applies to a set (1,2,3,4)

2. Every segment of ordinals has an ordinal number, which is greater than any ordinal in that segment.

This means a subset of numbers arranged in least to greatest has an ordinal that is not a part of it and must be greater than its ordinal in the subset. The example above applies again. So, the subset (1,2,3,4) of natural numbers has an 5th ordinal, which is greater than any of the ordinals in the set. E.g. the number that names all the numbers in order before itself.

3. There is a set B of all ordinals in natural order (least to greatest) that is well-ordered (has a smallest member).

Here is where a contradiction arises. If such a number say x in B exists, it is greater than itself, by condition 2. This contradiction arises precisely because set B is an infinite ordinal set. Being such a set, an ordinal for any subset of it will always have a successor ordinal that is larger than the ordinals in this subset, and not a part of this subset.

We can apply this to the sentential string example. To see this, we will number the objects of these sentences as such:

The man that saw a man (1), and that man saw a man (2), and that man saw a man (3), and that man saw an Nth man (where Nth represents every succeeding sighting of a man to infinity).

Generating this set infinitely, we would have a subset of the natural numbers, well-ordered like such: {1,2,3,4,5,6,7,8,10,11,12,13,14,15,...∞}. It should be noted here, that 0 is not a part of the naturals, is not in this subset. Since there is no 0th man that saw anything. The Nth ordinal of this set would have to be not in the infinite natural set by the Burali-Forti Paradox. This would mean that there exists a man such that, that man DID NOT see the previous man in the order specified. Which, in turn would mean that even conceptually this infinite sentence would break down in its meaningfulness. Or to state it differently, it would not be true and thus not an infinite sentence. For those that follow Cantor's proof of transfinite numbers note that, though Cantor proved there are infinite sets that transcend infinity, he was never able to determine the order of these sets. Thus BFP is not subject to this proof. Even if there is an Nth man beyond the infinite that saw a man, we can not determine his ordinality.

I'm not even sure if all human languages are potentially infinite. We know new words are invented all the time. That too is questionable because in order for word creation to make an infinite set of meaningful sentences we'd have to do it in infinite time. Moreover, the word creation couldn't be infinite, unless we started making repetitive words using the augmentation rules like arithmetic. I believe we'd start creating meaningless words. Example: Say we want a new word. Take the word fell and add another l to get felll, but that would not be a meaningful English word. Since the set of letters we have now is limited (26 in English) the corresponding combinations (words) are too. Unless we allow infinite repetition of letters, like we do with integer arithmetic, infinite creation of new words would lead to a meaningless arrays of symbols. There is a philosophic implication here: Within infinite sets, are finite sets. If we allow infinite members to be in a set (by repetition of letters or other devices) we begin to lose meaning. It's worst if we have infinite symbols (I'm referring to the letter symbols). Say we wanted the 26 letter English alphabet increased infinitely. This could in fact be done. Add another curve to B, or an additional line to T, etc. We could increase the primitives that compose words infinitely. If we have an infinite variety of symbols, we could never make one symbol unique. If an infinitude of symbols could stand for one of our most basic pronouns I, how could we ever uniquely specify ourselves? By analogy, if we had an infinity of words, how could we ever specify say tree? This is similar to the Axiom of Choice in set theory mathematics. This axiom says in non-technical language that there must be a way to specify members of a set. For example, in a set (a,b,c) we must able to indicate that a is not b and b is not c, or c is not a, etc. The paradoxical consequence of an infinite alphabet is that every member WOULD BE UNIQUE, and thus no member could be used uniquely.

Setting aside whether there are infinite words, languages as they stand now, certainly don't allow an infinite set of sentences if those sentences are required to be meaningful in said language. Here is where I know the objection can be raised: just what do you mean by meaningful? In fact, Pinker again is himself helpful on this point. He gives an example from Noam Chomsky of a grammatical correct yet meaningless sentence. Colorless green ideas sleep furiously. This is an example of a sentence I'd say could be formed, that is not meaningful and thus not a valid sentence. Now what do I mean by meaningful? I am referring to sentences that we can experience in our perceptual world. This perception applies to imagined sentences too, an imagined state of being is still a perceptual state of being. It is meaningful in the sense that it is a state of being that we understand through our experiences in the physical world, no matter how counter-real it is. Think of a dream in which you have encounters that are counter-factual, for instance you fly in the air, you see beings that don't exist in the concrete physical world. I would still say, this dream state experience is rooted in your conscious experiences, it is determined by it and thus subject to it. You could create ideas, images, even fantastical tales of things you never physically experienced, yet you could never do this, without reference to the perceptual experiences you have already had, in the language you speak and read. The very nature of your state of being traps you in this respect. In simple terms, to have infinite linguistic license, we must be infinite beings, with experiences of infinite states of being. This we do not have. An infinite being would have no need to use the 1st person personal pronoun I to name itself, it could have.. well... an infinite number of names for its self-reference. We mere humans can't. This is the point of meaningful. There is no infinite being by the way.

The profound conclusion of what I'm suggesting is this: We can only say so much in a given language. There is only so much we can actually write and say. It's huge in its finitude, it is not endless. To get an idea of the enormity of our language look at Robert Mannell's estimate, a linguist from Australia, whom also believes in the finitude of language Infinite Number of Sentences. I love this site and the arithmetic estimate approach it takes. It makes us realize how great our human capacity for language is, while noting that it's not the God-like infinity we believe. Any mathematician knows this for sure. Dr. Mannell gives you its arithmetical scope. Some would make the leap and declare: well that means we can only have a finite number of thoughts too. This would be a fallacious deduction. Pinker again is instructive here. He shows that words and their supersets, sentences are not at heart what thought is. Thought is more a brain activity that recognizes manipulation of representations of reality. For instance, you can think left and right without words for those geometric directions. Or you can feel thirsty without having a word to express this desire for water. What it does imply is not every thought can be articulated. I can see this as being true. Intuitively, I had thoughts or to use the colloquial term, feelings that I can not express in spoken or written language. Even more interesting is the question: do we need to name things to have a concept of multitude? Like direction in space, we perceive separate objects with our eyes. Can we understand the concept of numeration, without a system of naming those multitudinous things? I believe we can. Though, it would be difficult to recognize 10 things without 10 names for them, but it could be done. Now I'm moving off point so let's get back to the real issue.


We could be limited in our thought process too. If the idea that our brains are discrete state machines (sometimes called finite state machines), which can process units of input to our brains, is true, then there is an upper limit on what we can think too.

Other Articles of Interest

English is a Germanic language and thus still compounds its words. Is it possible to take any English compound word reverse its parts and form a meaningful sentence with the new word? I look at 190 bi-compound words in this brief study below and find some not to be expected results. I plan to augment it later with some projections as to how much compounding we can do with English. Again it's not infinite. Not every English word can marry another and make sense. It gets very subjective what we can consider a meaningful compound. For instance take swallowdead, which I just made up, is that meaningful in a poetic sense.

Brief Analysis of English Compound Words

Return to Portal Philosophies, Science, Mathematics, and Music