Multi-Fractal Analysis…? Posted by on Aug 19, 2009 in Esperanto Language

An article published last year in the New Scientist Technology Blog suggests that Esperanto is fundamentally different from other languages. Astonishing! To be frank, I have no idea what multi-fractal analysis has to do with linguistic comparison, but the author of the post seems to draw a conclusion the rest of us missed. If you can follow it, I’m sure it’s an interesting read. In the meantime, however, I find the commentary from equally confused or unreceptive readers far more amusing!

In the meantime, I’m testing the Esperanto features (spell-check and character support) of the latest build of Sun Microsystem’s freeware office software suite, OpenOffice. The overview and examination will be posted in a forthcoming article. Until then, stay cool!

  1. Johano:

    It’s not surprising that Esperanto’s different from English. Any language other than English would be the same! They never compared the book with Spanish, Russian, Swahili, Arabic or Mandarin.

    There was a discussion about this a while ago at lernu:

  2. Indiana:

    By the way, they did do further comparisons after the paper that article was about. In the later paper they used different statistical methods, and found that the Esperanto translation was “qualitatively” similar to the English translation.

    If I were to hazard a guess, Johano, I would guess that even though they didn’t compare the text to other languages in this paper, those kinds of experiments have been done before. I make this guess because the abstract says that the Esperanto results are “extreme”, which is kind of a weird judgement to make with a sample size of 2. It kinda implies that they know what kinds of values are “normal”, and the English text results are in that typical range while the Esperanto ones are not.

    I would also hazard a guess that any anomalous results one gets from Esperanto as compared to natural languages are due to the fact that Esperanto uses a much, much smaller subset of “common words” much, much more often. For instance, in English, one might expect common words to be “a, the, is, are” whereas in Esperanto you would have just “la, estas”… but those words appear more often (because the number of “estas” would equal the number “is” and “are” combined). The situation would be more extreme for other languages (for example, French would have “la, le, les, suis, es, est, sommes, êtes, sont” all compacting down to “la, estas”). That pattern, a smaller “core” vocabulary used for a wider range of situations, might be what makes Esperanto look “artificial” by their statistical estimate, compared to English. When I get back to campus where I have access to these papers, I’ll see if I can’t give some more info.

    Incidentally, Lex, thank you for your work on the OpenOffice Esperanto plugin.

  3. Johano:

    Of course, a Chinese core vocabulary would also be much smaller.

  4. Nick Nicholas:

    Parolas kabeinta ekslingvisto. The paper is breathtakingly superficial, enough to call out in public, and I’m tempted to write my own blog post saying so.

    Esperanto’s an agglutinative language with a different profile of function words. The behaviour of its word length and word frequency will *of course* look different. The sarcasm of other commenters about this is deserved.

    It’s also very far from coincidental that statistically the profile of Esperanto most resembles German: German was the prestige language of the East European millieu of Classic Esperanto (Zamenhof and Kabe), and there was a lot of stylistic and lexical calquing from German in the formative periods of the language.

    Put Turkish, Inuit, and Chinese into the mix, then you can say something sensible about whether Esperanto is all that different from Natural languages. (And to make sure you’re really saying something meaningful there, add in Lojban and Klingon.)

    Ĉiaokaze, daŭrigu la bonan laboron!