Language Learners on Amikumu


I was happy when the app Amikumu released statistics about their members. Amikumu is a social platform where users can create a profile and contact nearby users who speak the same language as themselves. It is aimed towards people learning uncommon languages. The app has become massively popular in the Esperanto community with 7,672 of the ~10,000 users indicating skills in Esperanto (and I am of course one of them!). The released statistics therefore gives us a unique insight into the Esperanto community. I will compare the languages using two measures

  1. The number of learners.
  2. The average level of the learners.

In the app, users indicates their speaking capabilities of a language using one of four categories; Beginner, Intermediate, Advanced and Fluent. I define a ‘Learner’ as a user choosing one of the first three categories. The average learner level, I calculate from the proportions of learners in those three categories(see the method section for more).

Every language is represented by a flag. There is a list of the languages below. I adjusted the position of some of  flags slightly to lessen the overlaps. The x axis is the composite variable described in the method section.  The value -0.61 means that at least 87% are beginners, while 0.27 means that less than 31% are beginners. Please look at the Copyrights Section where I attribute some flag artists and discuss copyright.
I had to be creative to give every language a distinct flag. The Ancient Greek flag is the Vergina Sun, which one normally associates with Macedonia. I chose American Sign Language to be represented by a hand. Norwegian Bokmål is obviously also my doing.

Amikumu has spread using the Esperanto community, so it is not surprising that Esperanto is a popular language in the app. The level of Esperanto speaking is not high compared to the English levels. It reflects how important English is, even for Esperanto speakers. I am also surprised by the number of learners of Scandinavian languages because there are not a lot of native speakers. It could be due to Scandinavians themselves learning other Scandinavian countries. The second-most popular constructed language is Toki Pona and not Lojban or Klingon, as I expected. However, besides Esperanto, the language capabilities within the learners of constructed languages are quite low.  What do you see in the plots? Please share in the comments!


I used versions of the national flags which belong to the public domain. They can be easily downloaded on Flagpedia. Some of the other flags had other languages

The Klingon flag is protected by a commercial copyright, so I claim that my use falls under ‘Fair Use’ like Wikipedia does.

Legal uses of national flags is specified in national laws, and I am not completely sure I fulfill all of them.

NOTE: If you want to use the graphics in this blog, you have to give the result CC BY-SA-license. You have to also explicitly attribute Smashicons, Flaticon, S.fuer, SilentResident, Martorell, Daniele Schirmo, Antonio Martins, Karel Podrazil and  me. Hopefully, it won’t be necessary to reconsider the ‘Fair Use’ criteria of the Klingon flag, because I already did so. In theory you should also make sure, that you wouldn’t break national flag laws. 

(Normally, you only have to attribute me if you use graphics from this blog. You can also normally choose whatever license you like because I have declared CC BY license. It is hard to say exactly how I am allowed to use some dataset, so I can’t promise that I am perfectly within the lines). 


For, say English I calculated the number of learners as

\displaystyle \textup{Learners}=\textup{Advanced}+\textup{Intermediate}+\textup{Beginner}=2566+1673+681

using statistics from Amikumu. The average language level is calculated as

\displaystyle \textup{Level}=\frac{c_1\cdot\textup{Advanced}+c_2\cdot\textup{Intermediate}+c_3\cdot\textup{Beginner}}{\textup{Learners}}

It is not obvious how to choose the constants c_1,c_2,c_3. I ended up using the second principal component of the matrix

Linguage\Level Advanced Intermediate Beginner
English 2566 1673 681
Esperanto 1275 2148 3137
Halkomelem 0 0 1

The coefficients are therefore (c_1,c_2,c_3)=(0.71802147, -0.01576322,-0.69584244). The first principal component is close to a perfect average of the 3 categories. I chose this to subtly indicate that the two axes of the plot are almost independent.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s