Firebase ML Kit 101 : Language Identification

Hitanshu Dhawan
AndroIDIOTS
Published in
3 min readFeb 15, 2019

Language Identification is the process of determining the language of a text.

Nowadays, mobile apps are used in every part of the world, by different users, speaking different languages. Language Identification can help you understand your user's language and personalize your app based on it.

With ML Kit’s Language Identification API, you can identify 100+ different languages in both native and romanized script.

Firebase ML Kit Series

In this series of articles, we will deep dive into different ML Kit APIs that it offers…

Let’s look into the ML Kit’s Language Identification API and how we can integrate it into our apps.

Let’s Code!

Step 1 : Add Firebase to your app

Offcourse! You can add Firebase to your app by following the steps mentioned here.

Step 2 : Include the dependencies

You need to include the ML Kit dependencies in your app-level build.gradle file.

dependencies {
// ...
implementation 'com.google.firebase:firebase-ml-natural-language:18.1.1'
implementation 'com.google.firebase:firebase-ml-natural-language-language-id-model:18.0.2'
}

Step 3 : Get! — the Text

The Language Identification model requires a text as a String for the identification. Whether you get this text from an EditText or a Speech-to-Text API, It's up to you.

Step 4 : Set! — the Model

Now, It’s time to prepare our Language Identification model.

val languageIdentifier = FirebaseNaturalLanguage.getInstance()
.languageIdentification

You can also change the confidence threshold of your language identification model by passing in an object of FirebaseLanguageIdentificationOptions to it.

val options = FirebaseLanguageIdentificationOptions.Builder()
.setConfidenceThreshold(0.2F)
.build()
val languageIdentifier = FirebaseNaturalLanguage.getInstance()
.getLanguageIdentification(options)

Step 5 : Gooo!

Finally, we can pass our text to the model for Language Identification.

languageIdentifier.identifyLanguage(text)
.addOnSuccessListener {
//
Task completed successfully
}
.addOnFailureListener {
//
Task failed with an exception
}

Step 6 : Extract the information

Voilà! That’s it!
If the language identification was successful, the success listener will receive a BCP-47 language code for that language. If the model didn't detect any language, the success listener will receive und (undetermined).

The complete list of all supported languages can be found here.

You can extract this information like this.

Have a Look!

This is what you can achieve with ML Kit’s Language Identification API.

The full source-code with other ML Kit APIs can be found here!

Thanks for reading! Share this article if you found it useful.
Please do Clap 👏 to show some love :)

Let’s become friends on LinkedIn, GitHub, Facebook, Twitter.

--

--

Hitanshu Dhawan
AndroIDIOTS

Senior Software Engineer @ Urban Company | Google Certified Android Developer