Make your own Music Recognition App with ACRCloud

Javier Hernández
5 min readApr 4, 2020

--

Have you ever thought about creating your own music recognition app like Shazam or Soundhound? Now you can, thanks to an online music recognition service called ACRCloud.

In this case I am going to teach you how to do it on Android with Kotlin language, but you could also integrate it into iOS or any other platform thanks to its Web API.

The best thing is that you can sign up for the free plan to do your first tests. It has a limit of 100 requests per day, enough to start your project.

Signup and creating the project

The first step that we must follow is signup on its website so we can access the console.

Once here, we can see the different services they offer.

ACRCloud console overview

In our case we are interested in «Projects > Audio & Video recognition».

Here we must create a new project by pressing «Create Project». We give it a name and choose “Recorded audio”, since we are going to recognize the music we record through the microphone of our device. As “bucket” we must choose “ACRCloud Music”. Optionally we can check “3rd party integration” if we want links to YouTube and Spotify, among others.

Once created it will give us an “Access Key” and an “Access Secret”. We will both use them in our Android project.

SDK Download and Integration

Next step is download the Android SDK through the following GitHub repository.

As you can see, we have several resources: an example project in Java, an example file in Kotlin, the source code of the native libraries, documentation and the libraries themselves.

The libs folder is essentially separated into a Java SDK (the .jar file) and native code in the form of libraries (.so files) for the different architectures.

Well then, let’s create a new blank project on Android Studio.

Then we can move the .jar file to the “app/libs” folder.

libs folder

For native libraries we must first create a folder called jniLibs inside “app/src/main/”. Then we must copy the folders of the architectures we want inside, leaving the following structure:

jniLibs folder

Android coding

We already have the SDK ready. Now let’s start developing our application.

The first thing we should know is that we are going to need microphone permission to record the audio and Internet permission to communicate with the ACRCloud Server. So in the manifest file we add the following:

<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />

Since audio recording is a “dangerous” permission, we must request it at runtime. So we are going to create a button in activity_main.xml to enable this permission:

<Button
android:id="@+id/permission"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="Enable mic" />

We can show or hide it depending on whether or not we have this permission. Also, we should request permission when pressing the button. A quick and easy way to implement it would be with these functions:

fun checkPermission() {
if (ContextCompat.checkSelfPermission(applicationContext, Manifest.permission.RECORD_AUDIO) != 0) {
permission.visibility = View.VISIBLE
permission.setOnClickListener {
ActivityCompat.requestPermissions(this, arrayOf(Manifest.permission.RECORD_AUDIO), 100)
}
} else {
permission.visibility = View.GONE
hasPermission()
}
}

fun hasPermission() { ... }

override fun onRequestPermissionsResult(
requestCode: Int,
permissions: Array<out String>,
grantResults: IntArray
) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults)
if (requestCode == 100) {
checkPermission()
}
}

In case we already have microphone permission enabled we can start to recognize the audio. So we go back to the layout and create a Button and a TextView to show the result.

<TextView
android:id="@+id/result"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:textSize="21sp"
android:textStyle="bold"
android:layout_margin="8dp"/>

<Button
android:id="@+id/recognize"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:text="Recognize" />

Now we can initialize ACRCloud configuration by copying the following code to our main Activity:

    private var mClient: ACRCloudClient? = null

fun initAcrcloud() {
val config = ACRCloudConfig()

config.acrcloudListener = this
config.context = this

// Please create project in "http://console.acrcloud.cn/service/avr".
// Please create project in "http://console.acrcloud.cn/service/avr".
config.host = "XXXXXX"
config.accessKey = "XXXXXX"
config.accessSecret = "XXXXXX"

config.recorderConfig.rate = 8000
config.recorderConfig.channels = 1

mClient = ACRCloudClient()
if (BuildConfig.DEBUG) {
ACRCloudLogger.setLog(true)
}
mClient!!.initWithConfig(config)
}

At the moment we are going to leave the host, accessKey and accessSecret variables in plain text, but you should consider hiding these fields in production. Look at some examples to hide your keys.

Let’s start recognizing the song by pressing the button. It will be something like this:

    fun hasPermission() {
recognize.setOnClickListener {
startRecognition()
}
}

fun startRecognition() {
mClient?.let {
if (it.startRecognize()) {
result.text = "Recognizing..."
} else {
result.text = "Init error"
}
} ?: run {
result.text = "Client not ready"
}
}

For receiving the result our activity must implement the interface “IACRCloudListener”. In the “onResult()” method we will receive the callback of the recognition results:

class MainActivity : AppCompatActivity(), IACRCloudListener {    // ...    override fun onResult(acrResult: ACRCloudResult?) {
acrResult?.let {
Log.d("ACR", "acr cloud result received: ${it.result}")
handleResult(it.result)
}
}

override fun onVolumeChanged(vol: Double) {
Log.d("ACR", "volume changed $vol")
}

fun handleResult(acrResult: String) { ... }

If we execute the app we will see on console that the result is a string on JSON format. That is why we can carry out the transformation to be able to read the data we want. We can see an example of the JSON output on the following ACRCloud documentation page.

Suppose we are only interested on the title and the main artist of the first result, which is also the most relevant. For this we would do the following:

    fun handleResult(acrResult: String) {
var res = ""
try {
val json = JSONObject(acrResult)
val status: JSONObject = json.getJSONObject("status")
val code = status.getInt("code")
if (code == 0) {
val metadata: JSONObject = json.getJSONObject("metadata")
if (metadata.has("music")) {
val musics = metadata.getJSONArray("music")
val tt = musics[0] as JSONObject
val title = tt.getString("title")
val artistt = tt.getJSONArray("artists")
val art = artistt[0] as JSONObject
val artist = art.getString("name")

res = "$title ($artist)"
}
} else {
// TODO: Handle error
res = acrResult
}
} catch (e: JSONException) {
res = "Error parsing metadata"
Log.e("ACR", "JSONException", e)
}

result.text = res
}

The last step is to release the ACRCloud client when exiting the application, in the onDestroy method:

    override fun onDestroy() {
super.onDestroy()
mClient?.let {
it.release()
mClient = null
}
}

And that’s it! With this when we press the button our app will start listening. Once we have the result available it will be shown in the text field.

Sample music recognition app on Android

It only remains to improve our app with a unique design, some progress indicator when recognizing music, manage recognition errors, the option to cancel a started recognition… Whatever you want to create an awesome app!

--

--

Javier Hernández

I am a freelance Android App Developer. Also interested in hybrid programming (Ionic & Angular), PHP, SQL and others.