<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by mohammed chandwala on Medium]]></title>
        <description><![CDATA[Stories by mohammed chandwala on Medium]]></description>
        <link>https://medium.com/@mohammedrazachandwala?source=rss-2b4013921dfc------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*f_TCL0Kcgm3nIpKojBha9A.png</url>
            <title>Stories by mohammed chandwala on Medium</title>
            <link>https://medium.com/@mohammedrazachandwala?source=rss-2b4013921dfc------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Sat, 23 May 2026 12:24:43 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@mohammedrazachandwala/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Building LocalMind: How I Ported OpenAI’s Whisper to Android using JNI and Kotlin]]></title>
            <link>https://medium.com/@mohammedrazachandwala/building-localmind-how-i-ported-openais-whisper-to-android-using-jni-and-kotlin-575dddd38fdc?source=rss-2b4013921dfc------2</link>
            <guid isPermaLink="false">https://medium.com/p/575dddd38fdc</guid>
            <category><![CDATA[android-ndk]]></category>
            <category><![CDATA[android-app-development]]></category>
            <category><![CDATA[kotlin]]></category>
            <category><![CDATA[software-engineering]]></category>
            <category><![CDATA[ai]]></category>
            <dc:creator><![CDATA[mohammed chandwala]]></dc:creator>
            <pubDate>Thu, 15 Jan 2026 08:37:51 GMT</pubDate>
            <atom:updated>2026-01-15T08:37:51.894Z</atom:updated>
            <content:encoded><![CDATA[<p>LocalMind isn’t just a wrapper for an API. It is an offline-first voice transcription app that runs the <strong>Whisper</strong> model directly on the Android device. This article isn’t about why the app is useful; it’s about the engineering challenges I faced building it — specifically bridging the gap between high-level Kotlin UI and low-level C++ number crunching.</p><p>You can find the full source code here: <a href="https://github.com/shaikmohammed8/LocalMind">https://github.com/shaikmohammed8/LocalMind</a></p><h3>The Architecture: Bridging Two Worlds</h3><p>The core challenge of this project was that the inference engine for Whisper is written in C/C++ (ggml), but modern Android development happens in Kotlin. To make them talk, I had to utilize the <strong>Java Native Interface (JNI)</strong>.</p><p>This is where things get dangerous. In Kotlin, memory is managed for you. In C++, if you forget to release a pointer, you leak memory. If you access a pointer from the wrong thread, the app crashes.</p><p>Here is a look at the JNI layer I wrote to handle the model initialization. I had to manually map Java input streams to C++ buffers to load the model from Android assets without exploding the RAM.</p><pre>// inside native-lib.cpp<br><br>extern &quot;C&quot; JNIEXPORT jlong JNICALL<br>Java_com_example_localmind_WhisperLib_00024Companion_initContextFromAsset(<br>        JNIEnv *env, jobject /* thiz */, jobject assetManager, jstring asset_path_str) {<br><br>    // 1. Convert Java String to C string<br>    const char *asset_path = env-&gt;GetStringUTFChars(asset_path_str, nullptr);<br>    <br>    // 2. Initialize the Whisper context using the asset manager<br>    whisper_context *context = whisper_init_from_asset(env, assetManager, asset_path);<br>    <br>    // 3. CRITICAL: Release the memory for the string immediately<br>    env-&gt;ReleaseStringUTFChars(asset_path_str, asset_path);<br>    <br>    // 4. Return the pointer address as a Long to Kotlin<br>    return reinterpret_cast&lt;jlong&gt;(context);<br>}</pre><p>By returning reinterpret_cast&lt;jlong&gt;(context), I’m essentially passing the raw memory address of the C++ object back to Kotlin. Kotlin holds this address (as a Long) and passes it back whenever it needs to transcribe something.</p><h3>The Hardware Problem: Not All Cores Are Equal</h3><p>One of the most interesting problems I ran into was performance tuning. On Linux desktops, you just throw threads at the problem. On mobile, big.LITTLE architecture complicates everything.</p><p>if I spun up 8 threads on an 8-core phone, the OS might schedule the heavy AI math on the “efficiency” cores to save battery, causing the transcription to be 5x slower.</p><p>I had to write a hardware abstraction layer that digs into /proc/cpuinfo and /sys/devices/system/cpu to identify the <strong>high-performance</strong> cores and explicitly tell Whisper how many threads to use.</p><pre>// LocalMindCpuConfig.kt<br><br>object LocalMindCpuConfig {<br>    val preferredThreadCount: Int<br>        // Always ensure we have at least 2 threads, <br>        // but prioritize high-performance cores<br>        get() = CpuInfo.getHighPerfCpuCount().coerceAtLeast(2)<br>}<br><br>// ... internal logic reading clock speeds ...<br>private fun getMaxCpuFrequency(cpuIndex: Int): Int {<br>    val path = &quot;/sys/devices/system/cpu/cpu${cpuIndex}/cpufreq/cpuinfo_max_freq&quot;<br>    val maxFreq = BufferedReader(FileReader(path)).use { it.readLine() }<br>    return maxFreq.toInt()<br>}</pre><h3>Managing Concurrency in the ViewModel</h3><p>On the Android side, I used Jetpack Compose and Coroutines. The tricky part was managing the state. Recording audio is an <strong>I/O</strong> operation, but transcribing it is a heavy <strong>CPU</strong> operation.</p><p>I separated these concerns using specific Dispatchers. If I ran Whisper on Dispatchers.IO, it would crowd out disk operations. If I ran it on Main, the UI would freeze.</p><p>Here is how I structured the processAudioData function in the ViewModel to handle the pipeline:</p><ol><li><strong>IO Thread:</strong> Save the raw WAV file to disk.</li><li><strong>Database:</strong> Insert a “Loading” placeholder message.</li><li><strong>Default Thread:</strong> Run the C++ Inference (Whisper).</li><li><strong>Database:</strong> Update the message with the transcribed text.</li></ol><pre>// LocalMindViewModel.kt<br><br>private suspend fun processAudioData(fullAudioData: FloatArray) {<br>    try {<br>        // Step 1: Disk IO (Save WAV)<br>        val wavFile = withContext(Dispatchers.IO) {<br>            val file = File(getApplication&lt;Application&gt;().filesDir, filename)<br>            WavHelper.saveWavFile(file, fullAudioData)<br>            file<br>        }<br><br>        // Step 2: Insert into Room Database (Get the ID!)<br>        val message = Message(<br>            isAudio = true,<br>            audioPath = wavFile.absolutePath, <br>            // ...<br>        )<br>        val rowId = dao.insertMessage(message)<br><br>        // Step 3: The Heavy Lifting (C++ Inference)<br>        val text = withContext(Dispatchers.Default) {<br>            // This calls the JNI bridge<br>            whisperContext?.transcribeData(fullAudioData, printTimestamp = false) ?: &quot;&quot;<br>        }<br><br>        // Step 4: Update UI and Database<br>        if (text.isNotEmpty()) {<br>            val newMessage = message.copy(<br>                id = rowId.toInt(),<br>                content = text<br>            )<br>            dao.updateMessage(newMessage)<br>        }<br>    } catch (e: Exception) {<br>        Log.e(&quot;LocalMind&quot;, &quot;Transcription failed&quot;, e)<br>    }<br>}</pre><h3>Visualizing the Data</h3><p>Since I enjoy hacking around with UI as much as backend code, I didn’t want a standard progress bar. I wanted a raw audio visualization.</p><p>I built a custom Compose Canvas that draws the audio amplitude. The challenge here was blending. I needed the bars to be gray by default, but fill with white as the audio played—pixel perfect, without drawing &quot;half&quot; bars.</p><p>I solved this using clipRect logic in the draw scope:</p><pre>// AudioWaveform.kt<br><br>Canvas(modifier = modifier) {<br>    // 1. Draw the &quot;Inactive&quot; Gray layer<br>    drawWaveformBars(color = MindDarkGray.copy(alpha = 0.25f))<br><br>    // 2. Draw the &quot;Active&quot; White layer, clipped to progress<br>    clipRect(<br>        left = 0f,<br>        top = 0f,<br>        right = size.width * progress, // The magic cut-off line<br>        bottom = size.height<br>    ) {<br>        drawWaveformBars(color = MindWhite)<br>    }<br>}</pre><h3>Screen Shots</h3><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*BkG_Gbyh8Hhu4jwXw-XiUg.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XnwXO67ea-mU6o5CgUjcJQ.png" /></figure><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*XD7BVdrOnr4AwvqA6lYA2A.png" /></figure><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FqOtyb1yDO08%3Ffeature%3Doembed&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fshorts%2FqOtyb1yDO08%3Fsi%3DJ5z1UPECgSYiaCNY&amp;image=https%3A%2F%2Fi.ytimg.com%2Fvi%2FqOtyb1yDO08%2Fhq2.jpg&amp;type=text%2Fhtml&amp;schema=youtube" width="640" height="480" frameborder="0" scrolling="no"><a href="https://medium.com/media/1ca4712baad1c994baff6359b3d249d6/href">https://medium.com/media/1ca4712baad1c994baff6359b3d249d6/href</a></iframe><h3>Final Thoughts</h3><p>Building LocalMind forced me to step out of the comfortable world of high-level APIs and deal with the messy reality of memory pointers, CPU topology, and thread management.</p><p>It reminded me why I love development: sometimes the best way to understand how something works is to stop importing libraries and start building the bridge yourself.</p><p>Check out the code, fork it, and let me know what you think. <a href="https://github.com/shaikmohammed8/LocalMind">https://github.com/shaikmohammed8/LocalMind</a></p><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=575dddd38fdc" width="1" height="1" alt="">]]></content:encoded>
        </item>
    </channel>
</rss>