Google Summer of Code with Sugar Labs(Week 1)

Introduction

2 min readJun 5, 2024

As part of my Google Summer of Code (GSoC) project with Sugar Labs, I’ve had an exciting and productive first coding period. My project, “Musical Creation and Transcription Assistance via Generative AI,” aims to empower users to transform their musical ideas into music blocks code. During this initial phase, I focused on designing and implementing the in-app interface for audio recording and uploading, a crucial component of the project.

WorkFlow of the Project

Designing the In-App Audio Interface

The first step in this journey was to create an intuitive and user-friendly interface for recording and uploading audio. This interface is designed to be simple yet powerful, allowing users to capture their musical ideas effortlessly. I leveraged the Mozilla Media Recorder API, a high-level JavaScript API, to handle the audio recording functionality. This API provides a seamless way to record audio directly within the web application, ensuring a smooth user experience.

Implementing Audio Recording Functionality

Implementing the audio recording feature was both challenging and rewarding. I began by setting up the media recorder, which required initializing the recorder, handling the recording process, and managing the recorded audio data. The Media Recorder API made this process straightforward by providing essential methods to start, stop, and retrieve recorded audio. I also ensured that users could easily control the recording process with intuitive Toggle Mic button.

Adding Audio Uploading Capability

In addition to recording audio, I integrated functionality for uploading audio files. This feature allows users to upload pre-recorded audio clips from their devices, giving them more flexibility in how they contribute their musical ideas. Implementing this required creating a file input element and handling file selection and uploading processes. I focused on making the upload process as smooth as possible, providing clear feedback to users when their files are successfully uploaded.

In App interface for uploading and recording audio

Moving Forward

With the audio recording and uploading interface successfully implemented, I’m excited about the next steps in this project. The upcoming phases will involve integrating a transcription service to convert audio into text and using a generative AI model to transform these transcriptions into Music Blocks code.

This first coding period has been a fantastic learning experience, allowing me to dive deep into web audio APIs and enhance my JavaScript skills. I’m eager to continue this journey and bring this innovative tool to life for the Sugar Labs community. Stay tuned for more updates as the project progresses!

Thank you for checking out this post!