Empowering Inclusion: Next-Gen Accessibility for Motor Impairment with Language and Action Models

Sanaa Karkera
8 min readMay 8, 2024

--

You are likely using a device and the internet right now to read this article. In fact, you have likely been using the internet and electronics for years now. You have enjoyed the automation, integration, and convenience technology provides, and could not see yourself with a limited or restricted ability to use it. However, millions of people do. Almost 1/9 of the American population, equivalent to approximately 39 million individuals, are motor impaired in some way. For many, this means traditional input methods are not options for using apps and the web, and, as technology evolves and scales, the barriers faced by individuals experiencing motor impairments are more significant than ever.

Often overlooked, simple tasks the majority take for granted — sending an email, browsing the internet, or even turning on and off a device — can be challenges for such individuals. Although accessibility standards and software are being continuously improved and updated, they still fall short at the same step. While perhaps capable of natural-sounding text-to-speech and intuitive, number-based navigation, they remain reliant on the support of the programs users employ. Without proper support, accessibility software becomes incapable and unhelpful. The intention to help is clearly present, however, lack of accessibility severely reduces the number of truly inclusive experiences — leaving millions of people isolated and excluded from effectively navigating the digital realm.

We started Actain to not only target, but eliminate this problem. At Actain, we envision a future where individuals of all abilities are able to seamlessly engage with technology, where limitations dissolve, and independence thrives. Realizing that the barriers currently faced are not only obstacles to overcome, but opportunities to innovate is where our journey began and is at the foundation of our mission: to empower every individual to interact with technology effortlessly, regardless of their differences.

Addressing the Gaps in Accessibility Solutions

Although many have made efforts into creative viable accessibility solutions, significant gaps still persist, leaving individuals with impairments underserved and marginalized.

Most Common Website Homepage Failures in 2022

One major issue resides in the reliance on traditional accessibility tools that fall short in providing comprehensive, expansive support. Many existing solutions offer limited functionalities, catering to basic needs while completely neglecting the diverse requirements of users facing varying degrees of motor impairment. This actively results in a fragmented user experience, forcing individuals to navigate through complex, expensive assistive technologies — each offering partial solutions, never fully solving the issue.

The dependence on website and application developers to integrate accessibility features directly on platforms further exacerbates the issue as well. While certain platforms may prioritize accessibility, most overlook it altogether, leaving users forced to navigate within an ineffective, inaccessible digital environment. The lack of universal support only restricts users’ access to essential services while perpetuating a cycle of constant exclusion and dependency, leaving individuals grappling with inadequate solutions failing to address their needs.

The Solution: Actain

With a relentless focus on innovation and inclusivity, Actain leverages cutting-edge Artificial Intelligence technology to empower users, offering seamless and intuitive ways to engage with digital platforms — surpassing the limitations imposed by traditional accessibility solutions.

The fundamental of Actain’s software lies within a blend of speech, language, and vision AI, working together to emulate the intricacies of human-computer interactions. Through localized speech recognition models, Actain can receive commands upon user-defined activation keywords and perform any action, allowing users to navigate and effectively use technological interfaces once again.

Powered through Natural Language and Large Language Models (LLM), Actain can comprehend a diverse array of voice-issued instructions, translating natural language into proprietary commands with high levels of accuracy. This process is further enhanced by targeted reasoning engines which meticulously parse command tokens generated by the LLM, combining API calls and computer vision instructions to execute tasks with extreme precision.

But Actain’s capabilities extend far beyond conventional accessibility solutions. The software offers users a fully customizable experience tailored to their unique needs and preferences. Whether leveraging native OS APIs for seamless task execution, or deploying custom routines trained by users themselves — Actain transforms the traditional industry from a “one-size-fits-all” model, to a completely user-centric platform.

Exploring Actain’s Cutting-Edge Features

Localized Speech Recognition

The core technology behind Actain’s user interface, Localized speech recognition, enables users to interact with the platform using voice commands, eliminating the need for conventional input methods such as keyboards or even touchscreens. By leveraging advanced neural network architectures, Actain’s speech recognition system can accurately transcribe user commands in real-time. Unlike traditional speech systems that rely on external servers for processing, Actain’s solution operates locally on the user’s device — ensuring privacy and eliminating dependence on internet connectivity. This localized approach enhances user accessibility and experience, providing seamless voice interaction capabilities that are truly reliable and responsive.

Large Language Model (LLM) Comprehension:

Actain’s cognitive capabilities are driven by its LLM system, a sophisticated neural network architecture tailored for natural language understanding. The LLM serves as a foundation for voice-issued instructions being interpreted, allowing a wide range of user commands to be comprehended with exceptional accuracy. Through a continuous cycle of training and refinement, Actain’s LLM is optimized to understand the nuances of human language — including context, semantics, and syntax. The specialized comprehension enables Actain to generate proprietary command tokens encapsulating the user’s intent, facilitating seamless interaction with multiple platforms. The LLM serves one main purpose; to deliver a personalized and adaptive user experience, empowering individuals to interact with technology effortlessly.

Symbolic Reasoning Engine:

A sophisticated algorithmic framework specially designed to process and interpret proprietary command tokens generated by the platform’s LLM, the Symbolic Reasoning Engine utilizes advanced symbolic logic rather than conventional rules-based systems. Actain’s engine uses inference techniques to analyze contextual information and device actionable insights from user commands. Through a combination of deductive and inductive reasoning, the engine deciphers complex user intents and translates them into executable instructions with precision and efficiency. This high-level reasoning capabilities enables Actain’s software to adapt dynamically to diverse user scenarios, making it adept at handling a wide range of tasks and interactions.

Customizable Workflows:

Users are enabled to uniquely tailor their digital interactions through customizable workflows, allowing them to define personalized sequences of actions aligning with their unique preferences and requirements. Through leveraging its AI-driven architecture, Actain allows users to create, modify, and execute custom workflows effortlessly — regardless of their personal technical expertise. Whether it’s automating routine tasks, creating processes, or streamlining multi-step interactions — the customizable workflow allows users with complete flexibility and control over their digital environment. Actain’s software enhances accessibility by encouraging user interaction through the user-defined system and operation framework.

Algorithmic Specifics: Neural Network for UI Recognition

Actain’s neural network architecture is designed to recognize and interpret various UI elements with precision and efficiency. Leveraging state-of-the-art deep learning techniques, the neural network undergoes extensive training to comprehend the intricacies of user interfaces — such as buttons, menus, text fields, and more.

Actain Demo

Training Process:

1. Data Collection: The training dataset comprises a diverse range of User Interface elements extracted from various applications and platforms. The dataset is curated to encompass a wide spectrum of visual styles and design paradigms.

2. Labeling and Annotation: All the UI elements in the dataset are labeled and annotated with relevant metadata, including: type, position, and functionality. This annotated data serves as a “ground truth” for training the network.

3. Model Architecture: Through employing a Convolutional Neural Network (CNN) architecture, image recognition tasks are instantly optimized. The CNN consists of multiple layers of convolutional and pooling type-operations. This enables extraction of hierarchical features from UI element ranges.

4. Training Objective: The primary objective of the neural network is to classify/localize the UI elements within digital interfaces accurately. The network is trained through supervised learning techniques, minimizing classification and localization errors through backpropagation and gradient descent.

5. Fine-Tuning and Validation: Following presumptive training, the neural network undergoes iterative specific training stages to optimize its performance further. This process involved adjusting hyperparameters, data augmentation techniques, and regularization methods to then enhance generalization and robustness.

Once trained, the neural network is designed to seamlessly integrate into Actain’s software framework, serving as the backbone for UI recognition and cursor control. During inference, the network actively analyzes screen captures or live video streams of digital interfaces — identifying and categorizing UI elements in real-time.

AI Control and Specific Mouse Mechanism:

The AI cursor and mouse control mechanism operate in tandem with the neural network, translating recognized UI elements into actionable commands for users. The process is detailed as followed:

Mouse and Cursor Control Technicality

1. UI Element Recognition: The neural network identifies and localizes UI elements within the digital interface, including buttons, links, input fields, and more.
2. Intent Inference: Based on the user’s input or predefined preferences, Actain infers the user’s intended action from the recognized UI elements (clicking, text entering, navigating menus, etc.)
3. Cursor Movement: Actain dynamically adjusts the cursor’s position on the screen, aligning it with the recognized element associated with the user’s action.
4. Action Execution: Upon user confirmation, Actain executes the intended action, simulating different user actions as needed.
5. Feedback and Adaption: Through real-time feedback loops to users, confirmation of successful actions and alerts regarding errors or obstacles allows for continuous adaptation and recognition refinement. This improves control mechanisms and enhances user experiences.

Transformative Impact:

Actain revolutionizes accessibility for individuals with motor impairments, offering a transformative solution empowering users to navigate and interact with technology effortlessly. Through the leveraging of Artificial Intelligence technologies like voice activation, reasoning algorithms, and computer vision systems — Actain works towards enhancing accessibility, enabling users to execute tasks with ease and efficiency.

Through Actain’s innovative approach, users experience a tenfold increase in productivity, seamlessly engaging with digital devices in ways previously impossible. Actain’s commitment to universal accessibility ensures that individuals of all abilities can fully participate in the digital world, further fostering inclusion and general independence.

If you would like to learn more about us and our mission, please visit our website: https://actain.zahtec.com/

Sanaa Karkera, Toryn Thompson, Seoyeon Cho

--

--

Sanaa Karkera

AI enthusiast driving innovation with a proven track record. Passionate about AI's transformative potential and its applications.