Open Assistant Framework: “Hello World”, featuring RevenueCat

6 min readSep 15, 2024

Let me start with a personal gripe. I am a daily user of AI tools like ChatGPT, Copilot, Gemini, etc. But I have an issue with all of them. I call it the “Curse of lost conversations”. There are hundreds of conversations that have been forgotten in the left or right panel of these tools just because it is so inconvenient to go back and locate a previous conversation similar to the new one I am about to start. And the list keeps growing. How about we [just] have a single conversation and the AI self-organizes into threads that can be easily accessed later? I want to propose a solution to this problem. But first, let me share a vision — the Open Assistant Framework (OAF).

The OAF is an attempt to create an AI assistant that is modular, hackable, and cross-platform. It has to be useful and accessible to an average person with a mobile device — either iPhone or Android. The cross-platform nature means that the same level and quality of service will be available on both IOS and Android. In this first step, I will share the vision as simply as possible, describe the overall architecture, and create a minimum viable product that can be accessible on the IOS App Store.

Why is this necessary?

There are many reasons, but I will highlight the most important ones:

AI Assistance is the next frontier of literacy. AI-assisted humans will be able to do things that were previously impossible and surpass humans without AI assistance in many aspects. Essentially, AI assistance has to become a basic human right.
The current AI assistants are limited within the constraints of the providers’ closed platforms. For example, Google’s Assistant can only be accessed on a Google device, and Apple’s Assistant is limited to Apple devices. This limits the ability of AI assistants to be truly helpful to the general public. There needs to be another category of AI assistants that are not bound by these constraints and can be accessed by anyone with a mobile device and possibly interface and integrate with the user’s device regardless of the operating system.
A true assistant should be customizable to the user’s needs. This is not just about the personality, voice, or language, but about the tools available to the assistant. The assistant has to be able to access any tool that a normal human can access, and it has to be able to do so in a way that is seamless and integrated into the user’s daily life.

The overall architecture of the OAF

The components of the OAF consist of the core, the user interface, memory, and extensions.

The core of the OAF is the Large Language Model (LLM) or group of LLMs combined with the Planning and Personalization interfaces. The framework should be able to use a single LLM or multiple LLMs. The LLMs can be selected based on the user’s needs and the expected assistant’s capabilities. Some frameworks have been proposed to provide automatic routing to the appropriate LLM based on the user’s request, and the selection of the LLM can be dynamic based on the user’s request and the assistant’s capabilities. While the choice of the LLM is crucial for the assistant’s capabilities, it is not the focus of this first step. In the future, where we envision competing LLMs with similar capabilities, the choice of the LLM will have only a minimal impact on the assistant’s capabilities, and the selection of the LLM will be a matter of choice for the user. The planning interface is responsible for deciding how the assistant should respond to the user’s request and execute a chosen line of action, and for coordinating the actions of the assistant. An agentic approach is a promising direction for the planning interface in the initial implementation. However, other approaches can be explored in the future. The personalization interface is responsible for adapting the assistant’s behavior to the user’s preferences and past interactions.

The user interface is the part that the user directly interacts with. Following the need to make the framework accessible to everyone, a mobile application will be the primary interface. Companions can be built on web and or browser extensions. The user can interact with the assistant through chat messages, voice messages, a combination of both, or uploaded media (documents, images, videos).

The memory interface is responsible for storing and retrieving information. The memory interface has to be able to store and retrieve information from different sources, such as the user’s conversation history, and uploaded media. The memory interface has to be able to store and retrieve information in a way that is secure and private. There is a fundamental difference between my vision for the memory interface here and the currently existing solutions.

The assistant will be constrained to “a single thread conversation”. This means that users do not need to create new conversations when they want to discuss a new topic, but rather the assistant will be able to keep track of the conversation history, and implicit threads within the conversation, and apply the appropriate context to the ongoing thread based on previous messages or available information.
Following the above constraint, the assistant should be able to create “self-organizing contexts”. That is the assistant automatically creates topics or groups of topics tagged with the relevant metadata for easy retrieval. This is used to build a personal knowledge graph that can be browsed and queried by the user outside interaction with the assistant.

The extensions are essentially 3rd party capabilities consisting of tools, integrations, and skins to elevate the experience of the assistant for a specific user. There can be basic tools like todos, internet searches, and notes. Integrations will allow the assistant to interface with applications that the user wants the assistant to control or manage. Skins will allow the user to customize the assistant’s appearance and behavior to their liking.

Fundamentally, the OAF is designed to be modular and hackable, while maintaining privacy and data security. The modularity allows for easy integration of new features and tools, and the hackability allows for customization of the assistant to the user’s needs.

The Minimum Viable Product (MVP)

The first version of the OAF is a mobile application — TaskAlly that is accessible on the IOS App Store. It implements the core’s LLM usage, basic user interface (text-based input and output), conversation history, and a task prioritization tool.

With the MVP, we are taking the first step to achieving the vision of the OAF and showing what is possible. Through a hackathon participation, we have been supported by RevenueCat to easily build a monetization layer for the OAF. You can now joint a waiting list for the app here.

What next?

Based on the feedback from the MVP, we will continue to iterate and improve the assistant while refining the OAF objective. The next steps will include the following:

Open sourcing the core components of the assistant.
Designing an application interface for the extensions to allow users to build their own.
Building an Android version of the assistant to increase the accessibility

Please reach out to me if you would like to learn more about the OAF and how you can contribute to it.

Watch out for Part 2.