Introduction
An introduction to OramaCore - a complex AI architecture made easy and open-source.
Building search engines, copilots, answer systems, or pretty much any AI project is harder than it should be.
Even in the simplest cases, you'll need a vector database, a connection to an LLM for generating embeddings, a solid chunking mechanism, and another LLM to generate answers. And that's without even considering your specific needs, where all these pieces need to work together in a way that's unique to your use case.
On top of that, you're likely forced to add multiple layers of network-based communication, deal with third-party slowdowns beyond your control, and address all the typical challenges we consider when building high-performance, high-quality applications.
OramaCore simplifies the chaos of setting up and maintaining a complex architecture. It gives you a single, easy-to-use, opinionated server that's designed to help you create tailored solutions for your own unique challenges.
Why OramaCore
OramaCore gives you everything you need in a single Dockerfile.
Just pull it from DockerHub:
You're getting acces to:
Search engine
A powerful, low-latency search engine with built-in support for >30 languages.
Vector database
A complete vector database with automatic chunking and automatic embeddings generation.
Small, fine tuned language models
An array of small, fine-tuned language models that can handle all sorts of operations on your data, from translating natural language queries into optimized OramaCore queries to running custom agents.
A JavaScript runtime
A fast, integrated, fully functional JavaScript runtime (powered by Deno) so you can write custom agents and business logic in plain JavaScript.
All from a single, self-contained image.
On being opinionated
When building OramaCore, we made a deliberate choice to create an opinionated system. We offer strong, general-purpose default configurations while still giving you the flexibility to customize them as needed.
There are plenty of great vector databases and full-text search engines out there. But most of them don't work seamlessly together out of the box—they often require extensive fine-tuning to arrive at a functional solution.
Our goal is to provide you with a platform that's ready to go the moment you pull a single Docker file.
Write Side, Read Side
OramaCore is a modular system. We allow it to run as a monolith - where all the components are running in a single process - or as a distributed system, where you can scale each component independently.
To allow this, we split the system into two distinct sides: the write side and the read side.
If you're running OramaCore in a single node, you won't notice the difference. But if you're running it in a distributed system, you can scale the write side independently from the read side.
Write Side
The write side is responsible for ingesting data, generating embeddings, and storing them in the vector database. It's also responsible for generating the full-text search index.
It's the part of the system that requires the most GPU power and memory, as it need to generate a lot of content, embeddings, and indexes.
In detail, the write side is responsible for:
- Ingesting data. It creates a buffer of documents and flushes them to the vector database and the full-text search index, rebuilding the immutable data structures used for search.
- Generating embeddings. It generates text embeddings for large datasets without interfering with the search performance.
- Expanding content (coming soon). It is capable of reading images, code blocks, and other types of content, and generating descriptions and metadata for them.
Every insertion, deletion, or update of a document will be handled by the write side.
Read Side
The read side is responsible for handling queries, searching for documents, and returning the results to the user.
It's also the home of the Answer Engine, which is responsible for generating answers to questions and performing chain of actions based on the user's input.
In detail, the read side is responsible for:
- Handling queries. It receives the user's query, translates it into a query that the vector database can understand, and returns the results.
- Searching for documents. It searches for documents in the full-text search index and the vector database.
- Answer Engine. It generates answers to questions, performs chain of actions, and runs custom agents.
Every query, question, or action will be handled by the read side.