Install OramaCore
Downloading, building, and running OramaCore on your machine or in production.
Using Docker Compose
The absolute easiest way to get started with OramaCore is by using Docker Compose. While we discourage using Docker Compose in production, it can be a great way to:
- Test OramaCore locally
- Understanding how the system works and how the components interact
If you're using OramaCore on a GPU, ensure that you have the NVIDIA Container Toolkit installed. If you're using a CPU, you can skip this step.
First thing first, let's create a docker-compose.yml
file:
One thing to remember when using Docker is network management. When configuring the OramaCore configuration file (config.yaml
), ensure that the ai_server.host
is set to python-ai-server
, and the ai_server.llm.host
is set to vllm
.
Also, we recommend exposing the services through Envoy. You can find an example configuration file here.
Building from source
You can also build OramaCore from source by cloning the repository from GitHub:
The project consists of two parts: a Rust core and a Python server.
The Python server is responsible for generating embeddings, and it communicates with the Rust core using gRPC.
To build the entire system, ensure that you have Rust installed (installation guide) and Python (recommended version: 3.11).
Building Rust
Simply run the following command from the root directory:
This will generate a binary located in /target/release/oramacore
.
Building Python
Navigate to the src/ai_server
directory and install the required dependencies. You'll find two distinct requirements files:
requirements.txt
requirements-cpu.txt
The first file contains dependencies for GPU usage, which we highly recommend for production with an NVIDIA GPU.
If you are running OramaCore on a system without an NVIDIA GPU (e.g., a Mac), use requirements-cpu.txt
.
Before installing, create a virtual environment:
Then, install the dependencies:
When you run the server, OramaCore will automatically download the required models specified in the configuration file.
The download time will depend on your internet connection.
Large Language Models
OramaCore uses VLLM for providing access to local LLMs. You can follow the installation guide here: VLLM Installation.
Since OramaCore interacts with VLLM through an OpenAI-compatible API, you can choose to use Ollama, OpenAI, or any other LLM provider that supports the OpenAI API.
Just set the host, port, and API key in the config.yaml
file:
Starting the OramaCore server
After installing the dependencies and compiling the binaries, you'll need to start two separate services.
In one terminal tab, run the Python server inside of src/ai_server
:
Once the process started, run the Rust core binary:
In future versions of OramaCore, we plan to unify everything into a single binary, so you won't need to run two separate processes manually.