Configuration

Configuring OramaCore

OramaCore uses a straightforward config.yaml file for configuration and customization.

While we provide robust defaults for nearly every option in the configuration file — making it almost unnecessary to write one yourself — here is an example of a typical config.yaml file:

http:
    host: 0.0.0.0
    port: 8080
    allow_cors: true
    with_prometheus: true
 
log:
    # Log levels.
    # Comment out the following lines to disable logging. 
    file_path: "./log.log"
    levels:
        oramacore: trace
        orama_js_pool: info
 
writer_side:
    output:
        type: in-memory
 
    # Replace the following value with your own API key
    master_api_key: my-master-api-key
    config:
        data_dir: ./.data/writer
        # The maximum number of embeddings that can be stored in the queue
        # before the writer starts to be blocked
        # Note: the elements are in memory, so be careful with this value
        embedding_queue_limit: 50000
        # The number of the document insertions after the write side will commit the changes
        insert_batch_commit_size: 50000000
        # The default embedding model used to calculate the embeddings
        # if not specified in the collection creation
        default_embedding_model: BGESmall
        # The maximum number of request to javascript runtime that can be stored in the queue
        # Note: the elements are in memory, so be careful with this value
        javascript_queue_limit: 500000
        # Set interval for commiting the changes to the disk
        commit_interval: 1m
 
reader_side:
    input:
        type: in-memory
    config:
        data_dir: ./.data/reader
        # The number of the write operation after the read side will commit the changes
        insert_batch_commit_size: 50000000
        # Set interval for commiting the changes to the disk
        commit_interval: 1m
 
ai_server:
    scheme: http
    host: 0.0.0.0
    port: 50051
    api_key: ""
    max_connections: 15
    total_threads: 12
 
    embeddings:
        default_model_group: small
        dynamically_load_models: false
        execution_providers:
            - CUDAExecutionProvider
            - CPUExecutionProvider
        total_threads: 8
 
        # Specify which model and provider to use for the automatic embeddings selector.
        # It's completely optional. If not specified, the internal LLM will be used.
        automatic_embeddings_selector:
            model: "gpt-4.1"
            provider: openai
 
    # The local LLM provider host, port and model.
    llm:
        port: 8000
        host: 0.0.0.0
        model: "Qwen/Qwen2.5-3B-Instruct"
 
    # Optional, remote LLM providers.
    # Currently, we support the following providers out of the box:
    # - openai: OpenAI API (ChatGPT)
    # - together: Together.ai (LLAMA, Mistral, Qwen, DeepSeek, and more)
    # - fireworks: Fireworks.ai (LLAMA, Mistral, Qwen, DeepSeek, and more)
    # You can specify the provider, API key and default model to use.
    remote_llms:
        - provider: openai
          api_key: "sk-******************************************"
          default_model: "gpt-4o"
        - provider: together
          api_key: "sk-******************************************"
          default_model: "nvidia/Llama-3.1-Nemotron-70B-Instruct-HF"
        - provider: fireworks
          api_key: "sk-******************************************"
          default_model: "accounts/fireworks/models/llama-v3p1-8b-instruct"

All the options above are optional, and you can customize them as needed.

Let's break them down one section at a time.

`http`

The http section configures the HTTP server that serves the OramaCore API. Here are the available options:

host: The host where the HTTP server will listen. By default, it listens on all interfaces (0.0.0.0).
port: The port where the HTTP server will listen. By default, it listens on port 8080.
allow_cors: Whether to allow Cross-Origin Resource Sharing (CORS) requests. By default, it's set to true. We recommend keeping it enabled.
with_prometheus: Whether to expose Prometheus metrics. By default, it's set to true.

`log`

The log section configures the logging settings for OramaCore. Here are the available options:

file_path: The path to the log file where OramaCore will write logs. By default, it's set to "./log.log".
levels: The logging levels for different components of OramaCore. You can customize the verbosity of logs for each component:
- oramacore: The log level for the core components. Available levels are: trace, debug, info, warn, error. By default, it's set to trace.
- orama_js_pool: The log level for the JavaScript runtime pool. Available levels are: trace, debug, info, warn, error. By default, it's set to info.

`writer_side`

The writer_side section configures the writer side of OramaCore. Here are the available options:

output: The output where the writer side will store the data. By default, it's set to in-memory.
master_api_key: The master API key used to authenticate the requests to the writer side. By default, it's set to an empty string. See more about the available API keys in the API Keys section.
config: The configuration options for the writer side. Here are the available options:
- data_dir: The directory where the writer side will persist the data on disk. By default, it's set to ./.data/writer.
- embedding_queue_limit: The maximum number of embeddings that can be stored in the queue before the writer starts to be blocked. By default, it's set to 50000.
- insert_batch_commit_size: The number of document insertions after which the write side will commit the changes. By default, it's set to 50000000.
- default_embedding_model: The default embedding model used to calculate the embeddings if not specified in the collection creation. By default, it's set to BGESmall. See more about the available models in the Embedding Models section.
- javascript_queue_limit: The maximum number of requests to the JavaScript runtime that can be stored in the queue. This is an in-memory queue, so be careful with this value. By default, it's set to 500000.
- commit_interval: The time interval after which changes will be committed to disk, regardless of the number of operations performed. By default, it's set to 1m (1 minute). You can use time units like s (seconds), m (minutes), and h (hours).

`reader_side`

The reader_side section configures the reader side of OramaCore. Here are the available options:

input: The input where the reader side will store the data. By default, it's set to in-memory.
config: The configuration options for the reader side. Here are the available options:
- data_dir: The directory where the reader side will persist the data on disk. By default, it's set to ./.data/reader.
- insert_batch_commit_size: The number of write operations after which the read side will commit the changes. By default, it's set to 50000000.
- commit_interval: The time interval after which changes will be committed to disk, regardless of the number of operations performed. By default, it's set to 1m (1 minute). You can use time units like s (seconds), m (minutes), and h (hours).

`ai_server`

The ai_server section configures the Python gRPC server that is responsible for calculating the embeddings and managing LLMs. Here are the available options:

scheme: The scheme where the AI server will listen. By default, it's set to http.
host: The host where the AI server will listen. By default, it listens on all interfaces (0.0.0.0).
port: The port where the AI server will listen. By default, it listens on port 50051.
api_key: The API key used to authenticate the requests to the AI server. By default, it's set to an empty string - no authentication is required since it's not recommended to expose the AI server to the public internet.
max_connections: The maximum number of connections that the AI server will accept. By default, it's set to 15.
total_threads: The total number of threads that the AI server will use. By default, it's set to 12.

`embeddings`

The embeddings section configures the embeddings calculation. Here are the available options:

default_model_group: The default model group used to calculate the embeddings if not specified in the collection creation. By default, it's set to small. See more about the available models in the Embedding Models section.
dynamically_load_models: Whether to dynamically load the models. By default, it's set to false.
execution_providers: The execution providers used to calculate the embeddings. By default, it's set to CUDAExecutionProvider and CPUExecutionProvider.
total_threads: The total number of threads used to calculate the embeddings. By default, it's set to 8.
automatic_embeddings_selector: Specifies which model and provider to use for automatically selecting appropriate embedding models. If not specified, the internal LLM will be used. Available options:
- model: The model to use for embedding selection, e.g., "gpt-4.1".
- provider: The provider of the model, e.g., openai.

`llm`

The llm section configures the local Language Models. Here are the available options:

port: on which port the LLM server will listen. By default, it's set to 8000 (VLLM).
host: the local LLM provider host. By default, it's set on 0.0.0.0.
model: the model used to run Answer and Reasoning Sessions. By default, it's set on Qwen/Qwen2.5-3B-Instruct.

`remote_llms`

The remote_llms section configures the remote Large Language Models. It's optional and allows you to specify the remote LLM providers you want to use. Here are the available options:

provider: The name of the remote LLM provider. Currently, we support the following providers out of the box:
- openai: OpenAI API (ChatGPT).
- together: Together.ai (LLAMA, Mistral, Qwen, DeepSeek, and more).
- fireworks: Fireworks.ai (LLAMA, Mistral, Qwen, DeepSeek, and more).
api_key: The API key used to authenticate the requests to the remote LLM provider. By default, it's set to an empty string. Requests will fail if the API key is not provided.
default_model: The default model to use. Please refer to the individual provider documentation for the available models.

Configuration

On this page