Configuration
Learn how to configure OramaCore
Configuring OramaCore
OramaCore uses a straightforward config.yaml
file for configuration and customization.
While we provide robust defaults for nearly every option in the configuration file — making it almost unnecessary to write one yourself — here is an example of a typical config.yaml
file:
All the options above are optional, and you can customize them as needed.
Let's break them down one section at a time.
http
The http
section configures the HTTP server that serves the OramaCore API. Here are the available options:
host
: The host where the HTTP server will listen. By default, it listens on all interfaces (0.0.0.0
).port
: The port where the HTTP server will listen. By default, it listens on port8080
.allow_cors
: Whether to allow Cross-Origin Resource Sharing (CORS) requests. By default, it's set totrue
. We recommend keeping it enabled.with_prometheus
: Whether to expose Prometheus metrics. By default, it's set totrue
.
writer_side
The writer_side
section configures the writer side of OramaCore. Here are the available options:
output
: The output where the writer side will store the data. By default, it's set toin-memory
.master_api_key
: The master API key used to authenticate the requests to the writer side. By default, it's set to an empty string. See more about the available API keys in the API Keys section.config
: The configuration options for the writer side. Here are the available options:data_dir
: The directory where the writer side will persist the data on disk. By default, it's set to./.data/writer
.embedding_queue_limit
: The maximum number of embeddings that can be stored in the queue before the writer starts to be blocked. By default, it's set to50000
.insert_batch_commit_size
: The number of document insertions after which the write side will commit the changes. By default, it's set to5000
.default_embedding_model
: The default embedding model used to calculate the embeddings if not specified in the collection creation. By default, it's set toMultilingualE5Small
. See more about the available models in the Embedding Models section.
reader_side
The reader_side
section configures the reader side of OramaCore. Here are the available options:
input
: The input where the reader side will store the data. By default, it's set toin-memory
.config
: The configuration options for the reader side. Here are the available options:data_dir
: The directory where the reader side will persist the data on disk. By default, it's set to./.data/reader
.insert_batch_commit_size
: The number of write operations after which the read side will commit the changes. By default, it's set to50000
.
ai_server
The ai_server
section configures the Python gRPC server that is responsible for calculating the embeddings and managing LLMs. Here are the available options:
scheme
: The scheme where the AI server will listen. By default, it's set tohttp
.host
: The host where the AI server will listen. By default, it listens on all interfaces (0.0.0.0
).port
: The port where the AI server will listen. By default, it listens on port50051
.api_key
: The API key used to authenticate the requests to the AI server. By default, it's set to an empty string - no authentication is required since it's not recommended to expose the AI server to the public internet.max_connections
: The maximum number of connections that the AI server will accept. By default, it's set to15
.total_threads
: The total number of threads that the AI server will use. By default, it's set to12
.
The embeddings
section configures the embeddings calculation. Here are the available options:
default_model_group
: The default model group used to calculate the embeddings if not specified in the collection creation. By default, it's set tomultilingual
. See more about the available models in the Embedding Models section.dynamically_load_models
: Whether to dynamically load the models. By default, it's set tofalse
.execution_providers
: The execution providers used to calculate the embeddings. By default, it's set toCUDAExecutionProvider
andCPUExecutionProvider
.total_threads
: The total number of threads used to calculate the embeddings. By default, it's set to8
.
The LLMs
section configures the Language Models. Here are the available options:
default_model
: The default model used to perform the Language Model operations. By default, it's set tomicrosoft/Phi-3.5-mini-instruct
. You can set it to any model available in the Hugging Face Model Hub.