qbrixqbrix

Quickstart

Get from zero to a working bandit experiment in under 5 minutes.

Prerequisites

Install the Python SDK

pip install qbrix

The SDK requires Python 3.10+. It wraps all qbrix HTTP endpoints into a typed client — pools, experiments, selection, and feedback.

Create an API Key

  1. Log in to cloud.qbrix.io
  2. Go to Settings → API Keys
  3. Click Create API Key and copy the key

Set your environment variables:

import qbrix
 
client = qbrix.Qbrix(
    api_key="your-api-key",
    base_url="https://cloud.qbrix.io",
)

Create a Pool

A pool is a collection of arms (variants). Let's create one with 3 homepage hero images:

pool = client.pool.create(
    name="homepage-heroes",
    arms=[
        {"name": "minimal-dark", "metadata": {"image": "hero-1.png"}},
        {"name": "gradient-bold", "metadata": {"image": "hero-2.png"}},
        {"name": "illustration", "metadata": {"image": "hero-3.png"}},
    ],
)
 
print(pool.id)

Create an Experiment

Link the pool to a bandit policy. We'll use Beta Thompson Sampling — a good default for binary rewards (click / no click):

experiment = client.experiment.create(
    name="hero-optimization",
    pool_id=pool.id,
    policy="BetaTSPolicy",
)
 
print(experiment.id)

Select an Arm

Ask qbrix to select the best arm for a request:

result = client.agent.select(
    experiment_id=experiment.id,
    context={"id": "user-001"},
)
 
print(f"Selected: {result.arm.name}")
print(f"Request ID: {result.request_id}")

The response includes the selected arm and a request_id to link feedback:

{
  "request_id": "req-abc123",
  "arm": {"id": "...", "name": "gradient-bold", "index": 1},
  "is_default": false
}

Send Feedback

After observing the outcome (e.g., user clicked), send a reward signal:

client.agent.feedback(
    request_id=result.request_id,
    reward=1.0,
)
Info

Feedback is processed asynchronously. Policy parameters update within seconds depending on batch settings.

Verify the Loop

Run a few more select → feedback cycles. You'll see the policy begin to favor higher-reward arms:

import random
 
for i in range(20):
    # select
    result = client.agent.select(
        experiment_id=experiment.id,
        context={"id": f"user-{i}"},
    )
 
    # simulate reward: arm index 1 has 70% click rate, others 30%
    if result.arm.index == 1:
        reward = 1.0 if random.random() < 0.7 else 0.0
    else:
        reward = 1.0 if random.random() < 0.3 else 0.0
 
    # feedback
    client.agent.feedback(
        request_id=result.request_id,
        reward=reward,
    )
 
    print(f"round {i+1}: arm={result.arm.name} reward={reward}")

View Results

Open your experiment in the qbrix console to see selection distributions and reward metrics.

What's Next