Skip to content

Python SDK

The WorldJen Python SDK is the programmatic surface for evaluation. Three top-level modules cover the three use cases:

  • worldjen.score — upload one clip at a time and get raw dimension scores.
  • worldjen.rank — upload several clips for one prompt and get a leaderboard.
  • worldjen.bench — full benchmark runs against many prompts and one or more models.

For typed exceptions, see Errors.

Install

sh
pip install worldjen

With uv:

sh
uv pip install worldjen

Install the runner extra only on machines that operate the GPU runner:

sh
pip install "worldjen[runner]"

Configure

sh
export WORLDJEN_API_KEY="wjk_..."

Or from Python:

python
import worldjen

worldjen.config(api_key="wjk_...")

Score

python
import worldjen

result = worldjen.score.upload(
    "clip.mp4",
    prompt="a cat sitting on a windowsill",
    dimensions=["aesthetic_quality", "dynamic_degree"],
    wait=True,
    timeout=120,
)
print(result["scores"])

Helpers:

FunctionPurpose
worldjen.score.get_or_create()Return the user's Score session (created lazily)
worldjen.score.reset()Clear all videos from the Score session (id preserved)
worldjen.score.upload(file, ...)Upload a clip, optionally wait for scores. Returns {videoId, sessionId, scores}

Rank

Rank sessions lock to the prompt of the first upload. Subsequent uploads either omit prompt (the lock is reused) or pass the same string. Passing a different prompt raises worldjen.RankPromptMismatchError. See Rank prompt lock.

python
import worldjen

worldjen.rank.upload("variant_a.mp4", prompt="a cat", wait=True)
worldjen.rank.upload("variant_b.mp4", wait=True)            # uses the lock

response = worldjen.rank.get_ranked(worldjen.rank.get_or_create()["_id"])
for entry in response["videos"]:
    print(entry["rank"], entry["videoId"], entry["overallScore"])

Helpers:

FunctionPurpose
worldjen.rank.get_or_create()Return the user's Rank session
worldjen.rank.reset()Clear all videos and unlock the prompt
worldjen.rank.upload(file, prompt=None, ...)Upload a clip; prompt optional once locked
worldjen.rank.get_ranked(run_id)Return {videos: [...], rankPrompt, rankPromptLocked}. Each video carries rank, videoId, overallScore, plus media metadata.

Bench

Bench has two entry points: an async worker-backed create(...) and an in-process pipeline driver run_with_pipeline(...).

python
import worldjen

run_id = worldjen.bench.create(
    name="nightly-comparison",
    model_id="MODEL_ID",
    dimensions=["subject_consistency", "motion_smoothness"],
    runner_id="RUNNER_ID",
    reference_model_id="REFERENCE_MODEL_ID",
    reasoning_enabled=True,
)

result = worldjen.bench.run_with_pipeline(
    my_pipeline,
    dimensions=[worldjen.Dimensions.SUBJECT_CONSISTENCY],
    run_name="ltx2-motion-check",
    model_id="MODEL_ID",
    wait_for_evals=True,
)
print(result.run_id, result.status, result.eval_results)

my_pipeline can be a callable or an object with .generate()/.infer(). It must accept a prompt kwarg and return frames or images.

Helpers:

FunctionPurpose
worldjen.bench.create(name, model_id, dimensions, *, runner_id, ...)Enqueue a Bench run on a worker queue
worldjen.bench.run_with_pipeline(pipeline, dimensions, ...)Drive a Python pipeline in-process
worldjen.bench.list(status=None, page=1, limit=50)List Bench runs
worldjen.bench.get(run_id)Fetch Bench run metadata
worldjen.bench.cancel(run_id)Cancel a Bench run
worldjen.bench.delete(run_id)Delete a Bench run
worldjen.bench.videos(run_id)List generated media
worldjen.bench.csv(run_id)Download scored CSV bytes
worldjen.bench.logs(run_id)Fetch runner logs
worldjen.bench.download_videos(run_id, output_dir)Download generated media

Dimensions

python
from worldjen import Dimensions

video_dimensions = [
    Dimensions.SUBJECT_CONSISTENCY,
    Dimensions.MOTION_SMOOTHNESS,
    Dimensions.SEMANTIC_ADHERENCE,
]

import worldjen
all_dimensions = worldjen.list_dimensions()
image_dimensions = worldjen.list_dimensions(model_type="t2i")

See Dimensions for descriptions.

Errors

All SDK errors derive from worldjen.WorldJenError. Subclasses let callers branch on the kind of failure without reading message text:

ExceptionTrigger
worldjen.AuthError401/403 — missing or invalid API key
worldjen.ValidationError400 — caller-fixable input error; includes detail and optional hint
worldjen.RankPromptMismatchError400 with code="RANK_PROMPT_MISMATCH"; includes the server-provided detail
worldjen.NotFoundError404
worldjen.RateLimitError429; carries retry_after seconds when the server reports it
worldjen.ServerError5xx; treat as transient

Models, projects, preferences, API keys

python
import worldjen

models = worldjen.models.list_user()
projects = worldjen.projects.list_projects()
prefs = worldjen.preferences.get()
keys = worldjen.api_keys.list_keys()

SDK vs CLI

Use the SDK for Python-native automation and in-process pipeline execution. Use the CLI for shell scripts, runner service operations, and one-off inspection. Both wrap the same REST API.