Python SDK
The WorldJen Python SDK is the programmatic surface for evaluation. Three top-level modules cover the three use cases:
worldjen.score— upload one clip at a time and get raw dimension scores.worldjen.rank— upload several clips for one prompt and get a leaderboard.worldjen.bench— full benchmark runs against many prompts and one or more models.
For typed exceptions, see Errors.
Install
pip install worldjenWith uv:
uv pip install worldjenInstall the runner extra only on machines that operate the GPU runner:
pip install "worldjen[runner]"Configure
export WORLDJEN_API_KEY="wjk_..."Or from Python:
import worldjen
worldjen.config(api_key="wjk_...")Score
import worldjen
result = worldjen.score.upload(
"clip.mp4",
prompt="a cat sitting on a windowsill",
dimensions=["aesthetic_quality", "dynamic_degree"],
wait=True,
timeout=120,
)
print(result["scores"])Helpers:
| Function | Purpose |
|---|---|
worldjen.score.get_or_create() | Return the user's Score session (created lazily) |
worldjen.score.reset() | Clear all videos from the Score session (id preserved) |
worldjen.score.upload(file, ...) | Upload a clip, optionally wait for scores. Returns {videoId, sessionId, scores} |
Rank
Rank sessions lock to the prompt of the first upload. Subsequent uploads either omit prompt (the lock is reused) or pass the same string. Passing a different prompt raises worldjen.RankPromptMismatchError. See Rank prompt lock.
import worldjen
worldjen.rank.upload("variant_a.mp4", prompt="a cat", wait=True)
worldjen.rank.upload("variant_b.mp4", wait=True) # uses the lock
response = worldjen.rank.get_ranked(worldjen.rank.get_or_create()["_id"])
for entry in response["videos"]:
print(entry["rank"], entry["videoId"], entry["overallScore"])Helpers:
| Function | Purpose |
|---|---|
worldjen.rank.get_or_create() | Return the user's Rank session |
worldjen.rank.reset() | Clear all videos and unlock the prompt |
worldjen.rank.upload(file, prompt=None, ...) | Upload a clip; prompt optional once locked |
worldjen.rank.get_ranked(run_id) | Return {videos: [...], rankPrompt, rankPromptLocked}. Each video carries rank, videoId, overallScore, plus media metadata. |
Bench
Bench has two entry points: an async worker-backed create(...) and an in-process pipeline driver run_with_pipeline(...).
import worldjen
run_id = worldjen.bench.create(
name="nightly-comparison",
model_id="MODEL_ID",
dimensions=["subject_consistency", "motion_smoothness"],
runner_id="RUNNER_ID",
reference_model_id="REFERENCE_MODEL_ID",
reasoning_enabled=True,
)
result = worldjen.bench.run_with_pipeline(
my_pipeline,
dimensions=[worldjen.Dimensions.SUBJECT_CONSISTENCY],
run_name="ltx2-motion-check",
model_id="MODEL_ID",
wait_for_evals=True,
)
print(result.run_id, result.status, result.eval_results)my_pipeline can be a callable or an object with .generate()/.infer(). It must accept a prompt kwarg and return frames or images.
Helpers:
| Function | Purpose |
|---|---|
worldjen.bench.create(name, model_id, dimensions, *, runner_id, ...) | Enqueue a Bench run on a worker queue |
worldjen.bench.run_with_pipeline(pipeline, dimensions, ...) | Drive a Python pipeline in-process |
worldjen.bench.list(status=None, page=1, limit=50) | List Bench runs |
worldjen.bench.get(run_id) | Fetch Bench run metadata |
worldjen.bench.cancel(run_id) | Cancel a Bench run |
worldjen.bench.delete(run_id) | Delete a Bench run |
worldjen.bench.videos(run_id) | List generated media |
worldjen.bench.csv(run_id) | Download scored CSV bytes |
worldjen.bench.logs(run_id) | Fetch runner logs |
worldjen.bench.download_videos(run_id, output_dir) | Download generated media |
Dimensions
from worldjen import Dimensions
video_dimensions = [
Dimensions.SUBJECT_CONSISTENCY,
Dimensions.MOTION_SMOOTHNESS,
Dimensions.SEMANTIC_ADHERENCE,
]
import worldjen
all_dimensions = worldjen.list_dimensions()
image_dimensions = worldjen.list_dimensions(model_type="t2i")See Dimensions for descriptions.
Errors
All SDK errors derive from worldjen.WorldJenError. Subclasses let callers branch on the kind of failure without reading message text:
| Exception | Trigger |
|---|---|
worldjen.AuthError | 401/403 — missing or invalid API key |
worldjen.ValidationError | 400 — caller-fixable input error; includes detail and optional hint |
worldjen.RankPromptMismatchError | 400 with code="RANK_PROMPT_MISMATCH"; includes the server-provided detail |
worldjen.NotFoundError | 404 |
worldjen.RateLimitError | 429; carries retry_after seconds when the server reports it |
worldjen.ServerError | 5xx; treat as transient |
Models, projects, preferences, API keys
import worldjen
models = worldjen.models.list_user()
projects = worldjen.projects.list_projects()
prefs = worldjen.preferences.get()
keys = worldjen.api_keys.list_keys()SDK vs CLI
Use the SDK for Python-native automation and in-process pipeline execution. Use the CLI for shell scripts, runner service operations, and one-off inspection. Both wrap the same REST API.
