Scoring
WorldJen produces per-dimension scores for each video and an overall score for each dimension within a run.
What you see
- Per dimension — Each video gets a score between 0 and 100 for each chosen dimension that the prompt works with. Higher means the video better satisfies what that dimension measures (e.g. motion smoothness, semantic adherence).
- Overall score — For a dimension, the overall score is an aggregate of the video scores for that dimension (e.g. the average).
Scores are shown on the run page: in the summary, in the gallery per video, and in exports (e.g. CSV).
Using scores
Use dimension scores to see where a model is strong or weak (motion vs. physics vs. prompt following). Use video specific scores to compare across runs or models for the same prompts. Exports let you analyze results in your own tools.
For the list of dimensions and what they measure, see Dimensions.
Score, Rank, and Bench
The scoring engine is the same across all three surfaces — what differs is what you upload and what you get back:
- Score session — single-clip uploads, raw per-dimension scores per video.
- Rank session — multiple clips for one prompt, plus a leaderboard sorted by aggregate score.
- Bench run — many prompts evaluated on a worker queue; CSV export, per-dimension stats, optional reference-model comparison.
Pick the surface that matches your workflow; see Concepts.
