A creative-writing benchmark for LLMs · prose craft, style, willingness
79 models · generated 2026-05-23 20:32 UTC
CWRP100% CW
↑ higher is better↓ lower is betterClick any column header to sort · hover for descriptionC0–C3 refusal rate · C2/C4 engagement rate · EngD harm density on engaged refusable runs