Actively onboarding contributors · Project Spark open

Evaluation data
for frontier AI.

Partner with us to author rigorous, machine-verifiable coding tasks that benchmark the world's most advanced AI agents — on real codebases, with real standards.

Trusted by engineers across 20+ countries

500+
Tasks authored
60+
Active contributors
< 24h
Review turnaround
11
Quality criteria
What we build

The benchmark layer
AI labs depend on.

Every task you author is a real coding challenge — grounded in actual code, verifiable by machine, and scored against an 11-criterion quality bar.

Real codebases

Tasks are grounded in actual production-grade code — not toy problems. The agent must understand real code to succeed.

Machine-verifiable

Every task includes a test harness and reference solution. Pass/fail is determined objectively — no human bias in the score.

11-point quality bar

Tasks are reviewed against 11 criteria — Verifiable, Solvable, Fair, Deterministic and more. All must pass. No shortcuts.

100%
Real production code
< 24h
Verdict turnaround
Weekly
Payout schedule
Programs

Evaluation programs

Spark accepting · Aura & Titan paused
Paused

Project Aura

SWE-bench-style coding tasks

Rigorous coding challenges grounded in real code, scored against a 24-hour reviewer SLA and a calibrated difficulty band.

Languages
Python · TS · Rust · Go
Payouts
Weekly · Bonuses available
Accepting nowFeatured

Project Spark

SWE-bench-style coding tasks

The primary program open to new contributors worldwide. Same high quality bar, fast reviews, and weekly earnings.

Languages
Python · TS · Rust · Go
Payouts
Weekly · Bonuses available
Paused

Project Titan

Long-horizon system-level tasks

Complex, multi-step tasks requiring deep understanding across large codebases. Invite-only for senior contributors.

Eligibility
Senior only · Invite only
Payouts
Higher rates
Process

From sign-up to payout.

01

Create account

Sign up and tell us about your background — languages, years of experience, and the kind of code you work in.

02

Get program access

Our team reviews your profile and grants access to the program that best fits your expertise level.

03

Author & submit

Author coding challenges and submit them for review. Verdicts arrive within 24 hours with detailed rubric feedback.

04

Get paid

Accepted tasks create pending payments automatically. We pay out every Wednesday via your preferred method.

Earnings

Get paid weekly.
Bonuses on top.

Every accepted task earns you real money, paid out weekly. We also run bonuses so the more you contribute, the more you can earn.

Start earning
Weekly payoutsEvery Wed
Bonus programsEarn extra
Tier upgradesBetter rates
Top contributor perksUnlock more
Quality bar

11 criteria.
All must pass.

Every task is graded on 11 rubric criteria. A single fail means rejection — this is the bar that makes the data genuinely useful to frontier labs.

Verifiable
Well-specified
Solvable
Genuinely difficult
Behavioral
Outcome-verified
Test alignment
Instruction quality
Fair
Anti-cheat
Deterministic

All 11 criteria must score Accept or Strong Accept. Tasks that miss even one are returned with detailed feedback.

Open to contributors worldwide

Ready to
contribute?

Create an account, tell us about your background, and our team will get back to you with program access within 48 hours.