Question Pipeline
QuestionGenerator is the only component that talks to Anthropic. Every other piece of the codebase consumes pre-generated questions from the stumper.questions table.
flowchart LR
Client["Client"]
Gen["QuestionGenerator"]
Cache[("stumper.questions")]
Anthropic["Anthropic Claude"]
Seen["Seen-IDs<br/>(per player)"]
Client -->|fetch / draw / generate| Gen
Gen -->|consult| Cache
Gen -->|consult| Seen
Gen -.miss / topup.-> Anthropic
Anthropic -.parse + dedupe.-> Gen
Gen -->|persist| Cache
Gen -->|safe payload| Client
Public surface
$generator->fetch (
category: string, // 'science' | ... | 'random'
difficulty: string, // 'easy' | 'medium' | 'hard' | 'auto' | ...
count: int, // 1-20
prompt: ?string, // optional natural-language steering
seenIds: int[], // exclude these question ids
model: ?string // optional Anthropic model override
): Question[]
$generator->draw (
playerId: int,
category: ?string,
difficulty: string = 'auto',
prompt: ?string,
retry: bool = false
): ?array // ['question' => Question, 'retry' => bool]
$generator->categorize (prompt: string): string[]
// Format helpers — never let raw answer indices reach the client mid-round
$generator->format ($question) // full payload (includes answer)
$generator->formatSafe ($question) // answer hidden
$generator->formatReveal ($question, $isCorrect) // post-answer payload
$generator->formatMany ($questions)
$generator->formatManySafe ($questions)
Three calling patterns
Practice — draw()
The practice screen wants ONE question that the player hasn't seen, matching their filters. draw checks the cache first; if there's a hit, it returns immediately. If not, it generates a small batch from Anthropic, persists them, and returns the first one. The remaining questions stay in the cache for the next caller.
$result = $generator->draw (
playerId: 42,
category: 'science',
difficulty: 'auto'
);
// → ['question' => Question, 'retry' => false]
Endless — fetch() with count=1
Endless calls fetch (count: 1) for every answer. Same dedup-by-seenIds logic.
Lobby — fetch() with count=N
When the host starts a lobby, LobbyManager::start calls fetch ($category, $difficulty, $numQuestions, $prompt, $seenIds, $model) once and stores the question IDs on the lobby row. Every player gets the same questions in the same order.
Safe payloads
The server has two distinct serializations:
type QuestionFull = Question & { answer: number; explanation: string };
type QuestionSafe = Omit <Question, 'answer' | 'explanation'>;
The formatSafe variant is what goes over the wire during a round. The full version is only sent after the answer is revealed (all answered, timer expired, host clicked Next, or the player submits in practice mode).
This boundary lives entirely in the formatter — nothing else has to remember to scrub answers. If you add a new field to the Question entity, update both formatters.
Dedup
Two layers prevent the same player from seeing the same question twice:
- Per-call:
fetchaccepts$seenIdsand filters them out of the cache lookup. - Per-batch: when generating new questions from Anthropic, the generator hashes question text + correct answer to detect near-duplicates of cached rows. Duplicates are dropped instead of persisted.
The seen-IDs are sourced from Question::findAnsweredIds($playerId) — anything in stumper.answers for this player is "seen", correct or not.
Categorize
categorize($prompt) is a small one-shot Anthropic call that suggests up to N canonical category names for a free-form prompt. It powers the practice screen's "Suggest categories" feature when the user types a topic. The endpoint is rate-limited (#[Throttle (limit: 20, interval: 60)]).
Why a per-game cache table
The simpler design would be to call Anthropic on every request. We don't because:
- Latency: Anthropic calls are 1-3s; cache hits are <10ms.
- Cost: same question served 1000 times costs once.
- Determinism in lobbies: every player needs to see the exact same question. Without persistence we'd have to either serialize the entire question into the lobby state (gross) or hope two parallel calls return the same content (impossible).
The cache is intentionally append-only. Questions are never modified after insert, so cache invalidation is a non-problem.