Docs/Stumper/Question Pipeline

Question Pipeline

QuestionGenerator is the only component that talks to Anthropic. Every other piece of the codebase consumes pre-generated questions from the stumper.questions table.

flowchart LR
   Client["Client"]
   Gen["QuestionGenerator"]
   Cache[("stumper.questions")]
   Anthropic["Anthropic Claude"]
   Seen["Seen-IDs<br/>(per player)"]

   Client -->|fetch / draw / generate| Gen
   Gen -->|consult| Cache
   Gen -->|consult| Seen
   Gen -.miss / topup.-> Anthropic
   Anthropic -.parse + dedupe.-> Gen
   Gen -->|persist| Cache
   Gen -->|safe payload| Client

Public surface

PHP
$generator->fetch (
   category:   string,        // 'science' | ... | 'random'
   difficulty: string,        // 'easy' | 'medium' | 'hard' | 'auto' | ...
   count:      int,           // 1-20
   prompt:     ?string,       // optional natural-language steering
   seenIds:    int[],         // exclude these question ids
   model:      ?string        // optional Anthropic model override
): Question[]

$generator->draw (
   playerId:   int,
   category:   ?string,
   difficulty: string = 'auto',
   prompt:     ?string,
   retry:      bool = false
): ?array  // ['question' => Question, 'retry' => bool]

$generator->categorize (prompt: string): string[]

// Format helpers — never let raw answer indices reach the client mid-round
$generator->format       ($question)        // full payload (includes answer)
$generator->formatSafe   ($question)        // answer hidden
$generator->formatReveal ($question, $isCorrect)  // post-answer payload
$generator->formatMany     ($questions)
$generator->formatManySafe ($questions)

Three calling patterns

Practice — draw()

The practice screen wants ONE question that the player hasn't seen, matching their filters. draw checks the cache first; if there's a hit, it returns immediately. If not, it generates a small batch from Anthropic, persists them, and returns the first one. The remaining questions stay in the cache for the next caller.

PHP
$result = $generator->draw (
   playerId:   42,
   category:   'science',
   difficulty: 'auto'
);
// → ['question' => Question, 'retry' => false]

Endless — fetch() with count=1

Endless calls fetch (count: 1) for every answer. Same dedup-by-seenIds logic.

Lobby — fetch() with count=N

When the host starts a lobby, LobbyManager::start calls fetch ($category, $difficulty, $numQuestions, $prompt, $seenIds, $model) once and stores the question IDs on the lobby row. Every player gets the same questions in the same order.

Safe payloads

The server has two distinct serializations:

TypeScript
type QuestionFull = Question & { answer: number; explanation: string };
type QuestionSafe = Omit <Question, 'answer' | 'explanation'>;

The formatSafe variant is what goes over the wire during a round. The full version is only sent after the answer is revealed (all answered, timer expired, host clicked Next, or the player submits in practice mode).

This boundary lives entirely in the formatter — nothing else has to remember to scrub answers. If you add a new field to the Question entity, update both formatters.

Dedup

Two layers prevent the same player from seeing the same question twice:

  1. Per-call: fetch accepts $seenIds and filters them out of the cache lookup.
  2. Per-batch: when generating new questions from Anthropic, the generator hashes question text + correct answer to detect near-duplicates of cached rows. Duplicates are dropped instead of persisted.

The seen-IDs are sourced from Question::findAnsweredIds($playerId) — anything in stumper.answers for this player is "seen", correct or not.

Categorize

categorize($prompt) is a small one-shot Anthropic call that suggests up to N canonical category names for a free-form prompt. It powers the practice screen's "Suggest categories" feature when the user types a topic. The endpoint is rate-limited (#[Throttle (limit: 20, interval: 60)]).

Why a per-game cache table

The simpler design would be to call Anthropic on every request. We don't because:

  • Latency: Anthropic calls are 1-3s; cache hits are <10ms.
  • Cost: same question served 1000 times costs once.
  • Determinism in lobbies: every player needs to see the exact same question. Without persistence we'd have to either serialize the entire question into the lobby state (gross) or hope two parallel calls return the same content (impossible).

The cache is intentionally append-only. Questions are never modified after insert, so cache invalidation is a non-problem.