feat(metadata): Recalculate Metadata Scores — on-demand library-wide recompute (bookshelf-r6lx.2) #583
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-r6lx.2"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
/settings/match-weights; clicking POSTs to admin-gatedPOST /settings/match-weights/recalculate(202 Accepted)RecalcMatchScoresWorkflowuses ContinueAsNew pagination (100 books/epoch) + single-activity-per-page design — no per-book fan-out, bounded history for 250k-book librariesGetWeightsForRecalcfalls back to defaults silently on settings load error so a transient DB hiccup never blocks the sweep; deterministic instance ID "recalc-match-scores" ensures idempotency (double-click → 409)recalculate()method POSTs, handles 409 gracefully, appends "View Workflows" link on successTest plan
internal/wfengine/recalc_scores_workflow_test.go— all 3 activities (success + error paths) + 4 workflow tester scenarios (empty library, single batch, weights fallback, batch error non-fatal) + Engine.StartRecalcMatchScoresWorkflow (success, not registered, already running)internal/books/recompute_scores_test.go— happy path, ErrNotFound skip, transient error, setScore error, context cancellationinternal/settings/match_weights_handler_test.go— RecalcMatchScoresHandler: 202 success, nil dep → 503, ErrConflict propagation, generic errorstatic/js/test/match_weights_settings_controller.test.js— RECALCULATE section: POST to recalculateUrl, server message shown, Workflows link appended, 409 handled without error styling, 503 shows errore2e/api/settings_match_weights_test.go— authenticated POST → 202 + instance_id + message; unauthenticated POST → 401e2e/browser/settings_match_weights_test.go— authenticated click → status span non-empty + screenshot posted to PRCloses bead bookshelf-r6lx.2 on merge.
8bbc79ce2488f4128c36Security Review — PR #583 (bookshelf-r6lx.2: Recalculate Metadata Scores)
Reviewed diff:
origin/main...origin/bd-bookshelf-r6lx.2(HEAD SHA88f4128c). CI green — tests not re-run per review-standard.Findings
[MINOR] e2e/api/settings_match_weights_test.go — no 403 test for authenticated non-admin user on POST /settings/match-weights/recalculate
The new e2e test suite covers the unauthenticated case (401) and the happy-path admin case (202), but does not exercise the authenticated non-admin path (403 Forbidden). The
adminRequiredguard is correctly applied inroutes.goand exists for other settings routes, but the convention ine2e/api/libraries_rbac_test.goandbooks_rbac_test.gois to test the 403 response explicitly usingSeedNonAdmin. The omission is a test-hygiene gap, not a correctness gap — the guard is inherited fromadminRequiredand cannot be bypassed — but it leaves an unexercised branch. Add a Context block that seeds a non-admin user, authenticates withAuthenticatedClientFor, and assertshttp.StatusForbidden.Threat-model walkthrough
AuthZ — admin gate on the recalculate endpoint: CONFIRMED SECURE.
internal/settings/routes.goregisters the route asadminRequired(eh.Wrap(RecalcMatchScoresHandler(mwd)))— identical guard to the existing PUT/DELETE match-weights routes. A non-admin authenticated session cannot reach the handler.DoS / resource — overlapping recompute workflows: CONFIRMED SECURE.
StartRecalcMatchScoresWorkflowuses a fixed deterministic instance ID"recalc-match-scores". A concurrent POST while the workflow is running hitsErrInstanceAlreadyExistsfrom the go-workflows backend, which is mapped toErrBulkAlreadyRunning→ 409 Conflict at the HTTP layer. The JS controller shows a human-readable "already in progress" message without error styling. The workflow itself is cursor-paginated (recalcScoresBatchSize = 100,ContinueAsNewper epoch) with no unbounded fan-out: all scoring runs inline per epoch, satisfying the single-digit fan-out rule.SQL injection — weights and book scan: CONFIRMED SECURE.
ListBooksForScoreBackfillis a parameterised sqlc query with no string interpolation.SetBookMetadataMatchScoreis similarly parameterised. No weight field is interpolated into SQL.Multi-user / per-user data isolation: CONFIRMED CORRECT.
metadata_match_scorelives onbook_metadata, which is per-book not per-user. The recompute is a global admin sweep by design. No per-user data is written; the scan does not touchuser_book_progressor any user-scoped table.CSRF on the POST: CONFIRMED PROTECTED. The Stimulus
recalculate()method sendscredentials: "same-origin"and"X-CSRF-Token": _csrfToken(), consistent with the existingsaveandresetToDefaultsfetch calls. The CSRF middleware (internal/middleware/csrf.go) uses the double-submit cookie pattern and is applied globally.Secrets in logs: CONFIRMED CLEAN. The handler logs
trace_idandinstance_idonly. Instance IDs are stable workflow identifiers, not secrets. No tokens, keys, or PII are logged.Grimmory schema compatibility — no new columns: CONFIRMED. No new migrations in the diff.
metadata_match_scoreonbook_metadataalready existed (same count in main and branch). Architecture boundary is respected:internal/books/recompute_scores.goimports onlymiddlewareandsettings— nogo-workflowsorwfenginesymbols.Error swallowing —
GetWeightsForRecalcnon-fatal fallback: This is an intentional, documented design choice (swallow settings-read error, fall back to defaults so a transient DB hiccup on settings doesn't abort a potentially hours-long library sweep). The trade-off is that a silent settings-layer failure causes the sweep to run with defaults without any operator-visible alert. This is not a security issue but may warrant aWarn-level log line rather than silently swallowing — the existing comment explains the intent, so leaving as-is is acceptable._ = errafterGetWeightsForRecalc: The workflow discards the error from the activity call becauseGetWeightsForRecalcalways returns nil (by design — errors are eaten internally). This is safe given the documented contract, but the_ = erron line 111 ofrecalc_scores_workflow.gois a minor readability concern as it looks like accidental error suppression to a new reader. The comment partially addresses it.REVIEW VERDICT: 0 blocker, 0 major, 1 minor
r6lx2_match_weights_recalculate
[MINOR] internal/wfengine/recalc_scores_workflow.go:68 — GetWeightsForRecalc swallows error without logging
The catch block returns
settings.DefaultMatchWeights(), nilsilently. Per logging-standard, every error catch must log with context before swallowing. The activity struct has no logger field and the workflow ctxgowf.Loggeris not available in activities — so add alogger *slog.Loggerfield toRecalcMatchScoresActivitiesand inject it viaNewRecalcMatchScoresActivitiesWithStubs/buildRecalcMatchScoresDeps, then log:logger.Warn("GetWeightsForRecalc: settings read error, falling back to defaults", "error", err). Without this, a persistent settings DB failure silently recomputes the entire library with wrong/default weights with no operator trace.[MINOR] internal/wfengine/recalc_scores_workflow_test.go:272 — Three Expects in one It block
The It("uses default weights when loading fails") block has three
Expectcalls:workflowErr,usedWeightslength, andusedWeights[0]value. Per project convention, oneExpectperIt. Split into threeItblocks (or fold the two value assertions intoExpect(usedWeights, workflowErr).To(Equal(...))for the value + nil-error combined, and add a separateItfor length). Same pattern appears inengine_new_test.go:561(unregistered-workflowItblock is fine — single Expect).[MINOR] internal/books/recompute_scores_test.go:62 — Loop with multiple Expects in single It
The It("persists a non-negative score for each book") iterates
calledScoreand callsExpect(s).To(BeNumerically(">=", 0))inside a range loop — effectively N Expects perIt. Per convention, oneExpectperIt. Replace withExpect(calledScore).To(HaveEach(BeNumerically(">=", 0)))(single Gomega assertion over the slice).REVIEW VERDICT: 0 blocker, 0 major, 3 minor
- Add RecalcMatchScoresWorkflow real-engine integration test (mirrors BulkLLMSweepWorkflow pattern) that would catch any future queue/registry mismatch; test passes green confirming registration is correct - Add logger to RecalcMatchScoresActivities and Warn log when settings read fails in GetWeightsForRecalc (logging-standard) - Wire logger through engine.go registration block - Fix CSS: add .settings-recalculate { margin-bottom: 2rem } so Field Weights heading is not abutted by the recalculate button (no inline style=) - Fix test: split 3-Expect "uses default weights" It into 3 separate Its (one Expect each, per project conventions) - Fix test: replace per-book Expect loop with HaveEach in recompute_scores_test.go - Add non-admin 403 e2e test for POST /settings/match-weights/recalculate - Add NewRecalcMatchScoresActivitiesWithLogger export helper + logger-path test to cover the if a.logger != nil branch (100% coverage gate) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>Root cause: RecalcMatchScoresWorkflow fires ContinueAsNew epochs at near-instant speed (inline CPU/DB score work with no I/O wait). The MySQL backend uses READ COMMITTED isolation in GetWorkflowTask; under rapid-fire continuations the next worker poll can dequeue a continuation task before its WorkflowExecutionStarted pending event is fully committed — producing "workflow RecalcMatchScoresWorkflow not found" and terminating the workflow with an error instead of completing the sweep. Fix: add a 200ms deterministic dwell (gowf.Sleep) before each ContinueAsNew. The timer is replay-safe and prevents microsecond-interval continuation chains. LibraryScanWorkflow is immune because its per-epoch cover fan-out takes real wall-clock seconds between epochs. Tests: - New multi-epoch integration test forces ≥3 real ContinueAsNew epochs with fast-returning stubs on a real MySQL backend. Asserts detail.Result == "null" (JSON nil = true success) to catch both "workflow not found" and "activity not found" race variants. - New tester unit test covers the gowf.Sleep error path via ScheduleCallback cancel before the 200ms dwell fires. - Fixed both integration test assertions from BeEmpty() to Equal("null"): a successful nil-error workflow serialises its result as "null", not "".f4d0c42ae34433b911164433b9111650667e7c39Security Review — PR #583 (bd-bookshelf-r6lx.2)
Scope:
internal/wfengine/diag_accessor.gotiebreaker fix,internal/wfengine/recalc_scores_workflow.go(new), multi-epoch integration test, handler + wiring + tests added since the prior r6lx review.Checklist findings
1. Parameterized queries / SQL injection
Both modified
diag_accessor.goqueries bindinstance_idvia?placeholder withdb.QueryRowContext. The, id DESCtiebreaker is a static column reference — no string interpolation or user-supplied data in the ORDER BY clause. No injection surface.ListBooksForScoreBackfillinbuildRecalcMatchScoresDepsusesdbsqlc.ListBooksForScoreBackfillParamswith typed fields bound by sqlc — clean.2. Admin gate + CSRF
POST /settings/match-weights/recalculateis registered asadminRequired(eh.Wrap(RecalcMatchScoresHandler(mwd)))ininternal/settings/routes.go. The e2e test explicitly verifies 401 for unauthenticated and 403 for a non-admin authenticated user. Consistent with every other settings-write endpoint on this mux.CSRF: the project uses a SameSite-cookie-based CSRF pattern applied at the middleware level for all non-GET state-changing endpoints; the route lives inside that boundary.
3. Logging — secrets/PII
RecalcMatchScoresHandlerlogs twoInfoevents: one on entry (trace_id only) and one on success (instance_id + trace_id). No user input, no book data, no secrets.GetWeightsForRecalcWarnlog:a.logger.Warn("recalc: settings read failed, using default weights", "error", err)—erris a DB/settings error (e.g. "connection reset", "settings unavailable"), not a secret or PII field. Safe.4. No new Grimmory DB columns
buildRecalcMatchScoresDepscallsq.ListBooksForScoreBackfill,q.GetBookMetadataForMatchScore, andq.SetBookMetadataMatchScore— all pre-existing sqlc-generated queries. No migrations, no new columns.5. Single-flight / duplicate trigger
StartRecalcMatchScoresWorkflowuses the deterministic instance ID"recalc-match-scores"and mapsgobackend.ErrInstanceAlreadyExists → ErrBulkAlreadyRunning → HTTP 409. The caller inbuild_extended_deps.gowraps that asmiddleware.ErrConflict. Double-clicks and concurrent admin requests are idempotent at the engine layer.6. userID from request body
RecalcMatchScoresHandlertakes no body at all (http.NewRequest(http.MethodPost, …, nil)in the e2e). No userID from request. The workflow is intentionally library-global (admin-only, no per-user scope). Correct for an admin sweep.7. Bounded query / no unbounded scan
RecalcMatchScoresWorkflowpaginates atrecalcScoresBatchSize = 100per epoch.ListBooksForScoreBackfillreceives a typedLimit int32parameter. No unbounded query.8. Batch error handling
A
RecomputeScoresBatchactivity failure logs aWarnand callsContinueAsNew(sweep continues).ListBooksPageForRecalcfailure propagates and aborts the epoch (retried up toMaxAttempts: 3).GetWeightsForRecalcfailure falls back to defaults — non-fatal, logged. All three paths are exercised by dedicated tests.9. Multi-epoch integration test
Uses
sync/atomicforrecomputeCallsandpageCallNcounters — no data race.Eventuallywith explicit timeout and poll interval (28s / 300ms). No wall-clock timing assertions in the assertion path (the timeout is a test-infrastructure bound, not an assertion). Clean.10. No workflow engine import in domain packages
internal/books/recompute_scores.goimports onlycontext,errors,fmt,log/slog,middleware, andsettings. Nogo-workflowsimport. Architecture boundary preserved.gowf.NewPermanentErrorappears only in the workflow-tester test (recalc_scores_workflow_test.go) as a test helper to force a permanent error in thetesterharness — this is the wfengine test package, not a domain package. Clean.Findings
No security issues identified.
REVIEW VERDICT: 0 blocker, 0 major, 0 minor
CODE REVIEW: APPROVED
Reviewing the unreviewed changes added since the earlier clean review (diag-tiebreaker fix, multi-epoch integration test, minor fixes).
Phase 0: DEMO Verification
No DEMO block was required for this incremental review (the feature's DEMO was in the prior review pass). The review spec covers 4 specific change sets only.
Phase 1: Spec Compliance
All four change sets land:
Diag tiebreaker fix —
internal/wfengine/diag_accessor.go: bothmakeInstanceRowQueryandmakeInstanceStateQuerynow useORDER BY created_at DESC, id DESC LIMIT 1. Tiebreaker is correct:instances.idis BIGINT AUTO_INCREMENT PK, strictly monotonic, deterministically selects newest execution. Applied to both diag query paths. No leftover dwell code inrecalc_scores_workflow.go(confirmed: the word "dwell" appears only in comments describing the reverted approach, not in anytime.Sleep/gowf.Sleepcall).Multi-epoch regression test + diag tie test — three new integration test suites in
engine_integration_test.go: empty-library single-epoch, multi-epoch CAN withatomicrecompute counter, and a raw DB seed test that directly exercises the tiebreaker by inserting 6 same-second executions and asserting the diag reads the newest.Review-minor fixes —
GetWeightsForRecalcWarn log on settings error (wired throughengine.gologger injection), one-Expect-per-It splits in workflow tests,HaveEachreplaces per-book loop.Security minor — non-admin 403 e2e added in
e2e/api/settings_match_weights_test.go.Architecture boundaries confirmed:
internal/books/recompute_scores.goimportsmiddleware,settings,slog— nowfengineorgo-workflowsimport.internal/settings/imports nothing fromwfengine..settings-recalculate { margin-bottom: 2rem }— no inlinestyle=attribute.Phase 2: Code Quality
Diag tiebreaker correctness: The fix is sound.
id DESCprovides a total order whencreated_atties. The diag seed test is the strongest guard: it inserts rows with a hard-coded identicalDATETIMEvalue (2026-06-16 12:00:00) and asserts bothGetInstanceStateandGetWorkflowDetail.ExecutionIDresolve to the newest (highestid) execution. That test will fail red if the tiebreaker is ever removed or the query is changed to a non-total order._ = erronGetWeightsForRecalc(recalc_scores_workflow.go:129): This is intentional and safe.GetWeightsForRecalcnever returns a non-nil error by contract (it falls back to defaults and returnsnilon settings failure). The comment above the call documents this. The activity test suite exercises the fallback path. No silent swallowing of a real error here.recalcActivityOptions.RetryOptions.MaxAttempts: 3: This applies toListBooksPageForRecalcandRecomputeScoresBatch. Both are transient-only failure paths at the activity boundary — bad-input (missing metadata) is handled insideRecomputeScoresBatchwith a skip, not an error return, so there is no bad-input path that burns retry budget. Correct.Multi-epoch test comment (
engine_integration_test.go:423–434): The Describe block comment says "Without the recalcEpochDwell fix this test fails intermittently" and the timeout comment says "3 epochs × ~200ms dwell". Both references are stale — the actual fix is the diag tiebreaker, not a dwell timer. The test DOES correctly guard the right symptom (workflow result contains "not found"), but the explanation of WHY it would fail is misleading to future readers. The 30s timeout is also unnecessarily generous now that there is no dwell.[MINOR]
internal/wfengine/engine_integration_test.go:423–434— stale "recalcEpochDwell" narrative in multi-epoch test commentThe comment attributes the "workflow not found" failure to the dwell timer, but the dwell was reverted; the actual fix is
ORDER BY created_at DESC, id DESCindiag_accessor.go. The accompanying timeout comment ("3 epochs × ~200ms dwell ≈ 30s") is also stale — without any dwell the 30s budget is ~25s larger than needed. No functional impact, but a future reader diagnosing a similar failure will be sent in the wrong direction. Suggested fix: replace "Without the recalcEpochDwell fix" → "Without theid DESCdiag tiebreaker fix" and update the timeout comment to reflect actual timing.No other issues found.
REVIEW VERDICT: 0 blocker, 0 major, 1 minor
50667e7c39342a380021342a380021156fb76c49156fb76c49ee8fa474faSecurity Review — PR #583 (bd-bookshelf-r6lx.2)
Reviewed the current diff (post-rebase). Scope: authorization on the recalc trigger, multi-user scoping, resource/DoS, input validation, secrets/PII logging, and CSP.
Findings
No security issues found.
Authorization (BLOCKER check):
POST /settings/match-weights/recalculateis registered atinternal/settings/routes.go:35withadminRequired(...)— same guard wrapping every other settings mutation route. The e2e test suite (e2e/api/settings_match_weights_test.go) explicitly exercises unauthenticated → 401 and non-admin authenticated → 403 cases against the real app. Auth is clean.Multi-user scoping (MAJOR check):
match_score/metadata_match_scoreis a per-book global metadata field (not per-user data). TheListBooksForScoreBackfillquery (internal/db/queries/books.sql:216) correctly selects frombookwith nouser_idscoping — appropriate for an admin-only global recompute. No per-user data is read or written. Design is documented inbuild_extended_deps.goas "No library scoping — this is a global admin sweep."Resource / DoS (MAJOR check):
The workflow uses a deterministic instance ID
"recalc-match-scores"(engine.go:1012) so duplicate triggers (double-click, repeated POST) hitErrInstanceAlreadyExists→ErrBulkAlreadyRunning→ HTTP 409, not a new workflow spawn. Fan-out is cursor-paginated in single-batch-per-epoch ContinueAsNew (batch size 100, no concurrent sub-workflows) — consistent with the bounded-fan-out hard rule. No unbounded resource exposure.Input validation:
The handler accepts no request body payload — the POST has no JSON/form inputs to validate. The only data flowing in is from
context(auth session, trace ID). Clean boundary.Secrets/PII logging:
Log fields are
trace_idandinstance_idonly.instance_idis the hard-coded string"recalc-match-scores"— no user data, no secrets.CSP / inline styles/handlers:
Template changes (
settings_match_weights.html) use onlydata-actionStimulus bindings — noonclick=, noonload=, nostyle=attributes. CSS change is a new rule inmain.css(class-based). JS additions usetextContentassignments (notinnerHTML) and a hard-codedhref="/admin/workflows"anchor. No CSP violations.audit.Record placement:
Called after
StartRecalcMatchScoresreturns without error (line 176), beforew.WriteHeader— correct success-path-only placement, consistent with other settings handlers.REVIEW VERDICT: 0 blocker, 0 major, 0 minor
CODE REVIEW: APPROVED
Branch:
bd-bookshelf-r6lx.2— "Recalculate Metadata Scores" workflow + UICI: green (all 6 jobs). Diff reviewed fresh (~1766 lines).
Phase 0: HARD RULE checks
1. Workflow integration test (≥3 ContinueAsNew epochs + assert SUCCESS value)
PASS.
engine_integration_test.go"multi-epoch ContinueAsNew" test drives exactlynumEpochs=3real epochs on a live engine, waits forcompleted, and assertsdetail.Result == "null"(true success — go-workflows serialises a nil-error result as JSON null). The secondItin the sameDescribeadditionally assertsatomic.LoadInt64(&recomputeCalls) == int64(numEpochs). Both its have independent BeforeEach setups (fresh DB + engine). The diag-tiebreaker test seeds 6 same-second tied executions and assertsGetInstanceStateresolves to"running"(not the stale completed epoch).2. Permanent vs retryable error classification
PASS.
internal/books/recompute_scores.goimports nothing from the workflow engine — it returns plain errors andmiddleware.ErrNotFoundtriggers a skip (not an error return). The wfengine adapter layer (recalc_scores_workflow.go) does not wrap domain errors withNewPermanentError; the batch-fail path is intentionally non-fatal at the workflow level (warn + ContinueAsNew).gowf.NewPermanentErrorin the test file (recalc_scores_workflow_test.go:354) is in the test stub only (simulating what would be a permanent fail for tester efficiency), not in production code.3. Bounded fan-out
PASS. No per-book sub-workflow. The workflow batches 100 books per epoch into one
RecomputeScoresBatchactivity. ContinueAsNew bounds history. Scale rule is satisfied.4. Diag tiebreaker
PASS. Both
makeInstanceRowQuery(diag_accessor.go:129) andmakeInstanceStateQuery(diag_accessor.go:144) useORDER BY created_at DESC, id DESC LIMIT 1.instances.idis aBIGINT AUTO_INCREMENTprimary key — strictly monotonic. Both queries updated consistently.5. Multi-user / auth scoping
PASS. The endpoint is admin-gated via
adminRequired(...)insettings/routes.go:35. The e2e test covers unauthenticated (401) and non-admin authenticated (403) cases. No user ID comes from the request body. The recalc is a global admin operation (not per-user data) — correct.6. CSP — no inline
style=PASS. The new template block uses only classes (
btn btn-secondary,settings-actions,settings-save-status,settings-recalculate). The CSS adds.settings-recalculate { margin-bottom: 2rem; }. Nostyle=attributes. JS controller usesclassNameassignment (class strings), notelement.style. Clean.7. Vitest controller test
PASS. Covers: POST fetch call and URL, server message display, Workflows link appended on success, 409 "already in progress" (no error class), 503 non-ok error styling. Good coverage of all controller branches.
Phase 1: Spec Compliance
All spec items from the bead description are present: workflow (ContinueAsNew + bounded batch), engine wiring, appwire trigger, handler + route, template button + status area, Stimulus controller, tests (workflow tester, integration, domain unit, handler, e2e API + browser Vitest).
Phase 2: Code Quality
Findings:
[MINOR]
internal/wfengine/recalc_scores_workflow.go:128—_ = errsilently swallows the return fromExecuteActivity[MatchWeights].Get(ctx). The comment documents the intent (GetWeightsForRecalc always returns nil error), but ifctxis cancelled between activities,Get()returns a context error that is silently swallowed, andweightsremains its zero value (MatchWeights{}). The next activity would also fail with ctx error and the workflow would terminate there, so this is not a correctness bug, but it is a latent footgun. Preferred fix:if err != nil { return fmt.Errorf("...get weights: %w", err) }— GetWeightsForRecalc already handles the non-fatal fallback internally; the activity-level error returned by.Get()is a different path (ctx cancel, framework error).[MINOR]
internal/wfengine/engine_integration_test.go:449-450— The test timeout comment is stale: "3 epochs × ~200ms dwell + DB round-trips + 20s margin ≈ 30s total." There is norecalcEpochDwellin the workflow (per therecalc_scores_workflow.go:18-29comment, the dwell was deliberately removed). The 30s timeout is still generous and correct, but the arithmetic is wrong. The comment should say the dwell is absent and the margin covers DB round-trips only.[MINOR]
internal/wfengine/engine_integration_test.go— The twoItblocks in "multi-epoch ContinueAsNew" each start the workflow directly insideIt(not viaJustBeforeEach), mixing invocation and assertion. This breaks the project's GinkgoBeforeEach / JustBeforeEach / Itconvention. Acceptable for an integration test given complexity, but worth noting.Verdict
REVIEW VERDICT: 0 blocker, 0 major, 3 minor
ee8fa474fafeeceaaee1