feat(enrich): confidence score in workflow result + configurable thresholds (bookshelf-rppy) #908
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-rppy"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
LowConfidenceErrorstruct added tointernal/bookscarryingbook_id,confidence,threshold,book_type,winner_providerso the wfengine activity adapter can surface them in theEnrichResultpayload (rather than an opaque error string).GetConfidenceThresholdscurried function readsENRICH_EBOOK_CONFIDENCE_THRESHOLD/ENRICH_COMIC_CONFIDENCE_THRESHOLDfromapp_settings, defaulting toDefaultEbookConfidenceThreshold=0.5/DefaultComicConfidenceThreshold=0.5when unset or invalid (validated 0.0–1.0). Wired throughRefetchMetadata→applyConfidenceGate.EnrichWorkflowreturn type changed fromerrorto(EnrichResult, error); low-confidence skips become successful results withSkippedLowConfidence *ConfidenceSkipDetailspopulated instead of permanent errors.boundedFanOutWithResult[T, R any]generic added tofanout.gofor(R, error)sub-workflows;BoundedFanOutEnrichInputroutes through it.LowConfidenceError.Error(),GetConfidenceThresholdsall branches,resolveConfidenceThresholdssuccess + error path viaRefetchMetadata,boundedFanOutWithResultempty/clamp/alreadyExists paths,boundedFanOutonItemError callback viaBoundedFanOutLLMSweepInput.Test plan
make testpasses (all packages)make coveragepasses —check-coverage: OK — zero uncovered statement blocksgolangci-lint run ./internal/books/... ./internal/wfengine/...— 0 issuesLowConfidenceError.Error(),GetConfidenceThresholdsall branches,resolveConfidenceThresholdserror path,boundedFanOutWithResultempty/concurrency-clamp/alreadyExists,BoundedFanOutLLMSweepInputonItemErrorCloses bead bookshelf-rppy on merge.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Security Review — bookshelf-rppy (PR #908)
Findings
[MAJOR] internal/books/enrich_confidence.go:107 — NaN bypasses the [0.0, 1.0] range clamp in readThreshold
strconv.ParseFloat("NaN", 64)returns(math.NaN(), nil)— no error, no panic. IEEE 754 comparisons with NaN are always false, so the guardv < 0.0 || v > 1.0evaluates tofalse || false = false, and NaN is returned as the threshold. Downstream inapplyConfidenceGate,confidence < math.NaN()is always false for any finite confidence value, silently disabling the confidence gate for that book type.Any admin (or DB operator) who stores the literal string
"NaN"forENRICH_EBOOK_CONFIDENCE_THRESHOLDorENRICH_COMIC_CONFIDENCE_THRESHOLDcauses the confidence gate to stop firing — metadata below the intended quality bar is persisted without rejection or log warning.The project already guards against this exact risk in analogous float-parsing paths:
internal/books/metadata_handler.go:399:if f < 0 || f > 10 || math.IsNaN(f) || math.IsInf(f, 0)internal/books/service.go:341:if err != nil || math.IsNaN(f) || math.IsInf(f, 0)Fix: add
|| math.IsNaN(v)(and optionally|| math.IsInf(v, 0), though+Inf > 1.0and-Inf < 0.0already catch infinity via the range check):Other checked areas (no findings)
strconv.ParseFloatcannot panic; the error path is handled. ±Infinity is correctly rejected by the existing range check (both fall outside [0.0, 1.0]). Only NaN escapes.ENRICH_EBOOK_CONFIDENCE_THRESHOLDorENRICH_COMIC_CONFIDENCE_THRESHOLDis introduced in this PR. These keys are read-only from the application's perspective; they can only be set via direct DB manipulation, which already implies elevated access.ConfidenceSkipDetails) carriesbook_id,confidence,threshold,book_type,winner_provider. No API keys, tokens, or cross-user data. The workflow detail surface that exposes this (GET /admin/workflows/{instanceID}) is correctly gated behindAdminRequired./admin/workflows/carryadminRequired). No new per-user-scoped surface is introduced.internal/books/enrich_confidence.goimports only stdlib (database/sql,strconv, etc.) and internal domain packages (internal/metadata,internal/middleware). No workflow engine import. Clean boundary preserved.applyConfidenceGateemitsbook_id,book_type,confidence,threshold,winner_provider,trace_id. No credentials or PII.REVIEW VERDICT: 0 blocker, 1 major, 0 minor
[MAJOR] internal/wfengine/simple_workflows_test.go:290 — Missing EnrichWorkflow-level test for low-confidence result propagation
The
EnrichActivities.Refetchunit tests (line 207-287) prove theLowConfidenceError → EnrichResultconversion logic, but there is noEnrichWorkflowworkflow-tester-level test that setsstubErrto a*books.LowConfidenceErrorand assertsworkflowResult.SkippedLowConfidenceis populated with the correct confidence/threshold. go-workflows serializes the activity result to JSON and deserializes it back into the workflow result; a type registration bug or JSON tag mismatch would make the workflow tester return a zero-valuedEnrichResult{}while all activity tests stay green. Per theworkflow-integration-test-must-assert-successrule: "go-workflows marks errored-finished as completed — assert Result IS the success value." Add a Context inside theEnrichWorkflowDescribe block: setstubErr = &books.LowConfidenceError{BookID:5, Confidence:0.42, Threshold:0.50, BookType:"ebook", WinnerProvider:"google"}, execute, assertworkflowResult.SkippedLowConfidence != niland that the Confidence/Threshold fields are propagated.[MINOR] internal/wfengine/export_test.go:714 — Redundant type aliases for already-exported types
EnrichResultExport = EnrichResultandConfidenceSkipDetailsExport = ConfidenceSkipDetailsre-alias types that are already exported (uppercase). Tests inwfengine_testpackage can referencewfengine.EnrichResultdirectly without the Export alias. The aliases add no capability. Consider removing them and having the tests use the canonical names directly; keeping them just trains readers to expect an Export alias when there is none of the unexported-symbol-bridging that export_test.go aliases normally provide.[MINOR] internal/books/metadata_service_test.go:358 — Misaligned nil arguments (3 spaces vs tabs)
Several
+ nil,lines added throughout the test file use 3-space indentation instead of the tab-based indentation that gofmt expects (matching the surrounding lines). Affects lines 358, 492, 500, 508, 533, 541, 549, 557. Rungofmt -w internal/books/metadata_service_test.goto fix.[MINOR] internal/wfengine/simple_workflows_test.go:264 — Standalone
It("returns no error")should be folded into value assertionThe low-confidence context has a separate
It("returns no error (skip is reported as result, not failure)") { Expect(err).NotTo(HaveOccurred()) }followed byIt("returns a result with SkippedLowConfidence populated") { Expect(result.SkippedLowConfidence).NotTo(BeNil()) }. Per conventions, the nil-error check should be folded into the first value assertion:Expect(result.SkippedLowConfidence, err).NotTo(BeNil())— asserts both err==nil AND the field is non-nil in a single Expect, eliminating the standalone no-error It.What this PR gets right:
internal/bookshas no go-workflows import;LowConfidenceErroris a plain struct;wfenginedoes the mapping inRefetch. ✓isPermanentEnrichErr, transient errors still return as errors and get retried. ✓boundedFanOutWithResultuses the same sliding-window bounded-fan-out semantics asboundedFanOut— concurrency cap is enforced. ✓readThresholdcorrectly validates 0.0–1.0, falls back to defaults on error/missing/invalid. ✓workflow_detail.html:93-97already renders.Detail.Resultas a<pre class="workflow-payload-block">generic JSON block — theEnrichResult.SkippedLowConfidenceJSON will be visible there without template changes. The bead spec "reuse canonical result rendering" is satisfied. ✓REVIEW VERDICT: 0 blocker, 1 major, 3 minor
- [MAJOR security] reject NaN/Inf in readThreshold (enrich_confidence.go): strconv.ParseFloat("NaN",64) returns (NaN,nil); NaN<0||NaN>1 is always false so a stored "NaN" silently disabled the confidence gate. Add math.IsNaN(v)||math.IsInf(v,0) to the rejection condition, matching the IsNaN/IsInf guards in metadata_handler.go and service.go. Add tests for NaN and Inf falling back to the 0.5 default. - [MAJOR code] add EnrichWorkflow tester-level test for LowConfidenceError: proves SkippedLowConfidence survives go-workflows JSON serialize/ deserialize round-trip (type-registration or JSON-tag bug would yield zero-valued EnrichResult{} while activity tests stayed green). - [MINOR] remove EnrichResultExport/ConfidenceSkipDetailsExport aliases from export_test.go — both EnrichResult and ConfidenceSkipDetails are already exported types; the aliases add no bridging value. Update simple_workflows_test.go and fanout_test.go to reference the types directly. - [MINOR] fold standalone It("returns no error") into value assertion Expect(result.SkippedLowConfidence, err).NotTo(BeNil()) per project-conventions one-Expect-per-It rule. - [MINOR] run gofmt -w on metadata_service_test.go to fix 3-space indent on nil args (mixed spaces/tabs). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
3bde3ba9b7019bd55a3cWorkflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
019bd55a3c3c95be3322