feat(books): match-confidence + cross-provider identity guard on bulk enrich (bookshelf-iso0) #885
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-iso0"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
EbookIdentityConfidenceandComicIdentityConfidencescoring functions (Jaccard word-set similarity, ISBN exact-match, comic issue hard-fail, year proximity bonus).ProvidersAgreecross-provider identity agreement check — on disagreement, reducesallResultsto winner-only so the merge doesn't blend data from providers that identified different books.applyConfidenceGateintoRefetchMetadata, scoped to the bulk unattended path only (enabledOverride != nil); the interactive single-book path is unchanged.ErrLowConfidenceMatchis a permanent sentinel — the wfengine adapter maps it toworkflow.NewPermanentErrorso the bulk sweep skips the book without burning retry budget.metadata_refetch_low_confidence_skipped_total{type}andmetadata_refetch_provider_disagreement_total.Test plan
make test— all 3912+ specs passmake coverage— 100% coverage gate passesmake build— binary compilesgo vet ./...+golangci-lint run ./internal/books/...— 0 issuesErrLowConfidenceMatchis defined ininternal/bookswith no workflow-engine importCloses bead bookshelf-iso0 on merge.
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
d6983eba7fa0c54cf855Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
a0c54cf8558f5f312185Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Security Review — bookshelf-iso0 (PR #885)
Scope: enrich_confidence.go (confidence/identity scoring), Prometheus metrics, RefetchMetadata hook, tests.
[MAJOR] internal/wfengine/simple_workflows.go:150-154 — ErrLowConfidenceMatch missing from isPermanentEnrichErr
enrich_confidence.godocuments the new sentinel explicitly: "This is a permanent error — the same low-quality match will occur on every retry... The wfengine activity adapter wraps this with workflow.NewPermanentError so the bulk sweep skips the book rather than burning retry budget." However,isPermanentEnrichErrin the wfengine adapter is NOT updated in this diff and does not includebooks.ErrLowConfidenceMatch. When the confidence gate fires and returns this error, theEnrichActivities.Refetchactivity will classify it as a transient failure and retry it up to 5 times perenrich5Attempts(30s initial interval, 2× backoff, 10m cap) — burning approximately 15+ minutes of retry budget per rejected book. Under the project's single-digit serial fan-out concurrency cap, every low-confidence book in a bulk sweep blocks the queue for that entire backoff chain before finally failing. This directly violates the "Permanent errors must fail fast (no retry)" hard rule from project-conventions.md.Fix: add
errors.Is(err, books.ErrLowConfidenceMatch)toisPermanentEnrichErrininternal/wfengine/simple_workflows.goand add it to the sentinel inventory comment above that function.[MINOR] internal/books/enrich_confidence_test.go:408-476 and confidence gate Describe blocks — test convention deviation
Pure-function tests (EbookIdentityConfidence, ComicIdentityConfidence, ProvidersAgree) place both setup and SUT invocation inline inside
Itblocks rather than using the project'sBeforeEach(construct world) +JustBeforeEach(call SUT) split. The confidence gate integration tests (e.g. lines 917-946) bundle stub construction together with the SUT call inJustBeforeEachrather than putting stub construction inBeforeEach. No correctness or security impact for the current tests, but the pattern blocks future nestedContextoverrides and deviates from the stated project convention.Fix: move fixture/stub construction to
BeforeEachand leave only therefetch(...)call (capturing_, err = refetch(...)) inJustBeforeEach. For single-It pure-function tests, aJustBeforeEachholding just the SUT call and a dedicatedItfor the assertion is the correct form.Other checks — all clean:
typelabel onlowConfidenceSkippedTotalis bounded to {"comic","ebook"} — hardcoded inapplyConfidenceGate, never derived from provider-returned strings.providerDisagreementTotalis a plain label-free counter.internal/books/enrich_confidence.goimports only stdlib +internal/metadata+internal/middleware. No workflow engine import.ErrLowConfidenceMatchis a plain sentinel error as required.book_id,book_type,confidence,threshold,winner_provider,trace_id— no secrets, tokens, or PII.package books_test(black-box). ✓REVIEW VERDICT: 0 blocker, 1 major, 1 minor
CODE REVIEW — bookshelf-iso0 (PR #885)
Head:
8f5f312185[BLOCKER] internal/wfengine/simple_workflows.go:150 — ErrLowConfidenceMatch missing from isPermanentEnrichErr
The PR's core stated goal is to prevent wasted retry budget on low-confidence books. The sentinel
ErrLowConfidenceMatchis documented inenrich_confidence.go:87-88as being wrapped withworkflow.NewPermanentErrorby the wfengine adapter — butisPermanentEnrichErratsimple_workflows.go:150-155was never updated to includeerrors.Is(err, books.ErrLowConfidenceMatch). The comment in that function says "NOTE: this is the ONLY place in the codebase that maps metadata domain errors to the go-workflows permanent-error concept." Without this, low-confidence books burn MaxAttempts × backoff on every bulk sweep, which is exactly the failure mode this feature exists to prevent. Fix: adderrors.Is(err, books.ErrLowConfidenceMatch)to theisPermanentEnrichErrconjunction, and add a test case insimple_workflows_test.goalongside the existingErrForbiddenQuerycase.[MAJOR] internal/books/enrich_confidence.go:196 — issue number hard-fail uses raw string equality without normalization
ComicIdentityConfidencehard-fails to 0.0 whenq.IssueNumber != c.Comic.IssueNumberusing plain string equality. The ComicVine provider already has anormalizeIssueNumberfunction that strips leading zeros and normalizes floats ("018"→"18", "7.0"→"7"), but that normalization is NOT applied before comparing the book's stored issue number against the provider's result. Real-world sources forq.IssueNumberinclude user-edited fields (e.g. "#7"), sidecar comicinfo.xml (format varies), andformatSeriesNumber— any of which may produce a format different from what ComicVine returns. A stored "#7" vs a provider's "7" hard-fails to 0.0, silently stopping enrichment for that book on every bulk sweep (once the BLOCKER above is fixed). Same problem incomicProvidersAgreeat line 275. Fix: normalize bothq.IssueNumberandc.Comic.IssueNumberwith a shared normalization function (strip leading "#", applyparseFloat/leading-zero stripping) before string comparison.[MAJOR] internal/books/enrich_confidence.go:250 — map iteration nondeterminism in reference-provider selection
Both
comicProvidersAgree(line 250-286) andebookProvidersAgree(line 291-331) iterateresults map[string]metadata.Metadatato pick a reference provider. Go map iteration order is randomized per-run. For exactly 2 providers (the current typical case: ComicVine+Metron, Google Books+Open Library) the comparison is pairwise-symmetric so the result is stable. But for 3+ providers the choice of reference affects the outcome — "does A agree with B and C?" can differ from "does B agree with A and C?" when the metrics are not transitive. This is the exact nondeterminism pattern flagged in review-standard.md as [MAJOR]. Fix: sort the provider keys before iterating (e.g.slices.Sorted(maps.Keys(results))) to guarantee a deterministic reference every time.[MINOR] internal/books/enrich_confidence.go:171 — empty-title guard appears after similarity computation
The check
if q.Title == "" && c.Title == "" { return 0.0 }on line 171 appears aftertitleSimandauthorSimare already computed. This is correct in behavior (the guard correctly overrides the vacuous 1.0 scores) but the unintuitive ordering causes a reader to think the function returns 1.0 for two empty titles. Suggest moving the guard to the top of the fuzzy-score section, beforetitleSimis computed.[MINOR] internal/books/enrich_confidence.go:147 — ISBN check paths are dead code at the primary call site
The ISBN exact-match (lines 147-151) and ISBN-disagreement (lines 155-159) early returns inside
EbookIdentityConfidenceare dead code when the function is called fromapplyConfidenceGate, because the gate only fires whenq.ISBN13 == "" && q.ISBN10 == "". The function is exported so external callers could theoretically reach these paths, but no such caller exists in the diff. The doc comment should clarify that ISBN checks are vacuous when called from the bulk gate path, or the ISBN paths should be validated with a direct unit test that calls the function with a query that has ISBN.REVIEW VERDICT: 1 blocker, 2 major, 2 minor
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Security Review — bookshelf-iso0 (re-review, head
5d3b07ef)Scope: confidence gate / identity guard for bulk enrich path; NormalizeIssueNumber helper; Prometheus skip/disagreement metrics.
No regex anywhere in the new code — tokenize uses a character-by-character unicode loop and NormalizeIssueNumber uses strconv/strings. No ReDoS risk.
Prometheus label cardinality — `lowConfidenceSkippedTotal` is labeled `type` and its value is hardcoded to either `"comic"` or `"ebook"` in applyConfidenceGate (enrich_confidence.go:44/48). `providerDisagreementTotal` has no labels. Neither metric accepts provider-returned strings as label values. Bounded, clean.
Architecture boundary — `internal/books` and `internal/metadata` contain no imports of the workflow engine. `ErrLowConfidenceMatch` is a plain `errors.New` sentinel; the mapping to `workflow.NewPermanentError` lives exclusively in `internal/wfengine/simple_workflows.go:isPermanentEnrichErr`. Boundary respected.
Log hygiene — both Warn log calls (enrich_confidence.go:53–75) emit only: book_id (int64), book_type ("comic"/"ebook"), confidence (float), threshold (float), trace_id, winner_provider (provider ID, operator-configured), provider_count (int). No provider-returned title/author/ISBN/series strings are logged. No credentials or PII.
Multi-user — the confidence gate is keyed on book_id (not user-scoped) and introduces no new per-user data paths. Gate applies only on the unattended bulk path (enabledOverride != nil check in metadata_service.go:978), not on the interactive single-book path. No cross-user leak.
SQL injection — confidence scoring and NormalizeIssueNumber are pure comparison/string-normalization logic with zero DB interaction. No SQL built from provider strings.
[MINOR] internal/metadata/normalize.go:25-29 — NormalizeIssueNumber converts Inf (parsed successfully by ParseFloat) to int64, producing an implementation-defined result
strconv.ParseFloat accepts the literal strings "Inf", "+Inf", "-Inf" and returns ±math.Inf with err==nil. The branch then executes int64(math.Inf(1)), which in Go is implementation-defined (on amd64: -9223372036854775808). NormalizeIssueNumber("Inf") therefore returns "-9223372036854775808" — a confusingly wrong normalization. Real comic issue numbers from providers will never be "Inf", but the input originates from untrusted provider responses. The fix is a guard before the int64 cast: if math.IsInf(f, 0) || math.IsNaN(f) { return n }. Note: the overflowing-large-number case ("99999…") is NOT affected because ParseFloat returns ErrRange (err != nil) for those, correctly falling through to the suffix-stripping path.
REVIEW VERDICT: 0 blocker, 0 major, 1 minor
CODE REVIEW (re-review): bookshelf-iso0 — confidence gate + NormalizeIssueNumber
Head:
5d3b07ef09. Reviewing only new code since prior review (enrich_confidence.go, normalize.go, comicvine refactor, wfengine mapping, metrics). Ignoring e2e/browser/journey_library_stats_test.go per instruction.Phase 0: DEMO block
No DEMO block in this bead (confidence gate is an unattended bulk-path guard; a DEMO would require a full bulk-enrich run against a low-signal book). Proceeding on the basis that this is an internal guard with no user-facing demo surface — acceptable for this bead type.
Arch boundary (BLOCKER focus)
internal/books/enrich_confidence.goimports:context,errors,fmt,log/slog,maps,slices,strconv,strings,unicode,internal/metadata,internal/middleware. No workflow engine import.internal/metadata/normalize.goimports only stdlib. Architecture boundary is clean.isPermanentEnrichErrininternal/wfengine/simple_workflows.go:154–159now includeserrors.Is(err, books.ErrLowConfidenceMatch). The wrapping chain is intact:applyConfidenceGatereturnsfmt.Errorf("book %d: %w", bookID, ErrLowConfidenceMatch)(uses%w),RefetchMetadatapropagates it as-is at line 986, the wfengineRefetchactivity callsisPermanentEnrichErr(err)on the unwrapped error before re-wrapping it.errors.Isunwraps correctly. ✓The test at
simple_workflows_test.goexercisesfmt.Errorf("book 42: %w", books.ErrLowConfidenceMatch)— mirrors actual wrapping. ✓NormalizeIssueNumber correctness
internal/metadata/normalize.go— the#stripping is a behavioural delta from comicvine's oldnormalizeIssueNumber(which did not strip#). In the structured comicvine path,parseTitleComponentscaptures the digits after#and never returns a#-prefixed issue number, so the three call-sites inprovider.go(norm = metadata.NormalizeIssueNumber(issueNum)at lines 468, 492, 573) are behaviour-equivalent to the old function. The#stripping only activates for rawq.IssueNumber/c.Comic.IssueNumberinComicIdentityConfidenceandcomicProvidersAgree, where stripping#is the correct and desired behaviour. No regression.Edge cases:
"#018"→"18","7.0"→"7","18.1"→"18.1","Annual"→"Annual",""→"","0AU"→"0AU"all covered innormalize_test.go. ✓ComicIdentityConfidence + ProvidersAgree
Determinism: Both
comicProvidersAgreeandebookProvidersAgreeiterate viaslices.Sorted(maps.Keys(results))— deterministic regardless of map allocation order. ✓Bulk-path gate scoping:
internal/books/metadata_service.go:978— gate is insideif enabledOverride != nil { ... }. Interactive path (enabledOverride == nil) bypasses it. Test"RefetchMetadata confidence gate — not applied on single-book interactive path"explicitly exercises this. ✓Score cap:
ComicIdentityConfidencecaps at 1.0 in the year-bonus block (if score > 1.0 { score = 1.0 }). Maximum path without cap: seriesSim=1.0×0.7 + issueBonus=0.3 + yearBonus=0.05 = 1.05, capped to 1.0. ✓False-negative risk: When query has an issue number but candidate does not (
normC == ""), the hard-fail does not fire and the bonus is withheld — the candidate must pass on series similarity alone (≥0.714 to reach 0.5 threshold after ×0.7). This is intentionally conservative and documented. Acceptable design choice, not a bug.Metrics
lowConfidenceSkippedTotallabel{"type"}with values"comic"or"ebook"— hardcoded in production code, not data-driven. Fixed cardinality (2 values).providerDisagreementTotalis a plain Counter with no labels. No attacker-controlled label values. ✓registerRefetchCountermirrors the pre-existingregisterRefetchCounterVecpattern exactly (including the silent-drop on non-AlreadyRegisteredErrorerrors — a pre-existing pattern, not new). ✓Test quality
All new test files declare
_testpackages:package books_test,package metadata_test,package wfengine_test. No unexported symbols accessed directly.export_test.gobridge file (package wfengine) correctly exposesIsPermanentEnrichErrfor black-box testing — standard Go pattern, not a white-box violation. Test calls inBeforeEachmirror the pre-existing pattern inrefetch_metrics_test.go. ✓Finding
[MINOR] internal/metadata/normalize.go:28 — theoretical
int64overflow for float-exponential issue stringsint64(f)wherefis a validfloat64outside[math.MinInt64, math.MaxInt64](e.g. input"1e308") produces undefined/saturated behaviour in Go. In practice comic issue numbers are always small integers, so this is unreachable in production. A guardif f > math.MaxInt64 || f < math.MinInt64 { return n }before the cast would make it formally correct.REVIEW VERDICT: 0 blocker, 0 major, 1 minor
5d3b07ef093b2fafd3f2Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
3b2fafd3f2fc6e8adfdaRecompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)