fix(sweep): provider-aware EnrichComicDetail — stop Metron id fetched as ComicVine id (bookshelf-vbtl) #866

Merged
zombor merged 1 commit from bd-bookshelf-vbtl into main 2026-06-30 14:20:45 +00:00
Owner

Root Cause

buildBulkLLMSweepDeps.EnrichComicDetail was provider-blind — it always called comicvine.NewFetchIssueDetail(m.ExternalID) regardless of which provider produced the matched candidate.

Metron list candidates (searchResultToListCandidate) set ExternalID = the Metron issue id (cv_id is only populated by the full detail endpoint, which the sweep never called for list results). So a Metron match caused ComicVine /issue/{metron_id}/ — Metron and ComicVine ids are both small ints and collide — fetching a completely unrelated ComicVine issue. Book 3815 (ASM #611) got saved as "The Secret of Skartaris / Warlord #5" with comicvine_id = 17164 (the Metron id).

The interactive path (CandidateDetailHandler) was correct — it switches on provider and calls fetchMetronDetail for Metron candidates. The bulk sweep lacked the same switch.

Fix

  • Add ProviderID string to metadata.Metadata — set by each provider's result mappers:
    • ComicVine: "comicvine" in issueListResultToMetadata, searchResultToMetadata, issueDetailToMetadata
    • Metron: "metron" in searchResultToListCandidate, detailToMetadata
  • Add wfengine.NewEnrichComicDetailFunc(fetchCV, fetchMetron) — routes the detail fetch by m.ProviderID:
    • "metron" → calls Metron detail endpoint with m.ExternalID (the Metron issue id); returns full metadata with cv_id → ExternalID so comicvine_id gets the correct value
    • "comicvine" → calls ComicVine detail endpoint (as before)
    • nil fetcher / zero ExternalID / unknown provider → returns m unchanged
  • Wire a Metron detail fetcher in buildBulkLLMSweepDeps and replace the old inline closure with wfengine.NewEnrichComicDetailFunc(fetchDetail, fetchMetronDetail)

ID Flow (corrected)

Metron candidate: ExternalID = 17164 (Metron issue id)fetchMetronDetail(17164) → detail: cv_id = 42ExternalID = 42 → saved as comicvine_id = 42 (correct)

ComicVine candidate: ExternalID = 42 (CV issue id)fetchCV(42)comicvine_id = 42 (unchanged)

Test Plan

  • NewEnrichComicDetailFunc tests: Metron candidate calls Metron fetcher with the Metron id (not ComicVine); result has cv_id as ExternalID; ComicVine candidate calls CV fetcher; nil fetcher / zero ExternalID / unknown provider return input unchanged
  • Provider tests: ProviderID assertions for both Metron list-candidate and detail, and ComicVine list result
  • make test + make test-policy-check + make e2e-policy-check all green

Closes bead bookshelf-vbtl on merge.

## Root Cause `buildBulkLLMSweepDeps.EnrichComicDetail` was provider-blind — it always called `comicvine.NewFetchIssueDetail(m.ExternalID)` regardless of which provider produced the matched candidate. Metron list candidates (`searchResultToListCandidate`) set `ExternalID = the Metron issue id` (cv_id is only populated by the full detail endpoint, which the sweep never called for list results). So a Metron match caused `ComicVine /issue/{metron_id}/` — Metron and ComicVine ids are both small ints and collide — fetching a completely unrelated ComicVine issue. Book 3815 (ASM #611) got saved as "The Secret of Skartaris / Warlord #5" with `comicvine_id = 17164` (the Metron id). The interactive path (`CandidateDetailHandler`) was correct — it switches on provider and calls `fetchMetronDetail` for Metron candidates. The bulk sweep lacked the same switch. ## Fix - Add `ProviderID string` to `metadata.Metadata` — set by each provider's result mappers: - ComicVine: `"comicvine"` in `issueListResultToMetadata`, `searchResultToMetadata`, `issueDetailToMetadata` - Metron: `"metron"` in `searchResultToListCandidate`, `detailToMetadata` - Add `wfengine.NewEnrichComicDetailFunc(fetchCV, fetchMetron)` — routes the detail fetch by `m.ProviderID`: - `"metron"` → calls Metron detail endpoint with `m.ExternalID` (the Metron issue id); returns full metadata with `cv_id → ExternalID` so `comicvine_id` gets the correct value - `"comicvine"` → calls ComicVine detail endpoint (as before) - nil fetcher / zero ExternalID / unknown provider → returns `m` unchanged - Wire a Metron detail fetcher in `buildBulkLLMSweepDeps` and replace the old inline closure with `wfengine.NewEnrichComicDetailFunc(fetchDetail, fetchMetronDetail)` ## ID Flow (corrected) Metron candidate: `ExternalID = 17164 (Metron issue id)` → `fetchMetronDetail(17164)` → detail: `cv_id = 42` → `ExternalID = 42` → saved as `comicvine_id = 42` (correct) ComicVine candidate: `ExternalID = 42 (CV issue id)` → `fetchCV(42)` → `comicvine_id = 42` (unchanged) ## Test Plan - `NewEnrichComicDetailFunc` tests: Metron candidate calls Metron fetcher with the Metron id (not ComicVine); result has cv_id as ExternalID; ComicVine candidate calls CV fetcher; nil fetcher / zero ExternalID / unknown provider return input unchanged - Provider tests: ProviderID assertions for both Metron list-candidate and detail, and ComicVine list result - `make test` + `make test-policy-check` + `make e2e-policy-check` all green Closes bead bookshelf-vbtl on merge.
fix(sweep): provider-aware EnrichComicDetail — stop Metron issue id being passed to ComicVine endpoint (bookshelf-vbtl)
All checks were successful
/ JS Unit Tests (pull_request) Successful in 36s
/ E2E API (pull_request) Successful in 2m3s
/ Lint (pull_request) Successful in 2m54s
/ E2E Browser (pull_request) Successful in 2m13s
/ Integration (pull_request) Successful in 3m6s
/ Test (pull_request) Successful in 3m54s
a276ab456f
Root cause: buildBulkLLMSweepDeps.EnrichComicDetail was provider-blind — it
always called comicvine.NewFetchIssueDetail with m.ExternalID regardless of which
provider produced the candidate. Metron list candidates set ExternalID = the Metron
issue id (cv_id is only available from the detail endpoint). So a Metron match
resulted in ComicVine /issue/{metron_id}/ which fetches an unrelated ComicVine
issue (Metron id 17164 → ComicVine "Warlord #5"), and that wrong comic was saved.

Fix:
- Add ProviderID string field to metadata.Metadata — set to "comicvine" in
  ComicVine's issueListResultToMetadata/searchResultToMetadata/issueDetailToMetadata,
  and "metron" in Metron's searchResultToListCandidate/detailToMetadata.
- Add wfengine.NewEnrichComicDetailFunc(fetchCV, fetchMetron) that routes the
  detail fetch by m.ProviderID: "metron" → Metron detail (ExternalID becomes cv_id
  after the call); "comicvine" → ComicVine detail (as before).
- Wire a Metron detail fetcher in buildBulkLLMSweepDeps and replace the old
  inline EnrichComicDetail closure with wfengine.NewEnrichComicDetailFunc.

Tests: NewEnrichComicDetailFunc coverage: Metron candidate calls Metron fetcher
(not ComicVine), result has cv_id as ExternalID; ComicVine candidate calls CV
fetcher; nil fetcher / zero ExternalID / unknown provider all return input unchanged.
Provider-level tests assert ProviderID is set correctly by both providers.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Workflow Detail page screenshot (wf-detail-older-execution)

Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.

wf-detail-older-execution

**Workflow Detail page screenshot** (wf-detail-older-execution) Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent. ![wf-detail-older-execution](/attachments/9e2aec3a-08e8-4d33-b577-a28af7adc6d8)

Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)

recompute-match-score-kebab-open

**Recompute Match Score — kebab open screenshot** (recompute-match-score-kebab-open) ![recompute-match-score-kebab-open](/attachments/3134424d-4e5e-468d-a3f7-4d4011567da1)
Author
Owner

Code Review — PR #866 (bookshelf-vbtl)

Phase 0: DEMO block

Not applicable — P1 data-corruption fix; no DEMO block required per review dispatch.

Phase 1 + 2: Spec compliance and code quality

Reviewed diff: git diff $(git merge-base origin/main origin/bd-bookshelf-vbtl)..origin/bd-bookshelf-vbtl


ROUTING (check 1): NewEnrichComicDetailFunc in internal/wfengine/bulk_llm_workflow.go:220 has an explicit switch m.ProviderID with case "metron" routing to fetchMetron and case "comicvine" routing to fetchCV. The ExternalID == 0 guard fires before the switch, so zero-ID candidates skip both fetchers. A Metron candidate with a non-zero ExternalID cannot reach the ComicVine endpoint. Verified.

CORRECT ID (check 2): metron/provider.go:574 sets m.ExternalID = *d.CVID inside detailToMetadata. The existing test at metron/provider_test.go:515–516 asserts result.ExternalID == 111222 where the stub has cv_id = 111222. The PR adds ProviderID == "metron" at line 519. searchResultToListCandidate (line 692) sets m.ExternalID = r.ID (Metron issue id) for list candidates, which is what gets passed into fetchMetron(ctx, m.ExternalID) — correct: Metron detail is fetched by Metron id, and its return value carries the cv_id as ExternalID. Verified.

PROVIDERID coverage (check 3): All five mappers set ProviderID:

  • comicvine/provider.go:1425issueListResultToMetadata
  • comicvine/provider.go:1527searchResultToMetadata
  • comicvine/provider.go:1560issueDetailToMetadata
  • metron/provider.go:571detailToMetadata
  • metron/provider.go:693searchResultToListCandidate

Note: searchResultToMetadata does not set m.ExternalID (pre-existing behavior, not a regression), so its ProviderID will never reach the routing switch — the ExternalID == 0 guard fires first. Safe.

WIRING (check 4): build_extended_deps.go:1481–1490 constructs fetchMetronDetail using Metron config and waitFor(settings.ProviderMetron) rate limiter. Both fetchDetail and fetchMetronDetail are passed to wfengine.NewEnrichComicDetailFunc(fetchDetail, fetchMetronDetail) at line 1545. Verified.

REGRESSION TEST (check 5): bulk_llm_workflow_test.go Describe("NewEnrichComicDetailFunc") at line 648:

  • Metron candidate {ProviderID: "metron", ExternalID: 17164} (exact bug ID): asserts metronCalledWith == 17164 and cvCalledWith == 0. ✓
  • ComicVine candidate: asserts cvCalledWith == 42 and metronCalledWith == 0. ✓
  • ExternalID == 0: asserts no fetcher called, input returned unchanged. ✓
  • nil fetchMetron / nil fetchCV: each returns input unchanged. ✓
  • Unknown ProviderID: returns input unchanged. ✓

GATE-GAMING (check 7): Test file declares package wfengine_test (black-box). NewEnrichComicDetailFunc is a legitimate production export consumed by build_extended_deps.go:1545 — not a test-only shim. No unexported symbols referenced. ✓


[MINOR] internal/wfengine/bulk_llm_workflow.go:239,244 — ProviderID routing uses bare string literals repeated across three packages
"comicvine" and "metron" appear in comicvine/provider.go, metron/provider.go, AND the switch here. A one-character typo in any of them silently falls through to the default branch (no enrichment, no error). Tests catch this today, but package-level constants (const ProviderIDComicVine = "comicvine") in the metadata package would make a mismatch a compile error. Low risk while the test suite is green; worth a follow-up bead.

[MINOR] internal/wfengine/bulk_llm_workflow_test.go:652 — Ordered on a unit test container causes cascade-skip on spec failure
Describe("NewEnrichComicDetailFunc", Ordered, ...) means if the Metron-routing spec fails, all subsequent specs (ComicVine, nil-fetcher, etc.) are skipped instead of running independently. For unit tests, a non-Ordered outer container is conventional — each Describe sub-block is already isolated by its own BeforeEach. Not a correctness issue; nit.


REVIEW VERDICT: 0 blocker, 0 major, 2 minor

## Code Review — PR #866 (bookshelf-vbtl) ### Phase 0: DEMO block Not applicable — P1 data-corruption fix; no DEMO block required per review dispatch. ### Phase 1 + 2: Spec compliance and code quality Reviewed diff: `git diff $(git merge-base origin/main origin/bd-bookshelf-vbtl)..origin/bd-bookshelf-vbtl` --- **ROUTING (check 1):** `NewEnrichComicDetailFunc` in `internal/wfengine/bulk_llm_workflow.go:220` has an explicit `switch m.ProviderID` with `case "metron"` routing to `fetchMetron` and `case "comicvine"` routing to `fetchCV`. The `ExternalID == 0` guard fires before the switch, so zero-ID candidates skip both fetchers. A Metron candidate with a non-zero ExternalID cannot reach the ComicVine endpoint. Verified. **CORRECT ID (check 2):** `metron/provider.go:574` sets `m.ExternalID = *d.CVID` inside `detailToMetadata`. The existing test at `metron/provider_test.go:515–516` asserts `result.ExternalID == 111222` where the stub has `cv_id = 111222`. The PR adds `ProviderID == "metron"` at line 519. `searchResultToListCandidate` (line 692) sets `m.ExternalID = r.ID` (Metron issue id) for list candidates, which is what gets passed into `fetchMetron(ctx, m.ExternalID)` — correct: Metron detail is fetched by Metron id, and its return value carries the cv_id as ExternalID. Verified. **PROVIDERID coverage (check 3):** All five mappers set ProviderID: - `comicvine/provider.go:1425` — `issueListResultToMetadata` ✓ - `comicvine/provider.go:1527` — `searchResultToMetadata` ✓ - `comicvine/provider.go:1560` — `issueDetailToMetadata` ✓ - `metron/provider.go:571` — `detailToMetadata` ✓ - `metron/provider.go:693` — `searchResultToListCandidate` ✓ Note: `searchResultToMetadata` does not set `m.ExternalID` (pre-existing behavior, not a regression), so its ProviderID will never reach the routing switch — the `ExternalID == 0` guard fires first. Safe. **WIRING (check 4):** `build_extended_deps.go:1481–1490` constructs `fetchMetronDetail` using Metron config and `waitFor(settings.ProviderMetron)` rate limiter. Both `fetchDetail` and `fetchMetronDetail` are passed to `wfengine.NewEnrichComicDetailFunc(fetchDetail, fetchMetronDetail)` at line 1545. Verified. **REGRESSION TEST (check 5):** `bulk_llm_workflow_test.go` `Describe("NewEnrichComicDetailFunc")` at line 648: - Metron candidate `{ProviderID: "metron", ExternalID: 17164}` (exact bug ID): asserts `metronCalledWith == 17164` and `cvCalledWith == 0`. ✓ - ComicVine candidate: asserts `cvCalledWith == 42` and `metronCalledWith == 0`. ✓ - `ExternalID == 0`: asserts no fetcher called, input returned unchanged. ✓ - nil fetchMetron / nil fetchCV: each returns input unchanged. ✓ - Unknown ProviderID: returns input unchanged. ✓ **GATE-GAMING (check 7):** Test file declares `package wfengine_test` (black-box). `NewEnrichComicDetailFunc` is a legitimate production export consumed by `build_extended_deps.go:1545` — not a test-only shim. No unexported symbols referenced. ✓ --- [MINOR] internal/wfengine/bulk_llm_workflow.go:239,244 — ProviderID routing uses bare string literals repeated across three packages `"comicvine"` and `"metron"` appear in `comicvine/provider.go`, `metron/provider.go`, AND the switch here. A one-character typo in any of them silently falls through to the default branch (no enrichment, no error). Tests catch this today, but package-level constants (`const ProviderIDComicVine = "comicvine"`) in the `metadata` package would make a mismatch a compile error. Low risk while the test suite is green; worth a follow-up bead. [MINOR] internal/wfengine/bulk_llm_workflow_test.go:652 — `Ordered` on a unit test container causes cascade-skip on spec failure `Describe("NewEnrichComicDetailFunc", Ordered, ...)` means if the Metron-routing spec fails, all subsequent specs (ComicVine, nil-fetcher, etc.) are skipped instead of running independently. For unit tests, a non-Ordered outer container is conventional — each `Describe` sub-block is already isolated by its own `BeforeEach`. Not a correctness issue; nit. --- REVIEW VERDICT: 0 blocker, 0 major, 2 minor
zombor merged commit 9c264528f1 into main 2026-06-30 14:20:45 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
zombor/pergamum!866
No description provided.