fix(books): comic-aware metadata_match_score so comics aren't over-scored (bookshelf-v1to) #816
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-v1to"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
IsComic. Comic books use a hardcoded comic weight table (104 total weight) that grades comic-specific fields; ebooks continue to use the existing user-configurable weight table unchanged.GetMetadataForMatchScoreWithComicreplacesGetMetadataForMatchScorein app.go and the RecalcMatchScores workflow — it fetches comic completeness data in one SQL round-trip (12 correlated subqueries) and setsIsComic=truefor CBX books.Comic weight table (total = 104)
Book 4595 (Spectacular Spider-Girl #2) before: 80.77% | after: ~21% (22/104)
Deviation from Grimmory
Grimmory match_score is generic-book-only (no comic-aware branching). This is an intentional pergamum improvement; CBX scores will differ from Grimmory output.
Recompute note
The backfill migration key (
backfill_metadata_match_score_v1) is intentionally unchanged — it was marked done on the live install. A manual trigger of POST /settings/match-weights/recalculate will re-score all books (including comics) with the new logic. TheRecalcMatchScoresworkflow now usesGetMetadataForMatchScoreWithComic, so any future recalc run picks up comic scoring automatically.Test plan
make test— 3 new CalculateMatchScore specs (full comic CBX > 80%, zero comic CBX < 25%, ebook > 70%), + GetComicMatchScoreData + GetMetadataForMatchScoreWithComic store tests (all error paths covered)make coverage— zero uncovered statement blocks (100% gate)go build ./...— compiles cleangolangci-lint run ./internal/books/... ./internal/app/...— 0 issuesCloses bead bookshelf-v1to on merge.
[BLOCKER] internal/books/recompute_scores.go:20-28 and internal/books/match_score_comic_store.go:42-57 — N+1 query pattern at scale
The RecomputeScoresBatch function loops through bookIDs sequentially and calls getMatchInput() for each book individually:
For the startup backfill of 250k books in 100-book batches: 2,500 batches × 2 queries/book = 500,000+ database round trips. This violates the hard rule "No N+1" in project-conventions.md.
Fix: Batch the comic-data fetches. Instead of GetComicMatchScoreData(conn.QueryContext) returning a per-book curried function, create GetComicMatchScoreDataBatch(conn, bookIDs []int64) that fetches all books in one query with
WHERE book_id IN (...), returning a map[int64]ComicMatchScoreRow. Then RecomputeScoresBatch can fetch comic data once per 100-book batch instead of 100 times per batch.[MAJOR] internal/app/build_extended_deps.go:73-80 and bead comment — Stale stored scores until manual recompute
The bead says "Backfill migration key is unchanged; next RecalcMatchScores run re-scores all." This means the startup backfill will NOT re-run (it only runs once per migration-key). Comics currently scored at ~80% will keep that old score until the user manually triggers a RecalcMatchScores workflow.
Is this intentional? If yes, the bead and PR should document this — existing users will see inconsistent scores (stored: old 80%, detail-page-computed: new 21%) until they manually kick off a recompute. If unintentional, bump the backfill migration key so it re-runs and updates all scores at startup.
[MINOR] internal/books/match_score.go:111-115 — Document weight-total rationale
The comment "total=104" lacks rationale. Add a note: "chosen so a comic with only generic fields (title + series + description, but no comic_metadata row) scores below 25%, matching the book-4595 test case."
REVIEW VERDICT: 1 blocker, 1 major, 1 minor
d808b204f8774d666047Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
774d6660477d65258a4aWorkflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)