feat(metadata): match score compute + display + filter (bookshelf-yfnr) #573
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-yfnr2"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
metadata_match_score(0–100) using weighted field completeness across 24 fields (totalWeight=87, compile-time enforced).app_migrationtable (idempotent, cursor-paginated).book_metadatais first seeded.?metadata=incomplete(score < 100 or NULL) and?metadata=complete(score = 100) filters on/books.style=(CSP-safe).Test plan
make test— all unit tests passmake coverage— 100% coverage gate passesmake lint— 0 issuesgo test -tags e2e -run='^$' ./e2e/browser/— e2e package compilesCloses bead bookshelf-yfnr on merge.
Metadata Match Score — UI screenshots
Book detail page — the Metadata Match cell in the metadata grid:
Books list with
?metadata=incomplete— only the incomplete book is shown:CODE REVIEW: NOT APPROVED
Branch: bd-bookshelf-yfnr2 | SHA:
9310e33e| PR: #573Phase 0: DEMO Verification
No explicit DEMO block in the bead comments. CI is green per the dispatch context; no DEMO commands to re-run. Proceeding on CI-green basis per review instructions.
Phase 1: Spec Compliance
Spec: compute
metadata_match_score(24 fields, totalWeight=87), store in existingbook.metadata_match_score, startup backfill, recompute on save+scan,?metadata=incomplete|completefilter, "Metadata Match: NN%" on detail page. All present. Migration 0036 is index-only (plainCREATE INDEX/DROP INDEX— noIF [NOT] EXISTS). Schema compat confirmed: no new column added to any Grimmory table.Phase 2: Code Quality
[MAJOR] internal/books/match_score.go:236-238 — review-count lock flags are wrong: uses rating lock instead of dedicated review-count lock columns
HardcoverReviewCount,GoodreadsReviewCount, andAmazonReviewCountare scored withHardcoverRatingLocked,GoodreadsRatingLocked, andAmazonRatingLockedas their lock flags respectively. The Grimmory baseline schema (migration 0001) has dedicated review-count lock columns:hardcover_review_count_locked,goodreads_review_count_locked,amazon_review_count_locked(lines 230, 224, 222 of 0001_grimmory_baseline.up.sql). TheGetBookMetadataForMatchScorequery does not select these columns; theMatchScoreInputstruct has no fields for them; and the score calculation therefore misuses the rating lock as a proxy for the review-count lock. When a user locks only the review count (not the rating), the review count will not be treated as present unless the corresponding rating is also locked — a correctness bug for the locked-field case. Fix: addHardcoverReviewCountLocked,GoodreadsReviewCountLocked,AmazonReviewCountLockedfields toMatchScoreInput, add them to the SQL query, and use them inCalculateMatchScore.[MINOR] internal/books/match_score_service_test.go:51-53 — two
Expectcalls inside oneItblockProject convention (project-conventions.md "Test structure (ginkgo)"): exactly one
ExpectperIt. TheIt("calls setScore with a non-zero score when title is present", ...)block at line 51 contains twoExpectcalls (Expect(setScores).NotTo(BeEmpty())followed byExpect(setScores[0])...). Split into twoItblocks: one asserting the slice is non-empty, one asserting the value is positive.[MINOR] internal/books/match_score_store_test.go:151-153, 166-168, 181-183, 196-198, 277-296, 299-303, 306-309 — multiple
ExpectperItin several store test blocksMultiple
Itblocks assert two or more things. Examples:NotTo(BeNil())+ value assertion.Expectcalls in a singleIt.Expectcalls.Expectcalls.Split each multi-assertion
Itinto individualItblocks per property.[MINOR] internal/db/queries/books.sql —
ListBooksForScoreBackfillByLibraryquery defined but not wired or called anywhereThe query is generated into
sqlcbut nothing ininternal/books/callsd.Q.ListBooksForScoreBackfillByLibrary. This is a dead query in this PR — slicing-standard.md: "Don't add a sqlc query that nothing calls." Either wire it (the r6lx recalculate-all follow-up epic presumably needs it) or remove it from this PR and add it in that epic.[MINOR] internal/books/match_score_service.go:69 — backfill query comment says "NULL score" but query has no NULL filter
BackfillMatchScoresdoc-comment says "every book that currently has a NULL score" butListBooksForScoreBackfillselects all non-deleted books withb.id > cursor— nometadata_match_score IS NULLpredicate. On the first startup run the migration guard is not yet set, so all books are scored (correct for bootstrapping). On subsequent restarts the migration guard short-circuits. The query description in the function doc is misleading and the SQL comment on the query mentions "startup backfill" without explaining the NULL omission. Update either the doc-comment or addAND b.metadata_match_score IS NULLto the query (which would make idempotency independent of the migration guard).REVIEW VERDICT: 0 blocker, 1 major, 4 minor
Security Review — bookshelf-yfnr (metadata_match_score)
Threat model coverage: multi-user/library scoping, SQL injection, input validation, DoS/resource bounds, migration safety, PII in logs, template XSS sinks.
Findings
[MINOR] internal/books/match_score.go:238 — AmazonReviewCount locked via AmazonRatingLocked (no dedicated lock column)
earned += presentI32Ptr(m.AmazonReviewCount, m.AmazonRatingLocked) * wAmazonReviewsAmazonReviewCountis gated onAmazonRatingLocked— there is noamazon_review_count_lockedcolumn in the schema, and the same pattern is used for Hardcover and Goodreads review counts (using their rating lock flags). This is internally consistent and mirrors Grimmory's schema (review counts share the provider's single lock flag). No security impact; noted for clarity. No fix required unless a separate review-count lock column is ever added.Threat-model findings (all green)
Multi-user / library scoping (PRIMARY) — PASS
ListHandlerresolvesuserLibraryIDsviagetUserLibraryIDs(lines 253–259) and assignsfilter.UserLibraryIDs = userLibraryIDs(line 326) before the?metadata=block (lines 378–382). BothbuildListBooksFilteredQueryandbuildMultiFieldListQueryapply the fail-closed1=0sentinel whenUserLibraryIDsis a non-nil empty slice (the authenticated-but-no-libraries case). TheMetadataFilterpredicate is appended after the library-scoping block in both builders — a user cannot use?metadata=incomplete|completeto enumerate books outside their accessible libraries.The detail-page score (
getMatchScore) is fetched afterget(r.Context(), id, userIDFromRequest(r))callscheckAccess, which already gates on library membership. No IDOR vector.SQL injection — PASS
The
MetadataFiltervalue is validated to an exact enum ("incomplete"|"complete") in the handler before it reaches the store. The SQL predicates (metadata_match_score IS NULL OR b.metadata_match_score < 100,b.metadata_match_score = 100) are hard-coded string literals appended to the WHERE clause — no user input is interpolated.SetBookMetadataMatchScoreuses parameterizedUPDATE book SET metadata_match_score = ? WHERE id = ?.Input validation — PASS
Unknown/garbage
?metadata=values returnErrValidation→ HTTP 400 (tested inmatch_score_handler_test.go). Empty string → no filter applied. No injection vector and no silent 500.Resource / DoS — PASS
Migration 0036 adds only an index (
CREATE INDEX idx_book_metadata_match_score ON book (metadata_match_score)) — no table rebuild, no data migration at DDL time. The startup backfill is cursor-paginated (batch size 200,id > cursor), runs in a background goroutine, and is guarded by anapp_migrationrow so it is a no-op after the first successful run. No OFFSET pagination; no full-table-in-memory load.Migration safety — PASS
Migration 0036 is index-only on an existing Grimmory column (
book.metadata_match_score). No new column is added; the Grimmory-schema CI guard is not tripped. TheDISK_TYPEguard is irrelevant for DDL (not a filesystem op).No PII/secrets in logs — PASS
Logs emit
book_idandscore(a float64 0–100) only. No user data, no ratings text, no personal identifiers.Template XSS sinks — PASS
books_show.html:{{printf "%.0f" .MetadataMatchScore}}%—MetadataMatchScoreis afloat64rendered throughprintf;html/templateauto-escapes it regardless. No XSS sink.books_index_controls.html:MetadataFilteris reflected intohrefquery strings via{{if $filter.MetadataFilter}}&metadata={{$filter.MetadataFilter}}{{end}}.html/templateURL-context-aware escaping applies here; the value is also validated to"incomplete"|"complete"at ingress so the set of possible values is a closed safe enum.REVIEW VERDICT: 0 blocker, 0 major, 1 minor
25183bda14f4d04f570bReview findings addressed
[MAJOR] internal/books/match_score.go:236-238 — review-count presence gated on wrong lock column
Fixed. Added three dedicated fields to
MatchScoreInput:HardcoverReviewCountLocked boolGoodreadsReviewCountLocked boolAmazonReviewCountLocked boolUpdated
GetBookMetadataForMatchScoreSQL query (internal/db/queries/books.sql:191-193) to SELECTbm.hardcover_review_count_locked,bm.goodreads_review_count_locked,bm.amazon_review_count_locked. Re-ranmake sqlc— the three fields appear inGetBookMetadataForMatchScoreRow. UpdatedmatchScoreRowToInputinmatch_score_store.go:207-213to map them. FixedCalculateMatchScoreatmatch_score.go:236-238to use the dedicated*ReviewCountLockedflags instead of*RatingLocked.Re: comment 6730 (security review) claiming these columns do not exist — they DO exist in
internal/db/migrations/0001_grimmory_baseline.up.sql:222,224,230(amazon_review_count_locked,goodreads_review_count_locked,hardcover_review_count_locked). The security review was incorrect; this is a real correctness bug that was correctly classified as [MAJOR] by comment 6729.Audible review count (
wAudibleRating = 2) has no weight entry for review count (wAudibleReviewsdoes not exist), soaudible_review_count_lockedwas intentionally not added.New unit tests added (
match_score_test.go): proves thatHardcoverReviewCountLocked=true, HardcoverRatingLocked=falsecounts the review count as present (and the symmetric case — rating locked but review count unlocked with nil pointer = review count absent).[MINOR] internal/db/queries/books.sql — ListBooksForScoreBackfillByLibrary unused
Removed at
internal/db/queries/books.sql(the query starting at line 224 in the original). Re-ranmake sqlc— the generated type and function are gone frominternal/db/sqlc/books.sql.go.[MINOR] internal/books/match_score_service.go:63 — misleading BackfillMatchScores doc comment
Updated doc at
match_score_service.go:60-64to say "all non-deleted books" (the actual query has noNULLpredicate — idempotency comes from the migration guard).[MINOR] match_score_service_test.go:51-53 and match_score_store_test.go:277-309 — multiple Expect per It
Split in both files:
match_score_service_test.go: the"calls setScore with a non-zero score"It (2 Expects) split into two Its:"calls setScore exactly once"+"calls setScore with a non-zero score".match_score_store_test.go: the"maps all lock flags when set"It (19 Expects) split into 19 individual Its, one per property. The"maps all rating pointers"(4 Expects) and"maps all review count pointers"(3 Expects) also split into per-property Its. Added 3 new Its forHardcoverReviewCountLocked,GoodreadsReviewCountLocked,AmazonReviewCountLocked.Migration version conflict (resolved)
After rebasing onto current
main,0036_book_match_score_indexconflicted with0036_dedup_external_id_indexes(merged in #567). Renumbered to0038_book_match_score_index(0037 is0037_book_file_dedup_covering_index, also from #567).CI: green (SHA
fa2baa5c). Mergeable: True.Security Re-Review — bd-bookshelf-yfnr2 (head
fa2baa5c)Scope: diff
origin/main...origin/bd-bookshelf-yfnr2, adversarial pass on metadata_match_score +?metadata=filter. Prior review (comment 6730) had 0 blockers / 0 major / 1 minor. This re-review re-checks the fixed items plus the full diff.Threat model confirmation
Multi-user / library scoping — CONFIRMED SAFE
The
?metadata=incomplete|completepredicate is appended instore.goafter theUserLibraryIDs/1=0fail-closed block in bothbuildListBooksFilteredQuery(single-key path, line ≈927) andbuildMultiFieldListQuery(multi-key path, line ≈1203). The sentinel1=0short-circuits the entire WHERE clause when the authenticated user has no accessible libraries, soMetadataFiltercannot bypass library scoping.filter.UserLibraryIDsis populated fromgetUserLibraryIDsbefore any filter params are parsed (handler.go ≈253–262, 326). A user cannot enumerate books outside their libraries via this filter. ✅SQL injection — CONFIRMED SAFE
b.metadata_match_score IS NULL OR b.metadata_match_score < 100andb.metadata_match_score = 100) are hard-coded string literals; they contain no user-controlled data and are ANDed into thewheresslice like all other predicates.SetBookMetadataMatchScoreuses two?placeholders for score and ID (books.sql ≈2545). Fully parameterized.GetBookMetadataForMatchScoreuses one?placeholder forbookID.ListBooksForScoreBackfilluses?for cursor ID and limit.*_review_count_lockedcolumns (fix from prior review) appear in a staticSELECTcolumn list — no interpolation. ✅CREATE INDEXwith no dynamic DDL. ✅Input validation — CONFIRMED SAFE
Handler validates
?metadata=against an exact allowlist ("incomplete"/"complete") before assigning tofilter.MetadataFilter, returning a 400 viamiddleware.ErrValidationon any other value. Empty string → no filter. ✅Resource / DoS — CONFIRMED SAFE
The filter is bounded by the existing
LIMIT+ cursor pagination.hasSelectiveFilternow returnstruewhenMetadataFilter != "", so the query planner hint is applied andidx_book_metadata_match_score(migration 0038) is available. Backfill is cursor-paginated in batches of 200, guarded by theapp_migrationidempotency key, and runs in a background goroutine — does not block request handling. ✅Migration 0038 — CONFIRMED SAFE
Index-only on the existing Grimmory V29 column
book.metadata_match_score. No new column. Down migration drops the index. No destructive DDL. ✅Findings
No new issues found.
The prior minor finding ([MINOR] prior review — migration numbering gap 0036→0038) is resolved: migration is now numbered 0038 contiguously.
One observation worth documenting, not a new finding:
REVIEW VERDICT: 0 blocker, 0 major, 0 minor
CODE RE-REVIEW: APPROVED — 0 blockers, 0 majors, 1 minor
Prior MAJOR (review-count lock columns): RESOLVED
All three dedicated columns are correctly plumbed end-to-end:
internal/db/queries/books.sqlselectsbm.hardcover_review_count_locked,bm.goodreads_review_count_locked,bm.amazon_review_count_locked(distinct from the rating-locked columns).internal/books/match_score.go:574–578:MatchScoreInputhas dedicatedHardcoverReviewCountLocked,GoodreadsReviewCountLocked,AmazonReviewCountLockedfields.internal/books/match_score.go:736–738:CalculateMatchScoreuses them (presentI32Ptr(m.HardcoverReviewCount, m.HardcoverReviewCountLocked)) — independent of the rating lock.internal/books/match_score_test.go:2106–2150proves the three scenarios: review-count-locked-but-rating-unlocked earns only the review-count weight; rating-locked-but-review-count-unlocked-and-nil earns only the rating weight.Prior MINORs: all resolved
ListBooksForScoreBackfillByLibraryremoved — onlyListBooksForScoreBackfill(global) exists in the SQL and generated code.BackfillMatchScoresdoc-comment describes the correct startup-backfill behavior.Itblocks were split — all new test files (match_score_test.go,match_score_service_test.go,match_score_store_test.go,match_score_handler_test.go) use oneExpectperIt.Migration 0038 verified
0035_dedup_normalized_key_index,0036_dedup_external_id_indexes,0037_book_file_dedup_covering_index,0038_book_match_score_indexare sequential with no gap or overlap.0038is index-only: plainCREATE INDEX idx_book_metadata_match_score ON book (metadata_match_score)/DROP INDEX— noIF [NOT] EXISTS, no new columns."backfill_metadata_match_score_v1"is unaffected by the rename.Fresh-eyes pass: clean
ListHandlerresolvesgetUserLibraryIDsandgetContentRestrictions; the metadata filter predicate is additive insidebuildListBooksFilteredQuery/buildMultiFieldListQuery— orthogonal to, not replacing, the library scope.book.metadata_match_scoreis the existing V29 column;0038adds only an index.totalWeightcompile-time assertion present atinternal/books/match_score.go:748:var _ = [totalWeight - 87]struct{}{}.style=introduced in the diff (the pre-existingstyle=widthinbooks_show.htmlis not from this PR).BeforeEach/JustBeforeEach/Itseparation,var (...)" blocks,ctx` propagated throughout.hasSelectiveFiltercorrectly updated to includeMetadataFilterso the query planner hint applies to the new filter path (internal/books/store.go:1232).[MINOR] internal/books/handler_test.go:3461 — misaligned stub in
makeShowHandlerLLMThe new
getMatchScorestub injected at line 3461 uses deeper indentation than its siblings. Cosmetic only — compiles and runs correctly.REVIEW VERDICT: 0 blocker, 0 major, 1 minor
fa2baa5c16633bceee26