feat(dedup): v4 — keep-strategy dropdown + paginated UI + pre-select keeper (bookshelf-c15a.4) #578

Merged
zombor merged 1 commit from bd-bookshelf-c15a.4 into main 2026-06-16 01:10:20 +00:00
Owner

Summary

  • Backend: KeepStrategy type with 4 strategies (preferred_format, largest_file, most_complete_metadata, cleanest_filename); SortGroupByStrategy sorts each group and returns suggestedKeepBookID; handler validates and accepts keep_strategy from JSON body or query param
  • Frontend: Keep-strategy <select> in modal (re-scans on change); cursor-based Prev/Next pagination with page indicator; pre-select suggested keeper per group via JS .checked property after replaceChildren; radios use unique name="target_book_id_N" per group so they're independent radio groups
  • Tests: Unit tests for all 4 strategies + CleanFilenameCompare direct tests + handler/service coverage; go-rod browser e2e tests for strategy change, pre-selection, pagination (Next/Prev), screenshots posted to PR

Key fixes discovered

  • DOMParser moves <meta> elements to <head> (invisible to resultsTarget.querySelector); fixed by storing cursor in data-next-cursor on the .dedup-pagination div instead
  • All radios shared name="target_book_id" making them a single browser radio group; setting .checked on one group's radio unchecked others; fixed by unique name="target_book_id_N" per group

Test plan

  • make test — unit tests pass
  • make coverage — 100% coverage gate passes
  • make e2e — 242 browser tests pass (21 dedup tests: 9 new v4 + 12 existing)
  • Strategy dropdown visible + defaults to preferred_format
  • Each group has a pre-selected keeper radio after scan
  • Changing strategy re-scans with keeper pre-selected
  • Next/Prev pagination with page indicator updates
  • Screenshots posted to Forgejo PR

Closes bead bookshelf-c15a.4 on merge.

## Summary - **Backend:** KeepStrategy type with 4 strategies (preferred_format, largest_file, most_complete_metadata, cleanest_filename); `SortGroupByStrategy` sorts each group and returns `suggestedKeepBookID`; handler validates and accepts `keep_strategy` from JSON body or query param - **Frontend:** Keep-strategy `<select>` in modal (re-scans on change); cursor-based Prev/Next pagination with page indicator; pre-select suggested keeper per group via JS `.checked` property after `replaceChildren`; radios use unique `name="target_book_id_N"` per group so they're independent radio groups - **Tests:** Unit tests for all 4 strategies + `CleanFilenameCompare` direct tests + handler/service coverage; go-rod browser e2e tests for strategy change, pre-selection, pagination (Next/Prev), screenshots posted to PR ## Key fixes discovered - `DOMParser` moves `<meta>` elements to `<head>` (invisible to `resultsTarget.querySelector`); fixed by storing cursor in `data-next-cursor` on the `.dedup-pagination` div instead - All radios shared `name="target_book_id"` making them a single browser radio group; setting `.checked` on one group's radio unchecked others; fixed by unique `name="target_book_id_N"` per group ## Test plan - [x] `make test` — unit tests pass - [x] `make coverage` — 100% coverage gate passes - [x] `make e2e` — 242 browser tests pass (21 dedup tests: 9 new v4 + 12 existing) - [x] Strategy dropdown visible + defaults to preferred_format - [x] Each group has a pre-selected keeper radio after scan - [x] Changing strategy re-scans with keeper pre-selected - [x] Next/Prev pagination with page indicator updates - [x] Screenshots posted to Forgejo PR Closes bead bookshelf-c15a.4 on merge.
feat(dedup): v4 — keep-strategy dropdown + paginated UI + pre-select keeper (bookshelf-c15a.4)
All checks were successful
/ JS Unit Tests (pull_request) Successful in 24s
/ Lint (pull_request) Successful in 1m20s
/ Test (pull_request) Successful in 2m9s
/ E2E Browser (pull_request) Successful in 3m29s
/ Integration (pull_request) Successful in 5m16s
/ E2E API (pull_request) Successful in 7m46s
26790efd46
Backend:
- Add KeepStrategy type (preferred_format, largest_file, most_complete_metadata,
  cleanest_filename) with SortGroupByStrategy that sorts each group in-place and
  returns suggestedKeepBookID
- Export cleanFilenameCompare via export_test.go for whitebox coverage
- Handler accepts keep_strategy from JSON body or query param; validates and defaults
  to preferred_format
- ScanPageData.KeepStrategy propagated through; SuggestedKeepBookID on each group

Frontend:
- Keep-strategy <select> in dedup modal; changing it resets cursor stack and re-scans
- Cursor-based pagination: Next/Prev with client-side cursor stack
- Page indicator (Page N) updated after each scan via direct DOM querySelectorAll
  (bypasses Stimulus target async discovery after replaceChildren)
- Pre-select suggested keeper per group via JS .checked property after insertion;
  radios now use unique name="target_book_id_N" per group so groups are independent
  radio groups (setting checked in one group no longer unchecks another)
- Next-cursor stored in data-next-cursor on .dedup-pagination div; avoids
  DOMParser moving <meta> to <head> (invisible to resultsTarget.querySelector)

Tests:
- Unit tests for all 4 strategy comparators including direct CleanFilenameCompare
  tests covering both branches (return -1 and return 1)
- Handler tests: invalid keep_strategy → 400; valid strategies parsed from JSON + query
- Service test: MetadataMatchScore.Valid=true populates DupBook.MetadataMatchScore
- go-rod e2e: strategy dropdown visible + defaults, pre-selection per group, ★ suffix,
  strategy change re-scans, Next/Prev pagination with page indicator, screenshots
  posted to Forgejo PR

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Author
Owner

Authenticated screenshot — Find Duplicates: keep-strategy dropdown + pagination + pre-selected keeper.

dedup

Authenticated screenshot — Find Duplicates: keep-strategy dropdown + pagination + pre-selected keeper. ![dedup](https://git.zombor.net/attachments/f32b1b91-e5d1-4209-b69b-0722fe653165)
Author
Owner

Security Review — bookshelf-c15a.4 (PR #578)

Reviewer: security-review bot
Head SHA: 26790efd
Scope: keepStrategy enum validation, SQL injection, pagination cursor safety, multi-user scoping, resource/DoS, XSS.


Threat-model findings

keepStrategy enum validation

parseScanRequest (handler.go) validates keep_strategy against the ValidKeepStrategy switch on both the JSON body path and the query-param path before it ever reaches the service layer. Unknown values return 400 ErrValidation. Tests cover both paths (returns 400 for invalid keep_strategy in JSON body, returns 400 for invalid keep_strategy in query param). ✓

SQL injection via keepStrategy / sort

SortGroupByStrategy and all *Less helpers are pure Go comparators on already-fetched DupBook structs. No keepStrategy value is ever interpolated into SQL. The most_complete_metadata strategy reads book.metadata_match_score — an existing Grimmory V29 column, selected with a parameterized ? in the query and scanned into sql.NullFloat64. ✓

Cursor decode safety

DecodeCursor (service.go) base64-URL-decodes then JSON-unmarshals. Both failure modes return fmt.Errorf("… %w: %w", middleware.ErrValidation, err), which the middleware translates to 400. No panic path. Cursor values (afterKey, afterID) are passed as bound parameters (?) in the keyset WHERE clauses — no interpolation. A crafted cursor can only skip past group keys / book IDs; all queries additionally filter by library_id IN (?) from the server-resolved userLibraryIDs, so a forged cursor cannot enumerate another user's books. ✓

Multi-user / library scoping

Handler calls getUserLibraryIDs(ctx, userID)filterToLibrary(userLibraryIDs, libraryID) → returns ErrNotFound on empty scope. All five finder functions (FindDuplicatesByISBN, FindDuplicatesByExternalID, FindDuplicatesByTitleAuthor, FindDuplicatesByDirectory, FindDuplicatesByFilename) guard on len(userLibraryIDs) == 0 and embed library_id IN (?) in both the inner group subquery and the outer WHERE. Adding metadata_match_score to the SELECT list does not change scope. ✓

Pagination boundedness

clampLimit (service.go:77) enforces [1, 200] before any query. The JS _cursorStack grows only as pages are navigated (no server-side session state); cursor values originate from server responses. No unbounded in-memory accumulation. ✓

XSS

  • data-next-cursor attribute: NextCursor is a *string containing base64url characters only ([A-Za-z0-9\-_=]); html/template auto-escapes it in attribute context. ✓
  • data-suggested-keeper="true": static literal, conditional on int64 comparison. ✓
  • FileSubPath in <code>: auto-escaped by html/template. ✓
  • MatchType fallthrough: {{else}}{{$g.MatchType}}{{end}} — auto-escaped. ✓
  • JS _updatePageIndicator writes "Page " + page where page is an integer from this._currentPagetextContent, not innerHTML. ✓
  • _preselectKeepers only sets .checked = true on existing radio elements — no DOM injection. ✓

No new Grimmory columns

metadata_match_score on book is an existing Grimmory V29 column (confirmed by migration 0038 comment). No new column added. ✓


REVIEW VERDICT: 0 blocker, 0 major, 0 minor

## Security Review — bookshelf-c15a.4 (PR #578) **Reviewer:** security-review bot **Head SHA:** 26790efd **Scope:** keepStrategy enum validation, SQL injection, pagination cursor safety, multi-user scoping, resource/DoS, XSS. --- ### Threat-model findings #### keepStrategy enum validation `parseScanRequest` (handler.go) validates `keep_strategy` against the `ValidKeepStrategy` switch on **both** the JSON body path and the query-param path before it ever reaches the service layer. Unknown values return `400 ErrValidation`. Tests cover both paths (`returns 400 for invalid keep_strategy in JSON body`, `returns 400 for invalid keep_strategy in query param`). ✓ #### SQL injection via keepStrategy / sort `SortGroupByStrategy` and all `*Less` helpers are pure Go comparators on already-fetched `DupBook` structs. No keepStrategy value is ever interpolated into SQL. The `most_complete_metadata` strategy reads `book.metadata_match_score` — an existing Grimmory V29 column, selected with a parameterized `?` in the query and scanned into `sql.NullFloat64`. ✓ #### Cursor decode safety `DecodeCursor` (service.go) base64-URL-decodes then JSON-unmarshals. Both failure modes return `fmt.Errorf("… %w: %w", middleware.ErrValidation, err)`, which the middleware translates to 400. No panic path. Cursor values (`afterKey`, `afterID`) are passed as bound parameters (`?`) in the keyset WHERE clauses — no interpolation. A crafted cursor can only skip past group keys / book IDs; all queries additionally filter by `library_id IN (?)` from the server-resolved `userLibraryIDs`, so a forged cursor cannot enumerate another user's books. ✓ #### Multi-user / library scoping Handler calls `getUserLibraryIDs(ctx, userID)` → `filterToLibrary(userLibraryIDs, libraryID)` → returns `ErrNotFound` on empty scope. All five finder functions (`FindDuplicatesByISBN`, `FindDuplicatesByExternalID`, `FindDuplicatesByTitleAuthor`, `FindDuplicatesByDirectory`, `FindDuplicatesByFilename`) guard on `len(userLibraryIDs) == 0` and embed `library_id IN (?)` in both the inner group subquery and the outer WHERE. Adding `metadata_match_score` to the SELECT list does not change scope. ✓ #### Pagination boundedness `clampLimit` (service.go:77) enforces `[1, 200]` before any query. The JS `_cursorStack` grows only as pages are navigated (no server-side session state); cursor values originate from server responses. No unbounded in-memory accumulation. ✓ #### XSS - `data-next-cursor` attribute: `NextCursor` is a `*string` containing base64url characters only (`[A-Za-z0-9\-_=]`); `html/template` auto-escapes it in attribute context. ✓ - `data-suggested-keeper="true"`: static literal, conditional on `int64` comparison. ✓ - `FileSubPath` in `<code>`: auto-escaped by `html/template`. ✓ - `MatchType` fallthrough: `{{else}}{{$g.MatchType}}{{end}}` — auto-escaped. ✓ - JS `_updatePageIndicator` writes `"Page " + page` where `page` is an integer from `this._currentPage` — `textContent`, not `innerHTML`. ✓ - `_preselectKeepers` only sets `.checked = true` on existing radio elements — no DOM injection. ✓ #### No new Grimmory columns `metadata_match_score` on `book` is an existing Grimmory V29 column (confirmed by migration 0038 comment). No new column added. ✓ --- REVIEW VERDICT: 0 blocker, 0 major, 0 minor
Author
Owner

Security Re-Review — Pass 3 (SHA 60ab9641, PR #567, bead bookshelf-c15a.2)

Prior findings — resolution status

[MAJOR] Cross-criterion cursor namespace collision — RESOLVED.
The new ScanCursor.Positions map[string]CriterionPos gives each active criterion its own (afterKey, afterID). The handler reads userLibraryIDs and builds scopedIDs before the cursor is decoded, so the cursor has no influence on the library scope. Unknown keys in Positions (attacker-injected) are silently ignored via map-lookup — they cannot widen scope or cause a panic. The afterKey/afterID values are passed only as SQL parameters in a keyset WHERE clause; the library-ID filter (WHERE b.library_id IN (?)) is applied independently in every finder. Cross-user read via cursor is not possible.

[MINOR] No http.MaxBytesReader — RESOLVED.
http.MaxBytesReader(w, r.Body, 4096) is applied in ScanHandler before parseScanRequest is called (handler.go:128-130). The error-mapper handles *http.MaxBytesError via errors.As (checked before errors.Is(ErrValidation) in the switch), so an oversized body returns 413 not 400. The handler unit test at handler_test.go:339-344 confirms 413 is returned.


Fresh threat-model

SQL injection — all five finder queries (FindDuplicatesByISBN, FindDuplicatesByExternalID, FindDuplicatesByTitleAuthor, FindDuplicatesByDirectory, FindDuplicatesByFilename) pass every user-influenced value (cursor afterKey/afterID, limit+1) as positional ? parameters. Library IDs are server-computed from the session and passed the same way. The LIKE patterns use %%/%% (Go fmt.Sprintf escaping to %/%) which are string literals, not user-controlled. No injection surface found.

ExternalID UNION ALL arg binding — manually verified: HC branch gets libIn, afterKey, afterKey, afterID, limit+1, libIn; GR branch gets the same pattern; total args = 4N + 8 matching the four %s expansions + eight cursor/limit ?s in the SQL. Both branches are independently scoped by userLibraryIDs.

Resource/DoS — request body capped at 4096 bytes via MaxBytesReader. The cursor query param is bounded by Go's default MaxHeaderBytes (1 MB) and is a server-generated opaque blob with no unbounded sub-structure. Each finder's inner subquery limits to limit+1 groups (clamped to max 200 by clampLimit); the ExternalID combined outer query has no second LIMIT but is bounded by 2x(limit+1) groups, which is documented and handled correctly by the service-layer seen-map. No unbounded scan path found.

Cursor decode safetyDecodeCursor returns a typed error (wrapping ErrValidation) on invalid base64 or malformed JSON; no panic path. The Positions map is deserialized into a Go struct with typed fields; oversized maps are bounded by the cursor string length, itself bounded by HTTP headers.

Multi-user fail-closed — every finder has if len(userLibraryIDs) == 0 { return nil, nil } as the first guard. filterToLibrary rejects requests for a library not in the user's accessible set (returns 404). userLibraryIDs is derived from the authenticated session, never from the request body or cursor.

keptKeys pagination correctness — the seen2 loop correctly breaks on len(seen2) > limit, excluding the peek group from keptKeys. The lastPos update only iterates labeledRows in keptKeys, so the cursor points to the last book of the last returned group, not the discarded peek group.


New findings

No new blockers or majors. One minor style note:

[MINOR] internal/dedup/handler.go:53 — double-wrapped error format %w: %w with ErrValidation first
fmt.Errorf("dedup scan: decode body: %w: %w", middleware.ErrValidation, err) is an unusual multi-error pattern. The statusFor switch checks errors.As(err, &maxBytesErr) before errors.Is(ErrValidation), so the 413 path wins correctly and the test confirms it. The double-colon separator is cosmetic and non-standard but causes no correctness or security issue. The same pattern exists in the merge handler, suggesting it is an established codebase convention. No fix required.


REVIEW VERDICT: 0 blocker, 0 major, 1 minor

## Security Re-Review — Pass 3 (SHA 60ab9641, PR #567, bead bookshelf-c15a.2) ### Prior findings — resolution status **[MAJOR] Cross-criterion cursor namespace collision — RESOLVED.** The new `ScanCursor.Positions map[string]CriterionPos` gives each active criterion its own `(afterKey, afterID)`. The handler reads `userLibraryIDs` and builds `scopedIDs` *before* the cursor is decoded, so the cursor has no influence on the library scope. Unknown keys in `Positions` (attacker-injected) are silently ignored via map-lookup — they cannot widen scope or cause a panic. The `afterKey`/`afterID` values are passed only as SQL *parameters* in a keyset WHERE clause; the library-ID filter (`WHERE b.library_id IN (?)`) is applied independently in every finder. Cross-user read via cursor is not possible. **[MINOR] No `http.MaxBytesReader` — RESOLVED.** `http.MaxBytesReader(w, r.Body, 4096)` is applied in `ScanHandler` before `parseScanRequest` is called (`handler.go:128-130`). The error-mapper handles `*http.MaxBytesError` via `errors.As` (checked before `errors.Is(ErrValidation)` in the switch), so an oversized body returns 413 not 400. The handler unit test at `handler_test.go:339-344` confirms 413 is returned. --- ### Fresh threat-model **SQL injection** — all five finder queries (`FindDuplicatesByISBN`, `FindDuplicatesByExternalID`, `FindDuplicatesByTitleAuthor`, `FindDuplicatesByDirectory`, `FindDuplicatesByFilename`) pass every user-influenced value (cursor `afterKey`/`afterID`, `limit+1`) as positional `?` parameters. Library IDs are server-computed from the session and passed the same way. The LIKE patterns use `%%/%%` (Go `fmt.Sprintf` escaping to `%/%`) which are string literals, not user-controlled. No injection surface found. **ExternalID UNION ALL arg binding** — manually verified: HC branch gets `libIn, afterKey, afterKey, afterID, limit+1, libIn`; GR branch gets the same pattern; total args = `4N + 8` matching the four `%s` expansions + eight cursor/limit `?`s in the SQL. Both branches are independently scoped by `userLibraryIDs`. **Resource/DoS** — request body capped at 4096 bytes via `MaxBytesReader`. The cursor query param is bounded by Go's default `MaxHeaderBytes` (1 MB) and is a server-generated opaque blob with no unbounded sub-structure. Each finder's inner subquery limits to `limit+1` groups (clamped to max 200 by `clampLimit`); the ExternalID combined outer query has no second LIMIT but is bounded by 2x(limit+1) groups, which is documented and handled correctly by the service-layer seen-map. No unbounded scan path found. **Cursor decode safety** — `DecodeCursor` returns a typed error (wrapping `ErrValidation`) on invalid base64 or malformed JSON; no panic path. The `Positions` map is deserialized into a Go struct with typed fields; oversized maps are bounded by the cursor string length, itself bounded by HTTP headers. **Multi-user fail-closed** — every finder has `if len(userLibraryIDs) == 0 { return nil, nil }` as the first guard. `filterToLibrary` rejects requests for a library not in the user's accessible set (returns 404). `userLibraryIDs` is derived from the authenticated session, never from the request body or cursor. **`keptKeys` pagination correctness** — the seen2 loop correctly breaks on `len(seen2) > limit`, excluding the peek group from `keptKeys`. The `lastPos` update only iterates `labeledRows` in `keptKeys`, so the cursor points to the last book of the last *returned* group, not the discarded peek group. --- ### New findings No new blockers or majors. One minor style note: [MINOR] internal/dedup/handler.go:53 — double-wrapped error format `%w: %w` with ErrValidation first `fmt.Errorf("dedup scan: decode body: %w: %w", middleware.ErrValidation, err)` is an unusual multi-error pattern. The `statusFor` switch checks `errors.As(err, &maxBytesErr)` before `errors.Is(ErrValidation)`, so the 413 path wins correctly and the test confirms it. The double-colon separator is cosmetic and non-standard but causes no correctness or security issue. The same pattern exists in the merge handler, suggesting it is an established codebase convention. No fix required. --- REVIEW VERDICT: 0 blocker, 0 major, 1 minor
Author
Owner

Code Review — bookshelf-c15a.4 (Find Duplicates v4)

Phase 0: DEMO verification

No interactive DEMO block (implementation-complete bead, CI green). Proceeding to spec/code review.


Phase 1 — Spec compliance

All required items are present:

  • keep_strategy param on POST /libraries/{id}/dedup/scan — JSON + query-param paths both handled with validation; invalid value returns 400.
  • Four strategies implemented (preferred_format/largest_file/most_complete_metadata/cleanest_filename), each ending in book ID as the final tiebreaker — deterministic total order confirmed.
  • cleanest_filename: '- 201002 Cable 021.cbz' vs '201002 Cable 021.cbz'leadingNonAlnum counts 2 vs 0, keeps the clean one. Spec case covered in keep_strategy_test.go.
  • SortGroupByStrategy is an exported pure function — reusable by v5/v6.
  • SuggestedKeepBookID returned; Books[0] is the suggested keeper after sort.
  • Pagination: NextCursor from server, client cursor STACK for Prev, page indicator, disable states correct on page 1 / last page.
  • Pre-select keeper radio via data-suggested-keeper="true" + _preselectKeepers() called after replaceChildren.
  • Library-scoped fail-closed unchanged.
  • No new Grimmory table columns (MetadataMatchScore reads from existing book.metadata_match_score, no migration).
  • No inline style= attributes.

Phase 2 — Code quality

[MINOR] internal/dedup/service_test.go:835 — multiple Expect calls inside one It block

The new "MetadataMatchScore.Valid = true populates DupBook.MetadataMatchScore" It block contains four Expect calls (HaveLen, Books[0].ID, MetadataMatchScore non-nil, float value). Project convention is exactly one Expect per It (with error folded into the value assertion). Suggested fix: split into three It blocks sharing a JustBeforeEach, so a failure pinpoints which property broke.

[MINOR] static/js/controllers/find_duplicates_controller.js:218 — keep_strategy sent in URL query param AND JSON body

_buildScanUrl appends keep_strategy=... to the query string when strategy != preferred_format. _doScan also sets criteria.keep_strategy in the JSON body (always). The handler's JSON path reads from the body and ignores the URL param — the URL param is dead weight for every browser request. Additionally, preferred_format is silently omitted from the URL but always present in the body, creating an asymmetry. Suggested fix: remove keep_strategy from _buildScanUrl entirely; the JSON body is the only path used by the browser.

[MINOR] templates/pages/dedup_results.html:63 — inert pagination bar visible before page indicator is populated

When groups are returned, the pagination bar always renders. On the initial scan, the page indicator span is empty until _updatePageIndicator() fires (JS, synchronous after replaceChildren). In practice the indicator fills in immediately, but briefly shows ← Prev [blank] Next → during the replace. Low priority; disable states are correct throughout.


REVIEW VERDICT: 0 blocker, 0 major, 3 minor

## Code Review — bookshelf-c15a.4 (Find Duplicates v4) ### Phase 0: DEMO verification No interactive DEMO block (implementation-complete bead, CI green). Proceeding to spec/code review. --- ### Phase 1 — Spec compliance All required items are present: - keep_strategy param on POST /libraries/{id}/dedup/scan — JSON + query-param paths both handled with validation; invalid value returns 400. - Four strategies implemented (preferred_format/largest_file/most_complete_metadata/cleanest_filename), each ending in `book ID` as the final tiebreaker — deterministic total order confirmed. - cleanest_filename: `'- 201002 Cable 021.cbz'` vs `'201002 Cable 021.cbz'` — `leadingNonAlnum` counts 2 vs 0, keeps the clean one. Spec case covered in `keep_strategy_test.go`. - `SortGroupByStrategy` is an exported pure function — reusable by v5/v6. - `SuggestedKeepBookID` returned; `Books[0]` is the suggested keeper after sort. - Pagination: `NextCursor` from server, client cursor STACK for Prev, page indicator, disable states correct on page 1 / last page. - Pre-select keeper radio via `data-suggested-keeper="true"` + `_preselectKeepers()` called after `replaceChildren`. - Library-scoped fail-closed unchanged. - No new Grimmory table columns (`MetadataMatchScore` reads from existing `book.metadata_match_score`, no migration). - No inline `style=` attributes. --- ### Phase 2 — Code quality [MINOR] internal/dedup/service_test.go:835 — multiple Expect calls inside one It block The new "MetadataMatchScore.Valid = true populates DupBook.MetadataMatchScore" It block contains four Expect calls (`HaveLen`, `Books[0].ID`, `MetadataMatchScore` non-nil, float value). Project convention is exactly one Expect per It (with error folded into the value assertion). Suggested fix: split into three It blocks sharing a JustBeforeEach, so a failure pinpoints which property broke. [MINOR] static/js/controllers/find_duplicates_controller.js:218 — keep_strategy sent in URL query param AND JSON body `_buildScanUrl` appends `keep_strategy=...` to the query string when strategy != `preferred_format`. `_doScan` also sets `criteria.keep_strategy` in the JSON body (always). The handler's JSON path reads from the body and ignores the URL param — the URL param is dead weight for every browser request. Additionally, `preferred_format` is silently omitted from the URL but always present in the body, creating an asymmetry. Suggested fix: remove `keep_strategy` from `_buildScanUrl` entirely; the JSON body is the only path used by the browser. [MINOR] templates/pages/dedup_results.html:63 — inert pagination bar visible before page indicator is populated When groups are returned, the pagination bar always renders. On the initial scan, the page indicator span is empty until `_updatePageIndicator()` fires (JS, synchronous after `replaceChildren`). In practice the indicator fills in immediately, but briefly shows `← Prev [blank] Next →` during the replace. Low priority; disable states are correct throughout. --- REVIEW VERDICT: 0 blocker, 0 major, 3 minor
Author
Owner

Code Review — bookshelf-c15a.4 (Find Duplicates v4)

Phase 0: DEMO verification

No interactive DEMO block (implementation-complete bead, CI green). Proceeding to spec/code review.


Phase 1 — Spec compliance

All required items are present:

  • keep_strategy param on POST /libraries/{id}/dedup/scan — JSON + query-param paths both handled with validation; invalid value returns 400.
  • Four strategies implemented (preferred_format/largest_file/most_complete_metadata/cleanest_filename), each ending in book ID as the final tiebreaker — deterministic total order confirmed.
  • cleanest_filename: '- 201002 Cable 021.cbz' vs '201002 Cable 021.cbz'leadingNonAlnum counts 2 vs 0, keeps the clean one. Spec case covered in keep_strategy_test.go.
  • SortGroupByStrategy is an exported pure function — reusable by v5/v6.
  • SuggestedKeepBookID returned; Books[0] is the suggested keeper after sort.
  • Pagination: NextCursor from server, client cursor STACK for Prev, page indicator, disable states correct on page 1 / last page.
  • Pre-select keeper radio via data-suggested-keeper="true" + _preselectKeepers() called after replaceChildren.
  • Library-scoped fail-closed unchanged.
  • No new Grimmory table columns (MetadataMatchScore reads from existing book.metadata_match_score, no migration).
  • No inline style= attributes.

Phase 2 — Code quality

[MINOR] internal/dedup/service_test.go:835 — multiple Expect calls inside one It block

The new "MetadataMatchScore.Valid = true populates DupBook.MetadataMatchScore" It block contains four Expect calls (HaveLen, Books[0].ID, MetadataMatchScore non-nil, float value). Project convention is exactly one Expect per It (with error folded into the value assertion). Suggested fix: split into three It blocks sharing a JustBeforeEach, so a failure pinpoints which property broke.

[MINOR] static/js/controllers/find_duplicates_controller.js:218 — keep_strategy sent in URL query param AND JSON body

_buildScanUrl appends keep_strategy=... to the query string when strategy != preferred_format. _doScan also sets criteria.keep_strategy in the JSON body (always). The handler's JSON path reads from the body and ignores the URL param — the URL param is dead weight for every browser request. Additionally, preferred_format is silently omitted from the URL but always present in the body, creating an asymmetry. Suggested fix: remove keep_strategy from _buildScanUrl entirely; the JSON body is the only path used by the browser.

[MINOR] templates/pages/dedup_results.html:63 — inert pagination bar visible before page indicator is populated

When groups are returned, the pagination bar always renders. On the initial scan, the page indicator span is empty until _updatePageIndicator() fires (JS, synchronous after replaceChildren). In practice the indicator fills in immediately, but briefly shows ← Prev [blank] Next → during the replace. Low priority; disable states are correct throughout.


REVIEW VERDICT: 0 blocker, 0 major, 3 minor

## Code Review — bookshelf-c15a.4 (Find Duplicates v4) ### Phase 0: DEMO verification No interactive DEMO block (implementation-complete bead, CI green). Proceeding to spec/code review. --- ### Phase 1 — Spec compliance All required items are present: - keep_strategy param on POST /libraries/{id}/dedup/scan — JSON + query-param paths both handled with validation; invalid value returns 400. - Four strategies implemented (preferred_format/largest_file/most_complete_metadata/cleanest_filename), each ending in `book ID` as the final tiebreaker — deterministic total order confirmed. - cleanest_filename: `'- 201002 Cable 021.cbz'` vs `'201002 Cable 021.cbz'` — `leadingNonAlnum` counts 2 vs 0, keeps the clean one. Spec case covered in `keep_strategy_test.go`. - `SortGroupByStrategy` is an exported pure function — reusable by v5/v6. - `SuggestedKeepBookID` returned; `Books[0]` is the suggested keeper after sort. - Pagination: `NextCursor` from server, client cursor STACK for Prev, page indicator, disable states correct on page 1 / last page. - Pre-select keeper radio via `data-suggested-keeper="true"` + `_preselectKeepers()` called after `replaceChildren`. - Library-scoped fail-closed unchanged. - No new Grimmory table columns (`MetadataMatchScore` reads from existing `book.metadata_match_score`, no migration). - No inline `style=` attributes. --- ### Phase 2 — Code quality [MINOR] internal/dedup/service_test.go:835 — multiple Expect calls inside one It block The new "MetadataMatchScore.Valid = true populates DupBook.MetadataMatchScore" It block contains four Expect calls (`HaveLen`, `Books[0].ID`, `MetadataMatchScore` non-nil, float value). Project convention is exactly one Expect per It (with error folded into the value assertion). Suggested fix: split into three It blocks sharing a JustBeforeEach, so a failure pinpoints which property broke. [MINOR] static/js/controllers/find_duplicates_controller.js:218 — keep_strategy sent in URL query param AND JSON body `_buildScanUrl` appends `keep_strategy=...` to the query string when strategy != `preferred_format`. `_doScan` also sets `criteria.keep_strategy` in the JSON body (always). The handler's JSON path reads from the body and ignores the URL param — the URL param is dead weight for every browser request. Additionally, `preferred_format` is silently omitted from the URL but always present in the body, creating an asymmetry. Suggested fix: remove `keep_strategy` from `_buildScanUrl` entirely; the JSON body is the only path used by the browser. [MINOR] templates/pages/dedup_results.html:63 — inert pagination bar visible before page indicator is populated When groups are returned, the pagination bar always renders. On the initial scan, the page indicator span is empty until `_updatePageIndicator()` fires (JS, synchronous after `replaceChildren`). In practice the indicator fills in immediately, but briefly shows `← Prev [blank] Next →` during the replace. Low priority; disable states are correct throughout. --- REVIEW VERDICT: 0 blocker, 0 major, 3 minor
zombor force-pushed bd-bookshelf-c15a.4 from 26790efd46
All checks were successful
/ JS Unit Tests (pull_request) Successful in 24s
/ Lint (pull_request) Successful in 1m20s
/ Test (pull_request) Successful in 2m9s
/ E2E Browser (pull_request) Successful in 3m29s
/ Integration (pull_request) Successful in 5m16s
/ E2E API (pull_request) Successful in 7m46s
to 7f8ba1d060
All checks were successful
/ JS Unit Tests (pull_request) Successful in 20s
/ Lint (pull_request) Successful in 55s
/ Test (pull_request) Successful in 2m44s
/ E2E Browser (pull_request) Successful in 3m32s
/ E2E API (pull_request) Successful in 4m24s
/ Integration (pull_request) Successful in 4m43s
2026-06-16 01:04:28 +00:00
Compare
zombor merged commit a0dad1660c into main 2026-06-16 01:10:20 +00:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
zombor/pergamum!578
No description provided.