fix(comicvine): separator-insensitive volume name match (bookshelf-poza) #879
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-poza"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
normalizeSeriesNamehelper that lowercases and replaces separator punctuation (:,-,–,—,_,/,.) with a single space, collapses whitespace, and trimsscoreVolumesname-match comparison (exact +50 and contains +25) so that series names likeBatman - City of LightorBatman City of Lightmatch the canonical ComicVine volumeBatman: City of LightnameMatch=truethe structuredvolume→/issues/?filter=volume:V,issue_number:Npath fires, returning only the requested issue instead of the fuzzy free-text fallback that returns all issuesScope: Only the local scoring comparison is changed. ComicVine query strings are unchanged. No other providers or fallback paths are touched.
Test plan
ScoreVolumesunit tests: dash variant, space variant, genuinely-different series (no nameMatch), partial normalized-contains"Batman - City of Light #5"→ structured path → exactly 1 result (issue #5)"Batman City of Light #5"→ structured path → exactly 1 result (issue #5)internal/metadata/comicvineincludingnormalizeSeriesNameCloses bead bookshelf-poza on merge.
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
Workflow Detail page screenshot (wf-detail-older-execution)
Older completed ContinueAsNew epoch detail — execution ID and state visible, Cancel absent.
Recompute Match Score — kebab open screenshot (recompute-match-score-kebab-open)
CODE REVIEW: APPROVED
Phase 0 — DEMO Verification
No DEMO block present in this bead (separator-insensitivity is a scoring-internal change with no interactive CLI demo). I ran the normalization logic manually (Go program against the same regex/collapse logic) to verify correctness.
Phase 1 — Spec Compliance
Diff:
internal/metadata/comicvine/provider.go(+32 lines),internal/metadata/comicvine/structured_test.go(+91 lines).normalizeSeriesNameadded and wired into both sides ofscoreVolumes' name-match. ✓seriesNormhoisted outside the loop (computed once;seriesis constant). ✓nameMatchgating logic are unchanged. ✓Phase 2 — Code Quality
Correctness:
normalizeSeriesNameprovider.go:289—separatorRE = regexp.MustCompile([-:–—_/.])-at position 0 inside[...]is a literal dash in RE2. Correct..inside[...]is a literal period (not any-char). Correct.–(U+2013) and em-dash—(U+2014) are multi-byte UTF-8; Go'sregexphandles these correctly as Unicode runes.Verified normalization outputs:
"Batman - City of Light"→"batman city of light"✓"Batman: City of Light"→"batman city of light"✓"Batman City of Light"→"batman city of light"✓"Batman & Robin"→"batman & robin"(ampersand preserved; does NOT collapse to"batman robin") ✓"X-Men"→"x men","X Men"→"x men"(same series, acceptable)"S.H.I.E.L.D."→"s h i e l d","SHIELD"→"shield"(NOT equal — no false positive across acronym styles) ✓Over-match analysis
The one new behavior worth naming: a partial query like
"Batman City"previously did NOT contains-match"Batman: City of Light"(the colon blockedstrings.Contains). After the patch it does, because both normalize to strings where"batman city"is a substring of"batman city of light". This is an acceptable trade-off — the contains branch awards +25 and setsnameMatch, but all other scoring factors (year, issue count, publisher, recency) differentiate further. This is strictly within the existing fuzzy-matching design intent and is not a correctness bug.Tests
Four new
ScoreVolumesunit tests (structured_test.go:264–295):"Batman - City of Light"vs"Batman: City of Light") →nameMatch=true. Non-tautological: wasfalsebefore the patch (the oldnameLowerpath hadstrings.Contains("batman: city of light", "batman - city of light")= false). ✓"Batman City of Light"vs"Batman: City of Light") →nameMatch=true. Non-tautological: wasfalsebefore the patch (the colon in the volume name blocked the old contains check). ✓"Superman"vs"Batman: City of Light") →nameMatch=false. ✓"Batman"vs"Batman: City of Light") →nameMatch=true. Valid regression test for the contains branch via the new code path; however this behavior was alreadytruebefore the patch (old code:strings.Contains("batman: city of light", "batman")= true). It does not specifically exercise the new separator normalization. Minor only.Two new httptest.Server journey tests (
structured_test.go:2022–2077):"Batman: City of Light"and an issue #5, then query with the dash and space variants. AssertHaveLen(1)(structured path fires, not fuzzy fallback) andSeriesNumber ≈ 5. Non-tautological — these journeys requirenameMatch=trueto route through the structured issue-#-filtered path; without the patch, the structured path would be skipped and a fuzzy result set (or empty set) would be returned instead. ✓[MINOR]
internal/metadata/comicvine/structured_test.go:288— partial/contains test does not exercise new normalizationThe "Batman" vs "Batman: City of Light"
Itpasses both before and after the patch ("batman"was always a substring of"batman: city of light"). It is a valid regression guard for the contains branch but does not specifically test that separator normalization enabled a new contains match. Consider adding a case where normalization is the enabling factor for the contains path (e.g."Batman City"vs"Batman: City of Light"). Does not block.REVIEW VERDICT: 0 blocker, 0 major, 1 minor