fix(comicvine): structured volume+issue_number lookup for Series #N (bookshelf-wzef) #414
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-wzef"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Test plan
Closes bead bookshelf-wzef on merge.
🤖 Generated with Claude Code
Security Review — bookshelf-wzef (PR #414)
Diff reviewed:
origin/main...origin/bd-bookshelf-wzefFiles:
internal/metadata/comicvine/provider.go,internal/metadata/comicvine/structured_test.go,internal/metadata/comicvine/export_test.goURL / Query Injection
query=param (series name → volume search):seriesis passed toparams.Set("query", series)and the full params block is serialized viaparams.Encode()(standardurl.Values.Encode()). This percent-encodes the value; any special characters in the series name are safely escaped before the URL is assembled. No injection possible here.filter=param (volume ID + issue number): The filter string is constructed as:volumeIDis anint64formatted with%d(integer only).issueNumberis the output ofnormalizeIssueNumber(issueNum), whereissueNumis captured by regex group 2:This constrains the value to digits, optionally followed by
./-+ digits, or alpha characters. The characters that could corrupt the ComicVine server-side filter syntax —,(filter clause separator),:(key/value separator),&(query string separator) — are structurally impossible from this regex. The entirefiltervalue is then further URL-encoded byparams.Encode(). No injection vector.SSRF
baseURLis set to the hardcoded constantdefaultBaseURL = "https://comicvine.gamespot.com/api"at construction time. The config fieldMetadataProviderComicVineBaseURLis explicitly documented as test-only and is NOT registered as an ff flag / environment variable, preventing operator or user injection of an attacker-controlled host. No user-request input influences the host or scheme. No SSRF vector.API Key Leakage
The API key is passed as a URL query parameter (
?api_key=KEY). The existingredactAPIKey/redactURLErrorsmachinery walks the full*url.Errorchain before logging and replaces the key value withREDACTED. Log statements at the Info level (volume search completed,issues filter completed,issue detail fetched, etc.) log only structured fields likeseries,volume_id,issue_number,results— never the rawreqURL. No key leakage.DoS / Rate Budget
Every new request path (
searchVolumes,fetchIssuesByFilter) routes throughdoGetWithRetry, which calls bothhourly.Wait(ctx)(200/hr bucket) andlimiter(ctx)(per-second) on every attempt, including retries. The existing limiters gate 100% of new requests. The structured path adds at most1 + maxVolumesToCheck(3) + maxVolumesToCheck(3)= 7 requests per search in the worst case (1 volume search + 3 issue filter + 3 detail), bounded by constants.Volume Cache — No Size Cap (MINOR observation, not a blocking finding)
The
volumeCacheis amap[string]volumeCacheEntrywith TTL-based lazy expiry (10 min). Stale entries are only removed when aget()for the same key is called after expiry — there is no background sweep or maximum-entry eviction. In a normal personal library use-case the number of distinct series is bounded (hundreds to low thousands), so the memory impact is negligible. Each entry holds at most 25volumeSearchResultstructs (small structs, ~100 bytes each) → worst case ~2.5 KB per series → thousands of series ≈ a few MB. Not a practical DoS risk for the target deployment scale.Test Fixtures
No real API keys, passwords, tokens, or credentials in any test fixture. Tests use
noAPIKey(nil/empty) or"test-key"literals. Clean.REVIEW VERDICT: 0 blocker, 0 major, 0 minor
Code Review — PR #414 (bookshelf-wzef)
Branch:
bd-bookshelf-wzef— ComicVine structured volume→issue lookupFindings
[MAJOR] internal/metadata/comicvine/structured_test.go:232 —
callCountdeclared and written in handler but never asserted; year-early-stop is not actually testedThe "year disambiguation picks the right printing" spec declares
callCount := 0and increments it in the/issues/handler, but noExpect(callCount...)assertion exists. The twoItblocks only check that results are non-empty and thatSeriesNumber ≈ 18. This means the year-aligned early-stop (the key rate-budget guarantee: "stop after the first year-matched volume yields a result, do not check remaining volumes") is completely untested. A regression that removesif sv.yearMatch && len(results) > 0 { break }would pass all tests. AddIt("checks only 1 volume when year-aligned volume matches first", func() { Expect(callCount).To(Equal(1)) })— or switch toatomic.Int32+ assertion. The underlying early-stop logic in production code is correct, but the test gives false confidence on a rate-budget-critical path.[MINOR] internal/metadata/comicvine/structured_test.go:109–111 — three
Expectcalls perDescribeTableentry violates one-assertion-per-It conventionEach
Entrybody containsExpect(s),Expect(i), andExpect(y). Project conventions (project-conventions.md) require exactly oneExpectperItblock; DescribeTable entries compile toItblocks. Split into three separateDescribeTables (one per return value) or use three tables. Minor: no correctness impact.[MINOR] internal/metadata/comicvine/structured_test.go:787 — provider invocation inside
BeforeEach, notJustBeforeEachThe "volume and filter search both use API key" spec calls
provider(ctx, ...)at line 787 insideBeforeEach. Project conventions requireBeforeEachfor pure setup (fixtures, stubs) andJustBeforeEachfor the system-under-test call. Move theprovider(ctx, ...)call to aJustBeforeEachblock. Minor: no correctness impact.[MINOR] internal/metadata/comicvine/structured_test.go:790–791 — API-key assertion is underspecified
Expect(gotKeys).To(ContainElement("test-key"))passes if the key is sent on any request (e.g. just the volume search). BothsearchVolumesandfetchIssuesByFilterdo include the key by code inspection, but the test does not verify the filter request specifically sent it. Strengthen toExpect(gotKeys).To(HaveLen(3))(volume + filter + detail each send it) or route-tag the key collection and assert both paths.[MINOR] internal/metadata/comicvine/provider.go:75–78 — regex allows empty series name; bare
#18sends a useless volume searchtitleREmatches^(.*?)#...— group 1 (series) can be empty. A title of"#18"producesseries="",issueNum="18", which enters the structured path and callssearchVolumeswithquery="". ComicVine will return an arbitrary result set. Cache key""permanently pins those results for 10 minutes. This wastes one rate-limited request. Fix: addif strings.TrimSpace(series) == "" { return "", "", "" }inparseTitleComponents, or guardissueNum != "" && strings.TrimSpace(series) != ""at the call site.[MINOR] internal/metadata/comicvine/structured_test.go:105–119 —
ParseTitleComponentstable is missing edge casesNo entry tests: (a) bare
"#18"→("","18","")empty series; (b) title with two#signs like"Batman #18 #2"which the lazy.*?anchors to the last#, yieldingseries="Batman #18",issueNum="2"(wrong series name). Neither scenario has catastrophic impact in practice (bare#18is rare; multi-#titles are rare), but both are silent wrong-answer paths on a function exported for white-box testing.REVIEW VERDICT: 0 blocker, 1 major, 5 minor
a920453adc494d9bab3a