docs(spike): native Go EPUB normalizer for Send-to-Kindle (bookshelf-zurw) #661

Closed
zombor wants to merge 1 commit from bd-bookshelf-zurw into main
Owner

Summary

Spike findings for native Go EPUB normalization to fix Kindle E999 errors on Send-to-Kindle.

Root cause: E999 is primarily triggered by (1) ZIP structure violations — mimetype entry must be first and STORED/uncompressed, (2) malformed/missing OPF or container.xml, and (3) oversized images. The current send path streams the raw EPUB file with zero normalization.

Feasibility verdict: Ship natively. The minimum viable scope — re-zip with mimetype STORED-first, repair container.xml/OPF, strip embedded fonts, downscale oversized images, remove remote resource references — is buildable on existing go.mod deps (no new dependencies) and reuses existing internal/files code (parseOPF, TransformImages, downscaleToWorkingSize).

Recommended scope for implementation bead:

  1. New NormalizeForKindle(src io.ReaderAt, size int64) ([]byte, error) in internal/files or internal/epub
  2. Re-zip with mimetype STORED-first (fixes the #1 E999 cause)
  3. Repair OPF: ensure version attr, add dc:identifier if missing, prune dangling idrefs
  4. Strip encryption.xml and embedded fonts (font entries + OPF manifest cleanup)
  5. Downscale images >1600px long edge (reuse existing transform pipeline)
  6. Remove remote resource references from XHTML
  7. Wire into SendBook for .epub files; graceful fallback to original on ErrEPUBNotNormalizable (PERMANENT — no retry)

Not in MVP scope: Full XHTML re-serialization for well-formedness, NCX generation, font subsetting (no pure-Go font subsetter exists).

Closes bead bookshelf-zurw on merge.

## Summary Spike findings for native Go EPUB normalization to fix Kindle E999 errors on Send-to-Kindle. **Root cause:** E999 is primarily triggered by (1) ZIP structure violations — `mimetype` entry must be first and STORED/uncompressed, (2) malformed/missing OPF or container.xml, and (3) oversized images. The current send path streams the raw EPUB file with zero normalization. **Feasibility verdict:** Ship natively. The minimum viable scope — re-zip with mimetype STORED-first, repair container.xml/OPF, strip embedded fonts, downscale oversized images, remove remote resource references — is buildable on existing `go.mod` deps (no new dependencies) and reuses existing `internal/files` code (`parseOPF`, `TransformImages`, `downscaleToWorkingSize`). **Recommended scope for implementation bead:** 1. New `NormalizeForKindle(src io.ReaderAt, size int64) ([]byte, error)` in `internal/files` or `internal/epub` 2. Re-zip with mimetype STORED-first (fixes the #1 E999 cause) 3. Repair OPF: ensure version attr, add dc:identifier if missing, prune dangling idrefs 4. Strip encryption.xml and embedded fonts (font entries + OPF manifest cleanup) 5. Downscale images >1600px long edge (reuse existing transform pipeline) 6. Remove remote resource references from XHTML 7. Wire into `SendBook` for .epub files; graceful fallback to original on `ErrEPUBNotNormalizable` (PERMANENT — no retry) **Not in MVP scope:** Full XHTML re-serialization for well-formedness, NCX generation, font subsetting (no pure-Go font subsetter exists). Closes bead bookshelf-zurw on merge.
docs(spike): native Go EPUB normalizer for Send-to-Kindle (bookshelf-zurw)
All checks were successful
/ Hugo build (pull_request) Successful in 36s
/ JS Unit Tests (pull_request) Successful in 51s
/ E2E API (pull_request) Successful in 1m57s
/ Lint (pull_request) Successful in 2m22s
/ E2E Browser (pull_request) Successful in 2m46s
/ Integration (pull_request) Successful in 2m57s
/ Test (pull_request) Successful in 3m36s
6d630ba473
Spike findings: E999 is primarily caused by ZIP structure violations
(mimetype not STORED-first), OPF/container.xml defects, and oversized
images. Native Go normalizer is feasible using existing deps and
internal/files code. Minimum viable scope covers the top 3 root causes
without new dependencies or cgo. Recommends one follow-up implementation
bead.
zombor closed this pull request 2026-06-20 12:52:13 +00:00
All checks were successful
/ Hugo build (pull_request) Successful in 36s
/ JS Unit Tests (pull_request) Successful in 51s
/ E2E API (pull_request) Successful in 1m57s
Required
Details
/ Lint (pull_request) Successful in 2m22s
Required
Details
/ E2E Browser (pull_request) Successful in 2m46s
Required
Details
/ Integration (pull_request) Successful in 2m57s
Required
Details
/ Test (pull_request) Successful in 3m36s
Required
Details

Pull request closed

Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
zombor/pergamum!661
No description provided.