feat(ci): nightly load test + loadtest-ci target (bookshelf-hj9.3) #601
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "bd-bookshelf-hj9.3"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Adds nightly load test workflow (.forgejo/workflows/loadtest.yml) + loadtest-ci Makefile target. Zero impact on normal PR pipeline — nightly only. Closes bead bookshelf-hj9.3 on merge.
Code Review: bookshelf-hj9.3 (PR #601)
Phase 0: DEMO Verification
No DEMO block is required for a workflow-only PR (the "demo" is the nightly run itself). CI green confirmed per bead comment. Proceeding to spec compliance.
Phase 1: Spec Compliance
ci.yml diff is empty — confirmed via
git diff main...origin/bd-bookshelf-hj9.3 -- .forgejo/workflows/ci.ymlproducing no output. The per-PR pipeline is fully unaffected.Trigger gating:
on: schedule(02:00 UTC) +workflow_dispatchonly. Nopushorpull_requesttriggers. Heavy job is correctly isolated from per-PR CI.Thresholds match the Scale targets from project-conventions.md (
GET /booksp99 < 100ms,GET /books/{id}p99 < 30ms). k6 exit code propagation causes the nightly job itself to fail on violations — visible, not silently swallowed.Phase 2: Code Quality
[MAJOR] .forgejo/workflows/loadtest.yml:183-218 —
set -eaborts k6 step on first threshold failure; second script never runsThe "Run k6 load tests" step opens with
set -e. The two k6 invocations are:Under
set -e, ifbooks_list.jsexits non-zero (threshold violation), the shell exits immediately after the k6 process — the;does not protect the assignment.BOOKS_LIST_EXITis never set, and the entire step aborts, sobook_detail.jsnever runs. The reporting step (withif: always()) still fires, but only one test was measured. On abooks_listregression the report showsGET /books= FAIL (via the:-1default) andGET /books/{id}= FAIL (also:-1), even though the detail endpoint may actually be healthy.Fix: capture each exit code without relying on
set -eabort semantics:Or run both with explicit
|| trueand capture$?after each, keepingset -efor other commands.[MAJOR] .forgejo/workflows/loadtest.yml:276-284 —
set -e+ uncheckedcurl POSTcan fail the job for non-performance reasonsThe "Parse results and post bead comment" step also runs under
set -e. Thecurl -sk -X POST ... > /dev/nullat line ~276 does not have|| true. If the Forgejo API returns a non-2xx status or a network error causes curl to exit non-zero,set -efires and the step exits before reaching the intendedexit $EXIT_CODE. The workflow job then fails with a misleading non-zero status unrelated to load-test results. A transient API blip makes it look like a performance regression.Fix: append
|| echo "Warning: failed to post comment"to the curl POST line so network failures are logged but do not abort the step.[MINOR] .forgejo/workflows/loadtest.yml:248-261 — issue search is fragile (state=open, limit=50)
The ISSUE_NUM lookup fetches
state=open&limit=50and matches on title containing "hj9" or "load test". If the hj9 epic bead is ever closed (status=done) before the nightly job is retired, the search returns no match and results only print to the workflow log — silently losing the comment history. Also, limit=50 could miss the bead if there are many open issues and the bead sorts beyond position 50.Fix (minor): add
state=allinstead ofstate=open, and increaselimitor paginate. Or hardcode the known issue number once resolved.[MINOR] .forgejo/workflows/loadtest.yml:107-115 — k6 downloaded from github.com with no checksum verification
The k6 binary is fetched from
github.com/grafana/k6/releases/download/v0.54.0/...with no SHA-256 checksum check. A supply-chain compromise or GitHub outage silently gives a bad binary or fails the job. For a nightly job this is low severity but worth noting.Fix: add a
sha256sum --checkstep after the download using the known checksum from the k6 release page.[MINOR] Makefile:179-192 —
loadtest-cifirst script abort prevents second from runningThe
loadtest-citarget uses@echo ... && k6 run ... loadtest/books_list.jsfor the first script. If the first k6 run exits non-zero,&&short-circuits and Make aborts the recipe. The secondbook_detail.jsrun is skipped. This only affects localmake loadtest-ciinvocations (the workflow calls k6 directly), but it is still inconsistent with the expected behavior.Fix: run each k6 invocation as its own recipe line without
&&, capturing exit codes explicitly, then exit non-zero at the end if either failed.Summary
set -eaborts k6 step; second script skipped on first failurecurl POSTwithout|| true; API blip fails job with wrong exit reasonloadtest-ciMake target: first-script abort skips secondREVIEW VERDICT: 0 blocker, 2 major, 3 minor
Security Review — PR #601 (bd-bookshelf-hj9.3)
Scope:
.forgejo/workflows/loadtest.yml+Makefileloadtest-citarget. CI/infra only, no production code.Findings
No security issues found.
Token handling —
FORGEJO_TOKENis${{ secrets.GITHUB_TOKEN }}, the built-in least-privilege Forgejo token. It is never echoed into logs (curl -sk … > /dev/null).REGISTRY_TOKENis a pull-only registry credential used only for container image auth. Neither secret is printed or stored anywhere reachable.Workflow injection —
workflow_dispatchinputs (seed_count,vus,duration) are assigned to job-level env vars (SEED_COUNT,LOADTEST_VUS,LOADTEST_DURATION) via the expression layer and then used as double-quoted shell variables ("${LOADTEST_DURATION}"). They are never interpolated directly into arun:script via${{ … }}, so there is no shell-injection vector. They also appear in the markdown$BODYstring, which is piped throughpython3 … json.dumpsbefore being sent to the API — any injected content becomes a literal string in the JSON body, not executable.REF_NAME(github.ref_name) ends up in$BODYas well; sinceworkflow_dispatchis restricted to users with write access on a private repo, a branch-name injection would require repo write access — already trusted. Nopull_request/pull_request_targettrigger exists.Trigger scope —
schedule+workflow_dispatchonly. Nopull_requesttrigger (which would allow untrusted forks to run with elevated permissions).workflow_dispatchrequires repo write access in Forgejo. No elevation of privilege path.App binding — the app starts on
:8080inside the runner container (not a cloud VM with a public IP); Forgejo Actions runners are isolated containers. The password (Loadtest_P4ssword!) is a hardcoded ephemeral CI credential scoped to this throwaway DB — not a production secret. No credentials are bound to external-facing surfaces.ISSUE_NUM URL construction — derived from
i["number"]in Python, which is an integer parsed from the Forgejo API JSON response (trusted source). No user-controlled string is interpolated into the curl URL path.k6 binary — downloaded from
github.com/grafana/k6/releasesat a pinned version (v0.54.0); nolatestalias. No SHA256 pin, but this is a minor hardening gap (not a blocking finding for an internal CI tool with no production-code impact).REVIEW VERDICT: 0 blocker, 0 major, 0 minor
Code Re-Review: bookshelf-hj9.3 (PR #601) — Fix Verification
Per-finding verification against fix commit
02df05f1[MAJOR #1]
.forgejo/workflows/loadtest.yml:183-218—set -eaborts k6 step on first threshold failureNOT RESOLVED — new bug introduced.
The fix uses
k6 run ... || true; BOOKS_LIST_EXIT=$?(loadtest.yml:205) and the identical pattern forBOOK_DETAIL_EXIT(line 218). However,|| truemakes the compound command always exit 0, so$?is always 0 regardless of k6's actual exit code. Verified via shell semantics:bash -c 'false || true; E=$?; echo $E'prints0.Effect: threshold violations are silently swallowed.
BOOKS_LIST_EXITandBOOK_DETAIL_EXITare always 0, the reporting step always shows PASS, andexit $EXIT_CODEat the end is always a no-op. The nightly job will never go red on a real performance regression.The same defective pattern appears in the Makefile
loadtest-citarget (lines|| true ; \ BOOKS_LIST_EXIT=$$?).$$?is always 0 there too.Correct fix patterns:
Option A —
set +e/set -ebracket:Option B — inline without
set -e:[MAJOR #2]
.forgejo/workflows/loadtest.yml:276-284—curl POSTwithout|| truefails job for non-perf reasonsRESOLVED. Line 336 now has
|| echo "Warning: failed to post loadtest comment".[MINOR #3]
.forgejo/workflows/loadtest.yml:248-261— issue searchstate=open+limit=50RESOLVED. Line 319 uses
state=all&limit=200.[MINOR #4]
.forgejo/workflows/loadtest.yml:107-115— k6 download with no checksumRESOLVED. Lines 121-123 hardcode
K6_SHA256and pipe tosha256sum --check.[MINOR #5]
Makefile:179-192—loadtest-ci&&short-circuit skips second scriptPARTIALLY RESOLVED. The
&&is gone — both scripts always run. However, the same|| true; BOOKS_LIST_EXIT=$$?bug from Major #1 applies here too: both exit codes are always 0, soexit $$EXITis always a no-op. Both scripts run (the specific minor finding is fixed), but exit code propagation is broken.ci.yml unchanged: confirmed —
git diff origin/main...origin/bd-bookshelf-hj9.3 -- .forgejo/workflows/ci.ymlis empty.No new issues other than the broken exit-code capture noted above.
Verdict
REVIEW VERDICT: 0 blocker, 1 major, 0 minor
The exit-code capture regression (
|| true; VAR=$?always yields VAR=0) means threshold violations are permanently silenced — the nightly job's core purpose (alerting on regressions) is non-functional. This is a [MAJOR] that must be fixed before merge.CODE REVIEW: APPROVED
Phase 0: DEMO Verification
No new DEMO block required for this targeted re-fix. Prior review (comment 7113) verified DEMO output; this re-review is scoped to the single exit-capture regression fix.
Phase 1: Exit-capture fix — .forgejo/workflows/loadtest.yml (lines 195-220)
set +eat line 195, BEFORE the first k6 invocation.BOOKS_LIST_EXIT=$?captured immediately at line 206, directly after the first k6 run. NO|| truein between.|| true.BOOK_DETAIL_EXIT=$?captured immediately at line 219.set -erestored at line 220, right after both captures.exit $EXIT_CODEpropagates failure correctly.The prior MAJOR (exit always 0 due to
|| true; EXIT=$?) is fully resolved.Phase 2: Makefile loadtest-ci target (lines 176-204)
set +eat the top.BOOKS_LIST_EXIT=$$?captured immediately after the first k6 run, no|| true.BOOK_DETAIL_EXIT=$$?captured immediately after the second k6 run, no|| true.set -erestored, then rollup intoEXITandexit $$EXIT. Correct.Phase 3: Previously-resolved findings still intact
|| echowarning: line 338, intact.state=allissue search: line 321, intact.No new issues found.
REVIEW VERDICT: 0 blocker, 0 major, 0 minor
- CheckBookAccess now returns ErrForbidden (403) for content-restricted books, distinct from ErrNotFound (404) for library-access failures. Per-spec: 403 content-blocked vs 404 not-in-library. - GetAdminUserContentRestrictions: fetch user_content_restriction rows for display in the admin user-edit form. - SetAdminUserContentRestrictions: atomically replace all restrictions in a transaction (delete + insert). - Admin user form template updated with a content-restrictions table: parallel arrays restriction_type[]/restriction_mode[]/restriction_value[] submitted on POST /admin/users/{id}. JS <template> for add-row. No inline style= (CSP-compliant). - contentRestrictionsFromForm parses the parallel-array form fields, skipping invalid type/mode/empty-value rows. - Wire in users.Wire() using d.Conn for the transaction adapter. - 100% coverage gate passes; all existing tests updated. Closes bead bookshelf-4op.10 on merge.5df5a28eb671bb62b823