docs: document image pipeline profiles

This commit is contained in:
Abel Luck 2026-05-27 10:13:06 +02:00
parent 18a7f652d4
commit cbb427b89d
6 changed files with 40 additions and 5 deletions

View file

@ -59,6 +59,17 @@ Operational notes:
- Mirrored feeds are written under `out/feeds/<slug>/`.
In production, expose `out/feeds/` directly from the reverse proxy at `/feeds/`.
- `Feed URL` is used to generate absolute media URLs and `atom:link rel="self"` in exported feeds.
- Image output is profile-driven. `REPUBLISHER_IMAGE` defines full-size
variants; the first profile is the canonical image URL used when feed image
URLs are rewritten.
- Default image profiles keep source bytes under `images/source/`, write
full-size variants under `images/full/`, and write thumbnail profiles from
`REPUBLISHER_IMAGE_THUMBNAILS` under `images/thumbs/`.
- Explicit item image media is exported as Media RSS image groups with named
thumbnails. Inline HTML images are mirrored and rewritten in content, but are
not promoted to item-level Media RSS.
- Image profile names and transform settings are part of generated filenames.
Reordering `REPUBLISHER_IMAGE` changes canonical feed image URLs.
- Job logs and stats artifacts are written under `out/logs/`.
The legacy one-shot config-driven crawler is still available:
@ -79,10 +90,9 @@ REPUBLISHER_FEED_URL = "https://mirror.example"
- [x] Offlines RSS feed xml
- [x] Downloads media and enclosures
- [x] Rewrites media urls
- [x] Image normalization (JPG, RGB)
- [x] Profile-driven image normalization, compression, and thumbnails
- [x] Audio transcoding
- [x] Video transcoding
- [ ] Image compression - Do we want this? -> DEFERED for now
- [x] Download and rewrite media embedded in content/CDATA fields
- [x] Config file to drive the program
- [x] Add sqlite database and simple admin UI to replace config