Fix feed validation output

2026-03-31 12:14:47 +02:00 · 2026-03-31 12:14:47 +02:00 · db1d9b44b7
commit db1d9b44b7
parent c834c3c254
13 changed files with 477 additions and 54 deletions
--- a/README.md
+++ b/README.md
@ -48,15 +48,17 @@ Once the UI is running:

 1. Open `http://127.0.0.1:8080/`.
 2. Create a source. Feed sources take a feed URL. Pangea sources take a domain plus category configuration.
-3. Configure the job schedule and any spider arguments.
-4. Use `Run now` to trigger an immediate crawl, or leave the job enabled for scheduled runs.
-5. Watch running jobs and logs live from the Runs pages.
+3. Open `Settings` and set `Feed URL` to the public origin that serves mirrored feeds, for example `https://mirror.example`.
+4. Configure the job schedule and any spider arguments.
+5. Use `Run now` to trigger an immediate crawl, or leave the job enabled for scheduled runs.
+6. Watch running jobs and logs live from the Runs pages.

 Operational notes:

 - The default database path is `republisher.db`. Set `REPUBLISHER_DB_PATH` to use a different SQLite file.
 - Mirrored feeds are written under `out/feeds/<slug>/`.
  In production, expose `out/feeds/` directly from the reverse proxy at `/feeds/`.
+- `Feed URL` is used to generate absolute media URLs and `atom:link rel="self"` in exported feeds.
 - Job logs and stats artifacts are written under `out/logs/`.

 The legacy one-shot config-driven crawler is still available:
@ -65,6 +67,13 @@ The legacy one-shot config-driven crawler is still available:
 uv run repub crawl -c repub.toml
 ```

+For config-driven crawls, set the public feed origin in `scrapy.settings.REPUBLISHER_FEED_URL`:
+
+```toml
+[scrapy.settings]
+REPUBLISHER_FEED_URL = "https://mirror.example"
+```
+
 ## Roadmap

 - [x] Offlines RSS feed xml