| demo | ||
| repub | ||
| scripts | ||
| tests | ||
| .envrc | ||
| .flake8 | ||
| .gitignore | ||
| AGENTS.md | ||
| flake.lock | ||
| flake.nix | ||
| LICENSE.md | ||
| pyproject.toml | ||
| README.md | ||
| scrapy.cfg | ||
| treefmt.nix | ||
| uv.lock | ||
AnyNews Republisher
The AnyNews Republisher is a tool for mirroring news content to alternative distribution points to avoid censorship or make content available to communities suffering from high Internet cost, slow or limited access, or natural disaster.
The organization with the original news content is the "publisher".
The AnyNews Republisher is managed through a local web UI. Sources, schedules, and job executions are stored in SQLite. On an interval the Republisher crawls the configured sources and mirrors the content (text and media) offline into an RSS feed.
The AnyNews app can then be configured to use this mirror (or more than one such mirror).
The Republisher currently accepts the following source input types:
- RSS and Atom feeds
- Pangea sources via
pygea
Usage
Sync dependencies and start the admin UI:
uv sync --all-groups
uv run repub
With no arguments, uv run repub starts the web UI in local dev mode and serves published feed files from /feeds/... out of out/feeds/....
By default the UI listens on 127.0.0.1:8080. You can override that with REPUBLISHER_HOST and REPUBLISHER_PORT, or with:
uv run repub serve --host 0.0.0.0 --port 8080
If you invoke the serve subcommand explicitly, use --dev-mode to expose published feeds directly from the Quart app:
uv run repub serve --dev-mode
In --dev-mode, requests under /feeds/... are served from out/feeds/....
In production, do not rely on Quart to serve published feeds. Configure the reverse proxy to serve out/feeds/... directly at /feeds/....
Important: the admin UI has no built-in authentication. Keep it bound to localhost or put it behind a trusted network layer such as Tailscale.
Once the UI is running:
- Open
http://127.0.0.1:8080/. - Create a source. Feed sources take a feed URL. Pangea sources take a domain plus category configuration.
- Open
Settingsand setFeed URLto the public origin that serves mirrored feeds, for examplehttps://mirror.example. - Configure the job schedule and any spider arguments.
- Use
Run nowto trigger an immediate crawl, or leave the job enabled for scheduled runs. - Watch running jobs and logs live from the Runs pages.
Operational notes:
- The default database path is
republisher.db. SetREPUBLISHER_DB_PATHto use a different SQLite file. - Mirrored feeds are written under
out/feeds/<slug>/. In production, exposeout/feeds/directly from the reverse proxy at/feeds/. Feed URLis used to generate absolute media URLs andatom:link rel="self"in exported feeds.- Job logs and stats artifacts are written under
out/logs/.
The legacy one-shot config-driven crawler is still available:
uv run repub crawl -c repub.toml
For config-driven crawls, set the public feed origin in scrapy.settings.REPUBLISHER_FEED_URL:
[scrapy.settings]
REPUBLISHER_FEED_URL = "https://mirror.example"
Roadmap
- Offlines RSS feed xml
- Downloads media and enclosures
- Rewrites media urls
- Image normalization (JPG, RGB)
- Audio transcoding
- Video transcoding
- Image compression - Do we want this? -> DEFERED for now
- Download and rewrite media embedded in content/CDATA fields
- Config file to drive the program
- Add sqlite database and simple admin UI to replace config
- Integrate pygea as input source
- Operationalize with metrics and error reporting
License
republisher, a tool to mirror RSS/ATOM feeds completely offline
Copyright (C) 2024-2026 Abel Luck
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.