Mirroring (republishing) news media content for censorship circumvention
Find a file
2026-03-29 14:02:44 +02:00
demo now with configuration 2026-03-29 13:52:23 +02:00
repub Fix Scrapy media pipeline initialization 2026-03-29 14:02:44 +02:00
tests Fix Scrapy media pipeline initialization 2026-03-29 14:02:44 +02:00
.envrc switch to uv and to nix flakes 2026-03-29 12:59:08 +02:00
.flake8 init repo 2024-04-17 10:31:33 +02:00
.gitignore now with configuration 2026-03-29 13:52:23 +02:00
AGENTS.md now with configuration 2026-03-29 13:52:23 +02:00
flake.lock fix local path 2026-03-29 13:10:53 +02:00
flake.nix now with configuration 2026-03-29 13:52:23 +02:00
LICENSE.md implement media pipelines and url rewriting 2024-04-18 15:34:23 +02:00
pyproject.toml switch to uv and to nix flakes 2026-03-29 12:59:08 +02:00
README.md now with configuration 2026-03-29 13:52:23 +02:00
scrapy.cfg basic feed rebuilding 2024-04-18 11:57:24 +02:00
treefmt.nix switch to uv and to nix flakes 2026-03-29 12:59:08 +02:00
uv.lock switch to uv and to nix flakes 2026-03-29 12:59:08 +02:00

republisher-redux

nix develop
uv sync --all-groups
cat > repub.toml <<'EOF'
out_dir = "out"

[[feeds]]
name = "gp-pod"
url = "https://guardianproject.info/podcast/podcast.xml"

[[feeds]]
name = "nasa"
url = "https://www.nasa.gov/rss/dyn/breaking_news.rss"
EOF
uv run repub --config repub.toml

out_dir may be relative or absolute. Relative paths are resolved against the directory containing the config file. Optional Scrapy runtime overrides can be set in the same file:

[scrapy.settings]
LOG_LEVEL = "DEBUG"
DOWNLOAD_TIMEOUT = 30

See demo/README.md for a self-contained example config.

TODO

  • Offlines RSS feed xml
  • Downloads media and enclosures
  • Rewrites media urls
  • Image normalization (JPG, RGB)
  • Audio transcoding
  • Video transcoding
  • Image compression - Do we want this?
  • Download and rewrite media embedded in content/CDATA fields
  • Config file to drive the program
  • Daemonize the program
  • Operationalize with metrics and error reporting

License

republisher-redux, a tool to mirror RSS/ATOM feeds completely offline

Copyright (C) 2024 Abel Luck

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.