repub: support slugged feeds and imported TOML feed configs
This commit is contained in:
parent
30b81934a8
commit
5a8162c876
9 changed files with 324 additions and 76 deletions
20
README.md
20
README.md
|
|
@ -7,19 +7,22 @@ cat > repub.toml <<'EOF'
|
|||
out_dir = "out"
|
||||
|
||||
[[feeds]]
|
||||
name = "gp-pod"
|
||||
name = "Guardian Project Podcast"
|
||||
slug = "gp-pod"
|
||||
url = "https://guardianproject.info/podcast/podcast.xml"
|
||||
|
||||
[[feeds]]
|
||||
name = "nasa"
|
||||
name = "NASA Breaking News"
|
||||
slug = "nasa"
|
||||
url = "https://www.nasa.gov/rss/dyn/breaking_news.rss"
|
||||
EOF
|
||||
uv run repub --config repub.toml
|
||||
```
|
||||
|
||||
`out_dir` may be relative or absolute. Relative paths are resolved against the
|
||||
directory containing the config file. Optional Scrapy runtime overrides can be
|
||||
set in the same file:
|
||||
directory containing the config file. Each feed now needs a user-provided
|
||||
`slug`, which is used for output paths and filenames. Optional Scrapy runtime
|
||||
overrides can be set in the same file:
|
||||
|
||||
```toml
|
||||
[scrapy.settings]
|
||||
|
|
@ -27,6 +30,15 @@ LOG_LEVEL = "DEBUG"
|
|||
DOWNLOAD_TIMEOUT = 30
|
||||
```
|
||||
|
||||
Additional feed definitions can also be imported from one or more TOML files,
|
||||
including a `pygea`-generated `manifest.toml`:
|
||||
|
||||
```toml
|
||||
feed_config_files = ["/absolute/path/to/pygea/feed/manifest.toml"]
|
||||
```
|
||||
|
||||
Imported files only need `[[feeds]]` entries with `name`, `slug`, and `url`.
|
||||
|
||||
See [`demo/README.md`](/home/abel/src/guardianproject/anynews/republisher-redux/demo/README.md) for a self-contained example config.
|
||||
|
||||
## TODO
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue