14fd0cf511
- Rewrite merge_blocklists.py to sync a single blacklist from upstream and subtract the locally-maintained whitelist - Replace whitelist contents with subtitle + webm seed - Remove blacklist_permissive, whitelist_with_subtitles, and all .prev files that are no longer needed - Rewrite README to reflect the two-file model and link to wiki
121 lines
4.4 KiB
Markdown
121 lines
4.4 KiB
Markdown
# arr/blocklists
|
|
|
|
Curated blacklist and whitelist for the ARR media stack. The blacklist is
|
|
synced automatically from upstream Cleanuparr and stripped of anything
|
|
listed in the locally-maintained whitelist, so consumers like qBittorrent
|
|
and Cleanuparr can point at a single raw URL per list and stay in sync.
|
|
|
|
See the wiki for full technical reference:
|
|
- [Sync](https://git.hisp.no/arr/blocklists/wiki/Sync)
|
|
-- three-way merge, whitelist exclusion, `.prev` snapshot, edge cases
|
|
- [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists)
|
|
-- the two-file model, pattern semantics, maintaining the whitelist
|
|
- [Consumers](https://git.hisp.no/arr/blocklists/wiki/Consumers)
|
|
-- qBittorrent and Cleanuparr integration, raw URLs, recommended modes
|
|
- [CI and Workflow](https://git.hisp.no/arr/blocklists/wiki/CI-and-Workflow)
|
|
-- scheduled Gitea Actions job, manual dispatch, commit behaviour
|
|
|
|
## How it works
|
|
|
|
The repository contains two data files:
|
|
|
|
| File | Role | Source |
|
|
|---|---|---|
|
|
| `blacklist` | Extensions blocked by downloaders and file cleaners | Synced from upstream, with the whitelist subtracted |
|
|
| `whitelist` | Extensions that must never be blocked or deleted | Locally maintained |
|
|
|
|
On every scheduled run the sync script:
|
|
|
|
1. Fetches the current upstream blacklist from Cleanuparr.
|
|
2. Detects any manual additions made directly to `blacklist` (three-way
|
|
merge against `blacklist.prev`).
|
|
3. Subtracts every entry listed in `whitelist`.
|
|
4. Writes the result back to `blacklist` and updates `blacklist.prev`.
|
|
|
|
The whitelist is the single source of truth for "what I want kept." Adding
|
|
an extension to `whitelist` removes it from `blacklist` on the next sync
|
|
and prevents consumers from blocking or deleting it. See
|
|
[Sync](https://git.hisp.no/arr/blocklists/wiki/Sync) for the full algorithm.
|
|
|
|
## Prerequisites
|
|
|
|
- A consumer that reads a remote text file of glob patterns (qBittorrent
|
|
excluded file names, Cleanuparr blacklist/whitelist sync, etc.)
|
|
- Network access from that consumer to `git.hisp.no`
|
|
|
|
## File structure
|
|
|
|
| Path | Purpose |
|
|
|---|---|
|
|
| `blacklist` | Merged output: upstream blacklist minus the whitelist. Consumer-facing |
|
|
| `blacklist.prev` | Snapshot of the last upstream fetch. Baseline for the three-way merge. Do not edit |
|
|
| `whitelist` | Locally-maintained allow list. Edit directly to add or remove entries |
|
|
| `scripts/merge_blocklists.py` | Sync script executed by the scheduled workflow |
|
|
| `.gitea/workflows/sync.yml` | Scheduled Gitea Actions workflow |
|
|
|
|
## Usage
|
|
|
|
Point your consumer at the raw URL of the file it should use.
|
|
|
|
### qBittorrent
|
|
|
|
qBittorrent has no whitelist feature, so it consumes the blacklist directly.
|
|
Set the excluded file names list (Options -> Downloads -> Excluded file
|
|
names) to:
|
|
|
|
```
|
|
https://git.hisp.no/arr/blocklists/raw/branch/main/blacklist
|
|
```
|
|
|
|
Because the whitelist is already subtracted from this file, any extension
|
|
you add to `whitelist` stops being blocked by qBittorrent on the next sync.
|
|
|
|
### Cleanuparr
|
|
|
|
Cleanuparr supports both blacklist and whitelist modes. Use whichever
|
|
matches your setup:
|
|
|
|
- **Blacklist mode** -- point at the same `blacklist` raw URL as qBittorrent.
|
|
- **Whitelist mode** -- point at the `whitelist` raw URL:
|
|
|
|
```
|
|
https://git.hisp.no/arr/blocklists/raw/branch/main/whitelist
|
|
```
|
|
|
|
See [Consumers](https://git.hisp.no/arr/blocklists/wiki/Consumers) for
|
|
recommended mode per feature.
|
|
|
|
## Maintaining the whitelist
|
|
|
|
Edit `whitelist` directly in Gitea or via a local clone. One glob pattern
|
|
per line, sorted, no blank lines. Patterns are matched against the blacklist
|
|
with exact-string set subtraction:
|
|
|
|
- `*.srt` in `whitelist` removes `*.srt` from `blacklist`.
|
|
- `*sample.srt` in `blacklist` is not affected by `*.srt` in `whitelist`.
|
|
Sample-file patterns are preserved because exact-string subtraction only
|
|
removes identical entries.
|
|
|
|
See [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists) for the full
|
|
pattern rules and examples.
|
|
|
|
## Sync schedule
|
|
|
|
The Gitea Actions workflow runs every 7 days at 04:00 UTC and on manual
|
|
dispatch. Each run:
|
|
|
|
1. Executes `scripts/merge_blocklists.py`.
|
|
2. Commits `blacklist` and `blacklist.prev` if either changed.
|
|
3. Pushes the commit to `main`.
|
|
|
|
See [CI and Workflow](https://git.hisp.no/arr/blocklists/wiki/CI-and-Workflow)
|
|
for workflow details and manual dispatch instructions.
|
|
|
|
## Upstream source
|
|
|
|
The blacklist is sourced from:
|
|
|
|
```
|
|
https://raw.githubusercontent.com/Cleanuparr/Cleanuparr/main/blacklist
|
|
```
|