Files
blocklists/README.md
T
CodeX 3a664675b5 Document supported pattern syntax in README
Add a Pattern syntax section listing the five pattern forms accepted
by qBittorrent's excluded file names (suffix, prefix, contains, exact,
regex), which are the same forms that the entries in blacklist and
whitelist must use. Note that the merge script subtracts whitelist
from blacklist as exact strings regardless of pattern form, so
equivalent patterns written differently are not interchangeable.
2026-04-07 21:45:35 +02:00

168 lines
6.7 KiB
Markdown

# arr/blocklists
Curated blacklist and whitelist for the ARR media stack. The blacklist is
synced automatically from upstream Cleanuparr and stripped of anything
listed in the locally-maintained whitelist. Both files are consumed by
Cleanuparr -- qBittorrent does not read either URL directly; Cleanuparr's
Blocklist Sync feature pushes patterns to qBittorrent on its behalf.
See the wiki for full technical reference:
- [Sync](https://git.hisp.no/arr/blocklists/wiki/Sync)
-- three-way merge, whitelist exclusion, `.prev` snapshot, edge cases
- [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists)
-- the two-file model, pattern semantics, maintaining the whitelist
- [Consumers](https://git.hisp.no/arr/blocklists/wiki/Consumers)
-- Cleanuparr Blocklist Sync and Malware Blocker, raw URLs, recommended modes
- [CI and Workflow](https://git.hisp.no/arr/blocklists/wiki/CI-and-Workflow)
-- scheduled Gitea Actions job, manual dispatch, commit behaviour
## How it works
The repository contains two data files:
| File | Role | Source |
|---|---|---|
| `blacklist` | Extensions blocked by downloaders and file cleaners | Synced from upstream, with the whitelist subtracted |
| `whitelist` | Extensions that must never be blocked or deleted | Locally maintained |
On every scheduled run the sync script:
1. Fetches the current upstream blacklist from Cleanuparr.
2. Detects any manual additions made directly to `blacklist` (three-way
merge against `blacklist.prev`).
3. Subtracts every entry listed in `whitelist`.
4. Writes the result back to `blacklist` and updates `blacklist.prev`.
The whitelist is the single source of truth for "what I want kept." Adding
an extension to `whitelist` removes it from `blacklist` on the next sync
and prevents consumers from blocking or deleting it. See
[Sync](https://git.hisp.no/arr/blocklists/wiki/Sync) for the full algorithm.
## Prerequisites
- A running [Cleanuparr](https://github.com/Cleanuparr/Cleanuparr) instance
with Blocklist Sync and/or Malware Blocker configured
- Network access from Cleanuparr to `git.hisp.no`
- For Blocklist Sync: a qBittorrent instance reachable from Cleanuparr's
qBittorrent connection settings
## File structure
| Path | Purpose |
|---|---|
| `blacklist` | Merged output: upstream blacklist minus the whitelist. Consumer-facing |
| `blacklist.prev` | Snapshot of the last upstream fetch. Baseline for the three-way merge. Do not edit |
| `whitelist` | Locally-maintained allow list. Edit directly to add or remove entries |
| `scripts/merge_blocklists.py` | Sync script executed by the scheduled workflow |
| `.gitea/workflows/sync.yml` | Scheduled Gitea Actions workflow |
## Usage
Both files are consumed by Cleanuparr through two independent features.
qBittorrent itself does not read any URL from this repository.
### Blocklist Sync (push to qBittorrent)
Cleanuparr's **Blocklist Sync** fetches a blocklist URL on its configured
schedule and pushes the full pattern set into qBittorrent's excluded file
names via the qBittorrent Web API. Only blocklists are supported -- there
is no whitelist mode for this feature. Point Blocklist Sync at:
```
https://git.hisp.no/arr/blocklists/raw/branch/main/blacklist
```
Because the locally-maintained whitelist is already subtracted from this
file, any extension you add to `whitelist` stops being pushed to
qBittorrent on the next sync.
### Malware Blocker (Sonarr/Radarr queue)
Cleanuparr's **Malware Blocker** inspects torrents in the Sonarr and
Radarr queues and removes ones that violate its rules. It supports both
blacklist and whitelist modes. This repository is configured for
**whitelist mode**:
```
https://git.hisp.no/arr/blocklists/raw/branch/main/whitelist
```
In whitelist mode, only torrents whose contained files all have
whitelisted extensions are kept. Everything else is removed from the queue.
See [Consumers](https://git.hisp.no/arr/blocklists/wiki/Consumers) for the
full configuration details and troubleshooting.
## Maintaining the whitelist
Edit `whitelist` directly in Gitea or via a local clone. One glob pattern
per line, sorted, no blank lines. Patterns are matched against the blacklist
with exact-string set subtraction:
- `*.srt` in `whitelist` removes `*.srt` from `blacklist`.
- `*sample.srt` in `blacklist` is not affected by `*.srt` in `whitelist`.
Sample-file patterns are preserved because exact-string subtraction only
removes identical entries.
See [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists) for the full
pattern rules and examples.
## Maintaining the blacklist
Most blacklist maintenance happens through the whitelist (above). To **add**
an entry that upstream Cleanuparr does not block -- a tracker-specific junk
file, a new malware extension, anything else upstream missed -- edit
`blacklist` directly in Gitea, add the line, commit. The next sync run
detects the new line as a manual local addition and preserves it across
every subsequent sync.
To **remove** an entry, use `whitelist` instead. Deleting a line from
`blacklist` does not work as a removal mechanism: the next sync re-adds it
from upstream. Removals must go through the whitelist.
See [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists) for the full
rules and worked examples.
## Pattern syntax
Each line in `blacklist` and `whitelist` is a pattern in one of the
following forms (the same forms that qBittorrent's excluded file names
accepts, since that is where blacklist patterns end up via Cleanuparr's
Blocklist Sync):
| Form | Example | Matches |
|---|---|---|
| `*example` | `*.srt` | File name ends with `example` |
| `example*` | `sample*` | File name starts with `example` |
| `*example*` | `*sample*` | File name contains `example` |
| `example` | `RARBG.txt` | File name is exactly `example` |
| `regex:<regex>` | `regex:.*\.srt$` | File name matches the regex |
The merge script subtracts the whitelist from the blacklist as **exact
strings**, not by pattern semantics. To remove a pattern from the
blacklist, the whitelist line must be byte-for-byte identical to the
blacklist line. `*.srt` in the whitelist strips `*.srt` from the
blacklist; it does **not** strip `regex:.*\.srt$` even though both
patterns describe the same files. See
[Lists](https://git.hisp.no/arr/blocklists/wiki/Lists) for the full rules.
## Sync schedule
The Gitea Actions workflow runs every 7 days at 04:00 UTC and on manual
dispatch. Each run:
1. Executes `scripts/merge_blocklists.py`.
2. Commits `blacklist` and `blacklist.prev` if either changed.
3. Pushes the commit to `main`.
See [CI and Workflow](https://git.hisp.no/arr/blocklists/wiki/CI-and-Workflow)
for workflow details and manual dispatch instructions.
## Upstream source
The blacklist is sourced from:
```
https://raw.githubusercontent.com/Cleanuparr/Cleanuparr/main/blacklist
```