Files
blocklists/README.md
T
CodeX 14fd0cf511 Simplify to whitelist/blacklist model
- Rewrite merge_blocklists.py to sync a single blacklist from upstream
  and subtract the locally-maintained whitelist
- Replace whitelist contents with subtitle + webm seed
- Remove blacklist_permissive, whitelist_with_subtitles, and all
  .prev files that are no longer needed
- Rewrite README to reflect the two-file model and link to wiki
2026-04-07 01:09:17 +02:00

121 lines
4.4 KiB
Markdown

# arr/blocklists
Curated blacklist and whitelist for the ARR media stack. The blacklist is
synced automatically from upstream Cleanuparr and stripped of anything
listed in the locally-maintained whitelist, so consumers like qBittorrent
and Cleanuparr can point at a single raw URL per list and stay in sync.
See the wiki for full technical reference:
- [Sync](https://git.hisp.no/arr/blocklists/wiki/Sync)
-- three-way merge, whitelist exclusion, `.prev` snapshot, edge cases
- [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists)
-- the two-file model, pattern semantics, maintaining the whitelist
- [Consumers](https://git.hisp.no/arr/blocklists/wiki/Consumers)
-- qBittorrent and Cleanuparr integration, raw URLs, recommended modes
- [CI and Workflow](https://git.hisp.no/arr/blocklists/wiki/CI-and-Workflow)
-- scheduled Gitea Actions job, manual dispatch, commit behaviour
## How it works
The repository contains two data files:
| File | Role | Source |
|---|---|---|
| `blacklist` | Extensions blocked by downloaders and file cleaners | Synced from upstream, with the whitelist subtracted |
| `whitelist` | Extensions that must never be blocked or deleted | Locally maintained |
On every scheduled run the sync script:
1. Fetches the current upstream blacklist from Cleanuparr.
2. Detects any manual additions made directly to `blacklist` (three-way
merge against `blacklist.prev`).
3. Subtracts every entry listed in `whitelist`.
4. Writes the result back to `blacklist` and updates `blacklist.prev`.
The whitelist is the single source of truth for "what I want kept." Adding
an extension to `whitelist` removes it from `blacklist` on the next sync
and prevents consumers from blocking or deleting it. See
[Sync](https://git.hisp.no/arr/blocklists/wiki/Sync) for the full algorithm.
## Prerequisites
- A consumer that reads a remote text file of glob patterns (qBittorrent
excluded file names, Cleanuparr blacklist/whitelist sync, etc.)
- Network access from that consumer to `git.hisp.no`
## File structure
| Path | Purpose |
|---|---|
| `blacklist` | Merged output: upstream blacklist minus the whitelist. Consumer-facing |
| `blacklist.prev` | Snapshot of the last upstream fetch. Baseline for the three-way merge. Do not edit |
| `whitelist` | Locally-maintained allow list. Edit directly to add or remove entries |
| `scripts/merge_blocklists.py` | Sync script executed by the scheduled workflow |
| `.gitea/workflows/sync.yml` | Scheduled Gitea Actions workflow |
## Usage
Point your consumer at the raw URL of the file it should use.
### qBittorrent
qBittorrent has no whitelist feature, so it consumes the blacklist directly.
Set the excluded file names list (Options -> Downloads -> Excluded file
names) to:
```
https://git.hisp.no/arr/blocklists/raw/branch/main/blacklist
```
Because the whitelist is already subtracted from this file, any extension
you add to `whitelist` stops being blocked by qBittorrent on the next sync.
### Cleanuparr
Cleanuparr supports both blacklist and whitelist modes. Use whichever
matches your setup:
- **Blacklist mode** -- point at the same `blacklist` raw URL as qBittorrent.
- **Whitelist mode** -- point at the `whitelist` raw URL:
```
https://git.hisp.no/arr/blocklists/raw/branch/main/whitelist
```
See [Consumers](https://git.hisp.no/arr/blocklists/wiki/Consumers) for
recommended mode per feature.
## Maintaining the whitelist
Edit `whitelist` directly in Gitea or via a local clone. One glob pattern
per line, sorted, no blank lines. Patterns are matched against the blacklist
with exact-string set subtraction:
- `*.srt` in `whitelist` removes `*.srt` from `blacklist`.
- `*sample.srt` in `blacklist` is not affected by `*.srt` in `whitelist`.
Sample-file patterns are preserved because exact-string subtraction only
removes identical entries.
See [Lists](https://git.hisp.no/arr/blocklists/wiki/Lists) for the full
pattern rules and examples.
## Sync schedule
The Gitea Actions workflow runs every 7 days at 04:00 UTC and on manual
dispatch. Each run:
1. Executes `scripts/merge_blocklists.py`.
2. Commits `blacklist` and `blacklist.prev` if either changed.
3. Pushes the commit to `main`.
See [CI and Workflow](https://git.hisp.no/arr/blocklists/wiki/CI-and-Workflow)
for workflow details and manual dispatch instructions.
## Upstream source
The blacklist is sourced from:
```
https://raw.githubusercontent.com/Cleanuparr/Cleanuparr/main/blacklist
```