Splitting PDF into color and B/W pages
Use this command when you need two versions of the final PDF: one file with color pages only and one file with black-and-white pages only.
task pdf:split-color -- Куприянов_И221_диплом.pdf
docker compose --profile latex run --build --rm latex python3 scripts/split_pdf_color.py Куприянов_И221_диплом.pdf
For an input file named document.pdf, the script creates:
| File | Contents |
|---|---|
document_color.pdf |
Only pages where C, M, or Y coverage is above the threshold |
document_bw.pdf |
All remaining pages |
How color is detected
scripts/split_pdf_color.py runs Ghostscript with the inkcov device:
gs -q -dSAFER -dBATCH -dNOPAUSE -o - -sDEVICE=inkcov document.pdf
A page is considered color if the maximum C/M/Y coverage is above the threshold. The default threshold is 0.00001 to ignore zero values and small numeric noise.
To use a different threshold:
task pdf:split-color -- document.pdf --threshold 0.001
Why qpdf is used
Ghostscript is useful for color analysis, but exporting through pdfwrite redraws pages and may change their rotation or geometry. Exporting is therefore done by qpdf, which copies selected pages from the source PDF without redrawing them.
The total number of pages in *_color.pdf and *_bw.pdf should match the number of pages in the source PDF.
Dependencies
The Docker command uses the project LaTeX image, where Ghostscript and qpdf are already installed.
For a local run without Docker, install:
| Tool | Check |
|---|---|
| Ghostscript | gs --version |
| qpdf | qpdf --version |
uv run python scripts/split_pdf_color.py document.pdf