././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 010212 x ustar 00 28 mtime=1745772903.5349615
img2pdf-0.6.1/ 0000755 0001750 0001750 00000000000 15003460550 012102 5 ustar 00josch josch ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772831.0
img2pdf-0.6.1/CHANGES.rst 0000644 0001750 0001750 00000013705 15003460437 013716 0 ustar 00josch josch =======
CHANGES
=======
0.6.1 (2025-04-27)
------------------
- testsuite fixes
0.6.0 (2025-02-15)
------------------
- Add support for JBIG2 (generic coding)
- Add convert_to_docobject() broken out from convert()
- Add pil_get_dpi() broken out from get_imgmetadata()
0.5.1 (2023-11-26)
------------------
- no default ICC profile location for PDF/A-1b on Windows
- workaround for PNG input without dpi units but non-square dpi aspect ratio
0.5.0 (2023-10-28)
------------------
- support MIFF for 16 bit CMYK input
- accept pathlib.Path objects as input
- don't store RGB ICC profiles from bilevel or grayscale TIFF, PNG and JPEG
- thumbnails are no longer included by default and --include-thumbnails has to
be used if you want them
- support for pikepdf (>= 6.2.0)
0.4.4 (2022-04-07)
------------------
- --viewer-page-layout support for twopageright and twopageleft
- Add B and JB paper sizes
- support for pikepdf (>= 5.0.0) and Pillow (>= 9.1.0)
0.4.3 (2021-10-24)
------------------
- fix --viewer-initial-page (broken in last release)
0.4.2 (2021-10-11)
------------------
- add --rotation
- allow palette PNG images with ICC profile
- sort globbing result on windows
- convert 8-bit PNG alpha channels to /SMasks in PDF
- remove pdfrw from tests
0.4.1 (2021-05-09)
------------------
- support wildcards in paths on windows
- support MPO images
- fix page border computation
- use "img2pdf" logger instead of "root" logger
- add --from-file
0.4.0 (2020-08-07)
------------------
- replace --without-pdfrw by --engine=internal or --engine=pdfrw
- add pikepdf as additional rendering engine and add --engine=pikepdf
- support for creating PDF/A-1b compliant PDF using the --pdfa option
(this also requires the presence of an ICC profile somewhere on the system)
- support for images with embedded ICC profile as input
- rewrite tests
* use pytest via tox
* use pikepdf instead of pdfrw
* use imagemagick json output instead of identify -verbose
- format all code with black
0.3.6 (2020-04-30)
------------------
- fix tests for Fedora on arm64
0.3.5 (2020-04-28)
------------------
- remove all Python 2 support
- disable pdfrw by default
0.3.4 (2020-04-05)
------------------
- test.sh: replace imagemagick with custom python script to produce bit-by-bit
identical results on all architectures
- add --crop-border, --bleed-border, --trim-border and --art-border options
- first draft of a rudimentary tkinter gui (run with --gui)
0.3.3 (2019-01-07)
------------------
- restore basic support for Python 2
- also ship test.sh
- add legal and tabloid paper formats
- respect exif rotation tag
0.3.2 (2018-11-20)
------------------
- support big endian TIFF with lsb-to-msb FillOrder
- support multipage CCITT Group 4 TIFF
- also reject palette images with transparency
- support PNG images with 1, 2, 4 or 16 bits per sample
- support multipage TIFF with differently encoded images
- support CCITT Group4 TIFF without rows-per-strip
- add extensive test suite
0.3.1 (2018-08-04)
------------------
- Directly copy data from CCITT Group 4 encoded TIFF images into the PDF
container without re-encoding
0.3.0 (2018-06-18)
------------------
- Store non-jpeg images using PNG compression
- Support arbitrarily large pages via PDF /UserUnit field
- Disallow input with alpha channel as it cannot be preserved
- Add option --pillow-limit-break to support very large input
0.2.4 (2017-05-23)
------------------
- Restore support for Python 2.7
- Add support for PyPy
- Add support for testing using tox
0.2.3 (2017-01-20)
------------------
- version number bump for botched pypi upload...
0.2.2 (2017-01-20)
------------------
- automatic monochrome CCITT Group4 encoding via Pillow/libtiff
0.2.1 (2016-05-04)
------------------
- set img2pdf as /producer value
- support multi-frame images like multipage TIFF and animated GIF
- support for palette images like GIF
- support all colorspaces and imageformats known by PIL
- read horizontal and vertical dpi from JPEG2000 files
0.2.0 (2015-05-10)
------------------
- now Python3 only
- pep8 compliant code
- update my email to josch@mister-muffin.de
- move from github to gitlab.mister-muffin.de/josch/img2pdf
- use logging module
- add extensive test suite
- ability to read from standard input
- pdf writer:
- make more compatible with the interface of pdfrw module
- print floats which equal to their integer conversion as integer
- do not print trailing zeroes for floating point numbers
- print more linebreaks
- add binary string at beginning of PDF to indicate that the PDF
contains binary data
- handle datetime and unicode strings by using utf-16-be encoding
- new options (see --help for more details):
- --without-pdfrw
- --imgsize
- --border
- --fit
- --auto-orient
- --viewer-panes
- --viewer-initial-page
- --viewer-magnification
- --viewer-page-layout
- --viewer-fit-window
- --viewer-center-window
- --viewer-fullscreen
- remove short options for metadata command line arguments
- correctly encode and escape non-ascii metadata
- explicitly store date in UTC and allow parsing all date formats understood
by dateutil and `date --date`
0.1.5 (2015-02-16)
------------------
- Enable support for CMYK images
- Rework test suite
- support file objects as input
0.1.4 (2015-01-21)
------------------
- add Python 3 support
- make output reproducible by sorting and --nodate option
0.1.3 (2014-11-10)
------------------
- Avoid leaking file descriptors
- Convert unrecognized colorspaces to RGB
0.1.1 (2014-09-07)
------------------
- allow running src/img2pdf.py standalone
- license change from GPL to LGPL
- Add pillow 2.4.0 support
- add options to specify pdf dimensions in points
0.1.0 (2014-03-14, unreleased)
------------------
- Initial PyPI release.
- Modified code to create proper package.
- Added tests.
- Added console script entry point.
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1705430659.0
img2pdf-0.6.1/LICENSE 0000644 0001750 0001750 00000016744 14551547203 013133 0 ustar 00josch josch GNU LESSER GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc.
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
This version of the GNU Lesser General Public License incorporates
the terms and conditions of version 3 of the GNU General Public
License, supplemented by the additional permissions listed below.
0. Additional Definitions.
As used herein, "this License" refers to version 3 of the GNU Lesser
General Public License, and the "GNU GPL" refers to version 3 of the GNU
General Public License.
"The Library" refers to a covered work governed by this License,
other than an Application or a Combined Work as defined below.
An "Application" is any work that makes use of an interface provided
by the Library, but which is not otherwise based on the Library.
Defining a subclass of a class defined by the Library is deemed a mode
of using an interface provided by the Library.
A "Combined Work" is a work produced by combining or linking an
Application with the Library. The particular version of the Library
with which the Combined Work was made is also called the "Linked
Version".
The "Minimal Corresponding Source" for a Combined Work means the
Corresponding Source for the Combined Work, excluding any source code
for portions of the Combined Work that, considered in isolation, are
based on the Application, and not on the Linked Version.
The "Corresponding Application Code" for a Combined Work means the
object code and/or source code for the Application, including any data
and utility programs needed for reproducing the Combined Work from the
Application, but excluding the System Libraries of the Combined Work.
1. Exception to Section 3 of the GNU GPL.
You may convey a covered work under sections 3 and 4 of this License
without being bound by section 3 of the GNU GPL.
2. Conveying Modified Versions.
If you modify a copy of the Library, and, in your modifications, a
facility refers to a function or data to be supplied by an Application
that uses the facility (other than as an argument passed when the
facility is invoked), then you may convey a copy of the modified
version:
a) under this License, provided that you make a good faith effort to
ensure that, in the event an Application does not supply the
function or data, the facility still operates, and performs
whatever part of its purpose remains meaningful, or
b) under the GNU GPL, with none of the additional permissions of
this License applicable to that copy.
3. Object Code Incorporating Material from Library Header Files.
The object code form of an Application may incorporate material from
a header file that is part of the Library. You may convey such object
code under terms of your choice, provided that, if the incorporated
material is not limited to numerical parameters, data structure
layouts and accessors, or small macros, inline functions and templates
(ten or fewer lines in length), you do both of the following:
a) Give prominent notice with each copy of the object code that the
Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the object code with a copy of the GNU GPL and this license
document.
4. Combined Works.
You may convey a Combined Work under terms of your choice that,
taken together, effectively do not restrict modification of the
portions of the Library contained in the Combined Work and reverse
engineering for debugging such modifications, if you also do each of
the following:
a) Give prominent notice with each copy of the Combined Work that
the Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the Combined Work with a copy of the GNU GPL and this license
document.
c) For a Combined Work that displays copyright notices during
execution, include the copyright notice for the Library among
these notices, as well as a reference directing the user to the
copies of the GNU GPL and this license document.
d) Do one of the following:
0) Convey the Minimal Corresponding Source under the terms of this
License, and the Corresponding Application Code in a form
suitable for, and under terms that permit, the user to
recombine or relink the Application with a modified version of
the Linked Version to produce a modified Combined Work, in the
manner specified by section 6 of the GNU GPL for conveying
Corresponding Source.
1) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (a) uses at run time
a copy of the Library already present on the user's computer
system, and (b) will operate properly with a modified version
of the Library that is interface-compatible with the Linked
Version.
e) Provide Installation Information, but only if you would otherwise
be required to provide such information under section 6 of the
GNU GPL, and only to the extent that such information is
necessary to install and execute a modified version of the
Combined Work produced by recombining or relinking the
Application with a modified version of the Linked Version. (If
you use option 4d0, the Installation Information must accompany
the Minimal Corresponding Source and Corresponding Application
Code. If you use option 4d1, you must provide the Installation
Information in the manner specified by section 6 of the GNU GPL
for conveying Corresponding Source.)
5. Combined Libraries.
You may place library facilities that are a work based on the
Library side by side in a single library together with other library
facilities that are not Applications and are not covered by this
License, and convey such a combined library under terms of your
choice, if you do both of the following:
a) Accompany the combined library with a copy of the same work based
on the Library, uncombined with any other library facilities,
conveyed under the terms of this License.
b) Give prominent notice with the combined library that part of it
is a work based on the Library, and explaining where to find the
accompanying uncombined form of the same work.
6. Revised Versions of the GNU Lesser General Public License.
The Free Software Foundation may publish revised and/or new versions
of the GNU Lesser General Public License from time to time. Such new
versions will be similar in spirit to the present version, but may
differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the
Library as you received it specifies that a certain numbered version
of the GNU Lesser General Public License "or any later version"
applies to it, you have the option of following the terms and
conditions either of that published version or of any later version
published by the Free Software Foundation. If the Library as you
received it does not specify a version number of the GNU Lesser
General Public License, you may choose any version of the GNU Lesser
General Public License ever published by the Free Software Foundation.
If the Library as you received it specifies that a proxy can decide
whether future versions of the GNU Lesser General Public License shall
apply, that proxy's public statement of acceptance of any version is
permanent authorization for you to choose that version for the
Library.
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1705430659.0
img2pdf-0.6.1/MANIFEST.in 0000644 0001750 0001750 00000000424 14551547203 013650 0 ustar 00josch josch include README.md
include test_comp.sh
include test.sh
include magick.py
include CHANGES.rst
include LICENSE
recursive-include src *.jpg
recursive-include src *.pdf
recursive-include src *.png
recursive-include src *.tif
recursive-include src *.gif
recursive-include src *.py
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 010212 x ustar 00 28 mtime=1745772903.5349615
img2pdf-0.6.1/PKG-INFO 0000644 0001750 0001750 00000032622 15003460550 013204 0 ustar 00josch josch Metadata-Version: 2.1
Name: img2pdf
Version: 0.6.1
Summary: Convert images to PDF via direct JPEG inclusion.
Home-page: https://gitlab.mister-muffin.de/josch/img2pdf
Download-URL: https://gitlab.mister-muffin.de/josch/img2pdf/repository/archive.tar.gz?ref=0.6.1
Author: Johannes Schauer Marin Rodrigues
Author-email: josch@mister-muffin.de
License: LGPL
Keywords: jpeg pdf converter
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Other Audience
Classifier: Environment :: Console
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
Provides-Extra: gui
License-File: LICENSE
[](https://app.travis-ci.com/josch/img2pdf)
[](https://ci.appveyor.com/project/josch/img2pdf/branch/main)
img2pdf
=======
Lossless conversion of raster images to PDF. You should use img2pdf if your
priorities are (in this order):
1. **always lossless**: the image embedded in the PDF will always have the
exact same color information for every pixel as the input
2. **small**: if possible, the difference in filesize between the input image
and the output PDF will only be the overhead of the PDF container itself
3. **fast**: if possible, the input image is just pasted into the PDF document
as-is without any CPU hungry re-encoding of the pixel data
Conventional conversion software (like ImageMagick) would either:
1. not be lossless because lossy re-encoding to JPEG
2. not be small because using wasteful flate encoding of raw pixel data
3. not be fast because input data gets re-encoded
Another advantage of not having to re-encode the input (in most common
situations) is, that img2pdf is able to handle much larger input than other
software, because the raw pixel data never has to be loaded into memory.
The following table shows how img2pdf handles different input depending on the
input file format and image color space.
| Format | Colorspace | Result |
| ------------------------------------- | ------------------------------------ | ------------- |
| JPEG | any | direct |
| JPEG2000 | any | direct |
| PNG (non-interlaced, no transparency) | any | direct |
| TIFF (CCITT Group 4) | 1-bit monochrome | direct |
| JBIG2 (single-page generic coding) | 1-bit monochrome | direct |
| any | any except CMYK and 1-bit monochrome | PNG Paeth |
| any | 1-bit monochrome | CCITT Group 4 |
| any | CMYK | flate |
For JPEG, JPEG2000, non-interlaced PNG, TIFF images with CCITT Group 4
encoded data, and JBIG2 with single-page generic coding (e.g. using `jbig2enc`),
img2pdf directly embeds the image data into the PDF without
re-encoding it. It thus treats the PDF format merely as a container format for
the image data. In these cases, img2pdf only increases the filesize by the size
of the PDF container (typically around 500 to 700 bytes). Since data is only
copied and not re-encoded, img2pdf is also typically faster than other
solutions for these input formats.
For all other input types, img2pdf first has to transform the pixel data to
make it compatible with PDF. In most cases, the PNG Paeth filter is applied to
the pixel data. For 1-bit monochrome input, CCITT Group 4 is used instead. Only for
CMYK input no filter is applied before finally applying flate compression.
Usage
-----
The images must be provided as files because img2pdf needs to seek in the file
descriptor.
If no output file is specified with the `-o`/`--output` option, output will be
done to stdout. A typical invocation is:
$ img2pdf img1.png img2.jpg -o out.pdf
The detailed documentation can be accessed by running:
$ img2pdf --help
With no command line arguments supplied, img2pdf will read a single image from
standard input and write the resulting PDF to standard output. Here is an
example for how to scan directly to PDF using scanimage(1) from SANE:
$ scanimage --mode=Color --resolution=300 | pnmtojpeg -quality 90 | img2pdf > scan.pdf
Bugs
----
- If you find a JPEG, JPEG2000, PNG or CCITT Group 4 encoded TIFF file that,
when embedded into the PDF cannot be read by the Adobe Acrobat Reader,
please contact me.
- An error is produced if the input image is broken. This commonly happens if
the input image has an invalid EXIF Orientation value of zero. Even though
only nine different values from 1 to 9 are permitted, Anroid phones and
Canon DSLR cameras produce JPEG images with the invalid value of zero.
Either fix your input images with `exiftool` or similar software before
passing the JPEG to `img2pdf` or run `img2pdf` with `--rotation=ifvalid`
(if you run img2pdf from the commandline) or by passing
`rotation=img2pdf.Rotation.ifvalid` as an argument to `convert()` when using
img2pdf as a library.
- img2pdf uses PIL (or Pillow) to obtain image meta data and to convert the
input if necessary. To prevent decompression bomb denial of service attacks,
Pillow limits the maximum number of pixels an input image is allowed to
have. If you are sure that you know what you are doing, then you can disable
this safeguard by passing the `--pillow-limit-break` option to img2pdf. This
allows one to process even very large input images.
Installation
------------
On a Debian- and Ubuntu-based systems, img2pdf can be installed from the
official repositories:
$ apt install img2pdf
If you want to install it using pip, you can run:
$ pip3 install img2pdf
If you prefer to install from source code use:
$ cd img2pdf/
$ pip3 install .
To test the console script without installing the package on your system,
use virtualenv:
$ cd img2pdf/
$ virtualenv ve
$ ve/bin/pip3 install .
You can then test the converter using:
$ ve/bin/img2pdf -o test.pdf src/tests/test.jpg
If you don't want to setup Python on Windows, then head to the
[releases](https://gitlab.mister-muffin.de/josch/img2pdf/releases) section and download the latest
`img2pdf.exe`.
GUI
---
There exists an experimental GUI with all settings currently disabled. You can
directly convert images to PDF but you cannot set any options via the GUI yet.
If you are interested in adding more features to the PDF, please submit a merge
request. The GUI is based on tkinter and works on Linux, Windows and MacOS.

Library
-------
The package can also be used as a library:
import img2pdf
# opening from filename
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg'))
# opening from file handle
with open("name.pdf","wb") as f1, open("test.jpg") as f2:
f1.write(img2pdf.convert(f2))
# opening using pathlib
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(pathlib.Path('test.jpg')))
# using in-memory image data
with open("name.pdf","wb") as f:
f.write(img2pdf.convert("\x89PNG...")
# multiple inputs (variant 1)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert("test1.jpg", "test2.png"))
# multiple inputs (variant 2)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(["test1.jpg", "test2.png"]))
# convert all files ending in .jpg inside a directory
dirname = "/path/to/images"
imgs = []
for fname in os.listdir(dirname):
if not fname.endswith(".jpg"):
continue
path = os.path.join(dirname, fname)
if os.path.isdir(path):
continue
imgs.append(path)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
# convert all files ending in .jpg in a directory and its subdirectories
dirname = "/path/to/images"
imgs = []
for r, _, f in os.walk(dirname):
for fname in f:
if not fname.endswith(".jpg"):
continue
imgs.append(os.path.join(r, fname))
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
# convert all files matching a glob
import glob
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(glob.glob("/path/to/*.jpg")))
# convert all files matching a glob using pathlib.Path
from pathlib import Path
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(*Path("/path").glob("**/*.jpg")))
# ignore invalid rotation values in the input images
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg'), rotation=img2pdf.Rotation.ifvalid)
# writing to file descriptor
with open("name.pdf","wb") as f1, open("test.jpg") as f2:
img2pdf.convert(f2, outputstream=f1)
# specify paper size (A4)
a4inpt = (img2pdf.mm_to_pt(210),img2pdf.mm_to_pt(297))
layout_fun = img2pdf.get_layout_fun(a4inpt)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun))
# use a fixed dpi of 300 instead of reading it from the image
dpix = dpiy = 300
layout_fun = img2pdf.get_fixed_dpi_layout_fun((dpix, dpiy))
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun))
# create a PDF/A-1b compliant document by passing an ICC profile
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', pdfa="/usr/share/color/icc/sRGB.icc"))
Comparison to ImageMagick
-------------------------
Create a large test image:
$ convert logo: -resize 8000x original.jpg
Convert it into PDF using ImageMagick and img2pdf:
$ time img2pdf original.jpg -o img2pdf.pdf
$ time convert original.jpg imagemagick.pdf
Notice how ImageMagick took an order of magnitude longer to do the conversion
than img2pdf. It also used twice the memory.
Now extract the image data from both PDF documents and compare it to the
original:
$ pdfimages -all img2pdf.pdf tmp
$ compare -metric AE original.jpg tmp-000.jpg null:
0
$ pdfimages -all imagemagick.pdf tmp
$ compare -metric AE original.jpg tmp-000.jpg null:
118716
To get lossless output with ImageMagick we can use Zip compression but that
unnecessarily increases the size of the output:
$ convert original.jpg -compress Zip imagemagick.pdf
$ pdfimages -all imagemagick.pdf tmp
$ compare -metric AE original.jpg tmp-000.png null:
0
$ stat --format="%s %n" original.jpg img2pdf.pdf imagemagick.pdf
1535837 original.jpg
1536683 img2pdf.pdf
9397809 imagemagick.pdf
Comparison to pdfLaTeX
----------------------
pdfLaTeX performs a lossless conversion from included images to PDF by default.
If the input is a JPEG, then it simply embeds the JPEG into the PDF in the same
way as img2pdf does it. But for other image formats it uses flate compression
of the plain pixel data and thus needlessly increases the output file size:
$ convert logo: -resize 8000x original.png
$ cat << END > pdflatex.tex
\documentclass{article}
\usepackage{graphicx}
\begin{document}
\includegraphics{original.png}
\end{document}
END
$ pdflatex pdflatex.tex
$ stat --format="%s %n" original.png pdflatex.pdf
4500182 original.png
9318120 pdflatex.pdf
Comparison to podofoimg2pdf
---------------------------
Like pdfLaTeX, podofoimg2pdf is able to perform a lossless conversion from JPEG
to PDF by plainly embedding the JPEG data into the pdf container. But just like
pdfLaTeX it uses flate compression for all other file formats, thus sometimes
resulting in larger files than necessary.
$ convert logo: -resize 8000x original.png
$ podofoimg2pdf out.pdf original.png
stat --format="%s %n" original.png out.pdf
4500181 original.png
9335629 out.pdf
It also only supports JPEG, PNG and TIF as input and lacks many of the
convenience features of img2pdf like page sizes, borders, rotation and
metadata.
Comparison to Tesseract OCR
---------------------------
Tesseract OCR comes closest to the functionality img2pdf provides. It is able
to convert JPEG and PNG input to PDF without needlessly increasing the filesize
and is at the same time lossless. So if your input is JPEG and PNG images, then
you should safely be able to use Tesseract instead of img2pdf. For other input,
Tesseract might not do a lossless conversion. For example it converts CMYK
input to RGB and removes the alpha channel from images with transparency. For
multipage TIFF or animated GIF, it will only convert the first frame.
Comparison to econvert from ExactImage
--------------------------------------
Like pdflatex and podofoimg2pf, econvert is able to embed JPEG images into PDF
directly without re-encoding but when given other file formats, it stores them
just using flate compressen, which unnecessarily increases the filesize.
Furthermore, it throws an error with CMYK TIF input. It also doesn't store CMYK
jpeg files as CMYK but converts them to RGB, so it's not lossless. When trying
to feed it 16bit files, it errors out with Unhandled bps/spp combination. It
also seems to choose JPEG encoding when using it on some file types (like
palette images) making it again not lossless for that input as well.
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1742957678.0
img2pdf-0.6.1/README.md 0000644 0001750 0001750 00000030503 14770666156 013406 0 ustar 00josch josch [](https://app.travis-ci.com/josch/img2pdf)
[](https://ci.appveyor.com/project/josch/img2pdf/branch/main)
img2pdf
=======
Lossless conversion of raster images to PDF. You should use img2pdf if your
priorities are (in this order):
1. **always lossless**: the image embedded in the PDF will always have the
exact same color information for every pixel as the input
2. **small**: if possible, the difference in filesize between the input image
and the output PDF will only be the overhead of the PDF container itself
3. **fast**: if possible, the input image is just pasted into the PDF document
as-is without any CPU hungry re-encoding of the pixel data
Conventional conversion software (like ImageMagick) would either:
1. not be lossless because lossy re-encoding to JPEG
2. not be small because using wasteful flate encoding of raw pixel data
3. not be fast because input data gets re-encoded
Another advantage of not having to re-encode the input (in most common
situations) is, that img2pdf is able to handle much larger input than other
software, because the raw pixel data never has to be loaded into memory.
The following table shows how img2pdf handles different input depending on the
input file format and image color space.
| Format | Colorspace | Result |
| ------------------------------------- | ------------------------------------ | ------------- |
| JPEG | any | direct |
| JPEG2000 | any | direct |
| PNG (non-interlaced, no transparency) | any | direct |
| TIFF (CCITT Group 4) | 1-bit monochrome | direct |
| JBIG2 (single-page generic coding) | 1-bit monochrome | direct |
| any | any except CMYK and 1-bit monochrome | PNG Paeth |
| any | 1-bit monochrome | CCITT Group 4 |
| any | CMYK | flate |
For JPEG, JPEG2000, non-interlaced PNG, TIFF images with CCITT Group 4
encoded data, and JBIG2 with single-page generic coding (e.g. using `jbig2enc`),
img2pdf directly embeds the image data into the PDF without
re-encoding it. It thus treats the PDF format merely as a container format for
the image data. In these cases, img2pdf only increases the filesize by the size
of the PDF container (typically around 500 to 700 bytes). Since data is only
copied and not re-encoded, img2pdf is also typically faster than other
solutions for these input formats.
For all other input types, img2pdf first has to transform the pixel data to
make it compatible with PDF. In most cases, the PNG Paeth filter is applied to
the pixel data. For 1-bit monochrome input, CCITT Group 4 is used instead. Only for
CMYK input no filter is applied before finally applying flate compression.
Usage
-----
The images must be provided as files because img2pdf needs to seek in the file
descriptor.
If no output file is specified with the `-o`/`--output` option, output will be
done to stdout. A typical invocation is:
$ img2pdf img1.png img2.jpg -o out.pdf
The detailed documentation can be accessed by running:
$ img2pdf --help
With no command line arguments supplied, img2pdf will read a single image from
standard input and write the resulting PDF to standard output. Here is an
example for how to scan directly to PDF using scanimage(1) from SANE:
$ scanimage --mode=Color --resolution=300 | pnmtojpeg -quality 90 | img2pdf > scan.pdf
Bugs
----
- If you find a JPEG, JPEG2000, PNG or CCITT Group 4 encoded TIFF file that,
when embedded into the PDF cannot be read by the Adobe Acrobat Reader,
please contact me.
- An error is produced if the input image is broken. This commonly happens if
the input image has an invalid EXIF Orientation value of zero. Even though
only nine different values from 1 to 9 are permitted, Anroid phones and
Canon DSLR cameras produce JPEG images with the invalid value of zero.
Either fix your input images with `exiftool` or similar software before
passing the JPEG to `img2pdf` or run `img2pdf` with `--rotation=ifvalid`
(if you run img2pdf from the commandline) or by passing
`rotation=img2pdf.Rotation.ifvalid` as an argument to `convert()` when using
img2pdf as a library.
- img2pdf uses PIL (or Pillow) to obtain image meta data and to convert the
input if necessary. To prevent decompression bomb denial of service attacks,
Pillow limits the maximum number of pixels an input image is allowed to
have. If you are sure that you know what you are doing, then you can disable
this safeguard by passing the `--pillow-limit-break` option to img2pdf. This
allows one to process even very large input images.
Installation
------------
On a Debian- and Ubuntu-based systems, img2pdf can be installed from the
official repositories:
$ apt install img2pdf
If you want to install it using pip, you can run:
$ pip3 install img2pdf
If you prefer to install from source code use:
$ cd img2pdf/
$ pip3 install .
To test the console script without installing the package on your system,
use virtualenv:
$ cd img2pdf/
$ virtualenv ve
$ ve/bin/pip3 install .
You can then test the converter using:
$ ve/bin/img2pdf -o test.pdf src/tests/test.jpg
If you don't want to setup Python on Windows, then head to the
[releases](https://gitlab.mister-muffin.de/josch/img2pdf/releases) section and download the latest
`img2pdf.exe`.
GUI
---
There exists an experimental GUI with all settings currently disabled. You can
directly convert images to PDF but you cannot set any options via the GUI yet.
If you are interested in adding more features to the PDF, please submit a merge
request. The GUI is based on tkinter and works on Linux, Windows and MacOS.

Library
-------
The package can also be used as a library:
import img2pdf
# opening from filename
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg'))
# opening from file handle
with open("name.pdf","wb") as f1, open("test.jpg") as f2:
f1.write(img2pdf.convert(f2))
# opening using pathlib
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(pathlib.Path('test.jpg')))
# using in-memory image data
with open("name.pdf","wb") as f:
f.write(img2pdf.convert("\x89PNG...")
# multiple inputs (variant 1)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert("test1.jpg", "test2.png"))
# multiple inputs (variant 2)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(["test1.jpg", "test2.png"]))
# convert all files ending in .jpg inside a directory
dirname = "/path/to/images"
imgs = []
for fname in os.listdir(dirname):
if not fname.endswith(".jpg"):
continue
path = os.path.join(dirname, fname)
if os.path.isdir(path):
continue
imgs.append(path)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
# convert all files ending in .jpg in a directory and its subdirectories
dirname = "/path/to/images"
imgs = []
for r, _, f in os.walk(dirname):
for fname in f:
if not fname.endswith(".jpg"):
continue
imgs.append(os.path.join(r, fname))
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
# convert all files matching a glob
import glob
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(glob.glob("/path/to/*.jpg")))
# convert all files matching a glob using pathlib.Path
from pathlib import Path
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(*Path("/path").glob("**/*.jpg")))
# ignore invalid rotation values in the input images
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg'), rotation=img2pdf.Rotation.ifvalid)
# writing to file descriptor
with open("name.pdf","wb") as f1, open("test.jpg") as f2:
img2pdf.convert(f2, outputstream=f1)
# specify paper size (A4)
a4inpt = (img2pdf.mm_to_pt(210),img2pdf.mm_to_pt(297))
layout_fun = img2pdf.get_layout_fun(a4inpt)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun))
# use a fixed dpi of 300 instead of reading it from the image
dpix = dpiy = 300
layout_fun = img2pdf.get_fixed_dpi_layout_fun((dpix, dpiy))
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun))
# create a PDF/A-1b compliant document by passing an ICC profile
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', pdfa="/usr/share/color/icc/sRGB.icc"))
Comparison to ImageMagick
-------------------------
Create a large test image:
$ convert logo: -resize 8000x original.jpg
Convert it into PDF using ImageMagick and img2pdf:
$ time img2pdf original.jpg -o img2pdf.pdf
$ time convert original.jpg imagemagick.pdf
Notice how ImageMagick took an order of magnitude longer to do the conversion
than img2pdf. It also used twice the memory.
Now extract the image data from both PDF documents and compare it to the
original:
$ pdfimages -all img2pdf.pdf tmp
$ compare -metric AE original.jpg tmp-000.jpg null:
0
$ pdfimages -all imagemagick.pdf tmp
$ compare -metric AE original.jpg tmp-000.jpg null:
118716
To get lossless output with ImageMagick we can use Zip compression but that
unnecessarily increases the size of the output:
$ convert original.jpg -compress Zip imagemagick.pdf
$ pdfimages -all imagemagick.pdf tmp
$ compare -metric AE original.jpg tmp-000.png null:
0
$ stat --format="%s %n" original.jpg img2pdf.pdf imagemagick.pdf
1535837 original.jpg
1536683 img2pdf.pdf
9397809 imagemagick.pdf
Comparison to pdfLaTeX
----------------------
pdfLaTeX performs a lossless conversion from included images to PDF by default.
If the input is a JPEG, then it simply embeds the JPEG into the PDF in the same
way as img2pdf does it. But for other image formats it uses flate compression
of the plain pixel data and thus needlessly increases the output file size:
$ convert logo: -resize 8000x original.png
$ cat << END > pdflatex.tex
\documentclass{article}
\usepackage{graphicx}
\begin{document}
\includegraphics{original.png}
\end{document}
END
$ pdflatex pdflatex.tex
$ stat --format="%s %n" original.png pdflatex.pdf
4500182 original.png
9318120 pdflatex.pdf
Comparison to podofoimg2pdf
---------------------------
Like pdfLaTeX, podofoimg2pdf is able to perform a lossless conversion from JPEG
to PDF by plainly embedding the JPEG data into the pdf container. But just like
pdfLaTeX it uses flate compression for all other file formats, thus sometimes
resulting in larger files than necessary.
$ convert logo: -resize 8000x original.png
$ podofoimg2pdf out.pdf original.png
stat --format="%s %n" original.png out.pdf
4500181 original.png
9335629 out.pdf
It also only supports JPEG, PNG and TIF as input and lacks many of the
convenience features of img2pdf like page sizes, borders, rotation and
metadata.
Comparison to Tesseract OCR
---------------------------
Tesseract OCR comes closest to the functionality img2pdf provides. It is able
to convert JPEG and PNG input to PDF without needlessly increasing the filesize
and is at the same time lossless. So if your input is JPEG and PNG images, then
you should safely be able to use Tesseract instead of img2pdf. For other input,
Tesseract might not do a lossless conversion. For example it converts CMYK
input to RGB and removes the alpha channel from images with transparency. For
multipage TIFF or animated GIF, it will only convert the first frame.
Comparison to econvert from ExactImage
--------------------------------------
Like pdflatex and podofoimg2pf, econvert is able to embed JPEG images into PDF
directly without re-encoding but when given other file formats, it stores them
just using flate compressen, which unnecessarily increases the filesize.
Furthermore, it throws an error with CMYK TIF input. It also doesn't store CMYK
jpeg files as CMYK but converts them to RGB, so it's not lossless. When trying
to feed it 16bit files, it errors out with Unhandled bps/spp combination. It
also seems to choose JPEG encoding when using it on some file types (like
palette images) making it again not lossless for that input as well.
././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 010212 x ustar 00 28 mtime=1745772903.5349615
img2pdf-0.6.1/setup.cfg 0000644 0001750 0001750 00000000046 15003460550 013723 0 ustar 00josch josch [egg_info]
tag_build =
tag_date = 0
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772836.0
img2pdf-0.6.1/setup.py 0000644 0001750 0001750 00000003256 15003460444 013624 0 ustar 00josch josch import sys
from setuptools import setup
VERSION = "0.6.1"
INSTALL_REQUIRES = (
"Pillow",
"pikepdf",
)
setup(
name="img2pdf",
version=VERSION,
author="Johannes Schauer Marin Rodrigues",
author_email="josch@mister-muffin.de",
description="Convert images to PDF via direct JPEG inclusion.",
long_description=open("README.md").read(),
long_description_content_type="text/markdown",
license="LGPL",
keywords="jpeg pdf converter",
classifiers=[
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Other Audience",
"Environment :: Console",
"Programming Language :: Python",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.5",
"Programming Language :: Python :: Implementation :: CPython",
"Programming Language :: Python :: Implementation :: PyPy",
"License :: OSI Approved :: GNU Lesser General Public License v3 " "(LGPLv3)",
"Natural Language :: English",
"Operating System :: OS Independent",
],
url="https://gitlab.mister-muffin.de/josch/img2pdf",
download_url="https://gitlab.mister-muffin.de/josch/img2pdf/repository/"
"archive.tar.gz?ref=" + VERSION,
package_dir={"": "src"},
py_modules=["img2pdf", "jp2"],
include_package_data=True,
zip_safe=True,
install_requires=INSTALL_REQUIRES,
extras_require={
"gui": ("tkinter"),
},
entry_points={
"setuptools.installation": ["eggsecutable = img2pdf:main"],
"console_scripts": ["img2pdf = img2pdf:main"],
"gui_scripts": ["img2pdf-gui = img2pdf:gui"],
},
)
././@PaxHeader 0000000 0000000 0000000 00000000033 00000000000 010211 x ustar 00 27 mtime=1745772903.510961
img2pdf-0.6.1/src/ 0000755 0001750 0001750 00000000000 15003460550 012671 5 ustar 00josch josch ././@PaxHeader 0000000 0000000 0000000 00000000033 00000000000 010211 x ustar 00 27 mtime=1745772903.514961
img2pdf-0.6.1/src/img2pdf.egg-info/ 0000755 0001750 0001750 00000000000 15003460550 015713 5 ustar 00josch josch ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772903.0
img2pdf-0.6.1/src/img2pdf.egg-info/PKG-INFO 0000644 0001750 0001750 00000032622 15003460547 017023 0 ustar 00josch josch Metadata-Version: 2.1
Name: img2pdf
Version: 0.6.1
Summary: Convert images to PDF via direct JPEG inclusion.
Home-page: https://gitlab.mister-muffin.de/josch/img2pdf
Download-URL: https://gitlab.mister-muffin.de/josch/img2pdf/repository/archive.tar.gz?ref=0.6.1
Author: Johannes Schauer Marin Rodrigues
Author-email: josch@mister-muffin.de
License: LGPL
Keywords: jpeg pdf converter
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Other Audience
Classifier: Environment :: Console
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Description-Content-Type: text/markdown
Provides-Extra: gui
License-File: LICENSE
[](https://app.travis-ci.com/josch/img2pdf)
[](https://ci.appveyor.com/project/josch/img2pdf/branch/main)
img2pdf
=======
Lossless conversion of raster images to PDF. You should use img2pdf if your
priorities are (in this order):
1. **always lossless**: the image embedded in the PDF will always have the
exact same color information for every pixel as the input
2. **small**: if possible, the difference in filesize between the input image
and the output PDF will only be the overhead of the PDF container itself
3. **fast**: if possible, the input image is just pasted into the PDF document
as-is without any CPU hungry re-encoding of the pixel data
Conventional conversion software (like ImageMagick) would either:
1. not be lossless because lossy re-encoding to JPEG
2. not be small because using wasteful flate encoding of raw pixel data
3. not be fast because input data gets re-encoded
Another advantage of not having to re-encode the input (in most common
situations) is, that img2pdf is able to handle much larger input than other
software, because the raw pixel data never has to be loaded into memory.
The following table shows how img2pdf handles different input depending on the
input file format and image color space.
| Format | Colorspace | Result |
| ------------------------------------- | ------------------------------------ | ------------- |
| JPEG | any | direct |
| JPEG2000 | any | direct |
| PNG (non-interlaced, no transparency) | any | direct |
| TIFF (CCITT Group 4) | 1-bit monochrome | direct |
| JBIG2 (single-page generic coding) | 1-bit monochrome | direct |
| any | any except CMYK and 1-bit monochrome | PNG Paeth |
| any | 1-bit monochrome | CCITT Group 4 |
| any | CMYK | flate |
For JPEG, JPEG2000, non-interlaced PNG, TIFF images with CCITT Group 4
encoded data, and JBIG2 with single-page generic coding (e.g. using `jbig2enc`),
img2pdf directly embeds the image data into the PDF without
re-encoding it. It thus treats the PDF format merely as a container format for
the image data. In these cases, img2pdf only increases the filesize by the size
of the PDF container (typically around 500 to 700 bytes). Since data is only
copied and not re-encoded, img2pdf is also typically faster than other
solutions for these input formats.
For all other input types, img2pdf first has to transform the pixel data to
make it compatible with PDF. In most cases, the PNG Paeth filter is applied to
the pixel data. For 1-bit monochrome input, CCITT Group 4 is used instead. Only for
CMYK input no filter is applied before finally applying flate compression.
Usage
-----
The images must be provided as files because img2pdf needs to seek in the file
descriptor.
If no output file is specified with the `-o`/`--output` option, output will be
done to stdout. A typical invocation is:
$ img2pdf img1.png img2.jpg -o out.pdf
The detailed documentation can be accessed by running:
$ img2pdf --help
With no command line arguments supplied, img2pdf will read a single image from
standard input and write the resulting PDF to standard output. Here is an
example for how to scan directly to PDF using scanimage(1) from SANE:
$ scanimage --mode=Color --resolution=300 | pnmtojpeg -quality 90 | img2pdf > scan.pdf
Bugs
----
- If you find a JPEG, JPEG2000, PNG or CCITT Group 4 encoded TIFF file that,
when embedded into the PDF cannot be read by the Adobe Acrobat Reader,
please contact me.
- An error is produced if the input image is broken. This commonly happens if
the input image has an invalid EXIF Orientation value of zero. Even though
only nine different values from 1 to 9 are permitted, Anroid phones and
Canon DSLR cameras produce JPEG images with the invalid value of zero.
Either fix your input images with `exiftool` or similar software before
passing the JPEG to `img2pdf` or run `img2pdf` with `--rotation=ifvalid`
(if you run img2pdf from the commandline) or by passing
`rotation=img2pdf.Rotation.ifvalid` as an argument to `convert()` when using
img2pdf as a library.
- img2pdf uses PIL (or Pillow) to obtain image meta data and to convert the
input if necessary. To prevent decompression bomb denial of service attacks,
Pillow limits the maximum number of pixels an input image is allowed to
have. If you are sure that you know what you are doing, then you can disable
this safeguard by passing the `--pillow-limit-break` option to img2pdf. This
allows one to process even very large input images.
Installation
------------
On a Debian- and Ubuntu-based systems, img2pdf can be installed from the
official repositories:
$ apt install img2pdf
If you want to install it using pip, you can run:
$ pip3 install img2pdf
If you prefer to install from source code use:
$ cd img2pdf/
$ pip3 install .
To test the console script without installing the package on your system,
use virtualenv:
$ cd img2pdf/
$ virtualenv ve
$ ve/bin/pip3 install .
You can then test the converter using:
$ ve/bin/img2pdf -o test.pdf src/tests/test.jpg
If you don't want to setup Python on Windows, then head to the
[releases](https://gitlab.mister-muffin.de/josch/img2pdf/releases) section and download the latest
`img2pdf.exe`.
GUI
---
There exists an experimental GUI with all settings currently disabled. You can
directly convert images to PDF but you cannot set any options via the GUI yet.
If you are interested in adding more features to the PDF, please submit a merge
request. The GUI is based on tkinter and works on Linux, Windows and MacOS.

Library
-------
The package can also be used as a library:
import img2pdf
# opening from filename
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg'))
# opening from file handle
with open("name.pdf","wb") as f1, open("test.jpg") as f2:
f1.write(img2pdf.convert(f2))
# opening using pathlib
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(pathlib.Path('test.jpg')))
# using in-memory image data
with open("name.pdf","wb") as f:
f.write(img2pdf.convert("\x89PNG...")
# multiple inputs (variant 1)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert("test1.jpg", "test2.png"))
# multiple inputs (variant 2)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(["test1.jpg", "test2.png"]))
# convert all files ending in .jpg inside a directory
dirname = "/path/to/images"
imgs = []
for fname in os.listdir(dirname):
if not fname.endswith(".jpg"):
continue
path = os.path.join(dirname, fname)
if os.path.isdir(path):
continue
imgs.append(path)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
# convert all files ending in .jpg in a directory and its subdirectories
dirname = "/path/to/images"
imgs = []
for r, _, f in os.walk(dirname):
for fname in f:
if not fname.endswith(".jpg"):
continue
imgs.append(os.path.join(r, fname))
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(imgs))
# convert all files matching a glob
import glob
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(glob.glob("/path/to/*.jpg")))
# convert all files matching a glob using pathlib.Path
from pathlib import Path
with open("name.pdf","wb") as f:
f.write(img2pdf.convert(*Path("/path").glob("**/*.jpg")))
# ignore invalid rotation values in the input images
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg'), rotation=img2pdf.Rotation.ifvalid)
# writing to file descriptor
with open("name.pdf","wb") as f1, open("test.jpg") as f2:
img2pdf.convert(f2, outputstream=f1)
# specify paper size (A4)
a4inpt = (img2pdf.mm_to_pt(210),img2pdf.mm_to_pt(297))
layout_fun = img2pdf.get_layout_fun(a4inpt)
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun))
# use a fixed dpi of 300 instead of reading it from the image
dpix = dpiy = 300
layout_fun = img2pdf.get_fixed_dpi_layout_fun((dpix, dpiy))
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun))
# create a PDF/A-1b compliant document by passing an ICC profile
with open("name.pdf","wb") as f:
f.write(img2pdf.convert('test.jpg', pdfa="/usr/share/color/icc/sRGB.icc"))
Comparison to ImageMagick
-------------------------
Create a large test image:
$ convert logo: -resize 8000x original.jpg
Convert it into PDF using ImageMagick and img2pdf:
$ time img2pdf original.jpg -o img2pdf.pdf
$ time convert original.jpg imagemagick.pdf
Notice how ImageMagick took an order of magnitude longer to do the conversion
than img2pdf. It also used twice the memory.
Now extract the image data from both PDF documents and compare it to the
original:
$ pdfimages -all img2pdf.pdf tmp
$ compare -metric AE original.jpg tmp-000.jpg null:
0
$ pdfimages -all imagemagick.pdf tmp
$ compare -metric AE original.jpg tmp-000.jpg null:
118716
To get lossless output with ImageMagick we can use Zip compression but that
unnecessarily increases the size of the output:
$ convert original.jpg -compress Zip imagemagick.pdf
$ pdfimages -all imagemagick.pdf tmp
$ compare -metric AE original.jpg tmp-000.png null:
0
$ stat --format="%s %n" original.jpg img2pdf.pdf imagemagick.pdf
1535837 original.jpg
1536683 img2pdf.pdf
9397809 imagemagick.pdf
Comparison to pdfLaTeX
----------------------
pdfLaTeX performs a lossless conversion from included images to PDF by default.
If the input is a JPEG, then it simply embeds the JPEG into the PDF in the same
way as img2pdf does it. But for other image formats it uses flate compression
of the plain pixel data and thus needlessly increases the output file size:
$ convert logo: -resize 8000x original.png
$ cat << END > pdflatex.tex
\documentclass{article}
\usepackage{graphicx}
\begin{document}
\includegraphics{original.png}
\end{document}
END
$ pdflatex pdflatex.tex
$ stat --format="%s %n" original.png pdflatex.pdf
4500182 original.png
9318120 pdflatex.pdf
Comparison to podofoimg2pdf
---------------------------
Like pdfLaTeX, podofoimg2pdf is able to perform a lossless conversion from JPEG
to PDF by plainly embedding the JPEG data into the pdf container. But just like
pdfLaTeX it uses flate compression for all other file formats, thus sometimes
resulting in larger files than necessary.
$ convert logo: -resize 8000x original.png
$ podofoimg2pdf out.pdf original.png
stat --format="%s %n" original.png out.pdf
4500181 original.png
9335629 out.pdf
It also only supports JPEG, PNG and TIF as input and lacks many of the
convenience features of img2pdf like page sizes, borders, rotation and
metadata.
Comparison to Tesseract OCR
---------------------------
Tesseract OCR comes closest to the functionality img2pdf provides. It is able
to convert JPEG and PNG input to PDF without needlessly increasing the filesize
and is at the same time lossless. So if your input is JPEG and PNG images, then
you should safely be able to use Tesseract instead of img2pdf. For other input,
Tesseract might not do a lossless conversion. For example it converts CMYK
input to RGB and removes the alpha channel from images with transparency. For
multipage TIFF or animated GIF, it will only convert the first frame.
Comparison to econvert from ExactImage
--------------------------------------
Like pdflatex and podofoimg2pf, econvert is able to embed JPEG images into PDF
directly without re-encoding but when given other file formats, it stores them
just using flate compressen, which unnecessarily increases the filesize.
Furthermore, it throws an error with CMYK TIF input. It also doesn't store CMYK
jpeg files as CMYK but converts them to RGB, so it's not lossless. When trying
to feed it 16bit files, it errors out with Unhandled bps/spp combination. It
also seems to choose JPEG encoding when using it on some file types (like
palette images) making it again not lossless for that input as well.
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772903.0
img2pdf-0.6.1/src/img2pdf.egg-info/SOURCES.txt 0000644 0001750 0001750 00000001507 15003460547 017610 0 ustar 00josch josch CHANGES.rst
LICENSE
MANIFEST.in
README.md
setup.py
test_comp.sh
src/img2pdf.py
src/img2pdf_test.py
src/jp2.py
src/img2pdf.egg-info/PKG-INFO
src/img2pdf.egg-info/SOURCES.txt
src/img2pdf.egg-info/dependency_links.txt
src/img2pdf.egg-info/entry_points.txt
src/img2pdf.egg-info/requires.txt
src/img2pdf.egg-info/top_level.txt
src/img2pdf.egg-info/zip-safe
src/tests/input/CMYK.jpg
src/tests/input/CMYK.tif
src/tests/input/animation.gif
src/tests/input/gray.png
src/tests/input/mono.png
src/tests/input/mono.tif
src/tests/input/normal.jpg
src/tests/input/normal.png
src/tests/output/CMYK.jpg.pdf
src/tests/output/CMYK.tif.pdf
src/tests/output/animation.gif.pdf
src/tests/output/gray.png.pdf
src/tests/output/mono.jb2.pdf
src/tests/output/mono.png.pdf
src/tests/output/mono.tif.pdf
src/tests/output/normal.jpg.pdf
src/tests/output/normal.png.pdf ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772903.0
img2pdf-0.6.1/src/img2pdf.egg-info/dependency_links.txt 0000644 0001750 0001750 00000000001 15003460547 021767 0 ustar 00josch josch
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772903.0
img2pdf-0.6.1/src/img2pdf.egg-info/entry_points.txt 0000644 0001750 0001750 00000000211 15003460547 021211 0 ustar 00josch josch [console_scripts]
img2pdf = img2pdf:main
[gui_scripts]
img2pdf-gui = img2pdf:gui
[setuptools.installation]
eggsecutable = img2pdf:main
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772903.0
img2pdf-0.6.1/src/img2pdf.egg-info/requires.txt 0000644 0001750 0001750 00000000036 15003460547 020320 0 ustar 00josch josch Pillow
pikepdf
[gui]
tkinter
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772903.0
img2pdf-0.6.1/src/img2pdf.egg-info/top_level.txt 0000644 0001750 0001750 00000000014 15003460547 020446 0 ustar 00josch josch img2pdf
jp2
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1698475203.0
img2pdf-0.6.1/src/img2pdf.egg-info/zip-safe 0000644 0001750 0001750 00000000001 14517126303 017347 0 ustar 00josch josch
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745772852.0
img2pdf-0.6.1/src/img2pdf.py 0000755 0001750 0001750 00000520402 15003460464 014605 0 ustar 00josch josch #!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Copyright (C) 2012-2021 Johannes Schauer Marin Rodrigues
#
# This program is free software: you can redistribute it and/or
# modify it under the terms of the GNU Lesser General Public
# License as published by the Free Software Foundation, either
# version 3 of the License, or (at your option) any later
# version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public
# License along with this program. If not, see
# .
import sys
import os
import zlib
import argparse
from PIL import Image, TiffImagePlugin, GifImagePlugin, ImageCms, ExifTags
if hasattr(GifImagePlugin, "LoadingStrategy"):
# Pillow 9.0.0 started emitting all frames but the first as RGB instead of
# P to make sure that more than 256 colors can be represented. But palette
# images compress far better than RGB images in PDF so we instruct Pillow
# to only emit RGB frames if the palette differs and return P otherwise.
# This works since Pillow 9.1.0.
GifImagePlugin.LOADING_STRATEGY = (
GifImagePlugin.LoadingStrategy.RGB_AFTER_DIFFERENT_PALETTE_ONLY
)
# TiffImagePlugin.DEBUG = True
from PIL.ExifTags import TAGS
from datetime import datetime, timezone
import jp2
from enum import Enum
from io import BytesIO
import logging
import struct
import platform
import hashlib
from itertools import chain
import re
import io
logger = logging.getLogger(__name__)
have_pdfrw = True
try:
import pdfrw
except ImportError:
have_pdfrw = False
have_pikepdf = True
try:
import pikepdf
except ImportError:
have_pikepdf = False
__version__ = "0.6.1"
default_dpi = 96.0
papersizes = {
"letter": "8.5inx11in",
"a0": "841mmx1189mm",
"a1": "594mmx841mm",
"a2": "420mmx594mm",
"a3": "297mmx420mm",
"a4": "210mmx297mm",
"a5": "148mmx210mm",
"a6": "105mmx148mm",
"b0": "1000mmx1414mm",
"b1": "707mmx1000mm",
"b2": "500mmx707mm",
"b3": "353mmx500mm",
"b4": "250mmx353mm",
"b5": "176mmx250mm",
"b6": "125mmx176mm",
"jb0": "1030mmx1456mm",
"jb1": "728mmx1030mm",
"jb2": "515mmx728mm",
"jb3": "364mmx515mm",
"jb4": "257mmx364mm",
"jb5": "182mmx257mm",
"jb6": "128mmx182mm",
"legal": "8.5inx14in",
"tabloid": "11inx17in",
}
papernames = {
"letter": "Letter",
"a0": "A0",
"a1": "A1",
"a2": "A2",
"a3": "A3",
"a4": "A4",
"a5": "A5",
"a6": "A6",
"b0": "B0",
"b1": "B1",
"b2": "B2",
"b3": "B3",
"b4": "B4",
"b5": "B5",
"b6": "B6",
"jb0": "JB0",
"jb1": "JB1",
"jb2": "JB2",
"jb3": "JB3",
"jb4": "JB4",
"jb5": "JB5",
"jb6": "JB6",
"legal": "Legal",
"tabloid": "Tabloid",
}
Engine = Enum("Engine", "internal pdfrw pikepdf")
Rotation = Enum("Rotation", "auto none ifvalid 0 90 180 270")
FitMode = Enum("FitMode", "into fill exact shrink enlarge")
PageOrientation = Enum("PageOrientation", "portrait landscape")
Colorspace = Enum("Colorspace", "RGB RGBA L LA 1 CMYK CMYK;I P PA other")
ImageFormat = Enum(
"ImageFormat", "JPEG JPEG2000 CCITTGroup4 PNG GIF TIFF MPO MIFF JBIG2 other"
)
PageMode = Enum("PageMode", "none outlines thumbs")
PageLayout = Enum(
"PageLayout",
"single onecolumn twocolumnright twocolumnleft twopageright twopageleft",
)
Magnification = Enum("Magnification", "fit fith fitbh")
ImgSize = Enum("ImgSize", "abs perc dpi")
Unit = Enum("Unit", "pt cm mm inch")
ImgUnit = Enum("ImgUnit", "pt cm mm inch perc dpi")
TIFFBitRevTable = [
0x00,
0x80,
0x40,
0xC0,
0x20,
0xA0,
0x60,
0xE0,
0x10,
0x90,
0x50,
0xD0,
0x30,
0xB0,
0x70,
0xF0,
0x08,
0x88,
0x48,
0xC8,
0x28,
0xA8,
0x68,
0xE8,
0x18,
0x98,
0x58,
0xD8,
0x38,
0xB8,
0x78,
0xF8,
0x04,
0x84,
0x44,
0xC4,
0x24,
0xA4,
0x64,
0xE4,
0x14,
0x94,
0x54,
0xD4,
0x34,
0xB4,
0x74,
0xF4,
0x0C,
0x8C,
0x4C,
0xCC,
0x2C,
0xAC,
0x6C,
0xEC,
0x1C,
0x9C,
0x5C,
0xDC,
0x3C,
0xBC,
0x7C,
0xFC,
0x02,
0x82,
0x42,
0xC2,
0x22,
0xA2,
0x62,
0xE2,
0x12,
0x92,
0x52,
0xD2,
0x32,
0xB2,
0x72,
0xF2,
0x0A,
0x8A,
0x4A,
0xCA,
0x2A,
0xAA,
0x6A,
0xEA,
0x1A,
0x9A,
0x5A,
0xDA,
0x3A,
0xBA,
0x7A,
0xFA,
0x06,
0x86,
0x46,
0xC6,
0x26,
0xA6,
0x66,
0xE6,
0x16,
0x96,
0x56,
0xD6,
0x36,
0xB6,
0x76,
0xF6,
0x0E,
0x8E,
0x4E,
0xCE,
0x2E,
0xAE,
0x6E,
0xEE,
0x1E,
0x9E,
0x5E,
0xDE,
0x3E,
0xBE,
0x7E,
0xFE,
0x01,
0x81,
0x41,
0xC1,
0x21,
0xA1,
0x61,
0xE1,
0x11,
0x91,
0x51,
0xD1,
0x31,
0xB1,
0x71,
0xF1,
0x09,
0x89,
0x49,
0xC9,
0x29,
0xA9,
0x69,
0xE9,
0x19,
0x99,
0x59,
0xD9,
0x39,
0xB9,
0x79,
0xF9,
0x05,
0x85,
0x45,
0xC5,
0x25,
0xA5,
0x65,
0xE5,
0x15,
0x95,
0x55,
0xD5,
0x35,
0xB5,
0x75,
0xF5,
0x0D,
0x8D,
0x4D,
0xCD,
0x2D,
0xAD,
0x6D,
0xED,
0x1D,
0x9D,
0x5D,
0xDD,
0x3D,
0xBD,
0x7D,
0xFD,
0x03,
0x83,
0x43,
0xC3,
0x23,
0xA3,
0x63,
0xE3,
0x13,
0x93,
0x53,
0xD3,
0x33,
0xB3,
0x73,
0xF3,
0x0B,
0x8B,
0x4B,
0xCB,
0x2B,
0xAB,
0x6B,
0xEB,
0x1B,
0x9B,
0x5B,
0xDB,
0x3B,
0xBB,
0x7B,
0xFB,
0x07,
0x87,
0x47,
0xC7,
0x27,
0xA7,
0x67,
0xE7,
0x17,
0x97,
0x57,
0xD7,
0x37,
0xB7,
0x77,
0xF7,
0x0F,
0x8F,
0x4F,
0xCF,
0x2F,
0xAF,
0x6F,
0xEF,
0x1F,
0x9F,
0x5F,
0xDF,
0x3F,
0xBF,
0x7F,
0xFF,
]
class NegativeDimensionError(Exception):
pass
class UnsupportedColorspaceError(Exception):
pass
class ImageOpenError(Exception):
pass
class JpegColorspaceError(Exception):
pass
class PdfTooLargeError(Exception):
pass
class AlphaChannelError(Exception):
pass
class ExifOrientationError(Exception):
pass
# temporary change the attribute of an object using a context manager
class temp_attr:
def __init__(self, obj, field, value):
self.obj = obj
self.field = field
self.value = value
def __enter__(self):
self.exists = False
if hasattr(self.obj, self.field):
self.exists = True
self.old_value = getattr(self.obj, self.field)
logger.debug(f"setting {self.obj}.{self.field} = {self.value}")
setattr(self.obj, self.field, self.value)
def __exit__(self, exctype, excinst, exctb):
if self.exists:
setattr(self.obj, self.field, self.old_value)
else:
delattr(self.obj, self.field)
# without pdfrw this function is a no-op
def my_convert_load(string):
return string
def parse(cont, indent=1):
if type(cont) is dict:
return (
b"<<\n"
+ b"\n".join(
[
4 * indent * b" " + k + b" " + parse(v, indent + 1)
for k, v in sorted(cont.items())
]
)
+ b"\n"
+ 4 * (indent - 1) * b" "
+ b">>"
)
elif type(cont) is int:
return str(cont).encode()
elif type(cont) is float:
if int(cont) == cont:
return parse(int(cont))
else:
return ("%0.4f" % cont).rstrip("0").encode()
elif isinstance(cont, MyPdfDict):
# if cont got an identifier, then addobj() has been called with it
# and a link to it will be added, otherwise add it inline
if hasattr(cont, "identifier"):
return ("%d 0 R" % cont.identifier).encode()
else:
return parse(cont.content, indent)
elif type(cont) is str or isinstance(cont, bytes):
if type(cont) is str and type(cont) is not bytes:
raise TypeError(
"parse must be passed a bytes object in py3. Got: %s" % cont
)
return cont
elif isinstance(cont, list):
return b"[ " + b" ".join([parse(c, indent) for c in cont]) + b" ]"
else:
raise TypeError("cannot handle type %s with content %s" % (type(cont), cont))
class MyPdfDict(object):
def __init__(self, *args, **kw):
self.content = dict()
if args:
if len(args) == 1:
args = args[0]
self.content.update(args)
self.stream = None
for key, value in kw.items():
if key == "stream":
self.stream = value
self.content[MyPdfName.Length] = len(value)
elif key == "indirect":
pass
else:
self.content[getattr(MyPdfName, key)] = value
def tostring(self):
if self.stream is not None:
return (
("%d 0 obj\n" % self.identifier).encode()
+ parse(self.content)
+ b"\nstream\n"
+ self.stream
+ b"\nendstream\nendobj\n"
)
else:
return (
("%d 0 obj\n" % self.identifier).encode()
+ parse(self.content)
+ b"\nendobj\n"
)
def __setitem__(self, key, value):
self.content[key] = value
def __getitem__(self, key):
return self.content[key]
def __contains__(self, key):
return key in self.content
class MyPdfName:
def __getattr__(self, name):
return b"/" + name.encode("ascii")
MyPdfName = MyPdfName()
class MyPdfObject(bytes):
def __new__(cls, string):
return bytes.__new__(cls, string.encode("ascii"))
class MyPdfArray(list):
pass
class MyPdfWriter:
def __init__(self):
self.objects = []
# create an incomplete pages object so that a /Parent entry can be
# added to each page
self.pages = MyPdfDict(Type=MyPdfName.Pages, Kids=[], Count=0)
self.catalog = MyPdfDict(Pages=self.pages, Type=MyPdfName.Catalog)
self.pagearray = []
def addobj(self, obj):
newid = len(self.objects) + 1
obj.identifier = newid
self.objects.append(obj)
def tostream(self, info, stream, version="1.3", ident=None):
xreftable = list()
# justification of the random binary garbage in the header from
# adobe:
#
# > Note: If a PDF file contains binary data, as most do (see Section
# > 3.1, “Lexical Conventions”), it is recommended that the header
# > line be immediately followed by a comment line containing at
# > least four binary characters—that is, characters whose codes are
# > 128 or greater. This ensures proper behavior of file transfer
# > applications that inspect data near the beginning of a file to
# > determine whether to treat the file’s contents as text or as
# > binary.
#
# the choice of binary characters is arbitrary but those four seem to
# be used elsewhere.
pdfheader = ("%%PDF-%s\n" % version).encode("ascii")
pdfheader += b"%\xe2\xe3\xcf\xd3\n"
stream.write(pdfheader)
# From section 3.4.3 of the PDF Reference (version 1.7):
#
# > Each entry is exactly 20 bytes long, including the end-of-line
# > marker.
# >
# > [...]
# >
# > The format of an in-use entry is
# > nnnnnnnnnn ggggg n eol
# > where
# > nnnnnnnnnn is a 10-digit byte offset
# > ggggg is a 5-digit generation number
# > n is a literal keyword identifying this as an in-use entry
# > eol is a 2-character end-of-line sequence
# >
# > [...]
# >
# > If the file’s end-of-line marker is a single character (either a
# > carriage return or a line feed), it is preceded by a single space;
#
# Since we chose to use a single character eol marker, we precede it by
# a space
pos = len(pdfheader)
xreftable.append(b"0000000000 65535 f \n")
for o in self.objects:
xreftable.append(("%010d 00000 n \n" % pos).encode())
content = o.tostring()
stream.write(content)
pos += len(content)
xrefoffset = pos
stream.write(b"xref\n")
stream.write(("0 %d\n" % len(xreftable)).encode())
for x in xreftable:
stream.write(x)
stream.write(b"trailer\n")
trailer = {b"/Size": len(xreftable), b"/Info": info, b"/Root": self.catalog}
if ident is not None:
md5 = hashlib.md5(ident).hexdigest().encode("ascii")
trailer[b"/ID"] = b"[<%s><%s>]" % (md5, md5)
stream.write(parse(trailer) + b"\n")
stream.write(b"startxref\n")
stream.write(("%d\n" % xrefoffset).encode())
stream.write(b"%%EOF\n")
return
def addpage(self, page):
page[b"/Parent"] = self.pages
self.pagearray.append(page)
self.pages.content[b"/Kids"].append(page)
self.pages.content[b"/Count"] += 1
self.addobj(page)
class MyPdfString:
@classmethod
def encode(cls, string, hextype=False):
if hextype:
return (
b"< " + b" ".join(("%06x" % c).encode("ascii") for c in string) + b" >"
)
else:
try:
string = string.encode("ascii")
except UnicodeEncodeError:
string = b"\xfe\xff" + string.encode("utf-16-be")
# We should probably encode more here because at least
# ghostscript interpretes a carriage return byte (0x0D) as a
# new line byte (0x0A)
# PDF supports: \n, \r, \t, \b and \f
string = string.replace(b"\\", b"\\\\")
string = string.replace(b"(", b"\\(")
string = string.replace(b")", b"\\)")
return b"(" + string + b")"
class pdfdoc(object):
def __init__(
self,
engine=Engine.internal,
version="1.3",
title=None,
author=None,
creator=None,
producer=None,
creationdate=None,
moddate=None,
subject=None,
keywords=None,
nodate=False,
panes=None,
initial_page=None,
magnification=None,
page_layout=None,
fit_window=False,
center_window=False,
fullscreen=False,
pdfa=None,
):
if engine is None:
if have_pikepdf:
engine = Engine.pikepdf
elif have_pdfrw:
engine = Engine.pdfrw
else:
engine = Engine.internal
if engine == Engine.pikepdf:
PdfWriter = pikepdf.new
PdfDict = pikepdf.Dictionary
PdfName = pikepdf.Name
elif engine == Engine.pdfrw:
from pdfrw import PdfWriter, PdfDict, PdfName, PdfString
elif engine == Engine.internal:
PdfWriter = MyPdfWriter
PdfDict = MyPdfDict
PdfName = MyPdfName
PdfString = MyPdfString
else:
raise ValueError("unknown engine: %s" % engine)
self.writer = PdfWriter()
if engine != Engine.pikepdf:
self.writer.docinfo = PdfDict(indirect=True)
def datetime_to_pdfdate(dt):
return dt.astimezone(tz=timezone.utc).strftime("%Y%m%d%H%M%SZ")
for k in ["Title", "Author", "Creator", "Producer", "Subject"]:
v = locals()[k.lower()]
if v is None or v == "":
continue
if engine != Engine.pikepdf:
v = PdfString.encode(v)
self.writer.docinfo[getattr(PdfName, k)] = v
now = datetime.now().astimezone()
for k in ["CreationDate", "ModDate"]:
v = locals()[k.lower()]
if v is None and nodate:
continue
if v is None:
v = now
v = ("D:" + datetime_to_pdfdate(v)).encode("ascii")
if engine == Engine.internal:
v = b"(" + v + b")"
self.writer.docinfo[getattr(PdfName, k)] = v
if keywords is not None:
if engine == Engine.pikepdf:
self.writer.docinfo[PdfName.Keywords] = ",".join(keywords)
else:
self.writer.docinfo[PdfName.Keywords] = PdfString.encode(
",".join(keywords)
)
def datetime_to_xmpdate(dt):
return dt.astimezone(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
self.xmp = b"""
%s
%s
""" % (
b" pdf:Producer='%s'" % producer.encode("ascii")
if producer is not None
else b"",
b""
if creationdate is None and nodate
else b"%s"
% datetime_to_xmpdate(now if creationdate is None else creationdate).encode(
"ascii"
),
b""
if moddate is None and nodate
else b"%s"
% datetime_to_xmpdate(now if moddate is None else moddate).encode("ascii"),
)
if engine != Engine.pikepdf:
# this is done because pdfrw adds info, catalog and pages as the first
# three objects in this order
if engine == Engine.internal:
self.writer.addobj(self.writer.docinfo)
self.writer.addobj(self.writer.catalog)
self.writer.addobj(self.writer.pages)
self.panes = panes
self.initial_page = initial_page
self.magnification = magnification
self.page_layout = page_layout
self.fit_window = fit_window
self.center_window = center_window
self.fullscreen = fullscreen
self.engine = engine
self.output_version = version
self.pdfa = pdfa
def add_imagepage(
self,
color,
imgwidthpx,
imgheightpx,
imgformat,
imgdata,
smaskdata,
imgwidthpdf,
imgheightpdf,
imgxpdf,
imgypdf,
pagewidth,
pageheight,
userunit=None,
palette=None,
inverted=False,
depth=0,
rotate=0,
cropborder=None,
bleedborder=None,
trimborder=None,
artborder=None,
iccp=None,
):
assert (
color not in [Colorspace.RGBA, Colorspace.LA]
or (imgformat == ImageFormat.PNG and smaskdata is not None)
or imgformat == ImageFormat.JPEG2000
)
if self.engine == Engine.pikepdf:
PdfArray = pikepdf.Array
PdfDict = pikepdf.Dictionary
PdfName = pikepdf.Name
elif self.engine == Engine.pdfrw:
from pdfrw import PdfDict, PdfName, PdfObject, PdfString
from pdfrw.py23_diffs import convert_load
elif self.engine == Engine.internal:
PdfDict = MyPdfDict
PdfName = MyPdfName
PdfObject = MyPdfObject
PdfString = MyPdfString
convert_load = my_convert_load
else:
raise ValueError("unknown engine: %s" % self.engine)
TrueObject = True if self.engine == Engine.pikepdf else PdfObject("true")
FalseObject = False if self.engine == Engine.pikepdf else PdfObject("false")
if color == Colorspace["1"] or color == Colorspace.L or color == Colorspace.LA:
colorspace = PdfName.DeviceGray
elif color == Colorspace.RGB or color == Colorspace.RGBA:
if color == Colorspace.RGBA and imgformat == ImageFormat.JPEG2000:
# there is no DeviceRGBA and for JPXDecode it is okay to have
# no colorspace as the pdf reader is supposed to get this info
# from the jpeg2000 payload itself
colorspace = None
else:
colorspace = PdfName.DeviceRGB
elif color == Colorspace.CMYK or color == Colorspace["CMYK;I"]:
colorspace = PdfName.DeviceCMYK
elif color == Colorspace.P:
if self.engine == Engine.pdfrw:
# https://github.com/pmaupin/pdfrw/issues/128
# https://github.com/pmaupin/pdfrw/issues/147
raise Exception(
"pdfrw does not support hex strings for "
"palette image input, re-run with "
"--engine=internal or --engine=pikepdf"
)
assert len(palette) % 3 == 0
colorspace = [
PdfName.Indexed,
PdfName.DeviceRGB,
(len(palette) // 3) - 1,
bytes(palette)
if self.engine == Engine.pikepdf
else PdfString.encode(
[
int.from_bytes(palette[i : i + 3], "big")
for i in range(0, len(palette), 3)
],
hextype=True,
),
]
else:
raise UnsupportedColorspaceError("unsupported color space: %s" % color.name)
if iccp is not None:
if self.engine == Engine.pikepdf:
iccpdict = self.writer.make_stream(iccp)
else:
iccpdict = PdfDict(stream=convert_load(iccp))
iccpdict[PdfName.Alternate] = colorspace
if (
color == Colorspace["1"]
or color == Colorspace.L
or color == Colorspace.LA
):
iccpdict[PdfName.N] = 1
elif color == Colorspace.RGB or color == Colorspace.RGBA:
iccpdict[PdfName.N] = 3
elif color == Colorspace.CMYK or color == Colorspace["CMYK;I"]:
iccpdict[PdfName.N] = 4
elif color == Colorspace.P:
raise Exception("Cannot have Palette images with ICC profile")
colorspace = [PdfName.ICCBased, iccpdict]
# either embed the whole jpeg or deflate the bitmap representation
if imgformat is ImageFormat.JPEG:
ofilter = PdfName.DCTDecode
elif imgformat is ImageFormat.JPEG2000:
ofilter = PdfName.JPXDecode
self.output_version = "1.5" # jpeg2000 needs pdf 1.5
elif imgformat is ImageFormat.CCITTGroup4:
ofilter = [PdfName.CCITTFaxDecode]
elif imgformat is ImageFormat.JBIG2:
ofilter = PdfName.JBIG2Decode
# JBIG2Decode requires PDF 1.4
if self.output_version < "1.4":
self.output_version = "1.4"
else:
ofilter = PdfName.FlateDecode
if self.engine == Engine.pikepdf:
image = self.writer.make_stream(imgdata)
else:
image = PdfDict(stream=convert_load(imgdata))
image[PdfName.Type] = PdfName.XObject
image[PdfName.Subtype] = PdfName.Image
image[PdfName.Filter] = ofilter
image[PdfName.Width] = imgwidthpx
image[PdfName.Height] = imgheightpx
if colorspace is not None:
image[PdfName.ColorSpace] = colorspace
image[PdfName.BitsPerComponent] = depth
smask = None
if color == Colorspace["CMYK;I"]:
# Inverts all four channels
image[PdfName.Decode] = [1, 0, 1, 0, 1, 0, 1, 0]
if imgformat is ImageFormat.CCITTGroup4:
decodeparms = PdfDict()
# The default for the K parameter is 0 which indicates Group 3 1-D
# encoding. We set it to -1 because we want Group 4 encoding.
decodeparms[PdfName.K] = -1
if inverted:
decodeparms[PdfName.BlackIs1] = FalseObject
else:
decodeparms[PdfName.BlackIs1] = TrueObject
decodeparms[PdfName.Columns] = imgwidthpx
decodeparms[PdfName.Rows] = imgheightpx
image[PdfName.DecodeParms] = [decodeparms]
elif imgformat is ImageFormat.PNG:
if smaskdata is not None:
if self.engine == Engine.pikepdf:
smask = self.writer.make_stream(smaskdata)
else:
smask = PdfDict(stream=convert_load(smaskdata))
smask[PdfName.Type] = PdfName.XObject
smask[PdfName.Subtype] = PdfName.Image
smask[PdfName.Filter] = PdfName.FlateDecode
smask[PdfName.Width] = imgwidthpx
smask[PdfName.Height] = imgheightpx
smask[PdfName.ColorSpace] = PdfName.DeviceGray
smask[PdfName.BitsPerComponent] = depth
decodeparms = PdfDict()
decodeparms[PdfName.Predictor] = 15
decodeparms[PdfName.Colors] = 1
decodeparms[PdfName.Columns] = imgwidthpx
decodeparms[PdfName.BitsPerComponent] = depth
smask[PdfName.DecodeParms] = decodeparms
image[PdfName.SMask] = smask
# /SMask requires PDF 1.4
if self.output_version < "1.4":
self.output_version = "1.4"
decodeparms = PdfDict()
decodeparms[PdfName.Predictor] = 15
if color in [Colorspace.P, Colorspace["1"], Colorspace.L, Colorspace.LA]:
decodeparms[PdfName.Colors] = 1
else:
decodeparms[PdfName.Colors] = 3
decodeparms[PdfName.Columns] = imgwidthpx
decodeparms[PdfName.BitsPerComponent] = depth
image[PdfName.DecodeParms] = decodeparms
text = (
"q\n%0.4f 0 0 %0.4f %0.4f %0.4f cm\n/Im0 Do\nQ"
% (imgwidthpdf, imgheightpdf, imgxpdf, imgypdf)
).encode("ascii")
if self.engine == Engine.pikepdf:
content = self.writer.make_stream(text)
else:
content = PdfDict(stream=convert_load(text))
resources = PdfDict(XObject=PdfDict(Im0=image))
if self.engine == Engine.pikepdf:
page = self.writer.add_blank_page(page_size=(pagewidth, pageheight))
else:
page = PdfDict(indirect=True)
page[PdfName.Type] = PdfName.Page
page[PdfName.MediaBox] = [0, 0, pagewidth, pageheight]
# 14.11.2 Page Boundaries
# ...
# The crop, bleed, trim, and art boxes shall not ordinarily extend
# beyond the boundaries of the media box. If they do, they are
# effectively reduced to their intersection with the media box.
if cropborder is not None:
page[PdfName.CropBox] = [
cropborder[1],
cropborder[0],
pagewidth - cropborder[1],
pageheight - cropborder[0],
]
if bleedborder is None:
if PdfName.CropBox in page:
page[PdfName.BleedBox] = page[PdfName.CropBox]
else:
page[PdfName.BleedBox] = [
bleedborder[1],
bleedborder[0],
pagewidth - bleedborder[1],
pageheight - bleedborder[0],
]
if trimborder is None:
if PdfName.CropBox in page:
page[PdfName.TrimBox] = page[PdfName.CropBox]
else:
page[PdfName.TrimBox] = [
trimborder[1],
trimborder[0],
pagewidth - trimborder[1],
pageheight - trimborder[0],
]
if artborder is None:
if PdfName.CropBox in page:
page[PdfName.ArtBox] = page[PdfName.CropBox]
else:
page[PdfName.ArtBox] = [
artborder[1],
artborder[0],
pagewidth - artborder[1],
pageheight - artborder[0],
]
page[PdfName.Resources] = resources
page[PdfName.Contents] = content
if rotate != 0:
page[PdfName.Rotate] = rotate
if userunit is not None:
# /UserUnit requires PDF 1.6
if self.output_version < "1.6":
self.output_version = "1.6"
page[PdfName.UserUnit] = userunit
if self.engine != Engine.pikepdf:
self.writer.addpage(page)
if self.engine == Engine.internal:
self.writer.addobj(content)
self.writer.addobj(image)
if smask is not None:
self.writer.addobj(smask)
if iccp is not None:
self.writer.addobj(iccpdict)
def tostring(self):
stream = BytesIO()
self.tostream(stream)
return stream.getvalue()
def finalize(self):
if self.engine == Engine.pikepdf:
PdfArray = pikepdf.Array
PdfDict = pikepdf.Dictionary
PdfName = pikepdf.Name
elif self.engine == Engine.pdfrw:
from pdfrw import PdfDict, PdfName, PdfArray, PdfObject
from pdfrw.py23_diffs import convert_load
elif self.engine == Engine.internal:
PdfDict = MyPdfDict
PdfName = MyPdfName
PdfObject = MyPdfObject
PdfArray = MyPdfArray
convert_load = my_convert_load
else:
raise ValueError("unknown engine: %s" % self.engine)
NullObject = None if self.engine == Engine.pikepdf else PdfObject("null")
TrueObject = True if self.engine == Engine.pikepdf else PdfObject("true")
# We fill the catalog with more information like /ViewerPreferences,
# /PageMode, /PageLayout or /OpenAction because the latter refers to a
# page object which has to be present so that we can get its id.
#
# Furthermore, if using pdfrw, the trailer is cleared every time a page
# is added, so we can only start using it after all pages have been
# written.
if self.engine == Engine.pikepdf:
catalog = self.writer.Root
elif self.engine == Engine.pdfrw:
catalog = self.writer.trailer.Root
elif self.engine == Engine.internal:
catalog = self.writer.catalog
else:
raise ValueError("unknown engine: %s" % self.engine)
if (
self.fullscreen
or self.fit_window
or self.center_window
or self.panes is not None
):
catalog[PdfName.ViewerPreferences] = PdfDict()
if self.fullscreen:
# this setting might be overwritten later by the page mode
catalog[PdfName.ViewerPreferences][
PdfName.NonFullScreenPageMode
] = PdfName.UseNone
if self.panes == PageMode.thumbs:
catalog[PdfName.ViewerPreferences][
PdfName.NonFullScreenPageMode
] = PdfName.UseThumbs
# this setting might be overwritten later if fullscreen
catalog[PdfName.PageMode] = PdfName.UseThumbs
elif self.panes == PageMode.outlines:
catalog[PdfName.ViewerPreferences][
PdfName.NonFullScreenPageMode
] = PdfName.UseOutlines
# this setting might be overwritten later if fullscreen
catalog[PdfName.PageMode] = PdfName.UseOutlines
elif self.panes in [PageMode.none, None]:
pass
else:
raise ValueError("unknown page mode: %s" % self.panes)
if self.fit_window:
catalog[PdfName.ViewerPreferences][PdfName.FitWindow] = TrueObject
if self.center_window:
catalog[PdfName.ViewerPreferences][PdfName.CenterWindow] = TrueObject
if self.fullscreen:
catalog[PdfName.PageMode] = PdfName.FullScreen
# see table 8.2 in section 8.2.1 in
# http://partners.adobe.com/public/developer/en/pdf/PDFReference16.pdf
# Fit - Fits the page to the window.
# FitH - Fits the width of the page to the window.
# FitV - Fits the height of the page to the window.
# FitR - Fits the rectangle specified by the four coordinates to the
# window.
# FitB - Fits the page bounding box to the window. This basically
# reduces the amount of whitespace (margins) that is displayed
# and thus focussing more on the text content.
# FitBH - Fits the width of the page bounding box to the window.
# FitBV - Fits the height of the page bounding box to the window.
# by default the initial page is the first one
if self.engine == Engine.pikepdf:
initial_page = self.writer.pages[0]
else:
initial_page = self.writer.pagearray[0]
# we set the open action here to make sure we open on the requested
# initial page but this value might be overwritten by a custom open
# action later while still taking the requested initial page into
# account
if self.initial_page is not None:
if self.engine == Engine.pikepdf:
initial_page = self.writer.pages[self.initial_page - 1]
else:
initial_page = self.writer.pagearray[self.initial_page - 1]
catalog[PdfName.OpenAction] = PdfArray(
[initial_page, PdfName.XYZ, NullObject, NullObject, 0]
)
# The /OpenAction array must contain the page as an indirect object.
# This changed some time after 4.2.0 and on or before 5.0.0 and current
# versions require to use .obj or otherwise we get:
# TypeError: Can't convert ObjectHelper (or subclass) to Object
# implicitly. Use .obj to get access the underlying object.
# See https://github.com/pikepdf/pikepdf/issues/313 for details.
if self.engine == Engine.pikepdf:
if isinstance(initial_page, pikepdf.Page):
initial_page = self.writer.make_indirect(initial_page.obj)
else:
initial_page = self.writer.make_indirect(initial_page)
if self.magnification == Magnification.fit:
catalog[PdfName.OpenAction] = PdfArray([initial_page, PdfName.Fit])
elif self.magnification == Magnification.fith:
pagewidth = initial_page[PdfName.MediaBox][2]
catalog[PdfName.OpenAction] = PdfArray(
[initial_page, PdfName.FitH, pagewidth]
)
elif self.magnification == Magnification.fitbh:
# quick hack to determine the image width on the page
imgwidth = float(initial_page[PdfName.Contents].stream.split()[4])
catalog[PdfName.OpenAction] = PdfArray(
[initial_page, PdfName.FitBH, imgwidth]
)
elif isinstance(self.magnification, float):
catalog[PdfName.OpenAction] = PdfArray(
[initial_page, PdfName.XYZ, NullObject, NullObject, self.magnification]
)
elif self.magnification is None:
pass
else:
raise ValueError("unknown magnification: %s" % self.magnification)
if self.page_layout == PageLayout.single:
catalog[PdfName.PageLayout] = PdfName.SinglePage
elif self.page_layout == PageLayout.onecolumn:
catalog[PdfName.PageLayout] = PdfName.OneColumn
elif self.page_layout == PageLayout.twocolumnright:
catalog[PdfName.PageLayout] = PdfName.TwoColumnRight
elif self.page_layout == PageLayout.twocolumnleft:
catalog[PdfName.PageLayout] = PdfName.TwoColumnLeft
elif self.page_layout == PageLayout.twopageright:
catalog[PdfName.PageLayout] = PdfName.TwoPageRight
if self.output_version < "1.5":
self.output_version = "1.5"
elif self.page_layout == PageLayout.twopageleft:
catalog[PdfName.PageLayout] = PdfName.TwoPageLeft
if self.output_version < "1.5":
self.output_version = "1.5"
elif self.page_layout is None:
pass
else:
raise ValueError("unknown page layout: %s" % self.page_layout)
if self.pdfa is not None:
if self.engine == Engine.pikepdf:
metadata = self.writer.make_stream(self.xmp)
else:
metadata = PdfDict(stream=convert_load(self.xmp))
metadata[PdfName.Subtype] = PdfName.XML
metadata[PdfName.Type] = PdfName.Metadata
with open(self.pdfa, "rb") as f:
icc = f.read()
intents = PdfDict()
if self.engine == Engine.pikepdf:
iccstream = self.writer.make_stream(icc)
iccstream.stream_dict.N = 3
else:
iccstream = PdfDict(stream=convert_load(zlib.compress(icc)))
iccstream[PdfName.N] = 3
iccstream[PdfName.Filter] = PdfName.FlateDecode
intents[PdfName.S] = PdfName.GTS_PDFA1
intents[PdfName.Type] = PdfName.OutputIntent
intents[PdfName.OutputConditionIdentifier] = (
b"sRGB" if self.engine == Engine.pikepdf else b"(sRGB)"
)
intents[PdfName.DestOutputProfile] = iccstream
catalog[PdfName.OutputIntents] = PdfArray([intents])
catalog[PdfName.Metadata] = metadata
if self.engine == Engine.internal:
self.writer.addobj(metadata)
self.writer.addobj(iccstream)
def tostream(self, outputstream):
# write out the PDF
# this assumes that finalize() has been invoked beforehand by the caller
if self.engine == Engine.pikepdf:
kwargs = {}
if pikepdf.__version__ >= "6.2.0":
kwargs["deterministic_id"] = True
self.writer.save(
outputstream, min_version=self.output_version, linearize=True, **kwargs
)
elif self.engine == Engine.pdfrw:
from pdfrw import PdfName, PdfArray
self.writer.trailer.Info = self.writer.docinfo
# setting the version attribute of the pdfrw PdfWriter object will
# influence the behaviour of the write() function
self.writer.version = self.output_version
if self.pdfa:
md5 = hashlib.md5(b"").hexdigest().encode("ascii")
self.writer.trailer[PdfName.ID] = PdfArray([md5, md5])
self.writer.write(outputstream)
elif self.engine == Engine.internal:
self.writer.tostream(
self.writer.docinfo,
outputstream,
self.output_version,
None if self.pdfa is None else b"",
)
else:
raise ValueError("unknown engine: %s" % self.engine)
def pil_get_dpi(imgdata, imgformat, default_dpi):
ndpi = imgdata.info.get("dpi")
if ndpi is None:
# the PNG plugin of PIL adds the undocumented "aspect" field instead of
# the "dpi" field if the PNG pHYs chunk unit is not set to meters
if imgformat == ImageFormat.PNG and imgdata.info.get("aspect") is not None:
aspect = imgdata.info["aspect"]
# make sure not to go below the default dpi
if aspect[0] > aspect[1]:
ndpi = (default_dpi * aspect[0] / aspect[1], default_dpi)
else:
ndpi = (default_dpi, default_dpi * aspect[1] / aspect[0])
else:
ndpi = (default_dpi, default_dpi)
# In python3, the returned dpi value for some tiff images will
# not be an integer but a float. To make the behaviour of
# img2pdf the same between python2 and python3, we convert that
# float into an integer by rounding.
# Search online for the 72.009 dpi problem for more info.
ndpi = (int(round(ndpi[0])), int(round(ndpi[1])))
# Since commit 07a96209597c5e8dfe785c757d7051ce67a980fb or release 4.1.0
# Pillow retrieves the DPI from EXIF if it cannot find the DPI in the JPEG
# header. In that case it can happen that the horizontal and vertical DPI
# are set to zero.
if ndpi == (0, 0):
ndpi = (default_dpi, default_dpi)
# PIL defaults to a dpi of 1 if a TIFF image does not specify the dpi.
# In that case, we want to use a different default.
if ndpi == (1, 1) and imgformat == ImageFormat.TIFF:
ndpi = (
imgdata.tag_v2.get(TiffImagePlugin.X_RESOLUTION, default_dpi),
imgdata.tag_v2.get(TiffImagePlugin.Y_RESOLUTION, default_dpi),
)
return ndpi
def get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata=None, rotreq=None
):
if imgformat == ImageFormat.JPEG2000 and rawdata is not None and imgdata is None:
# this codepath gets called if the PIL installation is not able to
# handle JPEG2000 files
imgwidthpx, imgheightpx, ics, hdpi, vdpi, channels, bpp = jp2.parse(rawdata)
if hdpi is None:
hdpi = default_dpi
if vdpi is None:
vdpi = default_dpi
ndpi = (hdpi, vdpi)
elif imgformat == ImageFormat.JBIG2:
imgwidthpx, imgheightpx, xres, yres = struct.unpack(">IIII", rawdata[24:40])
INCH_PER_METER = 39.370079
if xres == 0:
hdpi = default_dpi
elif xres < 1000:
# If xres is very small, it's likely accidentally expressed in dpi instead
# of dpm. See e.g. https://github.com/agl/jbig2enc/issues/86
hdpi = xres
else:
hdpi = int(float(xres) / INCH_PER_METER)
if yres == 0:
vdpi = default_dpi
elif yres < 1000:
vdpi = yres
else:
vdpi = int(float(yres) / INCH_PER_METER)
ndpi = (hdpi, vdpi)
ics = "1"
else:
imgwidthpx, imgheightpx = imgdata.size
ndpi = pil_get_dpi(imgdata, imgformat, default_dpi)
ics = imgdata.mode
logger.debug("input dpi = %d x %d", *ndpi)
# GIF and PNG files with transparency are supported
if imgformat in [ImageFormat.PNG, ImageFormat.GIF, ImageFormat.JPEG2000] and (
ics in ["RGBA", "LA"]
or (imgdata is not None and "transparency" in imgdata.info)
):
# Must check the IHDR chunk for the bit depth, because PIL would lossily
# convert 16-bit RGBA/LA images to 8-bit.
if imgformat == ImageFormat.PNG and rawdata is not None:
depth = rawdata[24]
if depth > 8:
logger.warning("Image with transparency and a bit depth of %d." % depth)
logger.warning("This is unsupported due to PIL limitations.")
logger.warning(
"If you accept a lossy conversion, you can manually convert "
"your images to 8 bit using `convert -depth 8` from imagemagick"
)
raise AlphaChannelError(
"Refusing to work with multiple >8bit channels."
)
elif ics in ["LA", "PA", "RGBA"] or (
imgdata is not None and "transparency" in imgdata.info
):
raise AlphaChannelError("This function must not be called on images with alpha")
rotation = 0
if rotreq in (None, Rotation.auto, Rotation.ifvalid):
if hasattr(imgdata, "getexif") and imgdata.getexif() is not None:
exif_dict = imgdata.getexif()
o_key = ExifTags.Base.Orientation.value # 274 rsp. 0x112
if exif_dict and o_key in exif_dict:
# Detailed information on EXIF rotation tags:
# http://impulseadventure.com/photo/exif-orientation.html
value = exif_dict[o_key]
if value == 1:
rotation = 0
elif value == 6:
rotation = 90
elif value == 3:
rotation = 180
elif value == 8:
rotation = 270
elif value in (2, 4, 5, 7):
if rotreq == Rotation.ifvalid:
logger.warning(
"Unsupported flipped rotation mode (%d): use "
"--rotation=ifvalid or "
"rotation=img2pdf.Rotation.ifvalid to ignore",
value,
)
else:
raise ExifOrientationError(
"Unsupported flipped rotation mode (%d): use "
"--rotation=ifvalid or "
"rotation=img2pdf.Rotation.ifvalid to ignore" % value
)
else:
if rotreq == Rotation.ifvalid:
logger.warning("Invalid rotation (%d)", value)
else:
raise ExifOrientationError(
"Invalid rotation (%d): use --rotation=ifvalid "
"or rotation=img2pdf.Rotation.ifvalid to ignore" % value
)
elif hasattr(imgdata, "_getexif") and imgdata._getexif() is not None:
for tag, value in imgdata._getexif().items():
if TAGS.get(tag, tag) == "Orientation":
# Detailed information on EXIF rotation tags:
# http://impulseadventure.com/photo/exif-orientation.html
if value == 1:
rotation = 0
elif value == 6:
rotation = 90
elif value == 3:
rotation = 180
elif value == 8:
rotation = 270
elif value in (2, 4, 5, 7):
if rotreq == Rotation.ifvalid:
logger.warning(
"Unsupported flipped rotation mode (%d): use "
"--rotation=ifvalid or "
"rotation=img2pdf.Rotation.ifvalid to ignore",
value,
)
else:
raise ExifOrientationError(
"Unsupported flipped rotation mode (%d): use "
"--rotation=ifvalid or "
"rotation=img2pdf.Rotation.ifvalid to ignore" % value
)
else:
if rotreq == Rotation.ifvalid:
logger.warning("Invalid rotation (%d)", value)
else:
raise ExifOrientationError(
"Invalid rotation (%d): use --rotation=ifvalid "
"or rotation=img2pdf.Rotation.ifvalid to ignore" % value
)
elif rotreq in (Rotation.none, Rotation["0"]):
rotation = 0
elif rotreq == Rotation["90"]:
rotation = 90
elif rotreq == Rotation["180"]:
rotation = 180
elif rotreq == Rotation["270"]:
rotation = 270
else:
raise Exception("invalid rotreq")
logger.debug("rotation = %d°", rotation)
if colorspace:
color = colorspace
logger.debug("input colorspace (forced) = %s", color)
else:
color = None
for c in Colorspace:
if c.name == ics:
color = c
if color is None:
# PIL does not provide the information about the original
# colorspace for 16bit grayscale PNG images. Thus, we retrieve
# that info manually by looking at byte 10 in the IHDR chunk. We
# know where to find that in the file because the IHDR chunk must
# be the first chunk
if (
rawdata is not None
and imgformat == ImageFormat.PNG
and rawdata[25] == 0
):
color = Colorspace.L
else:
raise ValueError("unknown colorspace")
if color == Colorspace.CMYK and imgformat == ImageFormat.JPEG:
# Adobe inverts CMYK JPEGs for some reason, and others
# have followed suit as well. Some software assumes the
# JPEG is inverted if the Adobe tag (APP14), while other
# software assumes all CMYK JPEGs are inverted. I don't
# have enough experience with these to know which is
# better for images currently in the wild, so I'm going
# with the first approach for now.
if "adobe" in imgdata.info:
color = Colorspace["CMYK;I"]
logger.debug("input colorspace = %s", color.name)
iccp = None
if imgdata is not None and "icc_profile" in imgdata.info:
iccp = imgdata.info.get("icc_profile")
# GIMP saves bilevel TIFF images and palette PNG images with only black and
# white in the palette with an RGB ICC profile which is useless
# https://gitlab.gnome.org/GNOME/gimp/-/issues/3438
# and produces an error in Adobe Acrobat, so we ignore it with a warning.
# imagemagick also used to (wrongly) include an RGB ICC profile for bilevel
# images: https://github.com/ImageMagick/ImageMagick/issues/2070
if iccp is not None and (
(color == Colorspace["1"] and imgformat == ImageFormat.TIFF)
or (
imgformat == ImageFormat.PNG
and color == Colorspace.P
and rawdata is not None
and parse_png(rawdata)[1]
in [b"\x00\x00\x00\xff\xff\xff", b"\xff\xff\xff\x00\x00\x00"]
)
):
with io.BytesIO(iccp) as f:
prf = ImageCms.ImageCmsProfile(f)
if (
prf.profile.model == "sRGB"
and prf.profile.manufacturer == "GIMP"
and prf.profile.profile_description == "GIMP built-in sRGB"
):
if imgformat == ImageFormat.TIFF:
logger.warning(
"Ignoring RGB ICC profile in bilevel TIFF produced by GIMP."
)
elif imgformat == ImageFormat.PNG:
logger.warning(
"Ignoring RGB ICC profile in 2-color palette PNG produced by GIMP."
)
logger.warning("https://gitlab.gnome.org/GNOME/gimp/-/issues/3438")
iccp = None
# SmartAlbums old version (found 2.2.6) exports JPG with only 1 compone
# with an RGB ICC profile which is useless.
# This produces an error in Adobe Acrobat, so we ignore it with a warning.
# Update: Found another case, the JPG is created by Adobe PhotoShop, so we
# don't check software anymore.
if iccp is not None and (
(color == Colorspace["L"] and imgformat == ImageFormat.JPEG)
):
with io.BytesIO(iccp) as f:
prf = ImageCms.ImageCmsProfile(f)
if prf.profile.xcolor_space not in ("GRAY"):
logger.warning("Ignoring non-GRAY ICC profile in Grayscale JPG")
iccp = None
logger.debug("width x height = %dpx x %dpx", imgwidthpx, imgheightpx)
return (color, ndpi, imgwidthpx, imgheightpx, rotation, iccp)
def ccitt_payload_location_from_pil(img):
# If Pillow is passed an invalid compression argument it will ignore it;
# make sure the image actually got compressed.
if img.info["compression"] != "group4":
raise ValueError(
"Image not compressed with CCITT Group 4 but with: %s"
% img.info["compression"]
)
# Read the TIFF tags to find the offset(s) of the compressed data strips.
strip_offsets = img.tag_v2[TiffImagePlugin.STRIPOFFSETS]
strip_bytes = img.tag_v2[TiffImagePlugin.STRIPBYTECOUNTS]
# PIL always seems to create a single strip even for very large TIFFs when
# it saves images, so assume we only have to read a single strip.
# A test ~10 GPixel image was still encoded as a single strip. Just to be
# safe check throw an error if there is more than one offset.
if len(strip_offsets) != 1 or len(strip_bytes) != 1:
raise NotImplementedError(
"Transcoding multiple strips not supported by the PDF format"
)
(offset,), (length,) = strip_offsets, strip_bytes
logger.debug("TIFF strip_offsets: %d" % offset)
logger.debug("TIFF strip_bytes: %d" % length)
return offset, length
def transcode_monochrome(imgdata):
"""Convert the open PIL.Image imgdata to compressed CCITT Group4 data"""
logger.debug("Converting monochrome to CCITT Group4")
# Convert the image to Group 4 in memory. If libtiff is not installed and
# Pillow is not compiled against it, .save() will raise an exception.
newimgio = BytesIO()
# we create a whole new PIL image or otherwise it might happen with some
# input images, that libtiff fails an assert and the whole process is
# killed by a SIGABRT:
# https://gitlab.mister-muffin.de/josch/img2pdf/issues/46
im = Image.frombytes(imgdata.mode, imgdata.size, imgdata.tobytes())
# Since version 8.3.0 Pillow limits strips to 64 KB. Since PDF only
# supports single strip CCITT Group4 payloads, we have to coerce it back
# into putting everything into a single strip. Thanks to Andrew Murray for
# the hack.
#
# Since version 8.4.0 Pillow allows us to modify the strip size explicitly
tmp_strip_size = (imgdata.size[0] + 7) // 8 * imgdata.size[1]
if hasattr(TiffImagePlugin, "STRIP_SIZE"):
# we are using Pillow 8.4.0 or later
with temp_attr(TiffImagePlugin, "STRIP_SIZE", tmp_strip_size):
im.save(newimgio, format="TIFF", compression="group4")
else:
# only needed for Pillow 8.3.x but works for versions before that as
# well
pillow__getitem__ = TiffImagePlugin.ImageFileDirectory_v2.__getitem__
def __getitem__(self, tag):
overrides = {
TiffImagePlugin.ROWSPERSTRIP: imgdata.size[1],
TiffImagePlugin.STRIPBYTECOUNTS: [tmp_strip_size],
TiffImagePlugin.STRIPOFFSETS: [0],
}
return overrides.get(tag, pillow__getitem__(self, tag))
with temp_attr(
TiffImagePlugin.ImageFileDirectory_v2, "__getitem__", __getitem__
):
im.save(newimgio, format="TIFF", compression="group4")
# Open new image in memory
newimgio.seek(0)
newimg = Image.open(newimgio)
offset, length = ccitt_payload_location_from_pil(newimg)
newimgio.seek(offset)
return newimgio.read(length)
def parse_png(rawdata):
pngidat = b""
palette = b""
i = 16
while i < len(rawdata):
# once we can require Python >= 3.2 we can use int.from_bytes() instead
(n,) = struct.unpack(">I", rawdata[i - 8 : i - 4])
if i + n > len(rawdata):
raise Exception("invalid png: %d %d %d" % (i, n, len(rawdata)))
if rawdata[i - 4 : i] == b"IDAT":
pngidat += rawdata[i : i + n]
elif rawdata[i - 4 : i] == b"PLTE":
palette += rawdata[i : i + n]
i += n
i += 12
return pngidat, palette
miff_re = re.compile(
r"""
[^\x00-\x20\x7f-\x9f] # the field name must not start with a control char or space
[^=]+ # the field name can even contain spaces
= # field name and value are separated by an equal sign
(?:
[^\x00-\x20\x7f-\x9f{}] # either chars that are not braces and not control chars
|{[^}]*} # or any kind of char surrounded by braces
)+""",
re.VERBOSE,
)
# https://imagemagick.org/script/miff.php
# turn off black formatting until python 3.10 is available on more platforms
# and we can use match/case
# fmt: off
def parse_miff(data):
results = []
header, rest = data.split(b":\x1a", 1)
header = header.decode("ISO-8859-1")
assert header.lower().startswith("id=imagemagick")
hdata = {}
for i, line in enumerate(re.findall(miff_re, header)):
if not line:
continue
k, v = line.split("=", 1)
if i == 0:
assert k.lower() == "id"
assert v.lower() == "imagemagick"
#match k.lower():
# case "class":
if k.lower() == "class":
#match v:
# case "DirectClass" | "PseudoClass":
if v in ["DirectClass", "PseudoClass"]:
hdata["class"] = v
# case _:
else:
print("cannot understand class", v)
# case "colorspace":
elif k.lower() == "colorspace":
# theoretically RGBA and CMYKA should be supported as well
# please teach me how to create such a MIFF file
#match v:
# case "sRGB" | "CMYK" | "Gray":
if v in ["sRGB", "CMYK", "Gray"]:
hdata["colorspace"] = v
# case _:
else:
print("cannot understand colorspace", v)
# case "depth":
elif k.lower() == "depth":
#match v:
# case "8" | "16" | "32":
if v in ["8", "16", "32"]:
hdata["depth"] = int(v)
# case _:
else:
print("cannot understand depth", v)
# case "colors":
elif k.lower() == "colors":
hdata["colors"] = int(v)
# case "matte":
elif k.lower() == "matte":
#match v:
# case "True":
if v == "True":
hdata["matte"] = True
# case "False":
elif v == "False":
hdata["matte"] = False
# case _:
else:
print("cannot understand matte", v)
# case "columns" | "rows":
elif k.lower() in ["columns", "rows"]:
hdata[k.lower()] = int(v)
# case "compression":
elif k.lower() == "compression":
print("compression not yet supported")
# case "profile":
elif k.lower() == "profile":
assert v in ["icc", "exif"]
hdata["profile"] = v
# case "resolution":
elif k.lower() == "resolution":
dpix, dpiy = v.split("x", 1)
hdata["resolution"] = (float(dpix), float(dpiy))
assert "depth" in hdata
assert "columns" in hdata
assert "rows" in hdata
#match hdata["class"]:
# case "DirectClass":
if hdata["class"] == "DirectClass":
if "colors" in hdata:
assert hdata["colors"] == 0
#match hdata["colorspace"]:
# case "sRGB":
if hdata["colorspace"] == "sRGB":
numchannels = 3
colorspace = Colorspace.RGB
# case "CMYK":
elif hdata["colorspace"] == "CMYK":
numchannels = 4
colorspace = Colorspace.CMYK
# case "Gray":
elif hdata["colorspace"] == "Gray":
numchannels = 1
colorspace = Colorspace.L
if hdata.get("matte"):
numchannels += 1
if hdata.get("profile"):
# there is no key encoding the length of icc or exif data
# according to the docs, the profile-icc key is supposed to do this
print("FAIL: exif")
else:
lenimgdata = (
hdata["depth"] // 8 * numchannels * hdata["columns"] * hdata["rows"]
)
assert len(rest) >= lenimgdata, (
len(rest),
hdata["depth"],
numchannels,
hdata["columns"],
hdata["rows"],
lenimgdata,
)
if colorspace == Colorspace.RGB and hdata["depth"] == 8:
newimg = Image.frombytes("RGB", (hdata["columns"], hdata["rows"]), rest[:lenimgdata])
imgdata, palette, depth = to_png_data(newimg)
assert palette == b""
assert depth == hdata["depth"]
imgfmt = ImageFormat.PNG
else:
imgdata = zlib.compress(rest[:lenimgdata])
imgfmt = ImageFormat.MIFF
results.append(
(
colorspace,
hdata.get("resolution") or (default_dpi, default_dpi),
imgfmt,
imgdata,
None, # smask
hdata["columns"],
hdata["rows"],
[], # palette
False, # inverted
hdata["depth"],
0, # rotation
None, # icc profile
)
)
if len(rest) > lenimgdata:
# another image is here
assert rest[lenimgdata:][:14].lower() == b"id=imagemagick"
results.extend(parse_miff(rest[lenimgdata:]))
# case "PseudoClass":
elif hdata["class"] == "PseudoClass":
assert "colors" in hdata
if hdata.get("matte"):
numchannels = 2
else:
numchannels = 1
lenpal = 3 * hdata["colors"] * hdata["depth"] // 8
lenimgdata = numchannels * hdata["rows"] * hdata["columns"]
assert len(rest) >= lenpal + lenimgdata, (len(rest), lenpal, lenimgdata)
results.append(
(
Colorspace.RGB,
hdata.get("resolution") or (default_dpi, default_dpi),
ImageFormat.MIFF,
zlib.compress(rest[lenpal : lenpal + lenimgdata]),
None, # FIXME: allow alpha channel smask
hdata["columns"],
hdata["rows"],
rest[:lenpal], # palette
False, # inverted
hdata["depth"],
0, # rotation
None, # icc profile
)
)
if len(rest) > lenpal + lenimgdata:
# another image is here
assert rest[lenpal + lenimgdata :][:14].lower() == b"id=imagemagick", (
len(rest),
lenpal,
lenimgdata,
)
results.extend(parse_miff(rest[lenpal + lenimgdata :]))
return results
# fmt: on
def read_images(
rawdata, colorspace, first_frame_only=False, rot=None, include_thumbnails=False
):
im = BytesIO(rawdata)
im.seek(0)
imgdata = None
try:
imgdata = Image.open(im)
except IOError as e:
# test if it is a jpeg2000 image
if rawdata[:12] == b"\x00\x00\x00\x0C\x6A\x50\x20\x20\x0D\x0A\x87\x0A":
# image is jpeg2000
imgformat = ImageFormat.JPEG2000
elif rawdata[:8] == b"\x97\x4a\x42\x32\x0d\x0a\x1a\x0a":
# For now we only support single-page generic coding of JBIG2, for example as generated by
# https://github.com/agl/jbig2enc
#
# In fact, you can pipe an example image `like src/tests/input/mono.png` directly into img2pdf:
# jbig2 src/tests/input/mono.png | img2pdf -o src/tests/output/mono.png.pdf
#
# For this we assume that the first 13 bytes are the JBIG file header describing a document with one page,
# followed by a "page information" segment describing the dimensions of that page.
#
# The following annotated `hexdump -C 042.jb2` shows the first 40 bytes that we inspect directly.
# The first 24 bytes (until "||") have to match exactly, while the following 16 bytes are read by get_imgmetadata.
#
# 97 4a 42 32 0d 0a 1a 0a 01 00 00 00 01 00 00 00
# \_____________________/ | \_________/ \______
# magic-bytes org/unk pages seg-num
#
# 00 30 00 01 00 00 00 13 || 00 00 00 73 00 00 00 30
# _/ | | | \_________/ || \_________/ \_________/
# type refs page seg-size || width-px height-px
#
# 00 00 00 48 00 00 00 48
# \_________/ \_________/
# xres yres
#
# For more information on the data format, see:
# * https://github.com/agl/jbig2enc/blob/ea05019/fcd14492.pdf
# For more information about the generic coding, see:
# * https://github.com/agl/jbig2enc/blob/ea05019/src/jbig2enc.cc#L898
imgformat = ImageFormat.JBIG2
if (
rawdata[:24]
!= b"\x97\x4a\x42\x32\x0d\x0a\x1a\x0a\x01\x00\x00\x00\x01\x00\x00\x00\x00\x30\x00\x01\x00\x00\x00\x13"
):
raise ImageOpenError(
"Unsupported JBIG2 format; only single-page generic coding is supported (e.g. from `jbig2enc`)."
)
if (
rawdata[-22:]
!= b"\x00\x00\x00\x021\x00\x01\x00\x00\x00\x00\x00\x00\x00\x033\x00\x01\x00\x00\x00\x00"
):
raise ImageOpenError(
"Unsupported JBIG2 format; we expect end-of-page and end-of-file segments at the end (e.g. from `jbig2enc`)."
)
elif rawdata[:14].lower() == b"id=imagemagick":
# image is in MIFF format
# this is useful for 16 bit CMYK because PNG cannot do CMYK and thus
# we need PIL but PIL cannot do 16 bit
imgformat = ImageFormat.MIFF
else:
raise ImageOpenError(
"cannot read input image (not jpeg2000). "
"PIL: error reading image: %s" % e
)
else:
logger.debug("PIL format = %s", imgdata.format)
imgformat = getattr(ImageFormat, imgdata.format, ImageFormat.other)
def cleanup():
if imgdata is not None:
# the python-pil version 2.3.0-1ubuntu3 in Ubuntu does not have the
# close() method
try:
imgdata.close()
except AttributeError:
pass
im.close()
logger.debug("imgformat = %s", imgformat.name)
# depending on the input format, determine whether to pass the raw
# image or the zlib compressed color information
# JPEG and JPEG2000 can be embedded into the PDF as-is
if imgformat == ImageFormat.JPEG or imgformat == ImageFormat.JPEG2000:
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata, rot
)
if color == Colorspace["1"]:
raise JpegColorspaceError("jpeg can't be monochrome")
if color == Colorspace["P"]:
raise JpegColorspaceError("jpeg can't have a color palette")
if color == Colorspace["RGBA"] and imgformat != ImageFormat.JPEG2000:
raise JpegColorspaceError("jpeg can't have an alpha channel")
logger.debug("read_images() embeds a JPEG")
cleanup()
depth = 8
if imgformat == ImageFormat.JPEG2000:
*_, depth = jp2.parse(rawdata)
return [
(
color,
ndpi,
imgformat,
rawdata,
None,
imgwidthpx,
imgheightpx,
[],
False,
depth,
rotation,
iccp,
)
]
# The MPO format is multiple JPEG images concatenated together
# we use the offset and size information to dissect the MPO into its
# individual JPEG images and then embed those into the PDF individually.
#
# The downside is, that this truncates the first JPEG as the MPO metadata
# will still be in it but the referenced images are chopped off. We still
# do it that way instead of adding the full MPO as the first image to not
# store duplicate image data.
if imgformat == ImageFormat.MPO:
result = []
img_page_count = 0
assert len(imgdata._MpoImageFile__mpoffsets) == len(imgdata.mpinfo[0xB002])
num_frames = len(imgdata.mpinfo[0xB002])
# An MPO file can be a main image together with one or more thumbnails
# if that is the case, then we only include all frames if the
# --include-thumbnails option is given. If it is not, such an MPO file
# will be embedded as is, so including its thumbnails but showing up
# as a single image page in the resulting PDF.
num_main_frames = 0
num_thumbnail_frames = 0
for i, mpent in enumerate(imgdata.mpinfo[0xB002]):
# check only the first frame for being the main image
if (
i == 0
and mpent["Attribute"]["DependentParentImageFlag"]
and not mpent["Attribute"]["DependentChildImageFlag"]
and mpent["Attribute"]["RepresentativeImageFlag"]
and mpent["Attribute"]["MPType"] == "Baseline MP Primary Image"
):
num_main_frames += 1
elif (
not mpent["Attribute"]["DependentParentImageFlag"]
and mpent["Attribute"]["DependentChildImageFlag"]
and not mpent["Attribute"]["RepresentativeImageFlag"]
and mpent["Attribute"]["MPType"]
in [
"Large Thumbnail (VGA Equivalent)",
"Large Thumbnail (Full HD Equivalent)",
]
):
num_thumbnail_frames += 1
logger.debug(f"number of frames: {num_frames}")
logger.debug(f"number of main frames: {num_main_frames}")
logger.debug(f"number of thumbnail frames: {num_thumbnail_frames}")
# this MPO file is a main image plus zero or more thumbnails
# embed as-is unless the --include-thumbnails option was given
if num_frames == 1 or (
not include_thumbnails
and num_main_frames == 1
and num_thumbnail_frames + 1 == num_frames
):
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata, rot
)
if color == Colorspace["1"]:
raise JpegColorspaceError("jpeg can't be monochrome")
if color == Colorspace["P"]:
raise JpegColorspaceError("jpeg can't have a color palette")
if color == Colorspace["RGBA"]:
raise JpegColorspaceError("jpeg can't have an alpha channel")
logger.debug("read_images() embeds an MPO verbatim")
cleanup()
return [
(
color,
ndpi,
ImageFormat.JPEG,
rawdata,
None,
imgwidthpx,
imgheightpx,
[],
False,
8,
rotation,
iccp,
)
]
# If the control flow reaches here, the MPO has more than a single
# frame but was not detected to be a main image followed by multiple
# thumbnails. We thus treat this MPO as we do other multi-frame images
# and include all its frames as individual pages.
for offset, mpent in zip(
imgdata._MpoImageFile__mpoffsets, imgdata.mpinfo[0xB002]
):
if first_frame_only and img_page_count > 0:
break
with BytesIO(rawdata[offset : offset + mpent["Size"]]) as rawframe:
with Image.open(rawframe) as imframe:
# The first frame contains the data that makes the JPEG a MPO
# Could we thus embed an MPO into another MPO? Lets not support
# such madness ;)
if img_page_count > 0 and imframe.format != "JPEG":
raise Exception("MPO payload must be a JPEG %s", imframe.format)
(
color,
ndpi,
imgwidthpx,
imgheightpx,
rotation,
iccp,
) = get_imgmetadata(
imframe, ImageFormat.JPEG, default_dpi, colorspace, rotreq=rot
)
if color == Colorspace["1"]:
raise JpegColorspaceError("jpeg can't be monochrome")
if color == Colorspace["P"]:
raise JpegColorspaceError("jpeg can't have a color palette")
if color == Colorspace["RGBA"]:
raise JpegColorspaceError("jpeg can't have an alpha channel")
logger.debug("read_images() embeds a JPEG from MPO")
result.append(
(
color,
ndpi,
ImageFormat.JPEG,
rawdata[offset : offset + mpent["Size"]],
None,
imgwidthpx,
imgheightpx,
[],
False,
8,
rotation,
iccp,
)
)
img_page_count += 1
cleanup()
return result
# We can directly embed the IDAT chunk of PNG images if the PNG is not
# interlaced
#
# PIL does not provide the information whether a PNG was stored interlaced
# or not. Thus, we retrieve that info manually by looking at byte 13 in the
# IHDR chunk. We know where to find that in the file because the IHDR chunk
# must be the first chunk.
if imgformat == ImageFormat.PNG and rawdata[28] == 0:
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata, rot
)
if (
color != Colorspace.RGBA
and color != Colorspace.LA
and color != Colorspace.PA
and "transparency" not in imgdata.info
):
pngidat, palette = parse_png(rawdata)
# PIL does not provide the information about the original bits per
# sample. Thus, we retrieve that info manually by looking at byte 9 in
# the IHDR chunk. We know where to find that in the file because the
# IHDR chunk must be the first chunk
depth = rawdata[24]
if depth not in [1, 2, 4, 8, 16]:
raise ValueError("invalid bit depth: %d" % depth)
# we embed the PNG only if it is not at the same time palette based
# and has an icc profile because PDF doesn't support icc profiles
# on palette images
if palette == b"" or iccp is None:
logger.debug("read_images() embeds a PNG")
cleanup()
return [
(
color,
ndpi,
imgformat,
pngidat,
None,
imgwidthpx,
imgheightpx,
palette,
False,
depth,
rotation,
iccp,
)
]
if imgformat == ImageFormat.JBIG2:
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata, rot
)
streamdata = rawdata[13:-22] # Strip file header and footer
return [
(
color,
ndpi,
imgformat,
streamdata,
None,
imgwidthpx,
imgheightpx,
[],
False,
1,
rotation,
iccp,
)
]
if imgformat == ImageFormat.MIFF:
return parse_miff(rawdata)
# If our input is not JPEG or PNG, then we might have a format that
# supports multiple frames (like TIFF or GIF), so we need a loop to
# iterate through all frames of the image.
#
# Each frame gets compressed using PNG compression *except* if:
#
# * The image is monochrome => encode using CCITT group 4
#
# * The image is CMYK => zip plain RGB data
#
# * We are handling a CCITT encoded TIFF frame => embed data
result = []
img_page_count = 0
# loop through all frames of the image (example: multipage TIFF)
while True:
try:
imgdata.seek(img_page_count)
except EOFError:
break
if first_frame_only and img_page_count > 0:
break
# PIL is unable to preserve the data of 16-bit RGB TIFF files and will
# convert it to 8-bit without the possibility to retrieve the original
# data
# https://github.com/python-pillow/Pillow/issues/1888
#
# Some tiff images do not have BITSPERSAMPLE set. Use this to create
# such a tiff: tiffset -u 258 test.tif
if (
imgformat == ImageFormat.TIFF
and max(imgdata.tag_v2.get(TiffImagePlugin.BITSPERSAMPLE, [1])) > 8
):
raise ValueError("PIL is unable to preserve more than 8 bits per sample")
# We can directly copy the data out of a CCITT Group 4 encoded TIFF, if it
# only contains a single strip
if (
imgformat == ImageFormat.TIFF
and imgdata.info["compression"] == "group4"
and len(imgdata.tag_v2[TiffImagePlugin.STRIPOFFSETS]) == 1
and len(imgdata.tag_v2[TiffImagePlugin.STRIPBYTECOUNTS]) == 1
):
photo = imgdata.tag_v2[TiffImagePlugin.PHOTOMETRIC_INTERPRETATION]
inverted = False
if photo == 0:
inverted = True
elif photo != 1:
raise ValueError(
"unsupported photometric interpretation for "
"group4 tiff: %d" % photo
)
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rawdata, rot
)
offset, length = ccitt_payload_location_from_pil(imgdata)
im.seek(offset)
rawdata = im.read(length)
fillorder = imgdata.tag_v2.get(TiffImagePlugin.FILLORDER)
if fillorder is None:
# no FillOrder: nothing to do
pass
elif fillorder == 1:
# msb-to-lsb: nothing to do
pass
elif fillorder == 2:
logger.debug("fillorder is lsb-to-msb => reverse bits")
# lsb-to-msb: reverse bits of each byte
rawdata = bytearray(rawdata)
for i in range(len(rawdata)):
rawdata[i] = TIFFBitRevTable[rawdata[i]]
rawdata = bytes(rawdata)
else:
raise ValueError("unsupported FillOrder: %d" % fillorder)
logger.debug("read_images() embeds Group4 from TIFF")
result.append(
(
color,
ndpi,
ImageFormat.CCITTGroup4,
rawdata,
None,
imgwidthpx,
imgheightpx,
[],
inverted,
1,
rotation,
iccp,
)
)
img_page_count += 1
continue
logger.debug("Converting frame: %d" % img_page_count)
color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata(
imgdata, imgformat, default_dpi, colorspace, rotreq=rot
)
newimg = None
if color == Colorspace["1"]:
try:
ccittdata = transcode_monochrome(imgdata)
logger.debug("read_images() encoded a B/W image as CCITT group 4")
result.append(
(
color,
ndpi,
ImageFormat.CCITTGroup4,
ccittdata,
None,
imgwidthpx,
imgheightpx,
[],
False,
1,
rotation,
iccp,
)
)
img_page_count += 1
continue
except Exception as e:
logger.debug(e)
logger.debug("Converting colorspace 1 to L")
newimg = imgdata.convert("L")
color = Colorspace.L
elif color in [
Colorspace.RGB,
Colorspace.RGBA,
Colorspace.L,
Colorspace.LA,
Colorspace.CMYK,
Colorspace["CMYK;I"],
Colorspace.P,
]:
logger.debug("Colorspace is OK: %s", color)
newimg = imgdata
else:
raise ValueError("unknown or unsupported colorspace: %s" % color.name)
# the PNG format does not support CMYK, so we fall back to normal
# compression
if color in [Colorspace.CMYK, Colorspace["CMYK;I"]]:
imggz = zlib.compress(newimg.tobytes())
logger.debug("read_images() encoded CMYK with flate compression")
result.append(
(
color,
ndpi,
imgformat,
imggz,
None,
imgwidthpx,
imgheightpx,
[],
False,
8,
rotation,
iccp,
)
)
else:
if color in [Colorspace.P, Colorspace.PA] and iccp is not None:
# PDF does not support palette images with icc profile
if color == Colorspace.P:
newcolor = Colorspace.RGB
newimg = newimg.convert(mode="RGB")
elif color == Colorspace.PA:
newcolor = Colorspace.RGBA
newimg = newimg.convert(mode="RGBA")
smaskidat = None
elif (
color == Colorspace.RGBA
or color == Colorspace.LA
or color == Colorspace.PA
or "transparency" in newimg.info
):
if color == Colorspace.RGBA:
newcolor = color
r, g, b, a = newimg.split()
newimg = Image.merge("RGB", (r, g, b))
elif color == Colorspace.LA:
newcolor = color
l, a = newimg.split()
newimg = l
elif color == Colorspace.PA or (
color == Colorspace.P and "transparency" in newimg.info
):
newcolor = color
a = newimg.convert(mode="RGBA").split()[-1]
else:
newcolor = Colorspace.RGBA
r, g, b, a = newimg.convert(mode="RGBA").split()
newimg = Image.merge("RGB", (r, g, b))
smaskidat, *_ = to_png_data(a)
logger.warning(
"Image contains an alpha channel. Computing a separate "
"soft mask (/SMask) image to store transparency in PDF."
)
else:
newcolor = color
smaskidat = None
pngidat, palette, depth = to_png_data(newimg)
logger.debug("read_images() encoded an image as PNG")
result.append(
(
newcolor,
ndpi,
ImageFormat.PNG,
pngidat,
smaskidat,
imgwidthpx,
imgheightpx,
palette,
False,
depth,
rotation,
iccp,
)
)
img_page_count += 1
cleanup()
return result
def to_png_data(img):
# cheapo version to retrieve a PNG encoding of the payload is to
# just save it with PIL. In the future this could be replaced by
# dedicated function applying the Paeth PNG filter to the raw pixel
pngbuffer = BytesIO()
img.save(pngbuffer, format="png")
pngidat, palette = parse_png(pngbuffer.getvalue())
# PIL does not provide the information about the original bits per
# sample. Thus, we retrieve that info manually by looking at byte 9 in
# the IHDR chunk. We know where to find that in the file because the
# IHDR chunk must be the first chunk
pngbuffer.seek(24)
depth = ord(pngbuffer.read(1))
if depth not in [1, 2, 4, 8, 16]:
raise ValueError("invalid bit depth: %d" % depth)
return pngidat, palette, depth
# converts a length in pixels to a length in PDF units (1/72 of an inch)
def px_to_pt(length, dpi):
return 72.0 * length / dpi
def cm_to_pt(length):
return (72.0 * length) / 2.54
def mm_to_pt(length):
return (72.0 * length) / 25.4
def in_to_pt(length):
return 72.0 * length
def get_layout_fun(
pagesize=None, imgsize=None, border=None, fit=None, auto_orient=False
):
def fitfun(fit, imgwidth, imgheight, fitwidth, fitheight):
if fitwidth is None and fitheight is None:
raise ValueError("fitwidth and fitheight cannot both be None")
# if fit is fill or enlarge then it is okay if one of the dimensions
# are negative but one of them must still be positive
# if fit is not fill or enlarge then both dimensions must be positive
if (
fit in [FitMode.fill, FitMode.enlarge]
and fitwidth is not None
and fitwidth < 0
and fitheight is not None
and fitheight < 0
):
raise ValueError(
"cannot fit into a rectangle where both dimensions are negative"
)
elif fit not in [FitMode.fill, FitMode.enlarge] and (
(fitwidth is not None and fitwidth < 0)
or (fitheight is not None and fitheight < 0)
):
raise Exception(
"cannot fit into a rectangle where either dimensions are negative"
)
def default():
if fitwidth is not None and fitheight is not None:
newimgwidth = fitwidth
newimgheight = (newimgwidth * imgheight) / imgwidth
if newimgheight > fitheight:
newimgheight = fitheight
newimgwidth = (newimgheight * imgwidth) / imgheight
elif fitwidth is None and fitheight is not None:
newimgheight = fitheight
newimgwidth = (newimgheight * imgwidth) / imgheight
elif fitheight is None and fitwidth is not None:
newimgwidth = fitwidth
newimgheight = (newimgwidth * imgheight) / imgwidth
else:
raise ValueError("fitwidth and fitheight cannot both be None")
return newimgwidth, newimgheight
if fit is None or fit == FitMode.into:
return default()
elif fit == FitMode.fill:
if fitwidth is not None and fitheight is not None:
newimgwidth = fitwidth
newimgheight = (newimgwidth * imgheight) / imgwidth
if newimgheight < fitheight:
newimgheight = fitheight
newimgwidth = (newimgheight * imgwidth) / imgheight
elif fitwidth is None and fitheight is not None:
newimgheight = fitheight
newimgwidth = (newimgheight * imgwidth) / imgheight
elif fitheight is None and fitwidth is not None:
newimgwidth = fitwidth
newimgheight = (newimgwidth * imgheight) / imgwidth
else:
raise ValueError("fitwidth and fitheight cannot both be None")
return newimgwidth, newimgheight
elif fit == FitMode.exact:
if fitwidth is not None and fitheight is not None:
return fitwidth, fitheight
elif fitwidth is None and fitheight is not None:
newimgheight = fitheight
newimgwidth = (newimgheight * imgwidth) / imgheight
elif fitheight is None and fitwidth is not None:
newimgwidth = fitwidth
newimgheight = (newimgwidth * imgheight) / imgwidth
else:
raise ValueError("fitwidth and fitheight cannot both be None")
return newimgwidth, newimgheight
elif fit == FitMode.shrink:
if fitwidth is not None and fitheight is not None:
if imgwidth <= fitwidth and imgheight <= fitheight:
return imgwidth, imgheight
elif fitwidth is None and fitheight is not None:
if imgheight <= fitheight:
return imgwidth, imgheight
elif fitheight is None and fitwidth is not None:
if imgwidth <= fitwidth:
return imgwidth, imgheight
else:
raise ValueError("fitwidth and fitheight cannot both be None")
return default()
elif fit == FitMode.enlarge:
if fitwidth is not None and fitheight is not None:
if imgwidth > fitwidth or imgheight > fitheight:
return imgwidth, imgheight
elif fitwidth is None and fitheight is not None:
if imgheight > fitheight:
return imgwidth, imgheight
elif fitheight is None and fitwidth is not None:
if imgwidth > fitwidth:
return imgwidth, imgheight
else:
raise ValueError("fitwidth and fitheight cannot both be None")
return default()
else:
raise NotImplementedError
# if no layout arguments are given, then the image size is equal to the
# page size and will be drawn with the default dpi
if pagesize is None and imgsize is None and border is None:
return default_layout_fun
if pagesize is None and imgsize is None and border is not None:
def layout_fun(imgwidthpx, imgheightpx, ndpi):
imgwidthpdf = px_to_pt(imgwidthpx, ndpi[0])
imgheightpdf = px_to_pt(imgheightpx, ndpi[1])
pagewidth = imgwidthpdf + 2 * border[1]
pageheight = imgheightpdf + 2 * border[0]
return pagewidth, pageheight, imgwidthpdf, imgheightpdf
return layout_fun
if border is None:
border = (0, 0)
# if the pagesize is given but the imagesize is not, then the imagesize
# will be calculated from the pagesize, taking into account the border
# and the fitting
if pagesize is not None and imgsize is None:
def layout_fun(imgwidthpx, imgheightpx, ndpi):
if (
pagesize[0] is not None
and pagesize[1] is not None
and auto_orient
and (
(imgwidthpx > imgheightpx and pagesize[0] < pagesize[1])
or (imgwidthpx < imgheightpx and pagesize[0] > pagesize[1])
)
):
pagewidth, pageheight = pagesize[1], pagesize[0]
newborder = border[1], border[0]
else:
pagewidth, pageheight = pagesize[0], pagesize[1]
newborder = border
if pagewidth is not None:
fitwidth = pagewidth - 2 * newborder[1]
else:
fitwidth = None
if pageheight is not None:
fitheight = pageheight - 2 * newborder[0]
else:
fitheight = None
if (
fit in [FitMode.fill, FitMode.enlarge]
and fitwidth is not None
and fitwidth < 0
and fitheight is not None
and fitheight < 0
):
raise NegativeDimensionError(
"at least one border dimension musts be smaller than half "
"the respective page dimension"
)
elif fit not in [FitMode.fill, FitMode.enlarge] and (
(fitwidth is not None and fitwidth < 0)
or (fitheight is not None and fitheight < 0)
):
raise NegativeDimensionError(
"one border dimension is larger than half of the "
"respective page dimension"
)
imgwidthpdf, imgheightpdf = fitfun(
fit,
px_to_pt(imgwidthpx, ndpi[0]),
px_to_pt(imgheightpx, ndpi[1]),
fitwidth,
fitheight,
)
if pagewidth is None:
pagewidth = imgwidthpdf + border[1] * 2
if pageheight is None:
pageheight = imgheightpdf + border[0] * 2
return pagewidth, pageheight, imgwidthpdf, imgheightpdf
return layout_fun
def scale_imgsize(s, px, dpi):
if s is None:
return None
mode, value = s
if mode == ImgSize.abs:
return value
if mode == ImgSize.perc:
return (px_to_pt(px, dpi) * value) / 100
if mode == ImgSize.dpi:
return px_to_pt(px, value)
raise NotImplementedError
if pagesize is None and imgsize is not None:
def layout_fun(imgwidthpx, imgheightpx, ndpi):
imgwidthpdf, imgheightpdf = fitfun(
fit,
px_to_pt(imgwidthpx, ndpi[0]),
px_to_pt(imgheightpx, ndpi[1]),
scale_imgsize(imgsize[0], imgwidthpx, ndpi[0]),
scale_imgsize(imgsize[1], imgheightpx, ndpi[1]),
)
pagewidth = imgwidthpdf + 2 * border[1]
pageheight = imgheightpdf + 2 * border[0]
return pagewidth, pageheight, imgwidthpdf, imgheightpdf
return layout_fun
if pagesize is not None and imgsize is not None:
def layout_fun(imgwidthpx, imgheightpx, ndpi):
if (
pagesize[0] is not None
and pagesize[1] is not None
and auto_orient
and (
(imgwidthpx > imgheightpx and pagesize[0] < pagesize[1])
or (imgwidthpx < imgheightpx and pagesize[0] > pagesize[1])
)
):
pagewidth, pageheight = pagesize[1], pagesize[0]
else:
pagewidth, pageheight = pagesize[0], pagesize[1]
imgwidthpdf, imgheightpdf = fitfun(
fit,
px_to_pt(imgwidthpx, ndpi[0]),
px_to_pt(imgheightpx, ndpi[1]),
scale_imgsize(imgsize[0], imgwidthpx, ndpi[0]),
scale_imgsize(imgsize[1], imgheightpx, ndpi[1]),
)
return pagewidth, pageheight, imgwidthpdf, imgheightpdf
return layout_fun
raise NotImplementedError
def default_layout_fun(imgwidthpx, imgheightpx, ndpi):
imgwidthpdf = pagewidth = px_to_pt(imgwidthpx, ndpi[0])
imgheightpdf = pageheight = px_to_pt(imgheightpx, ndpi[1])
return pagewidth, pageheight, imgwidthpdf, imgheightpdf
def get_fixed_dpi_layout_fun(fixed_dpi):
"""Layout function that overrides whatever DPI is claimed in input images.
>>> layout_fun = get_fixed_dpi_layout_fun((300, 300))
>>> convert(image1, layout_fun=layout_fun, ... outputstream=...)
"""
def fixed_dpi_layout_fun(imgwidthpx, imgheightpx, ndpi):
return default_layout_fun(imgwidthpx, imgheightpx, fixed_dpi)
return fixed_dpi_layout_fun
def find_scale(pagewidth, pageheight):
"""Find the power of 10 (10, 100, 1000...) that will reduce the scale
below the PDF specification limit of 14400 PDF units (=200 inches).
In principle we could also choose a scale that is not a power of 10.
We use powers of 10 because numbers in the PDF format are represented
in base-10 and using powers of 10 will thus just shift the comma and
keep the numbers easily readable by humans as well."""
from math import log10, ceil
major = max(pagewidth, pageheight)
oversized = major / 14400.0
return 10 ** ceil(log10(oversized))
# Convert the image(s) to a `pdfdoc` object.
# The `.writer` attribute holds the underlying engine document handle, and
# `.output_version` the minimum version the caller should use when saving.
# The main convert() wraps this implementation function.
def convert_to_docobject(*images, **kwargs):
_default_kwargs = dict(
engine=None,
title=None,
author=None,
creator=None,
producer=None,
creationdate=None,
moddate=None,
subject=None,
keywords=None,
colorspace=None,
nodate=False,
layout_fun=default_layout_fun,
viewer_panes=None,
viewer_initial_page=None,
viewer_magnification=None,
viewer_page_layout=None,
viewer_fit_window=False,
viewer_center_window=False,
viewer_fullscreen=False,
first_frame_only=False,
allow_oversized=True,
cropborder=None,
bleedborder=None,
trimborder=None,
artborder=None,
pdfa=None,
rotation=None,
include_thumbnails=False,
)
for kwname, default in _default_kwargs.items():
if kwname not in kwargs:
kwargs[kwname] = default
pdf = pdfdoc(
kwargs["engine"],
"1.3",
kwargs["title"],
kwargs["author"],
kwargs["creator"],
kwargs["producer"],
kwargs["creationdate"],
kwargs["moddate"],
kwargs["subject"],
kwargs["keywords"],
kwargs["nodate"],
kwargs["viewer_panes"],
kwargs["viewer_initial_page"],
kwargs["viewer_magnification"],
kwargs["viewer_page_layout"],
kwargs["viewer_fit_window"],
kwargs["viewer_center_window"],
kwargs["viewer_fullscreen"],
kwargs["pdfa"],
)
# backwards compatibility with older img2pdf versions where the first
# argument to the function had to be given as a list
if len(images) == 1:
# if only one argument was given and it is a list, expand it
if isinstance(images[0], (list, tuple)):
images = images[0]
if not isinstance(images, (list, tuple)):
images = [images]
else:
if len(images) == 0:
raise ValueError("Unable to process empty list")
for img in images:
# img is allowed to be a path, a binary string representing image data
# or a file-like object (really anything that implements read())
# or a pathlib.Path object (really anything that implements read_bytes())
rawdata = None
for fun in "read", "read_bytes":
try:
rawdata = getattr(img, fun)()
except AttributeError:
pass
if rawdata is None:
if not isinstance(img, (str, bytes)):
raise TypeError("Neither read(), read_bytes() nor is str or bytes")
# the thing doesn't have a read() function, so try if we can treat
# it as a file name
try:
f = open(img, "rb")
except Exception:
# whatever the exception is (string could contain NUL
# characters or the path could just not exist) it's not a file
# name so we now try treating it as raw image content
rawdata = img
else:
# we are not using a "with" block here because we only want to
# catch exceptions thrown by open(). The read() may throw its
# own exceptions like MemoryError which should be handled
# differently.
rawdata = f.read()
f.close()
# md5 = hashlib.md5(rawdata).hexdigest()
# with open("./testdata/" + md5, "wb") as f:
# f.write(rawdata)
for (
color,
ndpi,
imgformat,
imgdata,
smaskdata,
imgwidthpx,
imgheightpx,
palette,
inverted,
depth,
rotation,
iccp,
) in read_images(
rawdata,
kwargs["colorspace"],
kwargs["first_frame_only"],
kwargs["rotation"],
kwargs["include_thumbnails"],
):
pagewidth, pageheight, imgwidthpdf, imgheightpdf = kwargs["layout_fun"](
imgwidthpx, imgheightpx, ndpi
)
userunit = None
if pagewidth < 3.00 or pageheight < 3.00:
logger.warning(
"pdf width or height is below 3.00 - too small for some viewers!"
)
elif pagewidth > 14400.0 or pageheight > 14400.0:
if kwargs["allow_oversized"]:
userunit = find_scale(pagewidth, pageheight)
pagewidth /= userunit
pageheight /= userunit
imgwidthpdf /= userunit
imgheightpdf /= userunit
else:
raise PdfTooLargeError(
"pdf width or height must not exceed 200 inches."
)
for border in ["crop", "bleed", "trim", "art"]:
if kwargs[border + "border"] is None:
continue
if pagewidth < 2 * kwargs[border + "border"][1]:
raise ValueError(
"horizontal %s border larger than page width" % border
)
if pageheight < 2 * kwargs[border + "border"][0]:
raise ValueError(
"vertical %s border larger than page height" % border
)
# the image is always centered on the page
imgxpdf = (pagewidth - imgwidthpdf) / 2.0
imgypdf = (pageheight - imgheightpdf) / 2.0
pdf.add_imagepage(
color,
imgwidthpx,
imgheightpx,
imgformat,
imgdata,
smaskdata,
imgwidthpdf,
imgheightpdf,
imgxpdf,
imgypdf,
pagewidth,
pageheight,
userunit,
palette,
inverted,
depth,
rotation,
kwargs["cropborder"],
kwargs["bleedborder"],
kwargs["trimborder"],
kwargs["artborder"],
iccp,
)
pdf.finalize()
return pdf
# given one or more input image, depending on outputstream, either return a
# string containing the whole PDF if outputstream is None or write the PDF
# data to the given file-like object and return None
#
# Input images can be given as file like objects (they must implement read()),
# as a binary string representing the image content or as filenames to the
# images.
def convert(*images, outputstream=None, **kwargs):
pdf = convert_to_docobject(*images, **kwargs)
if outputstream:
pdf.tostream(outputstream)
return
return pdf.tostring()
def parse_num(num, name):
if num == "":
return None
unit = None
if num.endswith("pt"):
unit = Unit.pt
elif num.endswith("cm"):
unit = Unit.cm
elif num.endswith("mm"):
unit = Unit.mm
elif num.endswith("in"):
unit = Unit.inch
else:
try:
num = float(num)
except ValueError:
msg = (
"%s is not a floating point number and doesn't have a "
"valid unit: %s" % (name, num)
)
raise argparse.ArgumentTypeError(msg)
if unit is None:
unit = Unit.pt
else:
num = num[:-2]
try:
num = float(num)
except ValueError:
msg = "%s is not a floating point number: %s" % (name, num)
raise argparse.ArgumentTypeError(msg)
if num < 0:
msg = "%s must not be negative: %s" % (name, num)
raise argparse.ArgumentTypeError(msg)
if unit == Unit.cm:
num = cm_to_pt(num)
elif unit == Unit.mm:
num = mm_to_pt(num)
elif unit == Unit.inch:
num = in_to_pt(num)
return num
def parse_imgsize_num(num, name):
if num == "":
return None
unit = None
if num.endswith("pt"):
unit = ImgUnit.pt
elif num.endswith("cm"):
unit = ImgUnit.cm
elif num.endswith("mm"):
unit = ImgUnit.mm
elif num.endswith("in"):
unit = ImgUnit.inch
elif num.endswith("dpi"):
unit = ImgUnit.dpi
elif num.endswith("%"):
unit = ImgUnit.perc
else:
try:
num = float(num)
except ValueError:
msg = (
"%s is not a floating point number and doesn't have a "
"valid unit: %s" % (name, num)
)
raise argparse.ArgumentTypeError(msg)
if unit is None:
unit = ImgUnit.pt
else:
# strip off unit from string
if unit == ImgUnit.dpi:
num = num[:-3]
elif unit == ImgUnit.perc:
num = num[:-1]
else:
num = num[:-2]
try:
num = float(num)
except ValueError:
msg = "%s is not a floating point number: %s" % (name, num)
raise argparse.ArgumentTypeError(msg)
if unit == ImgUnit.cm:
num = (ImgSize.abs, cm_to_pt(num))
elif unit == ImgUnit.mm:
num = (ImgSize.abs, mm_to_pt(num))
elif unit == ImgUnit.inch:
num = (ImgSize.abs, in_to_pt(num))
elif unit == ImgUnit.pt:
num = (ImgSize.abs, num)
elif unit == ImgUnit.dpi:
num = (ImgSize.dpi, num)
elif unit == ImgUnit.perc:
num = (ImgSize.perc, num)
return num
def parse_pagesize_rectarg(string):
transposed = string.endswith("^T")
if transposed:
string = string[:-2]
if papersizes.get(string.lower()):
string = papersizes[string.lower()]
if "x" not in string:
# if there is no separating "x" in the string, then the string is
# interpreted as the width
w = parse_num(string, "width")
h = None
else:
w, h = string.split("x", 1)
w = parse_num(w, "width")
h = parse_num(h, "height")
if transposed:
w, h = h, w
if w is None and h is None:
raise argparse.ArgumentTypeError("at least one dimension must be specified")
return w, h
def parse_imgsize_rectarg(string):
transposed = string.endswith("^T")
if transposed:
string = string[:-2]
if papersizes.get(string.lower()):
string = papersizes[string.lower()]
if "x" not in string:
# if there is no separating "x" in the string, then the string is
# interpreted as the width
w = parse_imgsize_num(string, "width")
h = None
else:
w, h = string.split("x", 1)
w = parse_imgsize_num(w, "width")
h = parse_imgsize_num(h, "height")
if transposed:
w, h = h, w
if w is None and h is None:
raise argparse.ArgumentTypeError("at least one dimension must be specified")
return w, h
def parse_colorspacearg(string):
for c in Colorspace:
if c.name == string:
return c
allowed = ", ".join([c.name for c in Colorspace])
raise argparse.ArgumentTypeError(
"Unsupported colorspace: %s. Must be one of: %s." % (string, allowed)
)
def parse_enginearg(string):
for c in Engine:
if c.name == string:
return c
allowed = ", ".join([c.name for c in Engine])
raise argparse.ArgumentTypeError(
"Unsupported engine: %s. Must be one of: %s." % (string, allowed)
)
def parse_borderarg(string):
if ":" in string:
h, v = string.split(":", 1)
if h == "":
raise argparse.ArgumentTypeError("missing value before colon")
if v == "":
raise argparse.ArgumentTypeError("missing value after colon")
else:
if string == "":
raise argparse.ArgumentTypeError("border option cannot be empty")
h, v = string, string
h, v = parse_num(h, "left/right border"), parse_num(v, "top/bottom border")
if h is None and v is None:
raise argparse.ArgumentTypeError("missing value")
return h, v
def from_file(path):
result = []
if path == "-":
content = sys.stdin.buffer.read()
else:
with open(path, "rb") as f:
content = f.read()
for path in content.split(b"\0"):
if path == b"":
continue
try:
# test-read a byte from it so that we can abort early in case
# we cannot read data from the file
with open(path, "rb") as im:
im.read(1)
except IsADirectoryError:
raise argparse.ArgumentTypeError('"%s" is a directory' % path)
except PermissionError:
raise argparse.ArgumentTypeError('"%s" permission denied' % path)
except FileNotFoundError:
raise argparse.ArgumentTypeError('"%s" does not exist' % path)
result.append(path)
return result
def input_images(path_expr):
if path_expr == "-":
# we slurp in all data from stdin because we need to seek in it later
result = [sys.stdin.buffer.read()]
if len(result) == 0:
raise argparse.ArgumentTypeError('"%s" is empty' % path_expr)
else:
result = []
paths = [path_expr]
if sys.platform == "win32" and ("*" in path_expr or "?" in path_expr):
# on windows, program is responsible for expanding wildcards such as *.jpg
# glob won't return files that don't exist so we only use it for wildcards
# paths without wildcards that do not exist will trigger "does not exist"
from glob import glob
paths = sorted(glob(path_expr))
for path in paths:
try:
if os.path.getsize(path) == 0:
raise argparse.ArgumentTypeError('"%s" is empty' % path)
# test-read a byte from it so that we can abort early in case
# we cannot read data from the file
with open(path, "rb") as im:
im.read(1)
except IsADirectoryError:
raise argparse.ArgumentTypeError('"%s" is a directory' % path)
except PermissionError:
raise argparse.ArgumentTypeError('"%s" permission denied' % path)
except FileNotFoundError:
raise argparse.ArgumentTypeError('"%s" does not exist' % path)
result.append(path)
return result
def parse_rotationarg(string):
for m in Rotation:
if m.name == string.lower():
return m
raise argparse.ArgumentTypeError("unknown rotation value: %s" % string)
def parse_fitarg(string):
for m in FitMode:
if m.name == string.lower():
return m
raise argparse.ArgumentTypeError("unknown fit mode: %s" % string)
def parse_panes(string):
for m in PageMode:
if m.name == string.lower():
return m
allowed = ", ".join([m.name for m in PageMode])
raise argparse.ArgumentTypeError(
"Unsupported page mode: %s. Must be one of: %s." % (string, allowed)
)
def parse_magnification(string):
for m in Magnification:
if m.name == string.lower():
return m
try:
return float(string)
except ValueError:
pass
allowed = ", ".join([m.name for m in Magnification])
raise argparse.ArgumentTypeError(
"Unsupported magnification: %s. Must be "
"a floating point number or one of: %s." % (string, allowed)
)
def parse_layout(string):
for l in PageLayout:
if l.name == string.lower():
return l
allowed = ", ".join([l.name for l in PageLayout])
raise argparse.ArgumentTypeError(
"Unsupported page layout: %s. Must be one of: %s." % (string, allowed)
)
def valid_date(string):
# first try parsing in ISO8601 format
try:
return datetime.strptime(string, "%Y-%m-%d")
except ValueError:
pass
try:
return datetime.strptime(string, "%Y-%m-%dT%H:%M")
except ValueError:
pass
try:
return datetime.strptime(string, "%Y-%m-%dT%H:%M:%S")
except ValueError:
pass
# then try dateutil
try:
from dateutil import parser
except ImportError:
pass
else:
try:
return parser.parse(string)
except:
pass
# as a last resort, try the local date utility
try:
import subprocess
except ImportError:
pass
else:
try:
utime = subprocess.check_output(["date", "--date", string, "+%s"])
except subprocess.CalledProcessError:
pass
else:
return datetime.fromtimestamp(int(utime))
raise argparse.ArgumentTypeError("cannot parse date: %s" % string)
def gui():
import tkinter
import tkinter.filedialog
have_fitz = True
try:
import fitz
except ImportError:
have_fitz = False
# from Python 3.7 Lib/idlelib/configdialog.py
# Copyright 2015-2017 Terry Jan Reedy
# Python License
class VerticalScrolledFrame(tkinter.Frame):
"""A pure Tkinter vertically scrollable frame.
* Use the 'interior' attribute to place widgets inside the scrollable frame
* Construct and pack/place/grid normally
* This frame only allows vertical scrolling
"""
def __init__(self, parent, *args, **kw):
tkinter.Frame.__init__(self, parent, *args, **kw)
# Create a canvas object and a vertical scrollbar for scrolling it.
vscrollbar = tkinter.Scrollbar(self, orient=tkinter.VERTICAL)
vscrollbar.pack(fill=tkinter.Y, side=tkinter.RIGHT, expand=tkinter.FALSE)
canvas = tkinter.Canvas(
self,
borderwidth=0,
highlightthickness=0,
yscrollcommand=vscrollbar.set,
width=240,
)
canvas.pack(side=tkinter.LEFT, fill=tkinter.BOTH, expand=tkinter.TRUE)
vscrollbar.config(command=canvas.yview)
# Reset the view.
canvas.xview_moveto(0)
canvas.yview_moveto(0)
# Create a frame inside the canvas which will be scrolled with it.
self.interior = interior = tkinter.Frame(canvas)
interior_id = canvas.create_window(0, 0, window=interior, anchor=tkinter.NW)
# Track changes to the canvas and frame width and sync them,
# also updating the scrollbar.
def _configure_interior(event):
# Update the scrollbars to match the size of the inner frame.
size = (interior.winfo_reqwidth(), interior.winfo_reqheight())
canvas.config(scrollregion="0 0 %s %s" % size)
interior.bind("", _configure_interior)
def _configure_canvas(event):
if interior.winfo_reqwidth() != canvas.winfo_width():
# Update the inner frame's width to fill the canvas.
canvas.itemconfigure(interior_id, width=canvas.winfo_width())
canvas.bind("", _configure_canvas)
return
# From Python 3.7 Lib/tkinter/__init__.py
# Copyright 2000 Fredrik Lundh
# Python License
#
# add support for 'state' and 'name' kwargs
# add support for updating list of options
class OptionMenu(tkinter.Menubutton):
"""OptionMenu which allows the user to select a value from a menu."""
def __init__(self, master, variable, value, *values, **kwargs):
"""Construct an optionmenu widget with the parent MASTER, with
the resource textvariable set to VARIABLE, the initially selected
value VALUE, the other menu values VALUES and an additional
keyword argument command."""
kw = {
"borderwidth": 2,
"textvariable": variable,
"indicatoron": 1,
"relief": tkinter.RAISED,
"anchor": "c",
"highlightthickness": 2,
}
if "state" in kwargs:
kw["state"] = kwargs["state"]
del kwargs["state"]
if "name" in kwargs:
kw["name"] = kwargs["name"]
del kwargs["name"]
tkinter.Widget.__init__(self, master, "menubutton", kw)
self.widgetName = "tk_optionMenu"
self.callback = kwargs.get("command")
self.variable = variable
if "command" in kwargs:
del kwargs["command"]
if kwargs:
raise tkinter.TclError("unknown option -" + list(kwargs.keys())[0])
self.set_values([value] + list(values))
def __getitem__(self, name):
if name == "menu":
return self.__menu
return tkinter.Widget.__getitem__(self, name)
def set_values(self, values):
menu = self.__menu = tkinter.Menu(self, name="menu", tearoff=0)
self.menuname = menu._w
for v in values:
menu.add_command(
label=v, command=tkinter._setit(self.variable, v, self.callback)
)
self["menu"] = menu
def destroy(self):
"""Destroy this widget and the associated menu."""
tkinter.Menubutton.destroy(self)
self.__menu = None
root = tkinter.Tk()
app = tkinter.Frame(master=root)
infiles = []
maxpagewidth = 0
maxpageheight = 0
doc = None
args = {
"engine": tkinter.StringVar(),
"auto_orient": tkinter.BooleanVar(),
"fit": tkinter.StringVar(),
"title": tkinter.StringVar(),
"author": tkinter.StringVar(),
"creator": tkinter.StringVar(),
"producer": tkinter.StringVar(),
"subject": tkinter.StringVar(),
"keywords": tkinter.StringVar(),
"nodate": tkinter.BooleanVar(),
"creationdate": tkinter.StringVar(),
"moddate": tkinter.StringVar(),
"viewer_panes": tkinter.StringVar(),
"viewer_initial_page": tkinter.IntVar(),
"viewer_magnification": tkinter.StringVar(),
"viewer_page_layout": tkinter.StringVar(),
"viewer_fit_window": tkinter.BooleanVar(),
"viewer_center_window": tkinter.BooleanVar(),
"viewer_fullscreen": tkinter.BooleanVar(),
"pagesize_dropdown": tkinter.StringVar(),
"pagesize_width": tkinter.DoubleVar(),
"pagesize_height": tkinter.DoubleVar(),
"imgsize_dropdown": tkinter.StringVar(),
"imgsize_width": tkinter.DoubleVar(),
"imgsize_height": tkinter.DoubleVar(),
"colorspace": tkinter.StringVar(),
"first_frame_only": tkinter.BooleanVar(),
}
args["engine"].set("auto")
args["title"].set("")
args["auto_orient"].set(False)
args["fit"].set("into")
args["colorspace"].set("auto")
args["viewer_panes"].set("auto")
args["viewer_initial_page"].set(1)
args["viewer_magnification"].set("auto")
args["viewer_page_layout"].set("auto")
args["first_frame_only"].set(False)
args["pagesize_dropdown"].set("auto")
args["imgsize_dropdown"].set("auto")
def on_open_button():
nonlocal infiles
nonlocal doc
nonlocal maxpagewidth
nonlocal maxpageheight
infiles = tkinter.filedialog.askopenfilenames(
parent=root,
title="open image",
filetypes=[
(
"images",
"*.bmp *.eps *.gif *.ico *.jpeg *.jpg *.jp2 *.pcx *.png *.ppm *.tiff",
),
("all files", "*"),
],
# initialdir="/home/josch/git/plakativ",
# initialfile="test.pdf",
)
if have_fitz:
with BytesIO() as f:
save_pdf(f)
f.seek(0)
doc = fitz.open(stream=f, filetype="pdf")
for page in doc:
if page.get_displaylist().rect.width > maxpagewidth:
maxpagewidth = page.get_displaylist().rect.width
if page.get_displaylist().rect.height > maxpageheight:
maxpageheight = page.get_displaylist().rect.height
draw()
def save_pdf(stream):
pagesizearg = None
if args["pagesize_dropdown"].get() == "auto":
# nothing to do
pass
elif args["pagesize_dropdown"].get() == "custom":
pagesizearg = args["pagesize_width"].get(), args["pagesize_height"].get()
elif args["pagesize_dropdown"].get() in papernames.values():
raise NotImplemented()
else:
raise Exception("no such pagesize: %s" % args["pagesize_dropdown"].get())
imgsizearg = None
if args["imgsize_dropdown"].get() == "auto":
# nothing to do
pass
elif args["imgsize_dropdown"].get() == "custom":
imgsizearg = args["imgsize_width"].get(), args["imgsize_height"].get()
elif args["imgsize_dropdown"].get() in papernames.values():
raise NotImplemented()
else:
raise Exception("no such imgsize: %s" % args["imgsize_dropdown"].get())
borderarg = None
layout_fun = get_layout_fun(
pagesizearg,
imgsizearg,
borderarg,
args["fit"].get(),
args["auto_orient"].get(),
)
viewer_panesarg = None
if args["viewer_panes"].get() == "auto":
# nothing to do
pass
elif args["viewer_panes"].get() in PageMode:
viewer_panesarg = args["viewer_panes"].get()
else:
raise Exception("no such viewer_panes: %s" % args["viewer_panes"].get())
viewer_magnificationarg = None
if args["viewer_magnification"].get() == "auto":
# nothing to do
pass
elif args["viewer_magnification"].get() in Magnification:
viewer_magnificationarg = args["viewer_magnification"].get()
else:
raise Exception(
"no such viewer_magnification: %s" % args["viewer_magnification"].get()
)
viewer_page_layoutarg = None
if args["viewer_page_layout"].get() == "auto":
# nothing to do
pass
elif args["viewer_page_layout"].get() in PageLayout:
viewer_page_layoutarg = args["viewer_page_layout"].get()
else:
raise Exception(
"no such viewer_page_layout: %s" % args["viewer_page_layout"].get()
)
colorspacearg = None
if args["colorspace"].get() != "auto":
colorspacearg = next(
v for v in Colorspace if v.name == args["colorspace"].get()
)
enginearg = None
if args["engine"].get() != "auto":
enginearg = next(v for v in Engine if v.name == args["engine"].get())
convert(
*infiles,
engine=enginearg,
title=args["title"].get() if args["title"].get() else None,
author=args["author"].get() if args["author"].get() else None,
creator=args["creator"].get() if args["creator"].get() else None,
producer=args["producer"].get() if args["producer"].get() else None,
creationdate=args["creationdate"].get()
if args["creationdate"].get()
else None,
moddate=args["moddate"].get() if args["moddate"].get() else None,
subject=args["subject"].get() if args["subject"].get() else None,
keywords=args["keywords"].get() if args["keywords"].get() else None,
colorspace=colorspacearg,
nodate=args["nodate"].get(),
layout_fun=layout_fun,
viewer_panes=viewer_panesarg,
viewer_initial_page=args["viewer_initial_page"].get()
if args["viewer_initial_page"].get() > 1
else None,
viewer_magnification=viewer_magnificationarg,
viewer_page_layout=viewer_page_layoutarg,
viewer_fit_window=(args["viewer_fit_window"].get() or None),
viewer_center_window=(args["viewer_center_window"].get() or None),
viewer_fullscreen=(args["viewer_fullscreen"].get() or None),
outputstream=stream,
first_frame_only=args["first_frame_only"].get(),
cropborder=None,
bleedborder=None,
trimborder=None,
artborder=None,
)
def on_save_button():
filename = tkinter.filedialog.asksaveasfilename(
parent=root,
title="save PDF",
defaultextension=".pdf",
filetypes=[("pdf documents", "*.pdf"), ("all files", "*")],
# initialdir="/home/josch/git/plakativ",
# initialfile=base + "_poster" + ext,
)
with open(filename, "wb") as f:
save_pdf(f)
root.title("img2pdf")
app.pack(fill=tkinter.BOTH, expand=tkinter.TRUE)
canvas = tkinter.Canvas(app, bg="black")
def draw():
canvas.delete(tkinter.ALL)
if not infiles:
canvas.create_text(
canvas.size[0] / 2,
canvas.size[1] / 2,
text='Click on the "Open Image(s)" button in the upper right.',
fill="white",
)
return
if not doc:
canvas.create_text(
canvas.size[0] / 2,
canvas.size[1] / 2,
text="PyMuPDF not available. Install the Python fitz module\n"
+ "for preview functionality.",
fill="white",
)
return
canvas_padding = 10
# factor to convert from pdf dimensions (given in pt) into canvas
# dimensions (given in pixels)
zoom = min(
(canvas.size[0] - canvas_padding) / maxpagewidth,
(canvas.size[1] - canvas_padding) / maxpageheight,
)
pagenum = 0
mat_0 = fitz.Matrix(zoom, zoom)
canvas.image = tkinter.PhotoImage(
data=doc[pagenum]
.get_displaylist()
.get_pixmap(matrix=mat_0, alpha=False)
.tobytes("ppm")
)
canvas.create_image(
(canvas.size[0] - maxpagewidth * zoom) / 2,
(canvas.size[1] - maxpageheight * zoom) / 2,
anchor=tkinter.NW,
image=canvas.image,
)
canvas.create_rectangle(
(canvas.size[0] - maxpagewidth * zoom) / 2,
(canvas.size[1] - maxpageheight * zoom) / 2,
(canvas.size[0] - maxpagewidth * zoom) / 2 + canvas.image.width(),
(canvas.size[1] - maxpageheight * zoom) / 2 + canvas.image.height(),
outline="red",
)
def on_resize(event):
canvas.size = (event.width, event.height)
draw()
canvas.pack(fill=tkinter.BOTH, side=tkinter.LEFT, expand=tkinter.TRUE)
canvas.bind("", on_resize)
frame_right = tkinter.Frame(app)
frame_right.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.Y)
top_frame = tkinter.Frame(frame_right)
top_frame.pack(fill=tkinter.X)
tkinter.Button(top_frame, text="Open Image(s)", command=on_open_button).pack(
side=tkinter.LEFT, expand=tkinter.TRUE, fill=tkinter.X
)
tkinter.Button(top_frame, text="Help", state=tkinter.DISABLED).pack(
side=tkinter.RIGHT, expand=tkinter.TRUE, fill=tkinter.X
)
frame1 = VerticalScrolledFrame(frame_right)
frame1.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.Y)
output_options = tkinter.LabelFrame(frame1.interior, text="Output Options")
output_options.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
tkinter.Label(output_options, text="colorspace").grid(
row=0, column=0, sticky=tkinter.W
)
OptionMenu(output_options, args["colorspace"], "auto", state=tkinter.DISABLED).grid(
row=0, column=1, sticky=tkinter.W
)
tkinter.Label(output_options, text="engine").grid(row=1, column=0, sticky=tkinter.W)
OptionMenu(output_options, args["engine"], "auto", state=tkinter.DISABLED).grid(
row=1, column=1, sticky=tkinter.W
)
tkinter.Checkbutton(
output_options,
text="Suppress timestamp",
variable=args["nodate"],
state=tkinter.DISABLED,
).grid(row=2, column=0, columnspan=2, sticky=tkinter.W)
tkinter.Checkbutton(
output_options,
text="only first frame",
variable=args["first_frame_only"],
state=tkinter.DISABLED,
).grid(row=3, column=0, columnspan=2, sticky=tkinter.W)
tkinter.Checkbutton(
output_options, text="force large input", state=tkinter.DISABLED
).grid(row=4, column=0, columnspan=2, sticky=tkinter.W)
image_size_frame = tkinter.LabelFrame(frame1.interior, text="Image size")
image_size_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
OptionMenu(
image_size_frame,
args["imgsize_dropdown"],
*(["auto", "custom"] + sorted(papernames.values())),
state=tkinter.DISABLED,
).grid(row=1, column=0, columnspan=3, sticky=tkinter.W)
tkinter.Label(
image_size_frame, text="Width:", state=tkinter.DISABLED, name="size_label_width"
).grid(row=2, column=0, sticky=tkinter.W)
tkinter.Spinbox(
image_size_frame,
format="%.2f",
increment=0.01,
from_=0,
to=100,
width=5,
state=tkinter.DISABLED,
name="spinbox_width",
).grid(row=2, column=1, sticky=tkinter.W)
tkinter.Label(
image_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_width_mm"
).grid(row=2, column=2, sticky=tkinter.W)
tkinter.Label(
image_size_frame,
text="Height:",
state=tkinter.DISABLED,
name="size_label_height",
).grid(row=3, column=0, sticky=tkinter.W)
tkinter.Spinbox(
image_size_frame,
format="%.2f",
increment=0.01,
from_=0,
to=100,
width=5,
state=tkinter.DISABLED,
name="spinbox_height",
).grid(row=3, column=1, sticky=tkinter.W)
tkinter.Label(
image_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_height_mm"
).grid(row=3, column=2, sticky=tkinter.W)
page_size_frame = tkinter.LabelFrame(frame1.interior, text="Page size")
page_size_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
OptionMenu(
page_size_frame,
args["pagesize_dropdown"],
*(["auto", "custom"] + sorted(papernames.values())),
state=tkinter.DISABLED,
).grid(row=1, column=0, columnspan=3, sticky=tkinter.W)
tkinter.Label(
page_size_frame, text="Width:", state=tkinter.DISABLED, name="size_label_width"
).grid(row=2, column=0, sticky=tkinter.W)
tkinter.Spinbox(
page_size_frame,
format="%.2f",
increment=0.01,
from_=0,
to=100,
width=5,
state=tkinter.DISABLED,
name="spinbox_width",
).grid(row=2, column=1, sticky=tkinter.W)
tkinter.Label(
page_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_width_mm"
).grid(row=2, column=2, sticky=tkinter.W)
tkinter.Label(
page_size_frame,
text="Height:",
state=tkinter.DISABLED,
name="size_label_height",
).grid(row=3, column=0, sticky=tkinter.W)
tkinter.Spinbox(
page_size_frame,
format="%.2f",
increment=0.01,
from_=0,
to=100,
width=5,
state=tkinter.DISABLED,
name="spinbox_height",
).grid(row=3, column=1, sticky=tkinter.W)
tkinter.Label(
page_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_height_mm"
).grid(row=3, column=2, sticky=tkinter.W)
layout_frame = tkinter.LabelFrame(frame1.interior, text="Layout")
layout_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
tkinter.Label(layout_frame, text="border", state=tkinter.DISABLED).grid(
row=0, column=0, sticky=tkinter.W
)
tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid(
row=0, column=1, sticky=tkinter.W
)
tkinter.Label(layout_frame, text="fit", state=tkinter.DISABLED).grid(
row=1, column=0, sticky=tkinter.W
)
OptionMenu(
layout_frame, args["fit"], *[v.name for v in FitMode], state=tkinter.DISABLED
).grid(row=1, column=1, sticky=tkinter.W)
tkinter.Checkbutton(
layout_frame,
text="auto orient",
state=tkinter.DISABLED,
variable=args["auto_orient"],
).grid(row=2, column=0, columnspan=2, sticky=tkinter.W)
tkinter.Label(layout_frame, text="crop border", state=tkinter.DISABLED).grid(
row=3, column=0, sticky=tkinter.W
)
tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid(
row=3, column=1, sticky=tkinter.W
)
tkinter.Label(layout_frame, text="bleed border", state=tkinter.DISABLED).grid(
row=4, column=0, sticky=tkinter.W
)
tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid(
row=4, column=1, sticky=tkinter.W
)
tkinter.Label(layout_frame, text="trim border", state=tkinter.DISABLED).grid(
row=5, column=0, sticky=tkinter.W
)
tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid(
row=5, column=1, sticky=tkinter.W
)
tkinter.Label(layout_frame, text="art border", state=tkinter.DISABLED).grid(
row=6, column=0, sticky=tkinter.W
)
tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid(
row=6, column=1, sticky=tkinter.W
)
metadata_frame = tkinter.LabelFrame(frame1.interior, text="PDF metadata")
metadata_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
tkinter.Label(metadata_frame, text="title", state=tkinter.DISABLED).grid(
row=0, column=0, sticky=tkinter.W
)
tkinter.Entry(
metadata_frame, textvariable=args["title"], state=tkinter.DISABLED
).grid(row=0, column=1, sticky=tkinter.W)
tkinter.Label(metadata_frame, text="author", state=tkinter.DISABLED).grid(
row=1, column=0, sticky=tkinter.W
)
tkinter.Entry(
metadata_frame, textvariable=args["author"], state=tkinter.DISABLED
).grid(row=1, column=1, sticky=tkinter.W)
tkinter.Label(metadata_frame, text="creator", state=tkinter.DISABLED).grid(
row=2, column=0, sticky=tkinter.W
)
tkinter.Entry(
metadata_frame, textvariable=args["creator"], state=tkinter.DISABLED
).grid(row=2, column=1, sticky=tkinter.W)
tkinter.Label(metadata_frame, text="producer", state=tkinter.DISABLED).grid(
row=3, column=0, sticky=tkinter.W
)
tkinter.Entry(
metadata_frame, textvariable=args["producer"], state=tkinter.DISABLED
).grid(row=3, column=1, sticky=tkinter.W)
tkinter.Label(metadata_frame, text="creation date", state=tkinter.DISABLED).grid(
row=4, column=0, sticky=tkinter.W
)
tkinter.Entry(
metadata_frame, textvariable=args["creationdate"], state=tkinter.DISABLED
).grid(row=4, column=1, sticky=tkinter.W)
tkinter.Label(
metadata_frame, text="modification date", state=tkinter.DISABLED
).grid(row=5, column=0, sticky=tkinter.W)
tkinter.Entry(
metadata_frame, textvariable=args["moddate"], state=tkinter.DISABLED
).grid(row=5, column=1, sticky=tkinter.W)
tkinter.Label(metadata_frame, text="subject", state=tkinter.DISABLED).grid(
row=6, column=0, sticky=tkinter.W
)
tkinter.Entry(metadata_frame, state=tkinter.DISABLED).grid(
row=6, column=1, sticky=tkinter.W
)
tkinter.Label(metadata_frame, text="keywords", state=tkinter.DISABLED).grid(
row=7, column=0, sticky=tkinter.W
)
tkinter.Entry(metadata_frame, state=tkinter.DISABLED).grid(
row=7, column=1, sticky=tkinter.W
)
viewer_frame = tkinter.LabelFrame(frame1.interior, text="PDF viewer options")
viewer_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
tkinter.Label(viewer_frame, text="panes", state=tkinter.DISABLED).grid(
row=0, column=0, sticky=tkinter.W
)
OptionMenu(
viewer_frame,
args["viewer_panes"],
*(["auto"] + [v.name for v in PageMode]),
state=tkinter.DISABLED,
).grid(row=0, column=1, sticky=tkinter.W)
tkinter.Label(viewer_frame, text="initial page", state=tkinter.DISABLED).grid(
row=1, column=0, sticky=tkinter.W
)
tkinter.Spinbox(
viewer_frame,
increment=1,
from_=1,
to=10000,
width=6,
textvariable=args["viewer_initial_page"],
state=tkinter.DISABLED,
name="viewer_initial_page_spinbox",
).grid(row=1, column=1, sticky=tkinter.W)
tkinter.Label(viewer_frame, text="magnification", state=tkinter.DISABLED).grid(
row=2, column=0, sticky=tkinter.W
)
OptionMenu(
viewer_frame,
args["viewer_magnification"],
*(["auto", "custom"] + [v.name for v in Magnification]),
state=tkinter.DISABLED,
).grid(row=2, column=1, sticky=tkinter.W)
tkinter.Label(viewer_frame, text="page layout", state=tkinter.DISABLED).grid(
row=3, column=0, sticky=tkinter.W
)
OptionMenu(
viewer_frame,
args["viewer_page_layout"],
*(["auto"] + [v.name for v in PageLayout]),
state=tkinter.DISABLED,
).grid(row=3, column=1, sticky=tkinter.W)
tkinter.Checkbutton(
viewer_frame,
text="fit window to page size",
variable=args["viewer_fit_window"],
state=tkinter.DISABLED,
).grid(row=4, column=0, columnspan=2, sticky=tkinter.W)
tkinter.Checkbutton(
viewer_frame,
text="center window",
variable=args["viewer_center_window"],
state=tkinter.DISABLED,
).grid(row=5, column=0, columnspan=2, sticky=tkinter.W)
tkinter.Checkbutton(
viewer_frame,
text="open in fullscreen",
variable=args["viewer_fullscreen"],
state=tkinter.DISABLED,
).grid(row=6, column=0, columnspan=2, sticky=tkinter.W)
option_frame = tkinter.LabelFrame(frame1.interior, text="Program options")
option_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X)
tkinter.Label(option_frame, text="Unit:", state=tkinter.DISABLED).grid(
row=0, column=0, sticky=tkinter.W
)
unit = tkinter.StringVar()
unit.set("mm")
OptionMenu(option_frame, unit, ["mm"], state=tkinter.DISABLED).grid(
row=0, column=1, sticky=tkinter.W
)
tkinter.Label(option_frame, text="Language:", state=tkinter.DISABLED).grid(
row=1, column=0, sticky=tkinter.W
)
language = tkinter.StringVar()
language.set("English")
OptionMenu(option_frame, language, ["English"], state=tkinter.DISABLED).grid(
row=1, column=1, sticky=tkinter.W
)
bottom_frame = tkinter.Frame(frame_right)
bottom_frame.pack(fill=tkinter.X)
tkinter.Button(bottom_frame, text="Save PDF", command=on_save_button).pack(
side=tkinter.LEFT, expand=tkinter.TRUE, fill=tkinter.X
)
tkinter.Button(bottom_frame, text="Exit", command=root.destroy).pack(
side=tkinter.RIGHT, expand=tkinter.TRUE, fill=tkinter.X
)
app.mainloop()
def file_is_icc(fname):
with open(fname, "rb") as f:
data = f.read(40)
if len(data) < 40:
return False
return data[36:] == b"acsp"
def validate_icc(fname):
if not file_is_icc(fname):
raise argparse.ArgumentTypeError('"%s" is not an ICC profile' % fname)
return fname
def get_default_icc_profile():
for profile in [
"/usr/share/color/icc/sRGB.icc",
"/usr/share/color/icc/OpenICC/sRGB.icc",
"/usr/share/color/icc/colord/sRGB.icc",
]:
if not os.path.exists(profile):
continue
if not file_is_icc(profile):
continue
return profile
return "/usr/share/color/icc/sRGB.icc"
def get_main_parser():
rendered_papersizes = ""
for k, v in sorted(papersizes.items()):
rendered_papersizes += " %-8s %s\n" % (papernames[k], v)
parser = argparse.ArgumentParser(
formatter_class=argparse.RawDescriptionHelpFormatter,
description="""\
Losslessly convert raster images to PDF without re-encoding PNG, JPEG, and
JPEG2000 images. This leads to a lossless conversion of PNG, JPEG and JPEG2000
images with the only added file size coming from the PDF container itself.
Other raster graphics formats are losslessly stored using the same encoding
that PNG uses.
For images with transparency, the alpha channel will be stored as a separate
soft mask. This is lossless, too.
The output is sent to standard output so that it can be redirected into a file
or to another program as part of a shell pipe. To directly write the output
into a file, use the -o or --output option.
Options:
""",
epilog="""\
Colorspace:
Currently, the colorspace must be forced for JPEG 2000 images that are not in
the RGB colorspace. Available colorspace options are based on Python Imaging
Library (PIL) short handles.
RGB RGB color
L Grayscale
1 Black and white (internally converted to grayscale)
CMYK CMYK color
CMYK;I CMYK color with inversion (for CMYK JPEG files from Adobe)
Paper sizes:
You can specify the short hand paper size names shown in the first column in
the table below as arguments to the --pagesize and --imgsize options. The
width and height they are mapping to is shown in the second column. Giving
the value in the second column has the same effect as giving the short hand
in the first column. Appending ^T (a caret/circumflex followed by the letter
T) turns the paper size from portrait into landscape. The postfix thus
symbolizes the transpose. Note that on Windows cmd.exe the caret symbol is
the escape character, so you need to put quotes around the option value.
The values are case insensitive.
%s
Fit options:
The img2pdf options for the --fit argument are shown in the first column in
the table below. The function of these options can be mapped to the geometry
operators of imagemagick. For users who are familiar with imagemagick, the
corresponding operator is shown in the second column. The third column shows
whether or not the aspect ratio is preserved for that option (same as in
imagemagick). Just like imagemagick, img2pdf tries hard to preserve the
aspect ratio, so if the --fit argument is not given, then the default is
"into" which corresponds to the absence of any operator in imagemagick.
The value of the --fit option is case insensitive.
into | | Y | The default. Width and height values specify maximum
| | | values.
---------+---+---+----------------------------------------------------------
fill | ^ | Y | Width and height values specify the minimum values.
---------+---+---+----------------------------------------------------------
exact | ! | N | Width and height emphatically given.
---------+---+---+----------------------------------------------------------
shrink | > | Y | Shrinks an image with dimensions larger than the given
| | | ones (and otherwise behaves like "into").
---------+---+---+----------------------------------------------------------
enlarge | < | Y | Enlarges an image with dimensions smaller than the given
| | | ones (and otherwise behaves like "into").
Argument parsing:
Argument long options can be abbreviated to a prefix if the abbreviation is
unambiguous. That is, the prefix must match a unique option.
Beware of your shell interpreting argument values as special characters (like
the semicolon in the CMYK;I colorspace option). If in doubt, put the argument
values in single quotes.
If you want an argument value to start with one or more minus characters, you
must use the long option name and join them with an equal sign like so:
$ img2pdf --author=--test--
If your input file name starts with one or more minus characters, either
separate the input files from the other arguments by two minus signs:
$ img2pdf -- --my-file-starts-with-two-minuses.jpg
Or be more explicit about its relative path by prepending a ./:
$ img2pdf ./--my-file-starts-with-two-minuses.jpg
The order of non-positional arguments (all arguments other than the input
images) does not matter.
Examples:
Lines starting with a dollar sign denote commands you can enter into your
terminal. The dollar sign signifies your command prompt. It is not part of
the command you type.
Convert two scans in JPEG format to a PDF document.
$ img2pdf --output out.pdf page1.jpg page2.jpg
Use a custom dpi value for the input images:
$ img2pdf --output out.pdf --imgsize 300dpi page1.jpg page2.jpg
Convert a directory of JPEG images into a PDF with printable A4 pages in
landscape mode. On each page, the photo takes the maximum amount of space
while preserving its aspect ratio and a print border of 2 cm on the top and
bottom and 2.5 cm on the left and right hand side.
$ img2pdf --output out.pdf --pagesize "A4^T" --border 2cm:2.5cm *.jpg
On each A4 page, fit images into a 10 cm times 15 cm rectangle but keep the
original image size if the image is smaller than that.
$ img2pdf --output out.pdf -S A4 --imgsize 10cmx15cm --fit shrink *.jpg
Prepare a directory of photos to be printed borderless on photo paper with a
3:2 aspect ratio and rotate each page so that its orientation is the same as
the input image.
$ img2pdf --output out.pdf --pagesize 15cmx10cm --auto-orient *.jpg
Encode a grayscale JPEG2000 image. The colorspace has to be forced as img2pdf
cannot read it from the JPEG2000 file automatically.
$ img2pdf --output out.pdf --colorspace L input.jp2
Written by Johannes Schauer Marin Rodrigues
Report bugs at https://gitlab.mister-muffin.de/josch/img2pdf/issues
"""
% rendered_papersizes,
)
parser.add_argument(
"images",
metavar="infile",
type=input_images,
nargs="*",
help="Specifies the input file(s) in any format that can be read by "
"the Python Imaging Library (PIL). If no input images are given, then "
'a single image is read from standard input. The special filename "-" '
"can be used once to read an image from standard input. To read a "
'file in the current directory with the filename "-" (or with a '
'filename starting with "-"), pass it to img2pdf by explicitly '
'stating its relative path like "./-". Cannot be used together with '
"--from-file.",
)
parser.add_argument(
"-v",
"--verbose",
action="store_true",
help="Makes the program operate in verbose mode, printing messages on "
"standard error.",
)
parser.add_argument(
"-V",
"--version",
action="version",
version="%(prog)s " + __version__,
help="Prints version information and exits.",
)
parser.add_argument(
"--gui", dest="gui", action="store_true", help="run experimental tkinter gui"
)
parser.add_argument(
"--from-file",
metavar="FILE",
type=from_file,
default=[],
help="Read the list of images from FILE instead of passing them as "
"positional arguments. If this option is used, then the list of "
"positional arguments must be empty. The paths to the input images "
'in FILE are separated by NUL bytes. If FILE is "-" then the paths '
"are expected on standard input. This option is useful if you want "
"to pass more images than the maximum command length of your shell "
"permits. This option can be used with commands like `find -print0`.",
)
outargs = parser.add_argument_group(
title="General output arguments",
description="Arguments controlling the output format.",
)
# In Python3 we have to output to sys.stdout.buffer because we write are
# bytes and not strings. In certain situations, like when the main
# function is wrapped by contextlib.redirect_stdout(), sys.stdout does not
# have the buffer attribute. Thus we write to sys.stdout by default and
# to sys.stdout.buffer if it exists.
outargs.add_argument(
"-o",
"--output",
metavar="out",
type=argparse.FileType("wb"),
default=sys.stdout.buffer if hasattr(sys.stdout, "buffer") else sys.stdout,
help="Makes the program output to a file instead of standard output.",
)
outargs.add_argument(
"-C",
"--colorspace",
metavar="colorspace",
type=parse_colorspacearg,
help="""
Forces the PIL colorspace. See the epilogue for a list of possible values.
Usually the PDF colorspace would be derived from the color space of the input
image. This option overwrites the automatically detected colorspace from the
input image and thus forces a certain colorspace in the output PDF /ColorSpace
property. This is useful for JPEG 2000 images with a different colorspace than
RGB.""",
)
outargs.add_argument(
"-D",
"--nodate",
action="store_true",
help="Suppresses timestamps in the output and thus makes the output "
"deterministic between individual runs. You can also manually "
"set a date using the --moddate and --creationdate options.",
)
outargs.add_argument(
"--engine",
metavar="engine",
type=parse_enginearg,
help="Choose PDF engine. Can be either internal, pikepdf or pdfrw. "
"The internal engine does not have additional requirements and writes "
"out a human readable PDF. The pikepdf engine requires the pikepdf "
"Python module and qpdf library, is most featureful, can "
'linearize PDFs ("fast web view") and can compress more parts of it.'
"The pdfrw engine requires the pdfrw Python "
"module but does not support unicode metadata (See "
"https://github.com/pmaupin/pdfrw/issues/39) or palette data (See "
"https://github.com/pmaupin/pdfrw/issues/128).",
)
outargs.add_argument(
"--first-frame-only",
action="store_true",
help="By default, img2pdf will convert multi-frame images like "
"multi-page TIFF or animated GIF images to one page per frame. "
"This option will only let the first frame of every multi-frame "
"input image be converted into a page in the resulting PDF.",
)
outargs.add_argument(
"--include-thumbnails",
action="store_true",
help="Some multi-frame formats like MPO carry a main image and "
"one or more scaled-down copies of the main image (thumbnails). "
"In such a case, img2pdf will only include the main image and "
"not create additional pages for each of the thumbnails. If this "
"option is set, img2pdf will instead create one page per frame and "
"thus store each thumbnail on its own page.",
)
outargs.add_argument(
"--pillow-limit-break",
action="store_true",
help="img2pdf uses the Python Imaging Library Pillow to read input "
"images. Pillow limits the maximum input image size to %d pixels "
"to prevent decompression bomb denial of service attacks. If "
"your input image contains more pixels than that, use this "
"option to disable this safety measure during this run of img2pdf"
% Image.MAX_IMAGE_PIXELS,
)
if sys.platform == "win32":
# on Windows, there are no default paths to search for an ICC profile
# so make the argument required instead of optional
outargs.add_argument(
"--pdfa",
type=validate_icc,
help="Output a PDF/A-1b compliant document. The argument to this "
"option is the path to the ICC profile that will be embedded into "
"the resulting PDF.",
)
else:
outargs.add_argument(
"--pdfa",
nargs="?",
const=get_default_icc_profile(),
default=None,
type=validate_icc,
help="Output a PDF/A-1b compliant document. By default, this will "
"embed either /usr/share/color/icc/sRGB.icc, "
"/usr/share/color/icc/OpenICC/sRGB.icc or "
"/usr/share/color/icc/colord/sRGB.icc as the color profile, whichever "
"is found to exist first.",
)
sizeargs = parser.add_argument_group(
title="Image and page size and layout arguments",
description="""\
Every input image will be placed on its own page. The image size is controlled
by the dpi value of the input image or, if unset or missing, the default dpi of
%.2f. By default, each page will have the same size as the image it shows.
Thus, there will be no visible border between the image and the page border by
default. If image size and page size are made different from each other by the
options in this section, the image will always be centered in both dimensions.
The image size and page size can be explicitly set using the --imgsize and
--pagesize options, respectively. If either dimension of the image size is
specified but the same dimension of the page size is not, then the latter will
be derived from the former using an optional minimal distance between the image
and the page border (given by the --border option) and/or a certain fitting
strategy (given by the --fit option). The converse happens if a dimension of
the page size is set but the same dimension of the image size is not.
Any length value in below options is represented by the meta variable L which
is a floating point value with an optional unit appended (without a space
between them). The default unit is pt (1/72 inch, the PDF unit) and other
allowed units are cm (centimeter), mm (millimeter), and in (inch).
Any size argument of the format LxL in the options below specifies the width
and height of a rectangle where the first L represents the width and the second
L represents the height with an optional unit following each value as described
above. Either width or height may be omitted. If the height is omitted, the
separating x can be omitted as well. Omitting the width requires to prefix the
height with the separating x. The missing dimension will be chosen so to not
change the image aspect ratio. Instead of giving the width and height
explicitly, you may also specify some (case-insensitive) common page sizes such
as letter and A4. See the epilogue at the bottom for a complete list of the
valid sizes.
The --fit option scales to fit the image into a rectangle that is either
derived from the --imgsize option or otherwise from the --pagesize option.
If the --border option is given in addition to the --imgsize option while the
--pagesize option is not given, then the page size will be calculated from the
image size, respecting the border setting. If the --border option is given in
addition to the --pagesize option while the --imgsize option is not given, then
the image size will be calculated from the page size, respecting the border
setting. If the --border option is given while both the --pagesize and
--imgsize options are passed, then the --border option will be ignored.
The --pagesize option or the --imgsize option with the --border option will
determine the MediaBox size of the resulting PDF document.
"""
% default_dpi,
)
sizeargs.add_argument(
"-S",
"--pagesize",
metavar="LxL",
type=parse_pagesize_rectarg,
help="""
Sets the size of the PDF pages. The short-option is the upper case S because
it is an mnemonic for being bigger than the image size.""",
)
sizeargs.add_argument(
"-s",
"--imgsize",
metavar="LxL",
type=parse_imgsize_rectarg,
help="""
Sets the size of the images on the PDF pages. In addition, the unit dpi is
allowed which will set the image size as a value of dots per inch. Instead of
a unit, width and height values may also have a percentage sign appended,
indicating a resize of the image by that percentage. The short-option is the
lower case s because it is an mnemonic for being smaller than the page size.
""",
)
sizeargs.add_argument(
"-b",
"--border",
metavar="L[:L]",
type=parse_borderarg,
help="""
Specifies the minimal distance between the image border and the PDF page
border. This value Is overwritten by explicit values set by --pagesize or
--imgsize. The value will be used when calculating page dimensions from the
image dimensions or the other way round. One, or two length values can be given
as an argument, separated by a colon. One value specifies the minimal border on
all four sides. Two values specify the minimal border on the top/bottom and
left/right, respectively. It is not possible to specify asymmetric borders
because images will always be centered on the page.
""",
)
sizeargs.add_argument(
"-f",
"--fit",
metavar="FIT",
type=parse_fitarg,
default=FitMode.into,
help="""
If --imgsize is given, fits the image using these dimensions. Otherwise, fit
the image into the dimensions given by --pagesize. FIT is one of into, fill,
exact, shrink and enlarge. The default value is "into". See the epilogue at the
bottom for a description of the FIT options.
""",
)
sizeargs.add_argument(
"-a",
"--auto-orient",
action="store_true",
help="""
If both dimensions of the page are given via --pagesize, conditionally swaps
these dimensions such that the page orientation is the same as the orientation
of the input image. If the orientation of a page gets flipped, then so do the
values set via the --border option.
""",
)
sizeargs.add_argument(
"-r",
"--rotation",
"--orientation",
metavar="ROT",
type=parse_rotationarg,
default=Rotation.auto,
help="""
Specifies how input images should be rotated. ROT can be one of auto, none,
ifvalid, 0, 90, 180 and 270. The default value is auto and indicates that input
images are rotated according to their EXIF Orientation tag. The values none and
0 ignore the EXIF Orientation values of the input images. The value ifvalid
acts like auto but ignores invalid EXIF rotation values and only issues a
warning instead of throwing an error. This is useful because many devices like
Android phones, Canon cameras or scanners emit an invalid Orientation tag value
of zero. The values 90, 180 and 270 perform a clockwise rotation of the image.
""",
)
sizeargs.add_argument(
"--crop-border",
metavar="L[:L]",
type=parse_borderarg,
help="""
Specifies the border between the CropBox and the MediaBox. One, or two length
values can be given as an argument, separated by a colon. One value specifies
the border on all four sides. Two values specify the border on the top/bottom
and left/right, respectively. It is not possible to specify asymmetric borders.
""",
)
sizeargs.add_argument(
"--bleed-border",
metavar="L[:L]",
type=parse_borderarg,
help="""
Specifies the border between the BleedBox and the MediaBox. One, or two length
values can be given as an argument, separated by a colon. One value specifies
the border on all four sides. Two values specify the border on the top/bottom
and left/right, respectively. It is not possible to specify asymmetric borders.
""",
)
sizeargs.add_argument(
"--trim-border",
metavar="L[:L]",
type=parse_borderarg,
help="""
Specifies the border between the TrimBox and the MediaBox. One, or two length
values can be given as an argument, separated by a colon. One value specifies
the border on all four sides. Two values specify the border on the top/bottom
and left/right, respectively. It is not possible to specify asymmetric borders.
""",
)
sizeargs.add_argument(
"--art-border",
metavar="L[:L]",
type=parse_borderarg,
help="""
Specifies the border between the ArtBox and the MediaBox. One, or two length
values can be given as an argument, separated by a colon. One value specifies
the border on all four sides. Two values specify the border on the top/bottom
and left/right, respectively. It is not possible to specify asymmetric borders.
""",
)
metaargs = parser.add_argument_group(
title="Arguments setting metadata",
description="Options handling embedded timestamps, title and author "
"information.",
)
metaargs.add_argument(
"--title", metavar="title", type=str, help="Sets the title metadata value"
)
metaargs.add_argument(
"--author", metavar="author", type=str, help="Sets the author metadata value"
)
metaargs.add_argument(
"--creator", metavar="creator", type=str, help="Sets the creator metadata value"
)
metaargs.add_argument(
"--producer",
metavar="producer",
type=str,
default="img2pdf " + __version__,
help="Sets the producer metadata value "
"(default is: img2pdf " + __version__ + ")",
)
metaargs.add_argument(
"--creationdate",
metavar="creationdate",
type=valid_date,
help="Sets the UTC creation date metadata value in YYYY-MM-DD or "
"YYYY-MM-DDTHH:MM or YYYY-MM-DDTHH:MM:SS format or any format "
"understood by python dateutil module or any format understood "
"by `date --date`",
)
metaargs.add_argument(
"--moddate",
metavar="moddate",
type=valid_date,
help="Sets the UTC modification date metadata value in YYYY-MM-DD "
"or YYYY-MM-DDTHH:MM or YYYY-MM-DDTHH:MM:SS format or any format "
"understood by python dateutil module or any format understood "
"by `date --date`",
)
metaargs.add_argument(
"--subject", metavar="subject", type=str, help="Sets the subject metadata value"
)
metaargs.add_argument(
"--keywords",
metavar="kw",
type=str,
nargs="+",
help="Sets the keywords metadata value (can be given multiple times)",
)
viewerargs = parser.add_argument_group(
title="PDF viewer arguments",
description="PDF files can specify how they are meant to be "
"presented to the user by a PDF viewer",
)
viewerargs.add_argument(
"--viewer-panes",
metavar="PANES",
type=parse_panes,
help="Instruct the PDF viewer which side panes to show. Valid values "
'are "outlines" and "thumbs". It is not possible to specify both '
"at the same time.",
)
viewerargs.add_argument(
"--viewer-initial-page",
metavar="NUM",
type=int,
help="Instead of showing the first page, instruct the PDF viewer to "
"show the given page instead. Page numbers start with 1.",
)
viewerargs.add_argument(
"--viewer-magnification",
metavar="MAG",
type=parse_magnification,
help="Instruct the PDF viewer to open the PDF with a certain zoom "
"level. Valid values are either a floating point number giving "
'the exact zoom level, "fit" (zoom to fit whole page), "fith" '
'(zoom to fit page width) and "fitbh" (zoom to fit visible page '
"width).",
)
viewerargs.add_argument(
"--viewer-page-layout",
metavar="LAYOUT",
type=parse_layout,
help="Instruct the PDF viewer how to arrange the pages on the screen. "
'Valid values are "single" (display single pages), "onecolumn" '
'(one continuous column), "twocolumnright" (two continuous '
'columns with odd number pages on the right) and "twocolumnleft" '
"(two continuous columns with odd numbered pages on the left), "
'"twopageright" (two pages with odd numbered page on the right) '
'and "twopageleft" (two pages with odd numbered page on the left)',
)
viewerargs.add_argument(
"--viewer-fit-window",
action="store_true",
help="Instruct the PDF viewer to resize the window to fit the page size",
)
viewerargs.add_argument(
"--viewer-center-window",
action="store_true",
help="Instruct the PDF viewer to center the PDF viewer window",
)
viewerargs.add_argument(
"--viewer-fullscreen",
action="store_true",
help="Instruct the PDF viewer to open the PDF in fullscreen mode",
)
return parser
def main(argv=sys.argv):
args = get_main_parser().parse_args(argv[1:])
if args.verbose:
logging.basicConfig(level=logging.DEBUG)
if args.pillow_limit_break:
Image.MAX_IMAGE_PIXELS = None
if args.gui:
gui()
sys.exit(0)
layout_fun = get_layout_fun(
args.pagesize, args.imgsize, args.border, args.fit, args.auto_orient
)
if len(args.images) > 0 and len(args.from_file) > 0:
logger.error(
"%s: error: cannot use --from-file with positional arguments" % parser.prog
)
sys.exit(2)
elif len(args.images) == 0 and len(args.from_file) == 0:
# if no positional arguments were supplied, read a single image from
# standard input
print(
"Reading image from standard input...\n"
"Re-run with -h or --help for usage information.",
file=sys.stderr,
)
try:
images = [sys.stdin.buffer.read()]
except KeyboardInterrupt:
sys.exit(0)
elif len(args.images) > 0 and len(args.from_file) == 0:
# On windows, each positional argument can expand into multiple paths
# because we do globbing ourselves. Here we flatten the list of lists
# again.
images = list(chain.from_iterable(args.images))
elif len(args.images) == 0 and len(args.from_file) > 0:
images = args.from_file
# with the number of pages being equal to the number of images, the
# value passed to --viewer-initial-page must be between 1 and that number
if args.viewer_initial_page is not None:
if args.viewer_initial_page < 1:
parser.print_usage(file=sys.stderr)
logger.error(
"%s: error: argument --viewer-initial-page: must be "
"greater than zero" % parser.prog
)
sys.exit(2)
if args.viewer_initial_page > len(images):
parser.print_usage(file=sys.stderr)
logger.error(
"%s: error: argument --viewer-initial-page: must be "
"less than or equal to the total number of pages" % parser.prog
)
sys.exit(2)
try:
convert(
*images,
engine=args.engine,
title=args.title,
author=args.author,
creator=args.creator,
producer=args.producer,
creationdate=args.creationdate,
moddate=args.moddate,
subject=args.subject,
keywords=args.keywords,
colorspace=args.colorspace,
nodate=args.nodate,
layout_fun=layout_fun,
viewer_panes=args.viewer_panes,
viewer_initial_page=args.viewer_initial_page,
viewer_magnification=args.viewer_magnification,
viewer_page_layout=args.viewer_page_layout,
viewer_fit_window=args.viewer_fit_window,
viewer_center_window=args.viewer_center_window,
viewer_fullscreen=args.viewer_fullscreen,
outputstream=args.output,
first_frame_only=args.first_frame_only,
cropborder=args.crop_border,
bleedborder=args.bleed_border,
trimborder=args.trim_border,
artborder=args.art_border,
pdfa=args.pdfa,
rotation=args.rotation,
include_thumbnails=args.include_thumbnails,
)
except Exception as e:
logger.error("error: " + str(e))
if logger.isEnabledFor(logging.DEBUG):
import traceback
traceback.print_exc(file=sys.stderr)
sys.exit(1)
if __name__ == "__main__":
main()
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1745766091.0
img2pdf-0.6.1/src/img2pdf_test.py 0000755 0001750 0001750 00001041112 15003443313 015633 0 ustar 00josch josch #!/usr/bin/env python3
import sys
import numpy
import scipy.signal
import zlib
import struct
import subprocess
import pytest
import re
import pikepdf
import hashlib
import img2pdf
import os
from io import BytesIO
from PIL import Image
import decimal
from packaging.version import parse as parse_version
import warnings
import json
import pathlib
import itertools
import xml.etree.ElementTree as ET
import platform
img2pdfprog = os.getenv("img2pdfprog", default="src/img2pdf.py")
ICC_PROFILE = None
ICC_PROFILE_PATHS = (
# Debian
"/usr/share/color/icc/ghostscript/srgb.icc",
# Fedora
"/usr/share/ghostscript/iccprofiles/srgb.icc",
# Archlinux and Gentoo
"/usr/share/ghostscript/*/iccprofiles/srgb.icc",
)
for glob in ICC_PROFILE_PATHS:
for path in pathlib.Path("/").glob(glob.lstrip("/")):
if path.is_file():
ICC_PROFILE = path
break
HAVE_FAKETIME = True
try:
ver = subprocess.check_output(["faketime", "--version"])
if b"faketime: Version " not in ver:
HAVE_FAKETIME = False
except FileNotFoundError:
HAVE_FAKETIME = False
HAVE_MUTOOL = True
try:
ver = subprocess.check_output(["mutool", "-v"], stderr=subprocess.STDOUT)
m = re.fullmatch(r"mutool version ([0-9.]+)\n", ver.decode("utf8"))
if m is None:
HAVE_MUTOOL = False
else:
if parse_version(m.group(1)) < parse_version("1.10.0"):
HAVE_MUTOOL = False
except FileNotFoundError:
HAVE_MUTOOL = False
if not HAVE_MUTOOL:
warnings.warn("mutool >= 1.10.0 not available, skipping checks...")
HAVE_PDFIMAGES_CMYK = True
try:
ver = subprocess.check_output(["pdfimages", "-v"], stderr=subprocess.STDOUT)
m = re.fullmatch(r"pdfimages version ([0-9.]+)", ver.split(b"\n")[0].decode("utf8"))
if m is None:
HAVE_PDFIMAGES_CMYK = False
else:
if parse_version(m.group(1)) < parse_version("0.42.0"):
HAVE_PDFIMAGES_CMYK = False
except FileNotFoundError:
HAVE_PDFIMAGES_CMYK = False
if not HAVE_PDFIMAGES_CMYK:
warnings.warn("pdfimages >= 0.42.0 not available, skipping CMYK checks...")
for prog in ["convert", "compare", "identify"]:
try:
subprocess.check_call([prog] + ["-version"], stderr=subprocess.STDOUT)
globals()[prog.upper()] = [prog]
except subprocess.CalledProcessError:
globals()[prog.upper()] = ["magick", prog]
HAVE_IMAGEMAGICK_MODERN = True
HAVE_EXACT_CMYK8 = True
try:
ver = subprocess.check_output(CONVERT + ["-version"], stderr=subprocess.STDOUT)
m = re.fullmatch(
r"Version: ImageMagick ([0-9.]+-[0-9]+) .*", ver.split(b"\n")[0].decode("utf8")
)
if m is None:
HAVE_IMAGEMAGICK_MODERN = False
HAVE_EXACT_CMYK8 = False
else:
if parse_version(m.group(1)) < parse_version("6.9.10-12"):
HAVE_IMAGEMAGICK_MODERN = False
if parse_version(m.group(1)) < parse_version("7.1.0-48"):
HAVE_EXACT_CMYK8 = False
except FileNotFoundError:
HAVE_IMAGEMAGICK_MODERN = False
HAVE_EXACT_CMYK8 = False
except subprocess.CalledProcessError:
HAVE_IMAGEMAGICK_MODERN = False
HAVE_EXACT_CMYK8 = False
if not HAVE_IMAGEMAGICK_MODERN:
warnings.warn("imagemagick >= 6.9.10-12 not available, skipping certain checks...")
HAVE_JP2 = True
try:
ver = subprocess.check_output(
IDENTIFY + ["-list", "format"], stderr=subprocess.STDOUT
)
found = False
for line in ver.split(b"\n"):
if re.match(rb"\s+JP2\* JP2\s+rw-\s+JPEG-2000 File Format Syntax", line):
found = True
break
if not found:
HAVE_JP2 = False
except FileNotFoundError:
HAVE_JP2 = False
except subprocess.CalledProcessError:
HAVE_JP2 = False
if not HAVE_JP2:
warnings.warn("imagemagick has no jpeg 2000 support, skipping certain checks...")
# the result of compare -metric PSNR is either just a floating point value or a
# floating point value following by the same value multiplied by 0.01,
# surrounded in parenthesis since ImagemMagick 7.1.0-48:
# https://github.com/ImageMagick/ImageMagick/commit/751829cd4c911d7a42953a47c1f73068d9e7da2f
psnr_re = re.compile(rb"((?:inf|(?:0|[1-9][0-9]*)(?:\.[0-9]+)?))(?: \([0-9.]+\))?")
###############################################################################
# HELPER FUNCTIONS #
###############################################################################
# Interpret a datetime string in a given timezone and format it according to a
# given format string in in UTC.
# We avoid using the Python datetime module for this job because doing so would
# just replicate the code we want to test for correctness.
def tz2utcstrftime(string, fmt, timezone):
return (
subprocess.check_output(
[
"date",
"--utc",
f'--date=TZ="{timezone}" {string}',
f"+{fmt}",
]
)
.decode("utf8")
.removesuffix("\n")
)
def find_closest_palette_color(color, palette):
if color.ndim == 0:
idx = (numpy.abs(palette - color)).argmin()
else:
# naive distance function by computing the euclidean distance in RGB space
idx = ((palette - color) ** 2).sum(axis=-1).argmin()
return palette[idx]
def floyd_steinberg(img, palette):
result = numpy.array(img, copy=True)
for y in range(result.shape[0]):
for x in range(result.shape[1]):
oldpixel = result[y, x]
newpixel = find_closest_palette_color(oldpixel, palette)
quant_error = oldpixel - newpixel
result[y, x] = newpixel
if x + 1 < result.shape[1]:
result[y, x + 1] += quant_error * 7 / 16
if y + 1 < result.shape[0]:
result[y + 1, x - 1] += quant_error * 3 / 16
result[y + 1, x] += quant_error * 5 / 16
if x + 1 < result.shape[1] and y + 1 < result.shape[0]:
result[y + 1, x + 1] += quant_error * 1 / 16
return result
def convolve_rgba(img, kernel):
return numpy.stack(
(
scipy.signal.convolve2d(img[:, :, 0], kernel, "same"),
scipy.signal.convolve2d(img[:, :, 1], kernel, "same"),
scipy.signal.convolve2d(img[:, :, 2], kernel, "same"),
scipy.signal.convolve2d(img[:, :, 3], kernel, "same"),
),
axis=-1,
)
def rgb2gray(img):
result = numpy.zeros((60, 60), dtype=numpy.dtype("int64"))
count = 0
for y in range(img.shape[0]):
for x in range(img.shape[1]):
clin = sum(img[y, x] * [0.2126, 0.7152, 0.0722]) / 0xFFFF
if clin <= 0.0031308:
csrgb = 12.92 * clin
else:
csrgb = 1.055 * clin ** (1 / 2.4) - 0.055
result[y, x] = csrgb * 0xFFFF
count += 1
# if count == 24:
# raise Exception(result[y, x])
return result
def palettize(img, pal):
result = numpy.zeros((img.shape[0], img.shape[1]), dtype=numpy.dtype("int64"))
for y in range(img.shape[0]):
for x in range(img.shape[1]):
for i, col in enumerate(pal):
if numpy.array_equal(img[y, x], col):
result[y, x] = i
break
else:
raise Exception()
return result
# we cannot use zlib.compress() because different compressors may compress the
# same data differently, for example by using different optimizations on
# different architectures:
# https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/R7GD4L5Z6HELCDAL2RDESWR2F3ZXHWVX/
#
# to make the compressed representation of the uncompressed data bit-by-bit
# identical on all platforms we make use of the compression method 0, that is,
# no compression at all :)
def compress(data):
# two-byte zlib header (rfc1950)
# common header for lowest compression level
# bits 0-3: Compression info, base-2 logarithm of the LZ77 window size,
# minus eight -- 7 indicates a 32K window size
# bits 4-7: Compression method -- 8 is deflate
# bits 8-9: Compression level -- 0 is fastest
# bit 10: preset dictionary -- 0 is none
# bits 11-15: check bits so that the 16-bit unsigned integer stored in MSB
# order is a multiple of 31
result = b"\x78\x01"
# content is stored in deflate format (rfc1951)
# maximum chunk size is the largest 16 bit unsigned integer
chunksize = 0xFFFF
for i in range(0, len(data), chunksize):
# bits 0-4 are unused
# bits 5-6 indicate compression method -- 0 is no compression
# bit 7 indicates the last chunk
if i * chunksize < len(data) - chunksize:
result += b"\x00"
else:
# last chunck
result += b"\x01"
chunk = data[i : i + chunksize]
# the chunk length as little endian 16 bit unsigned integer
result += struct.pack("I", zlib.adler32(data))
return result
def write_png(data, path, bitdepth, colortype, palette=None, iccp=None):
with open(str(path), "wb") as f:
f.write(b"\x89PNG\r\n\x1A\n")
# PNG image type Colour type Allowed bit depths
# Greyscale 0 1, 2, 4, 8, 16
# Truecolour 2 8, 16
# Indexed-colour 3 1, 2, 4, 8
# Greyscale with alpha 4 8, 16
# Truecolour with alpha 6 8, 16
block = b"IHDR" + struct.pack(
">IIBBBBB",
data.shape[1], # width
data.shape[0], # height
bitdepth, # bitdepth
colortype, # colortype
0, # compression
0, # filtertype
0, # interlaced
)
f.write(
struct.pack(">I", len(block) - 4)
+ block
+ struct.pack(">I", zlib.crc32(block))
)
if iccp is not None:
with open(iccp, "rb") as infh:
iccdata = infh.read()
block = b"iCCP"
block += b"icc\0" # arbitrary profile name
block += b"\0" # compression method (deflate)
block += zlib.compress(iccdata)
f.write(
struct.pack(">I", len(block) - 4)
+ block
+ struct.pack(">I", zlib.crc32(block))
)
if palette is not None:
block = b"PLTE"
for col in palette:
block += struct.pack(">BBB", col[0], col[1], col[2])
f.write(
struct.pack(">I", len(block) - 4)
+ block
+ struct.pack(">I", zlib.crc32(block))
)
raw = b""
for y in range(data.shape[0]):
raw += b"\0"
if bitdepth == 16:
raw += data[y].astype(">u2").tobytes()
elif bitdepth == 8:
raw += data[y].astype(">u1").tobytes()
elif bitdepth in [4, 2, 1]:
valsperbyte = 8 // bitdepth
for x in range(0, data.shape[1], valsperbyte):
val = 0
for j in range(valsperbyte):
if x + j >= data.shape[1]:
break
val |= (data[y, x + j].astype(">u2") & (2**bitdepth - 1)) << (
(valsperbyte - j - 1) * bitdepth
)
raw += struct.pack(">B", val)
else:
raise Exception()
compressed = compress(raw)
block = b"IDAT" + compressed
f.write(
struct.pack(">I", len(compressed))
+ block
+ struct.pack(">I", zlib.crc32(block))
)
block = b"IEND"
f.write(struct.pack(">I", 0) + block + struct.pack(">I", zlib.crc32(block)))
def compare(im1, im2, exact, icc, cmyk):
if exact:
if cmyk and not HAVE_EXACT_CMYK8:
raise Exception("cmyk cannot be exact before ImageMagick 7.1.0-48")
elif icc:
raise Exception("icc cannot be exact")
else:
subprocess.check_call(
COMPARE
+ [
"-metric",
"AE",
"-alpha",
"off",
im1,
im2,
"null:",
]
)
else:
iccargs = []
if icc:
if ICC_PROFILE is None:
pytest.skip("Could not locate an ICC profile")
iccargs = ["-profile", ICC_PROFILE]
psnr = subprocess.run(
COMPARE
+ iccargs
+ [
"-metric",
"PSNR",
im1,
im2,
"null:",
],
check=False,
stderr=subprocess.PIPE,
).stderr
assert psnr != b"0"
assert psnr != b"0 (0)"
assert psnr_re.fullmatch(psnr) is not None, psnr
psnr = psnr_re.fullmatch(psnr).group(1)
psnr = float(psnr)
assert psnr != 0 # or otherwise we would use the exact variant
assert psnr > 50
def compare_ghostscript(tmpdir, img, pdf, gsdevice="png16m", exact=True, icc=False):
if gsdevice in ["png16m", "pnggray"]:
ext = "png"
elif gsdevice in ["tiff24nc", "tiff32nc", "tiff48nc"]:
ext = "tiff"
else:
raise Exception("unknown gsdevice: " + gsdevice)
subprocess.check_call(
[
"gs",
"-dQUIET",
"-dNOPAUSE",
"-dBATCH",
"-sDEVICE=" + gsdevice,
"-r96",
"-sOutputFile=" + str(tmpdir / "gs-") + "%00d." + ext,
str(pdf),
]
)
compare(str(img), str(tmpdir / "gs-1.") + ext, exact, icc, False)
(tmpdir / ("gs-1." + ext)).unlink()
def compare_poppler(tmpdir, img, pdf, exact=True, icc=False):
subprocess.check_call(
["pdftocairo", "-r", "96", "-png", str(pdf), str(tmpdir / "poppler")]
)
compare(str(img), str(tmpdir / "poppler-1.png"), exact, icc, False)
(tmpdir / "poppler-1.png").unlink()
def compare_mupdf(tmpdir, img, pdf, exact=True, cmyk=False):
if not HAVE_MUTOOL:
return
if cmyk:
out = tmpdir / "mupdf.pam"
subprocess.check_call(
["mutool", "draw", "-r", "96", "-c", "cmyk", "-o", str(out), str(pdf)]
)
else:
out = tmpdir / "mupdf.png"
subprocess.check_call(
["mutool", "draw", "-r", "96", "-png", "-o", str(out), str(pdf)]
)
compare(str(img), str(out), exact, False, cmyk)
out.unlink()
def compare_pdfimages_jpg(tmpdir, img, pdf):
subprocess.check_call(["pdfimages", "-j", str(pdf), str(tmpdir / "images")])
assert img.read_bytes() == (tmpdir / "images-000.jpg").read_bytes()
(tmpdir / "images-000.jpg").unlink()
def compare_pdfimages_cmyk(tmpdir, img, pdf):
if not HAVE_PDFIMAGES_CMYK:
return
subprocess.check_call(["pdfimages", "-j", str(pdf), str(tmpdir / "images")])
assert img.read_bytes() == (tmpdir / "images-000.jpg").read_bytes()
(tmpdir / "images-000.jpg").unlink()
def compare_pdfimages_jp2(tmpdir, img, pdf):
subprocess.check_call(["pdfimages", "-jp2", str(pdf), str(tmpdir / "images")])
assert img.read_bytes() == (tmpdir / "images-000.jp2").read_bytes()
(tmpdir / "images-000.jp2").unlink()
def compare_pdfimages_tiff(tmpdir, img, pdf):
subprocess.check_call(["pdfimages", "-tiff", str(pdf), str(tmpdir / "images")])
subprocess.check_call(
COMPARE
+ [
"-metric",
"AE",
str(img),
str(tmpdir / "images-000.tif"),
"null:",
]
)
(tmpdir / "images-000.tif").unlink()
def compare_pdfimages_png(tmpdir, img, pdf, exact=True, icc=False):
subprocess.check_call(["pdfimages", "-png", str(pdf), str(tmpdir / "images")])
# images-001.png is the grayscale SMask image (the original alpha channel)
if os.path.isfile(tmpdir / "images-001.png"):
subprocess.check_call(
CONVERT
+ [
str(tmpdir / "images-000.png"),
str(tmpdir / "images-001.png"),
"-compose",
"copy-opacity",
"-composite",
str(tmpdir / "composite.png"),
]
)
(tmpdir / "images-000.png").unlink()
(tmpdir / "images-001.png").unlink()
os.rename(tmpdir / "composite.png", tmpdir / "images-000.png")
if exact:
if icc:
raise Exception("not exact with icc")
subprocess.check_call(
COMPARE
+ [
"-metric",
"AE",
str(img),
str(tmpdir / "images-000.png"),
"null:",
]
)
else:
if icc:
if ICC_PROFILE is None:
pytest.skip("Could not locate an ICC profile")
psnr = subprocess.run(
COMPARE
+ [
"-metric",
"PSNR",
"(",
"-profile",
ICC_PROFILE,
"-depth",
"8",
str(img),
")",
str(tmpdir / "images-000.png"),
"null:",
],
check=False,
stderr=subprocess.PIPE,
).stderr
else:
psnr = subprocess.run(
COMPARE
+ [
"-metric",
"PSNR",
str(img),
str(tmpdir / "images-000.png"),
"null:",
],
check=False,
stderr=subprocess.PIPE,
).stderr
assert psnr != b"0"
assert psnr != b"0 (0)"
psnr = psnr_re.fullmatch(psnr).group(1)
psnr = float(psnr)
assert psnr != 0 # or otherwise we would use the exact variant
assert psnr > 50
(tmpdir / "images-000.png").unlink()
def tiff_header_for_ccitt(width, height, img_size, ccitt_group=4):
# Quick and dirty TIFF header builder from
# https://stackoverflow.com/questions/2641770
tiff_header_struct = "<" + "2s" + "h" + "l" + "h" + "hhll" * 8 + "h"
return struct.pack(
# fmt: off
tiff_header_struct,
b'II', # Byte order indication: Little indian
42, # Version number (always 42)
8, # Offset to first IFD
8, # Number of tags in IFD
256, 4, 1, width, # ImageWidth, LONG, 1, width
257, 4, 1, height, # ImageLength, LONG, 1, lenght
258, 3, 1, 1, # BitsPerSample, SHORT, 1, 1
259, 3, 1, ccitt_group, # Compression, SHORT, 1, 4 = CCITT Group 4
262, 3, 1, 1, # Threshholding, SHORT, 1, 0 = WhiteIsZero
273, 4, 1, struct.calcsize(
tiff_header_struct), # StripOffsets, LONG, 1, len of header
278, 4, 1, height, # RowsPerStrip, LONG, 1, lenght
279, 4, 1, img_size, # StripByteCounts, LONG, 1, size of image
0
# last IFD
# fmt: on
)
pixel_R = [
[1, 1, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 1],
[1, 0, 0, 1],
]
pixel_G = [
[0, 1, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 0],
[1, 0, 1, 1],
[1, 0, 0, 1],
[1, 0, 0, 1],
[0, 1, 1, 0],
]
pixel_B = [
[1, 1, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 0],
[1, 0, 0, 1],
[1, 0, 0, 1],
[1, 1, 1, 0],
]
def alpha_value():
# gaussian kernel with sigma=3
kernel = numpy.array(
[
[0.011362, 0.014962, 0.017649, 0.018648, 0.017649, 0.014962, 0.011362],
[0.014962, 0.019703, 0.02324, 0.024556, 0.02324, 0.019703, 0.014962],
[0.017649, 0.02324, 0.027413, 0.028964, 0.027413, 0.02324, 0.017649],
[0.018648, 0.024556, 0.028964, 0.030603, 0.028964, 0.024556, 0.018648],
[0.017649, 0.02324, 0.027413, 0.028964, 0.027413, 0.02324, 0.017649],
[0.014962, 0.019703, 0.02324, 0.024556, 0.02324, 0.019703, 0.014962],
[0.011362, 0.014962, 0.017649, 0.018648, 0.017649, 0.014962, 0.011362],
],
float,
)
# constructs a 2D array of a circle with a width of 36
circle = list()
offsets_36 = [14, 11, 9, 7, 6, 5, 4, 3, 3, 2, 2, 1, 1, 1, 0, 0, 0, 0]
for offs in offsets_36 + offsets_36[::-1]:
circle.append([0] * offs + [1] * (len(offsets_36) - offs) * 2 + [0] * offs)
alpha = numpy.zeros((60, 60, 4), dtype=numpy.dtype("int64"))
# draw three circles
for xpos, ypos, color in [
(12, 3, [0xFFFF, 0, 0, 0xFFFF]),
(21, 21, [0, 0xFFFF, 0, 0xFFFF]),
(3, 21, [0, 0, 0xFFFF, 0xFFFF]),
]:
for x, row in enumerate(circle):
for y, pos in enumerate(row):
if pos:
alpha[y + ypos, x + xpos] += color
alpha = numpy.clip(alpha, 0, 0xFFFF)
alpha = convolve_rgba(alpha, kernel)
# draw letters
for y, row in enumerate(pixel_R):
for x, pos in enumerate(row):
if pos:
alpha[13 + y, 28 + x] = [0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF]
for y, row in enumerate(pixel_G):
for x, pos in enumerate(row):
if pos:
alpha[39 + y, 40 + x] = [0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF]
for y, row in enumerate(pixel_B):
for x, pos in enumerate(row):
if pos:
alpha[39 + y, 15 + x] = [0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF]
return alpha
def icc_profile():
PCS = (0.96420288, 1.0, 0.82490540) # D50 illuminant constants
# approximate X,Y,Z values for white, red, green and blue
white = (0.95, 1.0, 1.09)
red = (0.44, 0.22, 0.014)
green = (0.39, 0.72, 0.1)
blue = (0.14, 0.06, 0.71)
getxyz = lambda v: (round(65536 * v[0]), round(65536 * v[1]), round(65536 * v[2]))
header = (
# header
+4 * b"\0" # cmmsignatures
+ 4 * b"\0" # version
+ b"mntr" # device class
+ b"RGB " # color space
+ b"XYZ " # PCS
+ 12 * b"\0" # datetime
+ b"\x61\x63\x73\x70" # static signature
+ 4 * b"\0" # platform
+ 4 * b"\0" # flags
+ 4 * b"\0" # device manufacturer
+ 4 * b"\0" # device model
+ 8 * b"\0" # device attributes
+ 4 * b"\0" # rendering intents
+ struct.pack(">III", *getxyz(PCS))
+ 4 * b"\0" # creator
+ 16 * b"\0" # identifier
+ 28 * b"\0" # reserved
)
def pad4(s):
if len(s) % 4 == 0:
return s
else:
return s + b"\x00" * (4 - len(s) % 4)
tagdata = [
b"desc\x00\x00\x00\x00" + struct.pack(">I", 5) + b"fake" + 79 * b"\x00",
b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(white)),
# by mixing up red, green and blue, we create a test profile
b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(blue)), # red
b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(red)), # green
b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(green)), # blue
# by only supplying two values, we create the most trivial "curve",
# where the remaining values will be linearly interpolated between them
b"curv\x00\x00\x00\x00" + struct.pack(">IHH", 2, 0, 65535),
b"text\x00\x00\x00\x00" + b"no copyright, use freely" + 1 * b"\x00",
]
table = [
(b"desc", 0),
(b"wtpt", 1),
(b"rXYZ", 2),
(b"gXYZ", 3),
(b"bXYZ", 4),
# we use the same curve for all three channels, so the same offset is referenced
(b"rTRC", 5),
(b"gTRC", 5),
(b"bTRC", 5),
(b"cprt", 6),
]
offset = (
lambda n: 4 # total size
+ len(header) # header length
+ 4 # number table entries
+ len(table) * 12 # table length
+ sum([len(pad4(s)) for s in tagdata[:n]])
)
table = struct.pack(">I", len(table)) + b"".join(
[t + struct.pack(">II", offset(o), len(tagdata[o])) for t, o in table]
)
data = b"".join([pad4(s) for s in tagdata])
data = (
struct.pack(">I", 4 + len(header) + len(table) + len(data))
+ header
+ table
+ data
)
return data
###############################################################################
# INPUT FIXTURES #
###############################################################################
@pytest.fixture(scope="session")
def alpha():
return alpha_value()
@pytest.fixture(scope="session")
def tmp_alpha_png(tmp_path_factory, alpha):
tmp_alpha_png = tmp_path_factory.mktemp("alpha_png") / "alpha.png"
write_png(alpha, str(tmp_alpha_png), 16, 6)
assert (
hashlib.md5(tmp_alpha_png.read_bytes()).hexdigest()
== "600bb4cffb039a022cec6ed55537deba"
)
return tmp_alpha_png
@pytest.fixture(scope="session")
def tmp_gray1_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
gray16 = rgb2gray(normal16)
tmp_gray1_png = tmp_path_factory.mktemp("gray1_png") / "gray1.png"
write_png(
floyd_steinberg(gray16, numpy.arange(2) / 0x1 * 0xFFFF) / 0xFFFF * 0x1,
str(tmp_gray1_png),
1,
0,
)
assert (
hashlib.md5(tmp_gray1_png.read_bytes()).hexdigest()
== "dd2c528152d34324747355b73495a115"
)
return tmp_gray1_png
@pytest.fixture(scope="session")
def tmp_gray2_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
gray16 = rgb2gray(normal16)
tmp_gray2_png = tmp_path_factory.mktemp("gray2_png") / "gray2.png"
write_png(
floyd_steinberg(gray16, numpy.arange(4) / 0x3 * 0xFFFF) / 0xFFFF * 0x3,
str(tmp_gray2_png),
2,
0,
)
assert (
hashlib.md5(tmp_gray2_png.read_bytes()).hexdigest()
== "68e614f4e6a85053d47098dad0ca3976"
)
return tmp_gray2_png
@pytest.fixture(scope="session")
def tmp_gray4_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
gray16 = rgb2gray(normal16)
tmp_gray4_png = tmp_path_factory.mktemp("gray4_png") / "gray4.png"
write_png(
floyd_steinberg(gray16, numpy.arange(16) / 0xF * 0xFFFF) / 0xFFFF * 0xF,
str(tmp_gray4_png),
4,
0,
)
assert (
hashlib.md5(tmp_gray4_png.read_bytes()).hexdigest()
== "ff04a6fea88133eb77bbb748692ae0fd"
)
return tmp_gray4_png
@pytest.fixture(scope="session")
def tmp_gray8_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
gray16 = rgb2gray(normal16)
tmp_gray8_png = tmp_path_factory.mktemp("gray8_png") / "gray8.png"
write_png(gray16 / 0xFFFF * 0xFF, tmp_gray8_png, 8, 0)
assert (
hashlib.md5(tmp_gray8_png.read_bytes()).hexdigest()
== "90b4ed9123f295dda7fde499744dede7"
)
return tmp_gray8_png
@pytest.fixture(scope="session")
def tmp_gray16_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
gray16 = rgb2gray(normal16)
tmp_gray16_png = tmp_path_factory.mktemp("gray16_png") / "gray16.png"
write_png(gray16, str(tmp_gray16_png), 16, 0)
assert (
hashlib.md5(tmp_gray16_png.read_bytes()).hexdigest()
== "f76153d2e72fada11d934c32c8168a57"
)
return tmp_gray16_png
@pytest.fixture(scope="session")
def tmp_inverse_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_inverse_png = tmp_path_factory.mktemp("inverse_png") / "inverse.png"
write_png(0xFF - normal16 / 0xFFFF * 0xFF, str(tmp_inverse_png), 8, 2)
assert (
hashlib.md5(tmp_inverse_png.read_bytes()).hexdigest()
== "0a7d57dc09c4d8fd1ad3511b116c7dfa"
)
return tmp_inverse_png
@pytest.fixture(scope="session")
def tmp_icc_profile(tmp_path_factory):
tmp_icc_profile = tmp_path_factory.mktemp("icc_profile") / "fake.icc"
tmp_icc_profile.write_bytes(icc_profile())
return tmp_icc_profile
@pytest.fixture(scope="session")
def tmp_icc_png(tmp_path_factory, alpha, tmp_icc_profile):
normal16 = alpha[:, :, 0:3]
tmp_icc_png = tmp_path_factory.mktemp("icc_png") / "icc.png"
write_png(
normal16 / 0xFFFF * 0xFF,
str(tmp_icc_png),
8,
2,
iccp=str(tmp_icc_profile),
)
assert (
hashlib.md5(tmp_icc_png.read_bytes()).hexdigest()
== "bf25f673c1617f5f9353b2a043747655"
)
return tmp_icc_png
@pytest.fixture(scope="session")
def tmp_normal16_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_normal16_png = tmp_path_factory.mktemp("normal16_png") / "normal16.png"
write_png(normal16, str(tmp_normal16_png), 16, 2)
assert (
hashlib.md5(tmp_normal16_png.read_bytes()).hexdigest()
== "820dd30a2566775fc64c110e8ac65c7e"
)
return tmp_normal16_png
@pytest.fixture(scope="session")
def tmp_normal_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_normal_png = tmp_path_factory.mktemp("normal_png") / "normal.png"
write_png(normal16 / 0xFFFF * 0xFF, str(tmp_normal_png), 8, 2)
assert (
hashlib.md5(tmp_normal_png.read_bytes()).hexdigest()
== "bc30c705f455991cd04be1c298063002"
)
return tmp_normal_png
@pytest.fixture(scope="session")
def tmp_palette1_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_palette1_png = tmp_path_factory.mktemp("palette1_png") / "palette1.png"
# don't choose black and white or otherwise imagemagick will classify the
# image as bilevel with 8/1-bit depth instead of palette with 8-bit color
# don't choose gray colors or otherwise imagemagick will classify the
# image as grayscale
pal1 = numpy.array(
[[0x01, 0x02, 0x03], [0xFE, 0xFD, 0xFC]], dtype=numpy.dtype("int64")
)
write_png(
palettize(
floyd_steinberg(normal16, pal1 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal1
),
str(tmp_palette1_png),
1,
3,
pal1,
)
assert (
hashlib.md5(tmp_palette1_png.read_bytes()).hexdigest()
== "3d065f731540e928fb730b3233e4e8a7"
)
return tmp_palette1_png
@pytest.fixture(scope="session")
def tmp_palette2_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_palette2_png = tmp_path_factory.mktemp("palette2_png") / "palette2.png"
# choose values slightly off red, lime and blue because otherwise
# imagemagick will classify the image as Depth: 8/1-bit
pal2 = numpy.array(
[[0, 0, 0], [0xFE, 0, 0], [0, 0xFE, 0], [0, 0, 0xFE]],
dtype=numpy.dtype("int64"),
)
write_png(
palettize(
floyd_steinberg(normal16, pal2 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal2
),
str(tmp_palette2_png),
2,
3,
pal2,
)
assert (
hashlib.md5(tmp_palette2_png.read_bytes()).hexdigest()
== "0b0d4412c28da26163a622d218ee02ca"
)
return tmp_palette2_png
@pytest.fixture(scope="session")
def tmp_palette4_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_palette4_png = tmp_path_factory.mktemp("palette4_png") / "palette4.png"
# windows 16 color palette
pal4 = numpy.array(
[
[0x00, 0x00, 0x00],
[0x80, 0x00, 0x00],
[0x00, 0x80, 0x00],
[0x80, 0x80, 0x00],
[0x00, 0x00, 0x80],
[0x80, 0x00, 0x80],
[0x00, 0x80, 0x80],
[0xC0, 0xC0, 0xC0],
[0x80, 0x80, 0x80],
[0xFF, 0x00, 0x00],
[0x00, 0xFF, 0x00],
[0xFF, 0x00, 0x00],
[0x00, 0xFF, 0x00],
[0xFF, 0x00, 0xFF],
[0x00, 0xFF, 0x00],
[0xFF, 0xFF, 0xFF],
],
dtype=numpy.dtype("int64"),
)
write_png(
palettize(
floyd_steinberg(normal16, pal4 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal4
),
str(tmp_palette4_png),
4,
3,
pal4,
)
assert (
hashlib.md5(tmp_palette4_png.read_bytes()).hexdigest()
== "163f6d7964b80eefa0dc6a48cb7315dd"
)
return tmp_palette4_png
@pytest.fixture(scope="session")
def tmp_palette8_png(tmp_path_factory, alpha):
normal16 = alpha[:, :, 0:3]
tmp_palette8_png = tmp_path_factory.mktemp("palette8_png") / "palette8.png"
# create a 256 color palette by first writing 16 shades of gray
# and then writing an array of RGB colors with 6, 8 and 5 levels
# for red, green and blue, respectively
pal8 = numpy.zeros((256, 3), dtype=numpy.dtype("int64"))
i = 0
for gray in range(15, 255, 15):
pal8[i] = [gray, gray, gray]
i += 1
for red in 0, 0x33, 0x66, 0x99, 0xCC, 0xFF:
for green in 0, 0x24, 0x49, 0x6D, 0x92, 0xB6, 0xDB, 0xFF:
for blue in 0, 0x40, 0x80, 0xBF, 0xFF:
pal8[i] = [red, green, blue]
i += 1
assert i == 256
write_png(
palettize(
floyd_steinberg(normal16, pal8 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal8
),
str(tmp_palette8_png),
8,
3,
pal8,
)
assert (
hashlib.md5(tmp_palette8_png.read_bytes()).hexdigest()
== "8847bb734eba0e2d85e3f97fc2849dd4"
)
return tmp_palette8_png
@pytest.fixture(scope="session")
def jpg_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("jpg") / "in.jpg"
subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JPEG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jpeg", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert "resolution" not in identify[0]["image"]
assert identify[0]["image"].get("units") == "Undefined", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) == "Undefined", str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG", str(identify)
assert identify[0]["image"].get("orientation") == "Undefined", str(identify)
assert (
identify[0]["image"].get("properties", {}).get("jpeg:colorspace") == "2"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def jpg_rot_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("jpg_rot") / "in.jpg"
subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)])
subprocess.check_call(
["exiftool", "-overwrite_original", "-all=", str(in_img), "-n"]
)
subprocess.check_call(
[
"exiftool",
"-overwrite_original",
"-Orientation=6",
"-XResolution=96",
"-YResolution=96",
"-ResolutionUnit=2",
"-n",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JPEG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jpeg", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("resolution") == {"x": 96, "y": 96}
assert identify[0]["image"].get("units") == "PixelsPerInch", str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG", str(identify)
assert identify[0]["image"].get("orientation") == "RightTop", str(identify)
return in_img
@pytest.fixture(scope="session")
def jpg_cmyk_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("jpg_cmyk") / "in.jpg"
subprocess.check_call(
CONVERT + [str(tmp_normal_png), "-colorspace", "cmyk", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JPEG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jpeg", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG", str(identify)
return in_img
@pytest.fixture(scope="session")
def jpg_2000_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("jpg_2000") / "in.jp2"
subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JP2", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG2000", str(identify)
return in_img
@pytest.fixture(scope="session")
def jpg_2000_rgba8_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("jpg_2000_rgba8") / "in.jp2"
subprocess.check_call(CONVERT + [str(tmp_alpha_png), "-depth", "8", str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JP2", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG2000", str(identify)
return in_img
@pytest.fixture(scope="session")
def jpg_2000_rgba16_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("jpg_2000_rgba16") / "in.jp2"
subprocess.check_call(CONVERT + [str(tmp_alpha_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "JP2", str(identify)
assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "JPEG2000", str(identify)
return in_img
@pytest.fixture(scope="session")
def png_rgb8_img(tmp_normal_png):
in_img = tmp_normal_png
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "2 (Truecolor)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_rgb16_img(tmp_normal16_png):
in_img = tmp_normal16_png
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig")
== "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "2 (Truecolor)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_rgba8_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("png_rgba8") / "in.png"
subprocess.check_call(
CONVERT + [str(tmp_alpha_png), "-depth", "8", "-strip", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "6"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "6 (RGBA)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_rgba16_img(tmp_alpha_png):
in_img = tmp_alpha_png
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig")
== "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "6"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "6 (RGBA)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_gray8a_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("png_gray8a") / "in.png"
subprocess.check_call(
CONVERT
+ [
str(tmp_alpha_png),
"-colorspace",
"Gray",
"-dither",
"FloydSteinberg",
"-colors",
"256",
"-depth",
"8",
"-strip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "GrayscaleAlpha", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "4"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "4 (GrayAlpha)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_gray16a_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("png_gray16a") / "in.png"
subprocess.check_call(
CONVERT
+ [
str(tmp_alpha_png),
"-colorspace",
"Gray",
"-depth",
"16",
"-strip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "GrayscaleAlpha", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig")
== "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "4"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "4 (GrayAlpha)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_interlaced_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("png_interlaced") / "in.png"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
"-interlace",
"PNG",
"-strip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "2 (Truecolor)"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.interlace_method")
== "1 (Adam7 method)"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def png_gray1_img(tmp_path_factory, tmp_gray1_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_gray1_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") in ["Bilevel", "Grayscale"], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "1"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "1"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "0"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "0 (Grayscale)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_gray1_png
@pytest.fixture(scope="session")
def png_gray2_img(tmp_path_factory, tmp_gray2_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_gray2_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 2, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "0"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "0 (Grayscale)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_gray2_png
@pytest.fixture(scope="session")
def png_gray4_img(tmp_path_factory, tmp_gray4_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_gray4_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 4, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "4"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "4"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "0"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "0 (Grayscale)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_gray4_png
@pytest.fixture(scope="session")
def png_gray8_img(tmp_path_factory, tmp_gray8_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_gray8_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "0"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "0 (Grayscale)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_gray8_png
@pytest.fixture(scope="session")
def png_gray16_img(tmp_path_factory, tmp_gray16_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_gray16_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig")
== "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "0"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "0 (Grayscale)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_gray16_png
@pytest.fixture(scope="session")
def png_palette1_img(tmp_path_factory, tmp_palette1_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_palette1_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "1"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "1"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "3"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "3 (Indexed)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_palette1_png
@pytest.fixture(scope="session")
def png_palette2_img(tmp_path_factory, tmp_palette2_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_palette2_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "3"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "3 (Indexed)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_palette2_png
@pytest.fixture(scope="session")
def png_palette4_img(tmp_path_factory, tmp_palette4_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_palette4_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "4"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "4"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "3"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "3 (Indexed)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_palette4_png
@pytest.fixture(scope="session")
def png_palette8_img(tmp_path_factory, tmp_palette8_png):
identify = json.loads(
subprocess.check_output(CONVERT + [str(tmp_palette8_png), "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "3"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "3 (Indexed)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return tmp_palette8_png
@pytest.fixture(scope="session")
def gif_transparent_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("gif_transparent_img") / "in.gif"
subprocess.check_call(CONVERT + [str(tmp_alpha_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "PaletteAlpha", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 256, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
return in_img
@pytest.fixture(scope="session")
def gif_palette1_img(tmp_path_factory, tmp_palette1_png):
in_img = tmp_path_factory.mktemp("gif_palette1_img") / "in.gif"
subprocess.check_call(CONVERT + [str(tmp_palette1_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 2, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
return in_img
@pytest.fixture(scope="session")
def gif_palette2_img(tmp_path_factory, tmp_palette2_png):
in_img = tmp_path_factory.mktemp("gif_palette2_img") / "in.gif"
subprocess.check_call(CONVERT + [str(tmp_palette2_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 4, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
return in_img
@pytest.fixture(scope="session")
def gif_palette4_img(tmp_path_factory, tmp_palette4_png):
in_img = tmp_path_factory.mktemp("gif_palette4_img") / "in.gif"
subprocess.check_call(CONVERT + [str(tmp_palette4_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
return in_img
@pytest.fixture(scope="session")
def gif_palette8_img(tmp_path_factory, tmp_palette8_png):
in_img = tmp_path_factory.mktemp("gif_palette8_img") / "in.gif"
subprocess.check_call(CONVERT + [str(tmp_palette8_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 256, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
return in_img
@pytest.fixture(scope="session")
def gif_animation_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png):
in_img = tmp_path_factory.mktemp("gif_animation_img") / "in.gif"
pal_img = tmp_path_factory.mktemp("gif_animation_img") / "pal.gif"
tmp_img = tmp_path_factory.mktemp("gif_animation_img") / "tmp.gif"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
str(tmp_inverse_png),
str(tmp_img),
]
)
# create palette image with all unique colors
subprocess.check_call(
CONVERT
+ [
str(tmp_img),
"-unique-colors",
str(pal_img),
]
)
# make sure all frames have the same palette by using -remap
subprocess.check_call(
CONVERT + [str(tmp_img), "-strip", "-remap", str(pal_img), str(in_img)]
)
pal_img.unlink()
tmp_img.unlink()
identify = json.loads(
subprocess.check_output(CONVERT + [str(in_img) + "[0]", "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 256, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
colormap_frame0 = identify[0]["image"].get("colormap")
identify = json.loads(
subprocess.check_output(CONVERT + [str(in_img) + "[1]", "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "GIF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/gif", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 256, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "LZW", str(identify)
assert identify[0]["image"].get("scene") == 1, str(identify)
colormap_frame1 = identify[0]["image"].get("colormap")
assert colormap_frame0 == colormap_frame1
return in_img
@pytest.fixture(scope="session")
def tiff_float_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("tiff_float_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
"-depth",
"32",
"-define",
"quantum:format=floating-point",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 32, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("quantum:format")
== "floating-point"
), str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_cmyk8_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("tiff_cmyk8") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
"-colorspace",
"cmyk",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "separated"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_cmyk16_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("tiff_cmyk16") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
"-depth",
"16",
"-colorspace",
"cmyk",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "separated"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_rgb8_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff"
subprocess.check_call(
CONVERT + [str(tmp_normal_png), "-compress", "Zip", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_rgb12_img(tmp_path_factory, tmp_normal16_png):
in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal16_png),
"-depth",
"12",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("baseDepth") == 12, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_rgb14_img(tmp_path_factory, tmp_normal16_png):
in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal16_png),
"-depth",
"14",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("baseDepth") == 14, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_rgb16_img(tmp_path_factory, tmp_normal16_png):
in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal16_png),
"-depth",
"16",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_rgba8_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("tiff_rgba8") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_alpha_png),
"-depth",
"8",
"-strip",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_rgba16_img(tmp_path_factory, tmp_alpha_png):
in_img = tmp_path_factory.mktemp("tiff_rgba16") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_alpha_png),
"-depth",
"16",
"-strip",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_gray1_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_gray1") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-depth",
"1",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_gray2_img(tmp_path_factory, tmp_gray2_png):
in_img = tmp_path_factory.mktemp("tiff_gray2") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray2_png),
"-depth",
"2",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 2, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_gray4_img(tmp_path_factory, tmp_gray4_png):
in_img = tmp_path_factory.mktemp("tiff_gray4") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray4_png),
"-depth",
"4",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 4, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_gray8_img(tmp_path_factory, tmp_gray8_png):
in_img = tmp_path_factory.mktemp("tiff_gray8") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray8_png),
"-depth",
"8",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_gray16_img(tmp_path_factory, tmp_gray16_png):
in_img = tmp_path_factory.mktemp("tiff_gray16") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray16_png),
"-depth",
"16",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Grayscale", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_multipage_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png):
in_img = tmp_path_factory.mktemp("tiff_multipage_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
str(tmp_inverse_png),
"-strip",
"-compress",
"Zip",
str(in_img),
]
)
identify = json.loads(
subprocess.check_output(CONVERT + [str(in_img) + "[0]", "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
identify = json.loads(
subprocess.check_output(CONVERT + [str(in_img) + "[1]", "json:"])
)
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB"
), str(identify)
assert identify[0]["image"].get("scene") == 1, str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_palette1_img(tmp_path_factory, tmp_palette1_png):
in_img = tmp_path_factory.mktemp("tiff_palette1_img") / "in.tiff"
subprocess.check_call(
CONVERT + [str(tmp_palette1_png), "-compress", "Zip", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 1, str(identify)
assert identify[0]["image"].get("colormapEntries") == 2, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_palette2_img(tmp_path_factory, tmp_palette2_png):
in_img = tmp_path_factory.mktemp("tiff_palette2_img") / "in.tiff"
subprocess.check_call(
CONVERT + [str(tmp_palette2_png), "-compress", "Zip", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 2, str(identify)
assert identify[0]["image"].get("colormapEntries") == 4, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_palette4_img(tmp_path_factory, tmp_palette4_png):
in_img = tmp_path_factory.mktemp("tiff_palette4_img") / "in.tiff"
subprocess.check_call(
CONVERT + [str(tmp_palette4_png), "-compress", "Zip", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("baseDepth") == 4, str(identify)
assert identify[0]["image"].get("colormapEntries") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_palette8_img(tmp_path_factory, tmp_palette8_png):
in_img = tmp_path_factory.mktemp("tiff_palette8_img") / "in.tiff"
subprocess.check_call(
CONVERT + [str(tmp_palette8_png), "-compress", "Zip", str(in_img)]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "Palette", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("colormapEntries") == 256, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette"
), str(identify)
return in_img
@pytest.fixture(scope="session")
def tiff_ccitt_lsb_m2l_white_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_white_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-compress",
"group4",
"-define",
"tiff:endian=lsb",
"-define",
"tiff:fill-order=msb",
"-define",
"quantum:polarity=min-is-white",
"-compress",
"Group4",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in [
"Undefined",
"LSB",
], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-white"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60"
), str(identify)
tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)])
expected = [
r"^ Image Width: 60 Image Length: 60",
r"^ Bits/Sample: 1",
r"^ Compression Scheme: CCITT Group 4",
r"^ Photometric Interpretation: min-is-white",
r"^ FillOrder: msb-to-lsb",
r"^ Samples/Pixel: 1",
r"^ Rows/Strip: 60",
]
for e in expected:
assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode(
"utf8"
)
return in_img
@pytest.fixture(scope="session")
def tiff_ccitt_msb_m2l_white_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_ccitt_msb_m2l_white_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-compress",
"group4",
"-define",
"tiff:endian=msb",
"-define",
"tiff:fill-order=msb",
"-define",
"quantum:polarity=min-is-white",
"-compress",
"Group4",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in [
"Undefined",
"MSB",
] # FIXME: should be MSB
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "msb", str(
identify
)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-white"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60"
), str(identify)
tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)])
expected = [
r"^ Image Width: 60 Image Length: 60",
r"^ Bits/Sample: 1",
r"^ Compression Scheme: CCITT Group 4",
r"^ Photometric Interpretation: min-is-white",
r"^ FillOrder: msb-to-lsb",
r"^ Samples/Pixel: 1",
r"^ Rows/Strip: 60",
]
for e in expected:
assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode(
"utf8"
)
return in_img
@pytest.fixture(scope="session")
def tiff_ccitt_msb_l2m_white_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_ccitt_msb_l2m_white_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-compress",
"group4",
"-define",
"tiff:endian=msb",
"-define",
"tiff:fill-order=lsb",
"-define",
"quantum:polarity=min-is-white",
"-compress",
"Group4",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in [
"Undefined",
"MSB",
] # FIXME: should be MSB
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "msb", str(
identify
)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-white"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60"
), str(identify)
tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)])
expected = [
r"^ Image Width: 60 Image Length: 60",
r"^ Bits/Sample: 1",
r"^ Compression Scheme: CCITT Group 4",
r"^ Photometric Interpretation: min-is-white",
r"^ FillOrder: lsb-to-msb",
r"^ Samples/Pixel: 1",
r"^ Rows/Strip: 60",
]
for e in expected:
assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode(
"utf8"
)
return in_img
@pytest.fixture(scope="session")
def tiff_ccitt_lsb_m2l_black_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_black_img") / "in.tiff"
# "-define quantum:polarity=min-is-black" requires ImageMagick with:
# https://github.com/ImageMagick/ImageMagick/commit/00730551f0a34328685c59d0dde87dd9e366103a
# or at least 7.0.8-11 from Aug 29, 2018
# or at least 6.9.10-12 from Sep 7, 2018 (for the ImageMagick6 branch)
# also see: https://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=34605
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-compress",
"group4",
"-define",
"tiff:endian=lsb",
"-define",
"tiff:fill-order=msb",
"-define",
"quantum:polarity=min-is-black",
"-compress",
"Group4",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in [
"Undefined",
"LSB",
], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-black"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60"
), str(identify)
tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)])
expected = [
r"^ Image Width: 60 Image Length: 60",
r"^ Bits/Sample: 1",
r"^ Compression Scheme: CCITT Group 4",
r"^ Photometric Interpretation: min-is-black",
r"^ FillOrder: msb-to-lsb",
r"^ Samples/Pixel: 1",
r"^ Rows/Strip: 60",
]
for e in expected:
assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode(
"utf8"
)
return in_img
@pytest.fixture(scope="session")
def tiff_ccitt_nometa1_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_ccitt_nometa1_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-compress",
"group4",
"-define",
"tiff:endian=lsb",
"-define",
"tiff:fill-order=msb",
"-define",
"quantum:polarity=min-is-white",
"-compress",
"Group4",
str(in_img),
]
)
subprocess.check_call(
["tiffset", "-u", "258", str(in_img)]
) # remove BitsPerSample (258)
subprocess.check_call(
["tiffset", "-u", "266", str(in_img)]
) # remove FillOrder (266)
subprocess.check_call(
["tiffset", "-u", "277", str(in_img)]
) # remove SamplesPerPixel (277)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in [
"Undefined",
"LSB",
], str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-white"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60"
), str(identify)
tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)])
expected = [
r"^ Image Width: 60 Image Length: 60",
r"^ Compression Scheme: CCITT Group 4",
r"^ Photometric Interpretation: min-is-white",
r"^ Rows/Strip: 60",
]
for e in expected:
assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode(
"utf8"
)
unexpected = [" Bits/Sample: ", " FillOrder: ", " Samples/Pixel: "]
for e in unexpected:
assert e not in tiffinfo.decode("utf8")
return in_img
@pytest.fixture(scope="session")
def tiff_ccitt_nometa2_img(tmp_path_factory, tmp_gray1_png):
in_img = tmp_path_factory.mktemp("tiff_ccitt_nometa2_img") / "in.tiff"
subprocess.check_call(
CONVERT
+ [
str(tmp_gray1_png),
"-compress",
"group4",
"-define",
"tiff:endian=lsb",
"-define",
"tiff:fill-order=msb",
"-define",
"quantum:polarity=min-is-white",
"-compress",
"Group4",
str(in_img),
]
)
subprocess.check_call(
["tiffset", "-u", "278", str(in_img)]
) # remove RowsPerStrip (278)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "TIFF", str(identify)
assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("units") == "PixelsPerInch", str(identify)
assert identify[0]["image"].get("type") == "Bilevel", str(identify)
endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness"
assert identify[0]["image"].get(endian) in [
"Undefined",
"LSB",
], str(identify)
assert identify[0]["image"].get("colorspace") == "Gray", str(identify)
assert identify[0]["image"].get("depth") == 1, str(identify)
assert identify[0]["image"].get("compression") == "Group4", str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [
"unspecified",
None,
], str(identify)
assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str(
identify
)
assert (
identify[0]["image"].get("properties", {}).get("tiff:photometric")
== "min-is-white"
), str(identify)
assert "tiff:rows-per-strip" not in identify[0]["image"]["properties"]
tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)])
expected = [
r"^ Image Width: 60 Image Length: 60",
r"^ Bits/Sample: 1",
r"^ Compression Scheme: CCITT Group 4",
r"^ Photometric Interpretation: min-is-white",
r"^ FillOrder: msb-to-lsb",
r"^ Samples/Pixel: 1",
]
for e in expected:
assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode(
"utf8"
)
unexpected = [" Rows/Strip: "]
for e in unexpected:
assert e not in tiffinfo.decode("utf8")
return in_img
@pytest.fixture(scope="session")
def miff_cmyk8_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("miff_cmyk8") / "in.miff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
"-colorspace",
"cmyk",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "MIFF", str(identify)
assert identify[0]["image"].get("class") == "DirectClass"
assert identify[0]["image"].get("type") == "ColorSeparation"
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
return in_img
@pytest.fixture(scope="session")
def miff_cmyk16_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("miff_cmyk16") / "in.miff"
subprocess.check_call(
CONVERT
+ [
str(tmp_normal_png),
"-depth",
"16",
"-colorspace",
"cmyk",
str(in_img),
]
)
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "MIFF", str(identify)
assert identify[0]["image"].get("class") == "DirectClass"
assert identify[0]["image"].get("type") == "ColorSeparation"
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "CMYK", str(identify)
assert identify[0]["image"].get("type") == "ColorSeparation", str(identify)
assert identify[0]["image"].get("depth") == 16, str(identify)
assert identify[0]["image"].get("baseDepth") == 16, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
return in_img
@pytest.fixture(scope="session")
def miff_rgb8_img(tmp_path_factory, tmp_normal_png):
in_img = tmp_path_factory.mktemp("miff_rgb8") / "in.miff"
subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)])
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "MIFF", str(identify)
assert identify[0]["image"].get("class") == "DirectClass"
assert identify[0]["image"].get("type") == "TrueColor"
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
return in_img
@pytest.fixture(scope="session")
def png_icc_img(tmp_icc_png):
in_img = tmp_icc_png
identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"]))
assert len(identify) == 1
# somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was
# put into an array, here we cater for the older version containing just
# the bare dictionary
if "image" in identify:
identify = [identify]
assert "image" in identify[0]
assert identify[0]["image"].get("format") == "PNG", str(identify)
assert identify[0]["image"].get("mimeType") == "image/png", str(identify)
assert identify[0]["image"].get("geometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert identify[0]["image"].get("colorspace") == "sRGB", str(identify)
assert identify[0]["image"].get("type") == "TrueColor", str(identify)
assert identify[0]["image"].get("depth") == 8, str(identify)
assert identify[0]["image"].get("pageGeometry") == {
"width": 60,
"height": 60,
"x": 0,
"y": 0,
}, str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig")
== "2"
), str(identify)
assert (
identify[0]["image"].get("properties", {}).get("png:IHDR.color_type")
== "2 (Truecolor)"
), str(identify)
assert (
identify[0]["image"]["properties"]["png:IHDR.interlace_method"]
== "0 (Not interlaced)"
), str(identify)
return in_img
###############################################################################
# OUTPUT FIXTURES #
###############################################################################
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_pdf(tmp_path_factory, jpg_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(jpg_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.Filter == "/DCTDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_rot_pdf(tmp_path_factory, jpg_rot_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_rot_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(jpg_rot_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.Filter == "/DCTDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
assert p.pages[0].Rotate == 90
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_cmyk_pdf(tmp_path_factory, jpg_cmyk_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_cmyk_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(jpg_cmyk_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK"
assert p.pages[0].Resources.XObject.Im0.Decode == pikepdf.Array(
[1, 0, 1, 0, 1, 0, 1, 0]
)
assert p.pages[0].Resources.XObject.Im0.Filter == "/DCTDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_2000_pdf(tmp_path_factory, jpg_2000_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_2000_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
jpg_2000_img,
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_2000_rgba8_pdf(tmp_path_factory, jpg_2000_rgba8_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_2000_rgba8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
jpg_2000_rgba8_img,
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert not hasattr(p.pages[0].Resources.XObject.Im0, "ColorSpace")
assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def jpg_2000_rgba16_pdf(tmp_path_factory, jpg_2000_rgba16_img, request):
out_pdf = tmp_path_factory.mktemp("jpg_2000_rgba16_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
jpg_2000_rgba16_img,
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16
assert not hasattr(p.pages[0].Resources.XObject.Im0, "ColorSpace")
assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_rgb8_pdf(tmp_path_factory, png_rgb8_img, request):
out_pdf = tmp_path_factory.mktemp("png_rgb8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(png_rgb8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_rgba8_pdf(tmp_path_factory, png_rgba8_img, request):
out_pdf = tmp_path_factory.mktemp("png_rgba8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(png_rgba8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
assert p.pages[0].Resources.XObject.Im0.SMask is not None
assert p.pages[0].Resources.XObject.Im0.SMask.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.SMask.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.SMask.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.SMask.Height == 60
assert p.pages[0].Resources.XObject.Im0.SMask.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def gif_transparent_pdf(tmp_path_factory, gif_transparent_img, request):
out_pdf = tmp_path_factory.mktemp("gif_transparent_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(gif_transparent_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
assert p.pages[0].Resources.XObject.Im0.SMask is not None
assert p.pages[0].Resources.XObject.Im0.SMask.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.SMask.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.SMask.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.SMask.Height == 60
assert p.pages[0].Resources.XObject.Im0.SMask.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_rgb16_pdf(tmp_path_factory, png_rgb16_img, request):
out_pdf = tmp_path_factory.mktemp("png_rgb16_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(png_rgb16_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 16
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_interlaced_pdf(tmp_path_factory, png_interlaced_img, request):
out_pdf = tmp_path_factory.mktemp("png_interlaced_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(png_interlaced_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_gray1_pdf(tmp_path_factory, tmp_gray1_png, request):
out_pdf = tmp_path_factory.mktemp("png_gray1_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_gray1_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_gray2_pdf(tmp_path_factory, tmp_gray2_png, request):
out_pdf = tmp_path_factory.mktemp("png_gray2_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_gray2_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_gray4_pdf(tmp_path_factory, tmp_gray4_png, request):
out_pdf = tmp_path_factory.mktemp("png_gray4_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_gray4_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_gray8_pdf(tmp_path_factory, tmp_gray8_png, request):
out_pdf = tmp_path_factory.mktemp("png_gray8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_gray8_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_gray8a_pdf(tmp_path_factory, png_gray8a_img, request):
out_pdf = tmp_path_factory.mktemp("png_gray8a_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(png_gray8a_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
assert p.pages[0].Resources.XObject.Im0.SMask is not None
assert p.pages[0].Resources.XObject.Im0.SMask.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.SMask.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.SMask.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.SMask.Height == 60
assert p.pages[0].Resources.XObject.Im0.SMask.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_gray16_pdf(tmp_path_factory, tmp_gray16_png, request):
out_pdf = tmp_path_factory.mktemp("png_gray16_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_gray16_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 16
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_palette1_pdf(tmp_path_factory, tmp_palette1_png, request):
out_pdf = tmp_path_factory.mktemp("png_palette1_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_palette1_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_palette2_pdf(tmp_path_factory, tmp_palette2_png, request):
out_pdf = tmp_path_factory.mktemp("png_palette2_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_palette2_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_palette4_pdf(tmp_path_factory, tmp_palette4_png, request):
out_pdf = tmp_path_factory.mktemp("png_palette4_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_palette4_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_palette8_pdf(tmp_path_factory, tmp_palette8_png, request):
out_pdf = tmp_path_factory.mktemp("png_palette8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_palette8_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def png_icc_pdf(tmp_path_factory, tmp_icc_png, tmp_icc_profile, request):
out_pdf = tmp_path_factory.mktemp("png_icc_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tmp_icc_png),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/ICCBased"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1].N == 3
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1].Alternate == "/DeviceRGB"
assert (
p.pages[0].Resources.XObject.Im0.ColorSpace[1].read_bytes()
== tmp_icc_profile.read_bytes()
)
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def gif_palette1_pdf(tmp_path_factory, gif_palette1_img, request):
out_pdf = tmp_path_factory.mktemp("gif_palette1_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(gif_palette1_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def gif_palette2_pdf(tmp_path_factory, gif_palette2_img, request):
out_pdf = tmp_path_factory.mktemp("gif_palette2_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(gif_palette2_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def gif_palette4_pdf(tmp_path_factory, gif_palette4_img, request):
out_pdf = tmp_path_factory.mktemp("gif_palette4_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(gif_palette4_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def gif_palette8_pdf(tmp_path_factory, gif_palette8_img, request):
out_pdf = tmp_path_factory.mktemp("gif_palette8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(gif_palette8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def gif_animation_pdf(tmp_path_factory, gif_animation_img, request):
tmpdir = tmp_path_factory.mktemp("gif_animation_pdf")
out_pdf = tmpdir / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(gif_animation_img),
]
)
pdfinfo = subprocess.check_output(["pdfinfo", str(out_pdf)])
assert re.search(
"^Pages: +2$", pdfinfo.decode("utf8"), re.MULTILINE
), identify.decode("utf8")
subprocess.check_call(["pdfseparate", str(out_pdf), str(tmpdir / "page-%d.pdf")])
for page in [1, 2]:
gif_animation_pdf_nr = tmpdir / ("page-%d.pdf" % page)
with pikepdf.open(gif_animation_pdf_nr) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
gif_animation_pdf_nr.unlink()
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_cmyk8_pdf(tmp_path_factory, tiff_cmyk8_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_cmyk8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_cmyk8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK"
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_rgb8_pdf(tmp_path_factory, tiff_rgb8_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_rgb8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_rgb8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_gray1_pdf(tmp_path_factory, tiff_gray1_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_gray1_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_gray1_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == True
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_gray2_pdf(tmp_path_factory, tiff_gray2_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_gray2_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_gray2_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_gray4_pdf(tmp_path_factory, tiff_gray4_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_gray4_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_gray4_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_gray8_pdf(tmp_path_factory, tiff_gray8_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_gray8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_gray8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_multipage_pdf(tmp_path_factory, tiff_multipage_img, request):
tmpdir = tmp_path_factory.mktemp("tiff_multipage_pdf")
out_pdf = tmpdir / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_multipage_img),
]
)
pdfinfo = subprocess.check_output(["pdfinfo", str(out_pdf)])
assert re.search(
"^Pages: +2$", pdfinfo.decode("utf8"), re.MULTILINE
), identify.decode("utf8")
subprocess.check_call(["pdfseparate", str(out_pdf), str(tmpdir / "page-%d.pdf")])
for page in [1, 2]:
tiff_multipage_pdf_nr = tmpdir / ("page-%d.pdf" % page)
with pikepdf.open(tiff_multipage_pdf_nr) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
tiff_multipage_pdf_nr.unlink()
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_palette1_pdf(tmp_path_factory, tiff_palette1_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_palette1_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_palette1_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_palette2_pdf(tmp_path_factory, tiff_palette2_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_palette2_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_palette2_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_palette4_pdf(tmp_path_factory, tiff_palette4_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_palette4_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_palette4_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_palette8_pdf(tmp_path_factory, tiff_palette8_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_palette8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_palette8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed"
assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_ccitt_lsb_m2l_white_pdf(
tmp_path_factory, tiff_ccitt_lsb_m2l_white_img, request
):
out_pdf = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_white_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_ccitt_lsb_m2l_white_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_ccitt_msb_m2l_white_pdf(
tmp_path_factory, tiff_ccitt_msb_m2l_white_img, request
):
out_pdf = tmp_path_factory.mktemp("tiff_ccitt_msb_m2l_white_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_ccitt_msb_m2l_white_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_ccitt_msb_l2m_white_pdf(
tmp_path_factory, tiff_ccitt_msb_l2m_white_img, request
):
out_pdf = tmp_path_factory.mktemp("tiff_ccitt_msb_l2m_white_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_ccitt_msb_l2m_white_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_ccitt_lsb_m2l_black_pdf(
tmp_path_factory, tiff_ccitt_lsb_m2l_black_img, request
):
out_pdf = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_black_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_ccitt_lsb_m2l_black_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == True
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_ccitt_nometa1_pdf(tmp_path_factory, tiff_ccitt_nometa1_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_ccitt_nometa1_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_ccitt_nometa1_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def tiff_ccitt_nometa2_pdf(tmp_path_factory, tiff_ccitt_nometa2_img, request):
out_pdf = tmp_path_factory.mktemp("tiff_ccitt_nometa2_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(tiff_ccitt_nometa2_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray"
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1
assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60
assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def miff_cmyk8_pdf(tmp_path_factory, miff_cmyk8_img, request):
out_pdf = tmp_path_factory.mktemp("miff_cmyk8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(miff_cmyk8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK"
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def miff_cmyk16_pdf(tmp_path_factory, miff_cmyk16_img, request):
out_pdf = tmp_path_factory.mktemp("miff_cmyk16_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(miff_cmyk16_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK"
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
@pytest.fixture(scope="session", params=["internal", "pikepdf"])
def miff_rgb8_pdf(tmp_path_factory, miff_rgb8_img, request):
out_pdf = tmp_path_factory.mktemp("miff_rgb8_pdf") / "out.pdf"
subprocess.check_call(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + request.param,
"--output=" + str(out_pdf),
str(miff_rgb8_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert (
p.pages[0].Contents.read_bytes()
== b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ"
)
assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB"
assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3
assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15
assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode"
assert p.pages[0].Resources.XObject.Im0.Height == 60
assert p.pages[0].Resources.XObject.Im0.Width == 60
return out_pdf
###############################################################################
# TEST CASES #
###############################################################################
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_jpg(tmp_path_factory, jpg_img, jpg_pdf):
tmpdir = tmp_path_factory.mktemp("jpg")
pnm = tmpdir / "jpg.pnm"
# We have to use jpegtopnm with the original JPG before being able to compare
# it with imagemagick because imagemagick will decode the JPG slightly
# differently than ghostscript, poppler and mupdf do it.
# We have to use jpegtopnm and cannot use djpeg because the latter produces
# slightly different results as well when called like this:
# djpeg -dct int -pnm "$tempdir/normal.jpg" > "$tempdir/normal.pnm"
# An alternative way to compare the JPG would be to require a different DCT
# method when decoding by setting -define jpeg:dct-method=ifast in the
# compare command.
pnm.write_bytes(subprocess.check_output(["jpegtopnm", "-dct", "int", str(jpg_img)]))
compare_ghostscript(tmpdir, pnm, jpg_pdf)
compare_poppler(tmpdir, pnm, jpg_pdf)
compare_mupdf(tmpdir, pnm, jpg_pdf)
pnm.unlink()
compare_pdfimages_jpg(tmpdir, jpg_img, jpg_pdf)
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_jpg_rot(tmp_path_factory, jpg_rot_img, jpg_rot_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_rot")
# We have to use jpegtopnm with the original JPG before being able to compare
# it with imagemagick because imagemagick will decode the JPG slightly
# differently than ghostscript, poppler and mupdf do it.
# We have to use jpegtopnm and cannot use djpeg because the latter produces
# slightly different results as well when called like this:
# djpeg -dct int -pnm "$tempdir/normal.jpg" > "$tempdir/normal.pnm"
# An alternative way to compare the JPG would be to require a different DCT
# method when decoding by setting -define jpeg:dct-method=ifast in the
# compare command.
jpg_rot_pnm = tmpdir / "jpg_rot.pnm"
jpg_rot_pnm.write_bytes(
subprocess.check_output(["jpegtopnm", "-dct", "int", str(jpg_rot_img)])
)
jpg_rot_png = tmpdir / "jpg_rot.png"
subprocess.check_call(
CONVERT + ["-rotate", "90", str(jpg_rot_pnm), str(jpg_rot_png)]
)
jpg_rot_pnm.unlink()
compare_ghostscript(tmpdir, jpg_rot_png, jpg_rot_pdf)
compare_poppler(tmpdir, jpg_rot_png, jpg_rot_pdf)
compare_mupdf(tmpdir, jpg_rot_png, jpg_rot_pdf)
jpg_rot_png.unlink()
compare_pdfimages_jpg(tmpdir, jpg_rot_img, jpg_rot_pdf)
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_jpg_cmyk(tmp_path_factory, jpg_cmyk_img, jpg_cmyk_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_cmyk")
compare_ghostscript(
tmpdir, jpg_cmyk_img, jpg_cmyk_pdf, gsdevice="tiff32nc", exact=HAVE_EXACT_CMYK8
)
# not testing with poppler as it cannot write CMYK images
compare_mupdf(tmpdir, jpg_cmyk_img, jpg_cmyk_pdf, exact=HAVE_EXACT_CMYK8, cmyk=True)
compare_pdfimages_cmyk(tmpdir, jpg_cmyk_img, jpg_cmyk_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
not HAVE_JP2, reason="requires imagemagick with support for jpeg2000"
)
def test_jpg_2000(tmp_path_factory, jpg_2000_img, jpg_2000_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_2000")
compare_ghostscript(tmpdir, jpg_2000_img, jpg_2000_pdf)
compare_poppler(tmpdir, jpg_2000_img, jpg_2000_pdf)
compare_mupdf(tmpdir, jpg_2000_img, jpg_2000_pdf)
compare_pdfimages_jp2(tmpdir, jpg_2000_img, jpg_2000_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
not HAVE_JP2, reason="requires imagemagick with support for jpeg2000"
)
def test_jpg_2000_rgba8(tmp_path_factory, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_2000_rgba8")
compare_ghostscript(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
# compare_poppler(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
# compare_mupdf(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
compare_pdfimages_jp2(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
not HAVE_JP2, reason="requires imagemagick with support for jpeg2000"
)
def test_jpg_2000_rgba16(tmp_path_factory, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf):
tmpdir = tmp_path_factory.mktemp("jpg_2000_rgba16")
compare_ghostscript(
tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf, gsdevice="tiff48nc"
)
# poppler outputs 8-bit RGB so the comparison will not be exact
# compare_poppler(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf, exact=False)
# compare_mupdf(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf)
compare_pdfimages_jp2(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_rgb8(tmp_path_factory, png_rgb8_img, png_rgb8_pdf):
tmpdir = tmp_path_factory.mktemp("png_rgb8")
compare_ghostscript(tmpdir, png_rgb8_img, png_rgb8_pdf)
compare_poppler(tmpdir, png_rgb8_img, png_rgb8_pdf)
compare_mupdf(tmpdir, png_rgb8_img, png_rgb8_pdf)
compare_pdfimages_png(tmpdir, png_rgb8_img, png_rgb8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_rgb16(tmp_path_factory, png_rgb16_img, png_rgb16_pdf):
tmpdir = tmp_path_factory.mktemp("png_rgb16")
compare_ghostscript(tmpdir, png_rgb16_img, png_rgb16_pdf, gsdevice="tiff48nc")
# poppler outputs 8-bit RGB so the comparison will not be exact
compare_poppler(tmpdir, png_rgb16_img, png_rgb16_pdf, exact=False)
# pdfimages is unable to write 16 bit output
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_rgba8(tmp_path_factory, png_rgba8_img, png_rgba8_pdf):
tmpdir = tmp_path_factory.mktemp("png_rgba8")
compare_pdfimages_png(tmpdir, png_rgba8_img, png_rgba8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_png_rgba16(tmp_path_factory, png_rgba16_img, engine):
out_pdf = tmp_path_factory.mktemp("png_rgba16") / "out.pdf"
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(png_rgba16_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_gray8a(tmp_path_factory, png_gray8a_img, png_gray8a_pdf):
tmpdir = tmp_path_factory.mktemp("png_gray8a")
compare_pdfimages_png(tmpdir, png_gray8a_img, png_gray8a_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_png_gray16a(tmp_path_factory, png_gray16a_img, engine):
out_pdf = tmp_path_factory.mktemp("png_gray16a") / "out.pdf"
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(png_gray16a_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_interlaced(tmp_path_factory, png_interlaced_img, png_interlaced_pdf):
tmpdir = tmp_path_factory.mktemp("png_interlaced")
compare_ghostscript(tmpdir, png_interlaced_img, png_interlaced_pdf)
compare_poppler(tmpdir, png_interlaced_img, png_interlaced_pdf)
compare_mupdf(tmpdir, png_interlaced_img, png_interlaced_pdf)
compare_pdfimages_png(tmpdir, png_interlaced_img, png_interlaced_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_gray1(tmp_path_factory, png_gray1_img, png_gray1_pdf):
tmpdir = tmp_path_factory.mktemp("png_gray1")
compare_ghostscript(tmpdir, png_gray1_img, png_gray1_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, png_gray1_img, png_gray1_pdf)
compare_mupdf(tmpdir, png_gray1_img, png_gray1_pdf)
compare_pdfimages_png(tmpdir, png_gray1_img, png_gray1_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_gray2(tmp_path_factory, png_gray2_img, png_gray2_pdf):
tmpdir = tmp_path_factory.mktemp("png_gray2")
compare_ghostscript(tmpdir, png_gray2_img, png_gray2_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, png_gray2_img, png_gray2_pdf)
compare_mupdf(tmpdir, png_gray2_img, png_gray2_pdf)
compare_pdfimages_png(tmpdir, png_gray2_img, png_gray2_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_gray4(tmp_path_factory, png_gray4_img, png_gray4_pdf):
tmpdir = tmp_path_factory.mktemp("png_gray4")
compare_ghostscript(tmpdir, png_gray4_img, png_gray4_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, png_gray4_img, png_gray4_pdf)
compare_mupdf(tmpdir, png_gray4_img, png_gray4_pdf)
compare_pdfimages_png(tmpdir, png_gray4_img, png_gray4_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_gray8(tmp_path_factory, png_gray8_img, png_gray8_pdf):
tmpdir = tmp_path_factory.mktemp("png_gray8")
compare_ghostscript(tmpdir, png_gray8_img, png_gray8_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, png_gray8_img, png_gray8_pdf)
compare_mupdf(tmpdir, png_gray8_img, png_gray8_pdf)
compare_pdfimages_png(tmpdir, png_gray8_img, png_gray8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_gray16(tmp_path_factory, png_gray16_img, png_gray16_pdf):
tmpdir = tmp_path_factory.mktemp("png_gray16")
# ghostscript outputs 8-bit grayscale, so the comparison will not be exact
compare_ghostscript(
tmpdir, png_gray16_img, png_gray16_pdf, gsdevice="pnggray", exact=False
)
# poppler outputs 8-bit grayscale so the comparison will not be exact
compare_poppler(tmpdir, png_gray16_img, png_gray16_pdf, exact=False)
# pdfimages outputs 8-bit grayscale so the comparison will not be exact
compare_pdfimages_png(tmpdir, png_gray16_img, png_gray16_pdf, exact=False)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_palette1(tmp_path_factory, png_palette1_img, png_palette1_pdf):
tmpdir = tmp_path_factory.mktemp("png_palette1")
compare_ghostscript(tmpdir, png_palette1_img, png_palette1_pdf)
compare_poppler(tmpdir, png_palette1_img, png_palette1_pdf)
compare_mupdf(tmpdir, png_palette1_img, png_palette1_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_palette2(tmp_path_factory, png_palette2_img, png_palette2_pdf):
tmpdir = tmp_path_factory.mktemp("png_palette2")
compare_ghostscript(tmpdir, png_palette2_img, png_palette2_pdf)
compare_poppler(tmpdir, png_palette2_img, png_palette2_pdf)
compare_mupdf(tmpdir, png_palette2_img, png_palette2_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_palette4(tmp_path_factory, png_palette4_img, png_palette4_pdf):
tmpdir = tmp_path_factory.mktemp("png_palette4")
compare_ghostscript(tmpdir, png_palette4_img, png_palette4_pdf)
compare_poppler(tmpdir, png_palette4_img, png_palette4_pdf)
compare_mupdf(tmpdir, png_palette4_img, png_palette4_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_palette8(tmp_path_factory, png_palette8_img, png_palette8_pdf):
tmpdir = tmp_path_factory.mktemp("png_palette8")
compare_ghostscript(tmpdir, png_palette8_img, png_palette8_pdf)
compare_poppler(tmpdir, png_palette8_img, png_palette8_pdf)
compare_mupdf(tmpdir, png_palette8_img, png_palette8_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_png_icc(tmp_path_factory, png_icc_img, png_icc_pdf):
tmpdir = tmp_path_factory.mktemp("png_icc")
compare_ghostscript(tmpdir, png_icc_img, png_icc_pdf, exact=False, icc=True)
compare_poppler(tmpdir, png_icc_img, png_icc_pdf, exact=False, icc=True)
# mupdf ignores the ICC profile in Debian (needs patched thirdparty liblcms2-art)
compare_pdfimages_png(tmpdir, png_icc_img, png_icc_pdf, exact=False, icc=True)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_gif_transparent(tmp_path_factory, gif_transparent_img, gif_transparent_pdf):
tmpdir = tmp_path_factory.mktemp("gif_transparent")
compare_pdfimages_png(tmpdir, gif_transparent_img, gif_transparent_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_gif_palette1(tmp_path_factory, gif_palette1_img, gif_palette1_pdf):
tmpdir = tmp_path_factory.mktemp("gif_palette1")
compare_ghostscript(tmpdir, gif_palette1_img, gif_palette1_pdf)
compare_poppler(tmpdir, gif_palette1_img, gif_palette1_pdf)
compare_mupdf(tmpdir, gif_palette1_img, gif_palette1_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_gif_palette2(tmp_path_factory, gif_palette2_img, gif_palette2_pdf):
tmpdir = tmp_path_factory.mktemp("gif_palette2")
compare_ghostscript(tmpdir, gif_palette2_img, gif_palette2_pdf)
compare_poppler(tmpdir, gif_palette2_img, gif_palette2_pdf)
compare_mupdf(tmpdir, gif_palette2_img, gif_palette2_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_gif_palette4(tmp_path_factory, gif_palette4_img, gif_palette4_pdf):
tmpdir = tmp_path_factory.mktemp("gif_palette4")
compare_ghostscript(tmpdir, gif_palette4_img, gif_palette4_pdf)
compare_poppler(tmpdir, gif_palette4_img, gif_palette4_pdf)
compare_mupdf(tmpdir, gif_palette4_img, gif_palette4_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_gif_palette8(tmp_path_factory, gif_palette8_img, gif_palette8_pdf):
tmpdir = tmp_path_factory.mktemp("gif_palette8")
compare_ghostscript(tmpdir, gif_palette8_img, gif_palette8_pdf)
compare_poppler(tmpdir, gif_palette8_img, gif_palette8_pdf)
compare_mupdf(tmpdir, gif_palette8_img, gif_palette8_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_gif_animation(tmp_path_factory, gif_animation_img, gif_animation_pdf):
tmpdir = tmp_path_factory.mktemp("gif_animation")
subprocess.check_call(
["pdfseparate", str(gif_animation_pdf), str(tmpdir / "page-%d.pdf")]
)
for page in [1, 2]:
gif_animation_pdf_nr = tmpdir / ("page-%d.pdf" % page)
compare_ghostscript(
tmpdir, str(gif_animation_img) + "[%d]" % (page - 1), gif_animation_pdf_nr
)
compare_poppler(
tmpdir, str(gif_animation_img) + "[%d]" % (page - 1), gif_animation_pdf_nr
)
compare_mupdf(
tmpdir, str(gif_animation_img) + "[%d]" % (page - 1), gif_animation_pdf_nr
)
# pdfimages cannot export palette based images
gif_animation_pdf_nr.unlink()
@pytest.mark.skipif(
sys.platform in ["darwin", "win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
platform.machine() == "s390x",
reason="https://github.com/ImageMagick/ImageMagick/issues/8054",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_float(tmp_path_factory, tiff_float_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_float") / "out.pdf"
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_float_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_cmyk8(tmp_path_factory, tiff_cmyk8_img, tiff_cmyk8_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_cmyk8")
compare_ghostscript(
tmpdir,
tiff_cmyk8_img,
tiff_cmyk8_pdf,
gsdevice="tiff32nc",
exact=HAVE_EXACT_CMYK8,
)
# not testing with poppler as it cannot write CMYK images
compare_mupdf(
tmpdir, tiff_cmyk8_img, tiff_cmyk8_pdf, exact=HAVE_EXACT_CMYK8, cmyk=True
)
compare_pdfimages_tiff(tmpdir, tiff_cmyk8_img, tiff_cmyk8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_cmyk16(tmp_path_factory, tiff_cmyk16_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_cmyk16") / "out.pdf"
# PIL is unable to read 16 bit CMYK images
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_cmyk16_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_rgb8(tmp_path_factory, tiff_rgb8_img, tiff_rgb8_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_rgb8")
compare_ghostscript(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf, gsdevice="tiff24nc")
compare_poppler(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf)
compare_mupdf(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf)
compare_pdfimages_tiff(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_rgb12(tmp_path_factory, tiff_rgb12_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_rgb12") / "out.pdf"
# PIL is unable to preserve more than 8 bits per sample
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_rgb12_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_rgb14(tmp_path_factory, tiff_rgb14_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_rgb14") / "out.pdf"
# PIL is unable to preserve more than 8 bits per sample
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_rgb14_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_rgb16(tmp_path_factory, tiff_rgb16_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_rgb16") / "out.pdf"
# PIL is unable to preserve more than 8 bits per sample
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_rgb16_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_rgba8(tmp_path_factory, tiff_rgba8_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_rgba8") / "out.pdf"
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_rgba8_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_rgba16(tmp_path_factory, tiff_rgba16_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_rgba16") / "out.pdf"
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_rgba16_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_gray1(tmp_path_factory, tiff_gray1_img, tiff_gray1_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_gray1")
compare_ghostscript(tmpdir, tiff_gray1_img, tiff_gray1_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, tiff_gray1_img, tiff_gray1_pdf)
compare_mupdf(tmpdir, tiff_gray1_img, tiff_gray1_pdf)
compare_pdfimages_tiff(tmpdir, tiff_gray1_img, tiff_gray1_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_gray2(tmp_path_factory, tiff_gray2_img, tiff_gray2_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_gray2")
compare_ghostscript(tmpdir, tiff_gray2_img, tiff_gray2_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, tiff_gray2_img, tiff_gray2_pdf)
compare_mupdf(tmpdir, tiff_gray2_img, tiff_gray2_pdf)
compare_pdfimages_tiff(tmpdir, tiff_gray2_img, tiff_gray2_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_gray4(tmp_path_factory, tiff_gray4_img, tiff_gray4_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_gray4")
compare_ghostscript(tmpdir, tiff_gray4_img, tiff_gray4_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, tiff_gray4_img, tiff_gray4_pdf)
compare_mupdf(tmpdir, tiff_gray4_img, tiff_gray4_pdf)
compare_pdfimages_tiff(tmpdir, tiff_gray4_img, tiff_gray4_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_gray8(tmp_path_factory, tiff_gray8_img, tiff_gray8_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_gray8")
compare_ghostscript(tmpdir, tiff_gray8_img, tiff_gray8_pdf, gsdevice="pnggray")
compare_poppler(tmpdir, tiff_gray8_img, tiff_gray8_pdf)
compare_mupdf(tmpdir, tiff_gray8_img, tiff_gray8_pdf)
compare_pdfimages_tiff(tmpdir, tiff_gray8_img, tiff_gray8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_tiff_gray16(tmp_path_factory, tiff_gray16_img, engine):
out_pdf = tmp_path_factory.mktemp("tiff_gray16") / "out.pdf"
assert (
0
!= subprocess.run(
[
img2pdfprog,
"--producer=",
"--nodate",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(tiff_gray16_img),
]
).returncode
)
out_pdf.unlink()
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_multipage(tmp_path_factory, tiff_multipage_img, tiff_multipage_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_multipage")
subprocess.check_call(
["pdfseparate", str(tiff_multipage_pdf), str(tmpdir / "page-%d.pdf")]
)
for page in [1, 2]:
tiff_multipage_pdf_nr = tmpdir / ("page-%d.pdf" % page)
compare_ghostscript(
tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr
)
compare_poppler(
tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr
)
compare_mupdf(
tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr
)
compare_pdfimages_tiff(
tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr
)
tiff_multipage_pdf_nr.unlink()
@pytest.mark.skipif(
not HAVE_IMAGEMAGICK_MODERN,
reason="requires imagemagick with support for keeping the palette depth",
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_palette1(tmp_path_factory, tiff_palette1_img, tiff_palette1_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_palette1")
compare_ghostscript(tmpdir, tiff_palette1_img, tiff_palette1_pdf)
compare_poppler(tmpdir, tiff_palette1_img, tiff_palette1_pdf)
compare_mupdf(tmpdir, tiff_palette1_img, tiff_palette1_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
not HAVE_IMAGEMAGICK_MODERN,
reason="requires imagemagick with support for keeping the palette depth",
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_palette2(tmp_path_factory, tiff_palette2_img, tiff_palette2_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_palette2")
compare_ghostscript(tmpdir, tiff_palette2_img, tiff_palette2_pdf)
compare_poppler(tmpdir, tiff_palette2_img, tiff_palette2_pdf)
compare_mupdf(tmpdir, tiff_palette2_img, tiff_palette2_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
not HAVE_IMAGEMAGICK_MODERN,
reason="requires imagemagick with support for keeping the palette depth",
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_palette4(tmp_path_factory, tiff_palette4_img, tiff_palette4_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_palette4")
compare_ghostscript(tmpdir, tiff_palette4_img, tiff_palette4_pdf)
compare_poppler(tmpdir, tiff_palette4_img, tiff_palette4_pdf)
compare_mupdf(tmpdir, tiff_palette4_img, tiff_palette4_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_palette8(tmp_path_factory, tiff_palette8_img, tiff_palette8_pdf):
tmpdir = tmp_path_factory.mktemp("tiff_palette8")
compare_ghostscript(tmpdir, tiff_palette8_img, tiff_palette8_pdf)
compare_poppler(tmpdir, tiff_palette8_img, tiff_palette8_pdf)
compare_mupdf(tmpdir, tiff_palette8_img, tiff_palette8_pdf)
# pdfimages cannot export palette based images
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_ccitt_lsb_m2l_white(
tmp_path_factory, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf
):
tmpdir = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_white")
compare_ghostscript(
tmpdir,
tiff_ccitt_lsb_m2l_white_img,
tiff_ccitt_lsb_m2l_white_pdf,
gsdevice="pnggray",
)
compare_poppler(tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf)
compare_mupdf(tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf)
compare_pdfimages_tiff(
tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_ccitt_msb_m2l_white(
tmp_path_factory, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf
):
tmpdir = tmp_path_factory.mktemp("tiff_ccitt_msb_m2l_white")
compare_ghostscript(
tmpdir,
tiff_ccitt_msb_m2l_white_img,
tiff_ccitt_msb_m2l_white_pdf,
gsdevice="pnggray",
)
compare_poppler(tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf)
compare_mupdf(tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf)
compare_pdfimages_tiff(
tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_ccitt_msb_l2m_white(
tmp_path_factory, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf
):
tmpdir = tmp_path_factory.mktemp("tiff_ccitt_msb_l2m_white")
compare_ghostscript(
tmpdir,
tiff_ccitt_msb_l2m_white_img,
tiff_ccitt_msb_l2m_white_pdf,
gsdevice="pnggray",
)
compare_poppler(tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf)
compare_mupdf(tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf)
compare_pdfimages_tiff(
tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf
)
@pytest.mark.skipif(
not HAVE_IMAGEMAGICK_MODERN,
reason="requires imagemagick with support for min-is-black",
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_ccitt_lsb_m2l_black(
tmp_path_factory, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf
):
tmpdir = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_black")
compare_ghostscript(
tmpdir,
tiff_ccitt_lsb_m2l_black_img,
tiff_ccitt_lsb_m2l_black_pdf,
gsdevice="pnggray",
)
compare_poppler(tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf)
compare_mupdf(tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf)
compare_pdfimages_tiff(
tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf
)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_ccitt_nometa1(
tmp_path_factory, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf
):
tmpdir = tmp_path_factory.mktemp("tiff_ccitt_nometa1")
compare_ghostscript(
tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf, gsdevice="pnggray"
)
compare_poppler(tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf)
compare_mupdf(tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf)
compare_pdfimages_tiff(tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_tiff_ccitt_nometa2(
tmp_path_factory, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf
):
tmpdir = tmp_path_factory.mktemp("tiff_ccitt_nometa2")
compare_ghostscript(
tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf, gsdevice="pnggray"
)
compare_poppler(tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf)
compare_mupdf(tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf)
compare_pdfimages_tiff(tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_miff_cmyk8(tmp_path_factory, miff_cmyk8_img, tiff_cmyk8_img, miff_cmyk8_pdf):
tmpdir = tmp_path_factory.mktemp("miff_cmyk8")
compare_ghostscript(tmpdir, tiff_cmyk8_img, miff_cmyk8_pdf, gsdevice="tiff32nc")
# not testing with poppler as it cannot write CMYK images
compare_mupdf(tmpdir, tiff_cmyk8_img, miff_cmyk8_pdf, cmyk=True)
compare_pdfimages_tiff(tmpdir, tiff_cmyk8_img, miff_cmyk8_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
@pytest.mark.skipif(
platform.machine() == "s390x",
reason="https://github.com/ImageMagick/ImageMagick/issues/8055",
)
def test_miff_cmyk16(
tmp_path_factory, miff_cmyk16_img, tiff_cmyk16_img, miff_cmyk16_pdf
):
tmpdir = tmp_path_factory.mktemp("miff_cmyk16")
compare_ghostscript(
tmpdir, tiff_cmyk16_img, miff_cmyk16_pdf, gsdevice="tiff32nc", exact=False
)
# not testing with poppler as it cannot write CMYK images
compare_mupdf(tmpdir, tiff_cmyk16_img, miff_cmyk16_pdf, exact=False, cmyk=True)
# compare_pdfimages_tiff(tmpdir, tiff_cmyk16_img, miff_cmyk16_pdf)
@pytest.mark.skipif(
sys.platform in ["win32"],
reason="test utilities not available on Windows and MacOS",
)
def test_miff_rgb8(tmp_path_factory, miff_rgb8_img, tiff_rgb8_img, miff_rgb8_pdf):
tmpdir = tmp_path_factory.mktemp("miff_rgb8")
compare_ghostscript(tmpdir, tiff_rgb8_img, miff_rgb8_pdf, gsdevice="tiff24nc")
compare_poppler(tmpdir, tiff_rgb8_img, miff_rgb8_pdf)
compare_mupdf(tmpdir, tiff_rgb8_img, miff_rgb8_pdf)
compare_pdfimages_tiff(tmpdir, tiff_rgb8_img, miff_rgb8_pdf)
# we define some variables so that the table below can be narrower
psl = (972, 504) # --pagesize landscape
psp = (504, 972) # --pagesize portrait
isl = (756, 324) # --imgsize landscape
isp = (324, 756) # --imgsize portrait
border = (162, 270) # --border
poster = (97200, 50400)
# shortcuts for fit modes
f_into = img2pdf.FitMode.into
f_fill = img2pdf.FitMode.fill
f_exact = img2pdf.FitMode.exact
f_shrink = img2pdf.FitMode.shrink
f_enlarge = img2pdf.FitMode.enlarge
@pytest.mark.parametrize(
"layout_test_cases",
[
# fmt: off
# psp=972x504, psl=504x972, isl=756x324, isp=324x756, border=162:270
# --pagesize --border -a pagepdf imgpdf
# --imgsize --fit
(None, None, None, f_into, 0, (648, 216), (648, 216), # 000
(864, 432), (864, 432)),
(None, None, None, f_into, 1, (648, 216), (648, 216), # 001
(864, 432), (864, 432)),
(None, None, None, f_fill, 0, (648, 216), (648, 216), # 002
(864, 432), (864, 432)),
(None, None, None, f_fill, 1, (648, 216), (648, 216), # 003
(864, 432), (864, 432)),
(None, None, None, f_exact, 0, (648, 216), (648, 216), # 004
(864, 432), (864, 432)),
(None, None, None, f_exact, 1, (648, 216), (648, 216), # 005
(864, 432), (864, 432)),
(None, None, None, f_shrink, 0, (648, 216), (648, 216), # 006
(864, 432), (864, 432)),
(None, None, None, f_shrink, 1, (648, 216), (648, 216), # 007
(864, 432), (864, 432)),
(None, None, None, f_enlarge, 0, (648, 216), (648, 216), # 008
(864, 432), (864, 432)),
(None, None, None, f_enlarge, 1, (648, 216), (648, 216), # 009
(864, 432), (864, 432)),
(None, None, border, f_into, 0, (1188, 540), (648, 216), # 010
(1404, 756), (864, 432)),
(None, None, border, f_into, 1, (1188, 540), (648, 216), # 011
(1404, 756), (864, 432)),
(None, None, border, f_fill, 0, (1188, 540), (648, 216), # 012
(1404, 756), (864, 432)),
(None, None, border, f_fill, 1, (1188, 540), (648, 216), # 013
(1404, 756), (864, 432)),
(None, None, border, f_exact, 0, (1188, 540), (648, 216), # 014
(1404, 756), (864, 432)),
(None, None, border, f_exact, 1, (1188, 540), (648, 216), # 015
(1404, 756), (864, 432)),
(None, None, border, f_shrink, 0, (1188, 540), (648, 216), # 016
(1404, 756), (864, 432)),
(None, None, border, f_shrink, 1, (1188, 540), (648, 216), # 017
(1404, 756), (864, 432)),
(None, None, border, f_enlarge, 0, (1188, 540), (648, 216), # 018
(1404, 756), (864, 432)),
(None, None, border, f_enlarge, 1, (1188, 540), (648, 216), # 019
(1404, 756), (864, 432)),
(None, isp, None, f_into, 0, (324, 108), (324, 108), # 020
(324, 162), (324, 162)),
(None, isp, None, f_into, 1, (324, 108), (324, 108), # 021
(324, 162), (324, 162)),
(None, isp, None, f_fill, 0, (2268, 756), (2268, 756), # 022
(1512, 756), (1512, 756)),
(None, isp, None, f_fill, 1, (2268, 756), (2268, 756), # 023
(1512, 756), (1512, 756)),
(None, isp, None, f_exact, 0, (324, 756), (324, 756), # 024
(324, 756), (324, 756)),
(None, isp, None, f_exact, 1, (324, 756), (324, 756), # 025
(324, 756), (324, 756)),
(None, isp, None, f_shrink, 0, (324, 108), (324, 108), # 026
(324, 162), (324, 162)),
(None, isp, None, f_shrink, 1, (324, 108), (324, 108), # 027
(324, 162), (324, 162)),
(None, isp, None, f_enlarge, 0, (648, 216), (648, 216), # 028
(864, 432), (864, 432)),
(None, isp, None, f_enlarge, 1, (648, 216), (648, 216), # 029
(864, 432), (864, 432)),
(None, isp, border, f_into, 0, (864, 432), (324, 108), # 030
(864, 486), (324, 162)),
(None, isp, border, f_into, 1, (864, 432), (324, 108), # 031
(864, 486), (324, 162)),
(None, isp, border, f_fill, 0, (2808, 1080), (2268, 756), # 032
(2052, 1080), (1512, 756)),
(None, isp, border, f_fill, 1, (2808, 1080), (2268, 756), # 033
(2052, 1080), (1512, 756)),
(None, isp, border, f_exact, 0, (864, 1080), (324, 756), # 034
(864, 1080), (324, 756)),
(None, isp, border, f_exact, 1, (864, 1080), (324, 756), # 035
(864, 1080), (324, 756)),
(None, isp, border, f_shrink, 0, (864, 432), (324, 108), # 036
(864, 486), (324, 162)),
(None, isp, border, f_shrink, 1, (864, 432), (324, 108), # 037
(864, 486), (324, 162)),
(None, isp, border, f_enlarge, 0, (1188, 540), (648, 216), # 038
(1404, 756), (864, 432)),
(None, isp, border, f_enlarge, 1, (1188, 540), (648, 216), # 039
(1404, 756), (864, 432)),
(None, isl, None, f_into, 0, (756, 252), (756, 252), # 040
(648, 324), (648, 324)),
(None, isl, None, f_into, 1, (756, 252), (756, 252), # 041
(648, 324), (648, 324)),
(None, isl, None, f_fill, 0, (972, 324), (972, 324), # 042
(756, 378), (756, 378)),
(None, isl, None, f_fill, 1, (972, 324), (972, 324), # 043
(756, 378), (756, 378)),
(None, isl, None, f_exact, 0, (756, 324), (756, 324), # 044
(756, 324), (756, 324)),
(None, isl, None, f_exact, 1, (756, 324), (756, 324), # 045
(756, 324), (756, 324)),
(None, isl, None, f_shrink, 0, (648, 216), (648, 216), # 046
(648, 324), (648, 324)),
(None, isl, None, f_shrink, 1, (648, 216), (648, 216), # 047
(648, 324), (648, 324)),
(None, isl, None, f_enlarge, 0, (756, 252), (756, 252), # 048
(864, 432), (864, 432)),
(None, isl, None, f_enlarge, 1, (756, 252), (756, 252), # 049
(864, 432), (864, 432)),
# psp=972x504, psp=504x972, isl=756x324, isp=324x756, border=162:270
# --pagesize --border -a pagepdf imgpdf
# --imgsize --fit imgpx
(None, isl, border, f_into, 0, (1296, 576), (756, 252), # 050
(1188, 648), (648, 324)),
(None, isl, border, f_into, 1, (1296, 576), (756, 252), # 051
(1188, 648), (648, 324)),
(None, isl, border, f_fill, 0, (1512, 648), (972, 324), # 052
(1296, 702), (756, 378)),
(None, isl, border, f_fill, 1, (1512, 648), (972, 324), # 053
(1296, 702), (756, 378)),
(None, isl, border, f_exact, 0, (1296, 648), (756, 324), # 054
(1296, 648), (756, 324)),
(None, isl, border, f_exact, 1, (1296, 648), (756, 324), # 055
(1296, 648), (756, 324)),
(None, isl, border, f_shrink, 0, (1188, 540), (648, 216), # 056
(1188, 648), (648, 324)),
(None, isl, border, f_shrink, 1, (1188, 540), (648, 216), # 057
(1188, 648), (648, 324)),
(None, isl, border, f_enlarge, 0, (1296, 576), (756, 252), # 058
(1404, 756), (864, 432)),
(None, isl, border, f_enlarge, 1, (1296, 576), (756, 252), # 059
(1404, 756), (864, 432)),
(psp, None, None, f_into, 0, (504, 972), (504, 168), # 060
(504, 972), (504, 252)),
(psp, None, None, f_into, 1, (972, 504), (972, 324), # 061
(972, 504), (972, 486)),
(psp, None, None, f_fill, 0, (504, 972), (2916, 972), # 062
(504, 972), (1944, 972)),
(psp, None, None, f_fill, 1, (972, 504), (1512, 504), # 063
(972, 504), (1008, 504)),
(psp, None, None, f_exact, 0, (504, 972), (504, 972), # 064
(504, 972), (504, 972)),
(psp, None, None, f_exact, 1, (972, 504), (972, 504), # 065
(972, 504), (972, 504)),
(psp, None, None, f_shrink, 0, (504, 972), (504, 168), # 066
(504, 972), (504, 252)),
(psp, None, None, f_shrink, 1, (972, 504), (648, 216), # 067
(972, 504), (864, 432)),
(psp, None, None, f_enlarge, 0, (504, 972), (648, 216), # 068
(504, 972), (864, 432)),
(psp, None, None, f_enlarge, 1, (972, 504), (972, 324), # 069
(972, 504), (972, 486)),
(psp, None, border, f_into, 0, None, None, None, None), # 070
(psp, None, border, f_into, 1, None, None, None, None), # 071
(psp, None, border, f_fill, 0, (504, 972), (1944, 648), # 072
(504, 972), (1296, 648)),
(psp, None, border, f_fill, 1, (972, 504), (648, 216), # 073
(972, 504), (648, 324)),
(psp, None, border, f_exact, 0, None, None, None, None), # 074
(psp, None, border, f_exact, 1, None, None, None, None), # 075
(psp, None, border, f_shrink, 0, None, None, None, None), # 076
(psp, None, border, f_shrink, 1, None, None, None, None), # 077
(psp, None, border, f_enlarge, 0, (504, 972), (648, 216), # 078
(504, 972), (864, 432)),
(psp, None, border, f_enlarge, 1, (972, 504), (648, 216), # 079
(972, 504), (864, 432)),
(psp, isp, None, f_into, 0, (504, 972), (324, 108), # 080
(504, 972), (324, 162)),
(psp, isp, None, f_into, 1, (972, 504), (324, 108), # 081
(972, 504), (324, 162)),
(psp, isp, None, f_fill, 0, (504, 972), (2268, 756), # 082
(504, 972), (1512, 756)),
(psp, isp, None, f_fill, 1, (972, 504), (2268, 756), # 083
(972, 504), (1512, 756)),
(psp, isp, None, f_exact, 0, (504, 972), (324, 756), # 084
(504, 972), (324, 756)),
(psp, isp, None, f_exact, 1, (972, 504), (324, 756), # 085
(972, 504), (324, 756)),
(psp, isp, None, f_shrink, 0, (504, 972), (324, 108), # 086
(504, 972), (324, 162)),
(psp, isp, None, f_shrink, 1, (972, 504), (324, 108), # 087
(972, 504), (324, 162)),
(psp, isp, None, f_enlarge, 0, (504, 972), (648, 216), # 088
(504, 972), (864, 432)),
(psp, isp, None, f_enlarge, 1, (972, 504), (648, 216), # 089
(972, 504), (864, 432)),
(psp, isp, border, f_into, 0, (504, 972), (324, 108), # 090
(504, 972), (324, 162)),
(psp, isp, border, f_into, 1, (972, 504), (324, 108), # 091
(972, 504), (324, 162)),
(psp, isp, border, f_fill, 0, (504, 972), (2268, 756), # 092
(504, 972), (1512, 756)),
(psp, isp, border, f_fill, 1, (972, 504), (2268, 756), # 093
(972, 504), (1512, 756)),
(psp, isp, border, f_exact, 0, (504, 972), (324, 756), # 094
(504, 972), (324, 756)),
(psp, isp, border, f_exact, 1, (972, 504), (324, 756), # 095
(972, 504), (324, 756)),
(psp, isp, border, f_shrink, 0, (504, 972), (324, 108), # 096
(504, 972), (324, 162)),
(psp, isp, border, f_shrink, 1, (972, 504), (324, 108), # 097
(972, 504), (324, 162)),
(psp, isp, border, f_enlarge, 0, (504, 972), (648, 216), # 098
(504, 972), (864, 432)),
(psp, isp, border, f_enlarge, 1, (972, 504), (648, 216), # 099
(972, 504), (864, 432)),
# psp=972x504, psp=504x972, isl=756x324, isp=324x756, border=162:270
# --pagesize --border -a pagepdf imgpdf
# --imgsize --fit imgpx
(psp, isl, None, f_into, 0, (504, 972), (756, 252), # 100
(504, 972), (648, 324)),
(psp, isl, None, f_into, 1, (972, 504), (756, 252), # 101
(972, 504), (648, 324)),
(psp, isl, None, f_fill, 0, (504, 972), (972, 324), # 102
(504, 972), (756, 378)),
(psp, isl, None, f_fill, 1, (972, 504), (972, 324), # 103
(972, 504), (756, 378)),
(psp, isl, None, f_exact, 0, (504, 972), (756, 324), # 104
(504, 972), (756, 324)),
(psp, isl, None, f_exact, 1, (972, 504), (756, 324), # 105
(972, 504), (756, 324)),
(psp, isl, None, f_shrink, 0, (504, 972), (648, 216), # 106
(504, 972), (648, 324)),
(psp, isl, None, f_shrink, 1, (972, 504), (648, 216), # 107
(972, 504), (648, 324)),
(psp, isl, None, f_enlarge, 0, (504, 972), (756, 252), # 108
(504, 972), (864, 432)),
(psp, isl, None, f_enlarge, 1, (972, 504), (756, 252), # 109
(972, 504), (864, 432)),
(psp, isl, border, f_into, 0, (504, 972), (756, 252), # 110
(504, 972), (648, 324)),
(psp, isl, border, f_into, 1, (972, 504), (756, 252), # 111
(972, 504), (648, 324)),
(psp, isl, border, f_fill, 0, (504, 972), (972, 324), # 112
(504, 972), (756, 378)),
(psp, isl, border, f_fill, 1, (972, 504), (972, 324), # 113
(972, 504), (756, 378)),
(psp, isl, border, f_exact, 0, (504, 972), (756, 324), # 114
(504, 972), (756, 324)),
(psp, isl, border, f_exact, 1, (972, 504), (756, 324), # 115
(972, 504), (756, 324)),
(psp, isl, border, f_shrink, 0, (504, 972), (648, 216), # 116
(504, 972), (648, 324)),
(psp, isl, border, f_shrink, 1, (972, 504), (648, 216), # 117
(972, 504), (648, 324)),
(psp, isl, border, f_enlarge, 0, (504, 972), (756, 252), # 118
(504, 972), (864, 432)),
(psp, isl, border, f_enlarge, 1, (972, 504), (756, 252), # 119
(972, 504), (864, 432)),
(psl, None, None, f_into, 0, (972, 504), (972, 324), # 120
(972, 504), (972, 486)),
(psl, None, None, f_into, 1, (972, 504), (972, 324), # 121
(972, 504), (972, 486)),
(psl, None, None, f_fill, 0, (972, 504), (1512, 504), # 122
(972, 504), (1008, 504)),
(psl, None, None, f_fill, 1, (972, 504), (1512, 504), # 123
(972, 504), (1008, 504)),
(psl, None, None, f_exact, 0, (972, 504), (972, 504), # 124
(972, 504), (972, 504)),
(psl, None, None, f_exact, 1, (972, 504), (972, 504), # 125
(972, 504), (972, 504)),
(psl, None, None, f_shrink, 0, (972, 504), (648, 216), # 126
(972, 504), (864, 432)),
(psl, None, None, f_shrink, 1, (972, 504), (648, 216), # 127
(972, 504), (864, 432)),
(psl, None, None, f_enlarge, 0, (972, 504), (972, 324), # 128
(972, 504), (972, 486)),
(psl, None, None, f_enlarge, 1, (972, 504), (972, 324), # 129
(972, 504), (972, 486)),
(psl, None, border, f_into, 0, (972, 504), (432, 144), # 130
(972, 504), (360, 180)),
(psl, None, border, f_into, 1, (972, 504), (432, 144), # 131
(972, 504), (360, 180)),
(psl, None, border, f_fill, 0, (972, 504), (540, 180), # 132
(972, 504), (432, 216)),
(psl, None, border, f_fill, 1, (972, 504), (540, 180), # 133
(972, 504), (432, 216)),
(psl, None, border, f_exact, 0, (972, 504), (432, 180), # 134
(972, 504), (432, 180)),
(psl, None, border, f_exact, 1, (972, 504), (432, 180), # 135
(972, 504), (432, 180)),
(psl, None, border, f_shrink, 0, (972, 504), (432, 144), # 136
(972, 504), (360, 180)),
(psl, None, border, f_shrink, 1, (972, 504), (432, 144), # 137
(972, 504), (360, 180)),
(psl, None, border, f_enlarge, 0, (972, 504), (648, 216), # 138
(972, 504), (864, 432)),
(psl, None, border, f_enlarge, 1, (972, 504), (648, 216), # 139
(972, 504), (864, 432)),
(psl, isp, None, f_into, 0, (972, 504), (324, 108), # 140
(972, 504), (324, 162)),
(psl, isp, None, f_into, 1, (972, 504), (324, 108), # 141
(972, 504), (324, 162)),
(psl, isp, None, f_fill, 0, (972, 504), (2268, 756), # 142
(972, 504), (1512, 756)),
(psl, isp, None, f_fill, 1, (972, 504), (2268, 756), # 143
(972, 504), (1512, 756)),
(psl, isp, None, f_exact, 0, (972, 504), (324, 756), # 144
(972, 504), (324, 756)),
(psl, isp, None, f_exact, 1, (972, 504), (324, 756), # 145
(972, 504), (324, 756)),
(psl, isp, None, f_shrink, 0, (972, 504), (324, 108), # 146
(972, 504), (324, 162)),
(psl, isp, None, f_shrink, 1, (972, 504), (324, 108), # 147
(972, 504), (324, 162)),
(psl, isp, None, f_enlarge, 0, (972, 504), (648, 216), # 148
(972, 504), (864, 432)),
(psl, isp, None, f_enlarge, 1, (972, 504), (648, 216), # 149
(972, 504), (864, 432)),
# psp=972x504, psl=504x972, isl=756x324, isp=324x756, border=162:270
# --pagesize --border -a pagepdf imgpdf
# --imgsize --fit imgpx
(psl, isp, border, f_into, 0, (972, 504), (324, 108), # 150
(972, 504), (324, 162)),
(psl, isp, border, f_into, 1, (972, 504), (324, 108), # 151
(972, 504), (324, 162)),
(psl, isp, border, f_fill, 0, (972, 504), (2268, 756), # 152
(972, 504), (1512, 756)),
(psl, isp, border, f_fill, 1, (972, 504), (2268, 756), # 153
(972, 504), (1512, 756)),
(psl, isp, border, f_exact, 0, (972, 504), (324, 756), # 154
(972, 504), (324, 756)),
(psl, isp, border, f_exact, 1, (972, 504), (324, 756), # 155
(972, 504), (324, 756)),
(psl, isp, border, f_shrink, 0, (972, 504), (324, 108), # 156
(972, 504), (324, 162)),
(psl, isp, border, f_shrink, 1, (972, 504), (324, 108), # 157
(972, 504), (324, 162)),
(psl, isp, border, f_enlarge, 0, (972, 504), (648, 216), # 158
(972, 504), (864, 432)),
(psl, isp, border, f_enlarge, 1, (972, 504), (648, 216), # 159
(972, 504), (864, 432)),
(psl, isl, None, f_into, 0, (972, 504), (756, 252), # 160
(972, 504), (648, 324)),
(psl, isl, None, f_into, 1, (972, 504), (756, 252), # 161
(972, 504), (648, 324)),
(psl, isl, None, f_fill, 0, (972, 504), (972, 324), # 162
(972, 504), (756, 378)),
(psl, isl, None, f_fill, 1, (972, 504), (972, 324), # 163
(972, 504), (756, 378)),
(psl, isl, None, f_exact, 0, (972, 504), (756, 324), # 164
(972, 504), (756, 324)),
(psl, isl, None, f_exact, 1, (972, 504), (756, 324), # 165
(972, 504), (756, 324)),
(psl, isl, None, f_shrink, 0, (972, 504), (648, 216), # 166
(972, 504), (648, 324)),
(psl, isl, None, f_shrink, 1, (972, 504), (648, 216), # 167
(972, 504), (648, 324)),
(psl, isl, None, f_enlarge, 0, (972, 504), (756, 252), # 168
(972, 504), (864, 432)),
(psl, isl, None, f_enlarge, 1, (972, 504), (756, 252), # 169
(972, 504), (864, 432)),
(psl, isl, border, f_into, 0, (972, 504), (756, 252), # 170
(972, 504), (648, 324)),
(psl, isl, border, f_into, 1, (972, 504), (756, 252), # 171
(972, 504), (648, 324)),
(psl, isl, border, f_fill, 0, (972, 504), (972, 324), # 172
(972, 504), (756, 378)),
(psl, isl, border, f_fill, 1, (972, 504), (972, 324), # 173
(972, 504), (756, 378)),
(psl, isl, border, f_exact, 0, (972, 504), (756, 324), # 174
(972, 504), (756, 324)),
(psl, isl, border, f_exact, 1, (972, 504), (756, 324), # 175
(972, 504), (756, 324)),
(psl, isl, border, f_shrink, 0, (972, 504), (648, 216), # 176
(972, 504), (648, 324)),
(psl, isl, border, f_shrink, 1, (972, 504), (648, 216), # 177
(972, 504), (648, 324)),
(psl, isl, border, f_enlarge, 0, (972, 504), (756, 252), # 178
(972, 504), (864, 432)),
(psl, isl, border, f_enlarge, 1, (972, 504), (756, 252), # 179
(972, 504), (864, 432)),
(poster, None, None, f_fill, 0, (97200, 50400), (151200, 50400),
(97200, 50400), (100800, 50400)),
]
# fmt: on
)
def test_layout(layout_test_cases):
# there is no need to have test cases with the same images with inverted
# orientation (landscape/portrait) because --pagesize and --imgsize are
# already inverted
im1 = (864, 288) # imgpx #1 => 648x216
im2 = (1152, 576) # imgpx #2 => 864x432
psopt, isopt, border, fit, ao, pspdf1, ispdf1, pspdf2, ispdf2 = layout_test_cases
if isopt is not None:
isopt = ((img2pdf.ImgSize.abs, isopt[0]), (img2pdf.ImgSize.abs, isopt[1]))
layout_fun = img2pdf.get_layout_fun(psopt, isopt, border, fit, ao)
try:
pwpdf, phpdf, iwpdf, ihpdf = layout_fun(
im1[0], im1[1], (img2pdf.default_dpi, img2pdf.default_dpi)
)
assert (pwpdf, phpdf) == pspdf1
assert (iwpdf, ihpdf) == ispdf1
except img2pdf.NegativeDimensionError:
assert pspdf1 is None
assert ispdf1 is None
try:
pwpdf, phpdf, iwpdf, ihpdf = layout_fun(
im2[0], im2[1], (img2pdf.default_dpi, img2pdf.default_dpi)
)
assert (pwpdf, phpdf) == pspdf2
assert (iwpdf, ihpdf) == ispdf2
except img2pdf.NegativeDimensionError:
assert pspdf2 is None
assert ispdf2 is None
@pytest.fixture(
scope="session",
params=os.listdir(os.path.join(os.path.dirname(__file__), "tests", "input")),
)
def general_input(request):
assert os.path.isfile(
os.path.join(os.path.dirname(__file__), "tests", "input", request.param)
)
return request.param
@pytest.mark.skipif(not HAVE_FAKETIME, reason="requires faketime")
@pytest.mark.parametrize(
"engine,testdata,timezone,pdfa",
itertools.product(
["internal", "pikepdf"],
["2021-02-05 17:49:00"],
["Europe/Berlin", "GMT+12"],
[True, False],
),
)
def test_faketime(tmp_path_factory, jpg_img, engine, testdata, timezone, pdfa):
expected = tz2utcstrftime(testdata, "D:%Y%m%d%H%M%SZ", timezone)
out_pdf = tmp_path_factory.mktemp("faketime") / "out.pdf"
subprocess.check_call(
["env", f"TZ={timezone}", "faketime", "-f", testdata, img2pdfprog]
+ (["--pdfa"] if pdfa else [])
+ [
"--producer=",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(jpg_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert p.docinfo.CreationDate == expected
assert p.docinfo.ModDate == expected
if pdfa:
assert p.Root.Metadata.Subtype == "/XML"
assert p.Root.Metadata.Type == "/Metadata"
expected = tz2utcstrftime(testdata, "%Y-%m-%dT%H:%M:%SZ", timezone)
root = ET.fromstring(p.Root.Metadata.read_bytes())
for k in ["ModifyDate", "CreateDate"]:
assert (
root.find(
f".//xmp:{k}", {"xmp": "http://ns.adobe.com/xap/1.0/"}
).text
== expected
)
out_pdf.unlink()
@pytest.mark.parametrize(
"engine,testdata,timezone,pdfa",
itertools.product(
["internal", "pikepdf"],
[
"2021-02-05 17:49:00",
"2021-02-05T17:49:00",
"Fri, 05 Feb 2021 17:49:00 +0100",
"last year 12:00",
],
["Europe/Berlin", "GMT+12"],
[True, False],
),
)
def test_date(tmp_path_factory, jpg_img, engine, testdata, timezone, pdfa):
# we use the date utility to convert the timestamp from the local
# timezone into UTC with the format used by PDF
expected = tz2utcstrftime(testdata, "D:%Y%m%d%H%M%SZ", timezone)
out_pdf = tmp_path_factory.mktemp("faketime") / "out.pdf"
subprocess.check_call(
["env", f"TZ={timezone}", img2pdfprog]
+ (["--pdfa"] if pdfa else [])
+ [
f"--moddate={testdata}",
f"--creationdate={testdata}",
"--producer=",
"--engine=" + engine,
"--output=" + str(out_pdf),
str(jpg_img),
]
)
with pikepdf.open(str(out_pdf)) as p:
assert p.docinfo.CreationDate == expected
assert p.docinfo.ModDate == expected
if pdfa:
assert p.Root.Metadata.Subtype == "/XML"
assert p.Root.Metadata.Type == "/Metadata"
expected = tz2utcstrftime(testdata, "%Y-%m-%dT%H:%M:%SZ", timezone)
root = ET.fromstring(p.Root.Metadata.read_bytes())
for k in ["ModifyDate", "CreateDate"]:
assert (
root.find(
f".//xmp:{k}", {"xmp": "http://ns.adobe.com/xap/1.0/"}
).text
== expected
)
out_pdf.unlink()
@pytest.mark.parametrize("engine", ["internal", "pikepdf"])
def test_general(general_input, engine):
inputf = os.path.join(os.path.dirname(__file__), "tests", "input", general_input)
outputf = os.path.join(
os.path.dirname(__file__), "tests", "output", general_input + ".pdf"
)
assert os.path.isfile(outputf)
f = inputf
out = outputf
engine = getattr(img2pdf.Engine, engine)
with open(f, "rb") as inf:
orig_imgdata = inf.read()
output = img2pdf.convert(orig_imgdata, nodate=True, engine=engine)
x = pikepdf.open(BytesIO(output))
assert x.Root.Pages.Count in (1, 2)
if len(x.Root.Pages.Kids) == "1":
assert x.Size == "7"
assert len(x.Root.Pages.Kids) == 1
elif len(x.Root.Pages.Kids) == "2":
assert x.Size == "10"
assert len(x.Root.Pages.Kids) == 2
assert sorted(x.Root.keys()) == ["/Pages", "/Type"]
assert x.Root.Type == "/Catalog"
assert sorted(x.Root.Pages.keys()) == ["/Count", "/Kids", "/Type"]
assert x.Root.Pages.Type == "/Pages"
if f.endswith(".jb2"):
# PIL doens't support .jb2, so we load the original .png, which
# was converted to the .jb2 using `jbig2enc`.
orig_img = Image.open(f.replace(".jb2", ".png"))
else:
orig_img = Image.open(f)
for pagenum in range(len(x.Root.Pages.Kids)):
# retrieve the original image frame that this page was
# generated from
orig_img.seek(pagenum)
cur_page = x.Root.Pages.Kids[pagenum]
ndpi = orig_img.info.get("dpi", (96.0, 96.0))
if ndpi[0] <= 0.001 or ndpi[1] <= 0.001:
ndpi = (96.0, 96.0)
# In python3, the returned dpi value for some tiff images will
# not be an integer but a float. To make the behaviour of
# img2pdf the same between python2 and python3, we convert that
# float into an integer by rounding.
# Search online for the 72.009 dpi problem for more info.
ndpi = (int(round(ndpi[0])), int(round(ndpi[1])))
imgwidthpx, imgheightpx = orig_img.size
pagewidth = 72.0 * imgwidthpx / ndpi[0]
pageheight = 72.0 * imgheightpx / ndpi[1]
def format_float(f):
if int(f) == f:
return int(f)
else:
return decimal.Decimal("%.4f" % f)
assert sorted(cur_page.keys()) == [
"/Contents",
"/MediaBox",
"/Parent",
"/Resources",
"/Type",
]
assert cur_page.MediaBox == pikepdf.Array(
[0, 0, format_float(pagewidth), format_float(pageheight)]
)
assert cur_page.Parent == x.Root.Pages
assert cur_page.Type == "/Page"
assert cur_page.Resources.keys() == {"/XObject"}
assert cur_page.Resources.XObject.keys() == {"/Im0"}
if engine != img2pdf.Engine.pikepdf:
assert cur_page.Contents.Length == len(cur_page.Contents.read_bytes())
assert (
cur_page.Contents.read_bytes()
== b"q\n%.4f 0 0 %.4f 0.0000 0.0000 cm\n/Im0 Do\nQ"
% (
pagewidth,
pageheight,
)
)
imgprops = cur_page.Resources.XObject.Im0
# test if the filter is valid:
assert imgprops.Filter in [
"/DCTDecode",
"/JPXDecode",
"/FlateDecode",
pikepdf.Array([pikepdf.Name.CCITTFaxDecode]),
"/JBIG2Decode",
]
# test if the image has correct size
assert imgprops.Width == orig_img.size[0]
assert imgprops.Height == orig_img.size[1]
# if the input file is a jpeg then it should've been copied
# verbatim into the PDF
if imgprops.Filter in ["/DCTDecode", "/JPXDecode"]:
assert cur_page.Resources.XObject.Im0.read_raw_bytes() == orig_imgdata
elif imgprops.Filter == "/JBIG2Decode":
assert (
cur_page.Resources.XObject.Im0.read_raw_bytes() == orig_imgdata[13:-22]
) # Strip file header and footer.
elif imgprops.Filter == pikepdf.Array([pikepdf.Name.CCITTFaxDecode]):
tiff_header = tiff_header_for_ccitt(
int(imgprops.Width), int(imgprops.Height), int(imgprops.Length), 4
)
imgio = BytesIO()
imgio.write(tiff_header)
imgio.write(cur_page.Resources.XObject.Im0.read_raw_bytes())
imgio.seek(0)
im = Image.open(imgio)
assert im.tobytes() == orig_img.tobytes()
try:
im.close()
except AttributeError:
pass
elif imgprops.Filter == "/FlateDecode":
# otherwise, the data is flate encoded and has to be equal
# to the pixel data of the input image
imgdata = zlib.decompress(cur_page.Resources.XObject.Im0.read_raw_bytes())
if hasattr(imgprops, "DecodeParms"):
if orig_img.format == "PNG":
pngidat, palette = img2pdf.parse_png(orig_imgdata)
elif (
orig_img.format == "TIFF"
and orig_img.info["compression"] == "group4"
):
offset, length = img2pdf.ccitt_payload_location_from_pil(orig_img)
pngidat = orig_imgdata[offset : offset + length]
else:
pngbuffer = BytesIO()
orig_img.save(pngbuffer, format="png")
pngidat, palette = img2pdf.parse_png(pngbuffer.getvalue())
assert zlib.decompress(pngidat) == imgdata
else:
colorspace = imgprops.ColorSpace
if colorspace == "/DeviceGray":
colorspace = "L"
elif colorspace == "/DeviceRGB":
colorspace = "RGB"
elif colorspace == "/DeviceCMYK":
colorspace = "CMYK"
else:
raise Exception("invalid colorspace")
im = Image.frombytes(
colorspace, (int(imgprops.Width), int(imgprops.Height)), imgdata
)
if orig_img.mode == "1":
assert im.tobytes() == orig_img.convert("L").tobytes()
elif orig_img.mode not in ("RGB", "L", "CMYK", "CMYK;I"):
assert im.tobytes() == orig_img.convert("RGB").tobytes()
# the python-pil version 2.3.0-1ubuntu3 in Ubuntu does
# not have the close() method
try:
im.close()
except AttributeError:
pass
else:
raise Exception("unknown filter")
def rec(obj):
if isinstance(obj, pikepdf.Dictionary):
return {k: rec(v) for k, v in obj.items() if k != "/Parent"}
elif isinstance(obj, pikepdf.Array):
return [rec(v) for v in obj]
elif isinstance(obj, pikepdf.Stream):
ret = rec(obj.stream_dict)
stream = obj.read_raw_bytes()
assert len(stream) == ret["/Length"]
del ret["/Length"]
if ret.get("/Filter") == "/FlateDecode":
stream = obj.read_bytes()
del ret["/Filter"]
ret["stream"] = stream
return ret
elif isinstance(obj, pikepdf.Name) or isinstance(obj, pikepdf.String):
return str(obj)
elif isinstance(obj, decimal.Decimal) or isinstance(obj, str):
return obj
elif isinstance(obj, int):
return decimal.Decimal(obj)
raise Exception("unhandled: %s" % (type(obj)))
y = pikepdf.open(out)
pydictx = rec(x.Root)
pydicty = rec(y.Root)
assert pydictx == pydicty
# the python-pil version 2.3.0-1ubuntu3 in Ubuntu does not have the
# close() method
try:
orig_img.close()
except AttributeError:
pass
def test_return_engine_doc(tmp_path_factory):
inputf = os.path.join(os.path.dirname(__file__), "tests", "input", "normal.jpg")
outputf = tmp_path_factory.mktemp("return_engine_doc") / "normal.jpg.pdf"
pdf_wrapper = img2pdf.convert_to_docobject(inputf, engine=img2pdf.Engine.pikepdf)
pdf = pdf_wrapper.writer
assert isinstance(pdf, pikepdf.Pdf)
pdf.save(outputf, min_version=pdf_wrapper.output_version, linearize=True)
assert os.path.isfile(outputf)
def main():
normal16 = alpha_value()[:, :, 0:3]
pathlib.Path("test.icc").write_bytes(icc_profile())
write_png(
normal16 / 0xFFFF * 0xFF,
"icc.png",
8,
2,
iccp="test.icc",
)
write_png(
normal16 / 0xFFFF * 0xFF,
"normal.png",
8,
2,
)
if __name__ == "__main__":
main()
././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1723007641.0
img2pdf-0.6.1/src/jp2.py 0000644 0001750 0001750 00000012255 14654601231 013747 0 ustar 00josch josch #!/usr/bin/env python
#
# Copyright (C) 2013 Johannes Schauer Marin Rodrigues
#
# this module is heavily based upon jpylyzer which is
# KB / National Library of the Netherlands, Open Planets Foundation
# and released under the same license conditions
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Lesser General Public License for more details.
#
# You should have received a copy of the GNU Lesser General Public License
# along with this program. If not, see .
import struct
def getBox(data, byteStart, noBytes):
boxLengthValue = struct.unpack(">I", data[byteStart : byteStart + 4])[0]
boxType = data[byteStart + 4 : byteStart + 8]
contentsStartOffset = 8
if boxLengthValue == 1:
boxLengthValue = struct.unpack(">Q", data[byteStart + 8 : byteStart + 16])[0]
contentsStartOffset = 16
if boxLengthValue == 0:
boxLengthValue = noBytes - byteStart
byteEnd = byteStart + boxLengthValue
boxContents = data[byteStart + contentsStartOffset : byteEnd]
return (boxLengthValue, boxType, byteEnd, boxContents)
def parse_ihdr(data):
height, width, channels, bpp = struct.unpack(">IIHB", data[:11])
return width, height, channels, bpp + 1
def parse_colr(data):
meth = struct.unpack(">B", data[0:1])[0]
if meth != 1:
raise Exception("only enumerated color method supported")
enumCS = struct.unpack(">I", data[3:])[0]
if enumCS == 16:
return "RGB"
elif enumCS == 17:
return "L"
else:
raise Exception(
"only sRGB and greyscale color space is supported, " "got %d" % enumCS
)
def parse_resc(data):
hnum, hden, vnum, vden, hexp, vexp = struct.unpack(">HHHHBB", data)
hdpi = ((hnum / hden) * (10**hexp) * 100) / 2.54
vdpi = ((vnum / vden) * (10**vexp) * 100) / 2.54
return hdpi, vdpi
def parse_res(data):
hdpi, vdpi = None, None
noBytes = len(data)
byteStart = 0
boxLengthValue = 1 # dummy value for while loop condition
while byteStart < noBytes and boxLengthValue != 0:
boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes)
if boxType == b"resc":
hdpi, vdpi = parse_resc(boxContents)
break
return hdpi, vdpi
def parse_jp2h(data):
width, height, colorspace, hdpi, vdpi = None, None, None, None, None
noBytes = len(data)
byteStart = 0
boxLengthValue = 1 # dummy value for while loop condition
while byteStart < noBytes and boxLengthValue != 0:
boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes)
if boxType == b"ihdr":
width, height, channels, bpp = parse_ihdr(boxContents)
elif boxType == b"colr":
colorspace = parse_colr(boxContents)
elif boxType == b"res ":
hdpi, vdpi = parse_res(boxContents)
byteStart = byteEnd
return (width, height, colorspace, hdpi, vdpi, channels, bpp)
def parsejp2(data):
noBytes = len(data)
byteStart = 0
boxLengthValue = 1 # dummy value for while loop condition
width, height, colorspace, hdpi, vdpi = None, None, None, None, None
while byteStart < noBytes and boxLengthValue != 0:
boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes)
if boxType == b"jp2h":
width, height, colorspace, hdpi, vdpi, channels, bpp = parse_jp2h(
boxContents
)
break
byteStart = byteEnd
if not width:
raise Exception("no width in jp2 header")
if not height:
raise Exception("no height in jp2 header")
if not colorspace:
raise Exception("no colorspace in jp2 header")
# retrieving the dpi is optional so we do not error out if not present
return (width, height, colorspace, hdpi, vdpi, channels, bpp)
def parsej2k(data):
lsiz, rsiz, xsiz, ysiz, xosiz, yosiz, _, _, _, _, csiz = struct.unpack(
">HHIIIIIIIIH", data[4:42]
)
ssiz = [None] * csiz
xrsiz = [None] * csiz
yrsiz = [None] * csiz
for i in range(csiz):
ssiz[i], xrsiz[i], yrsiz[i] = struct.unpack(
"BBB", data[42 + 3 * i : 42 + 3 * (i + 1)]
)
assert ssiz == [7, 7, 7]
return xsiz - xosiz, ysiz - yosiz, None, None, None, csiz, 8
def parse(data):
if data[:4] == b"\xff\x4f\xff\x51":
return parsej2k(data)
else:
return parsejp2(data)
if __name__ == "__main__":
import sys
width, height, colorspace, hdpi, vdpi, channels, bpp = parse(
open(sys.argv[1], "rb").read()
)
print("width = %d" % width)
print("height = %d" % height)
print("colorspace = %s" % colorspace)
print("hdpi = %s" % hdpi)
print("vdpi = %s" % vdpi)
print("channels = %s" % channels)
print("bpp = %s" % bpp)
././@PaxHeader 0000000 0000000 0000000 00000000033 00000000000 010211 x ustar 00 27 mtime=1745772903.502961
img2pdf-0.6.1/src/tests/ 0000755 0001750 0001750 00000000000 15003460550 014033 5 ustar 00josch josch ././@PaxHeader 0000000 0000000 0000000 00000000034 00000000000 010212 x ustar 00 28 mtime=1745772903.5229614
img2pdf-0.6.1/src/tests/input/ 0000755 0001750 0001750 00000000000 15003460550 015172 5 ustar 00josch josch ././@PaxHeader 0000000 0000000 0000000 00000000026 00000000000 010213 x ustar 00 22 mtime=1705430659.0
img2pdf-0.6.1/src/tests/input/CMYK.jpg 0000644 0001750 0001750 00000011264 14551547203 016453 0 ustar 00josch josch JFIF H H Adobe d C
C
0 s" "
D
!"w289#1XAQBYadr ' 1!3"AQaq#2 ? cCXm9˚W͒?IK[u\k}טonAz=Ai|uenb6AvjuT$ e!7
(kpUq/e