././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1745772903.5349615 img2pdf-0.6.1/0000755000175000017500000000000015003460550012102 5ustar00joschjosch././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772831.0 img2pdf-0.6.1/CHANGES.rst0000644000175000017500000001370515003460437013716 0ustar00joschjosch======= CHANGES ======= 0.6.1 (2025-04-27) ------------------ - testsuite fixes 0.6.0 (2025-02-15) ------------------ - Add support for JBIG2 (generic coding) - Add convert_to_docobject() broken out from convert() - Add pil_get_dpi() broken out from get_imgmetadata() 0.5.1 (2023-11-26) ------------------ - no default ICC profile location for PDF/A-1b on Windows - workaround for PNG input without dpi units but non-square dpi aspect ratio 0.5.0 (2023-10-28) ------------------ - support MIFF for 16 bit CMYK input - accept pathlib.Path objects as input - don't store RGB ICC profiles from bilevel or grayscale TIFF, PNG and JPEG - thumbnails are no longer included by default and --include-thumbnails has to be used if you want them - support for pikepdf (>= 6.2.0) 0.4.4 (2022-04-07) ------------------ - --viewer-page-layout support for twopageright and twopageleft - Add B and JB paper sizes - support for pikepdf (>= 5.0.0) and Pillow (>= 9.1.0) 0.4.3 (2021-10-24) ------------------ - fix --viewer-initial-page (broken in last release) 0.4.2 (2021-10-11) ------------------ - add --rotation - allow palette PNG images with ICC profile - sort globbing result on windows - convert 8-bit PNG alpha channels to /SMasks in PDF - remove pdfrw from tests 0.4.1 (2021-05-09) ------------------ - support wildcards in paths on windows - support MPO images - fix page border computation - use "img2pdf" logger instead of "root" logger - add --from-file 0.4.0 (2020-08-07) ------------------ - replace --without-pdfrw by --engine=internal or --engine=pdfrw - add pikepdf as additional rendering engine and add --engine=pikepdf - support for creating PDF/A-1b compliant PDF using the --pdfa option (this also requires the presence of an ICC profile somewhere on the system) - support for images with embedded ICC profile as input - rewrite tests * use pytest via tox * use pikepdf instead of pdfrw * use imagemagick json output instead of identify -verbose - format all code with black 0.3.6 (2020-04-30) ------------------ - fix tests for Fedora on arm64 0.3.5 (2020-04-28) ------------------ - remove all Python 2 support - disable pdfrw by default 0.3.4 (2020-04-05) ------------------ - test.sh: replace imagemagick with custom python script to produce bit-by-bit identical results on all architectures - add --crop-border, --bleed-border, --trim-border and --art-border options - first draft of a rudimentary tkinter gui (run with --gui) 0.3.3 (2019-01-07) ------------------ - restore basic support for Python 2 - also ship test.sh - add legal and tabloid paper formats - respect exif rotation tag 0.3.2 (2018-11-20) ------------------ - support big endian TIFF with lsb-to-msb FillOrder - support multipage CCITT Group 4 TIFF - also reject palette images with transparency - support PNG images with 1, 2, 4 or 16 bits per sample - support multipage TIFF with differently encoded images - support CCITT Group4 TIFF without rows-per-strip - add extensive test suite 0.3.1 (2018-08-04) ------------------ - Directly copy data from CCITT Group 4 encoded TIFF images into the PDF container without re-encoding 0.3.0 (2018-06-18) ------------------ - Store non-jpeg images using PNG compression - Support arbitrarily large pages via PDF /UserUnit field - Disallow input with alpha channel as it cannot be preserved - Add option --pillow-limit-break to support very large input 0.2.4 (2017-05-23) ------------------ - Restore support for Python 2.7 - Add support for PyPy - Add support for testing using tox 0.2.3 (2017-01-20) ------------------ - version number bump for botched pypi upload... 0.2.2 (2017-01-20) ------------------ - automatic monochrome CCITT Group4 encoding via Pillow/libtiff 0.2.1 (2016-05-04) ------------------ - set img2pdf as /producer value - support multi-frame images like multipage TIFF and animated GIF - support for palette images like GIF - support all colorspaces and imageformats known by PIL - read horizontal and vertical dpi from JPEG2000 files 0.2.0 (2015-05-10) ------------------ - now Python3 only - pep8 compliant code - update my email to josch@mister-muffin.de - move from github to gitlab.mister-muffin.de/josch/img2pdf - use logging module - add extensive test suite - ability to read from standard input - pdf writer: - make more compatible with the interface of pdfrw module - print floats which equal to their integer conversion as integer - do not print trailing zeroes for floating point numbers - print more linebreaks - add binary string at beginning of PDF to indicate that the PDF contains binary data - handle datetime and unicode strings by using utf-16-be encoding - new options (see --help for more details): - --without-pdfrw - --imgsize - --border - --fit - --auto-orient - --viewer-panes - --viewer-initial-page - --viewer-magnification - --viewer-page-layout - --viewer-fit-window - --viewer-center-window - --viewer-fullscreen - remove short options for metadata command line arguments - correctly encode and escape non-ascii metadata - explicitly store date in UTC and allow parsing all date formats understood by dateutil and `date --date` 0.1.5 (2015-02-16) ------------------ - Enable support for CMYK images - Rework test suite - support file objects as input 0.1.4 (2015-01-21) ------------------ - add Python 3 support - make output reproducible by sorting and --nodate option 0.1.3 (2014-11-10) ------------------ - Avoid leaking file descriptors - Convert unrecognized colorspaces to RGB 0.1.1 (2014-09-07) ------------------ - allow running src/img2pdf.py standalone - license change from GPL to LGPL - Add pillow 2.4.0 support - add options to specify pdf dimensions in points 0.1.0 (2014-03-14, unreleased) ------------------ - Initial PyPI release. - Modified code to create proper package. - Added tests. - Added console script entry point. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/LICENSE0000644000175000017500000001674414551547203013133 0ustar00joschjosch GNU LESSER GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. This version of the GNU Lesser General Public License incorporates the terms and conditions of version 3 of the GNU General Public License, supplemented by the additional permissions listed below. 0. Additional Definitions. As used herein, "this License" refers to version 3 of the GNU Lesser General Public License, and the "GNU GPL" refers to version 3 of the GNU General Public License. "The Library" refers to a covered work governed by this License, other than an Application or a Combined Work as defined below. An "Application" is any work that makes use of an interface provided by the Library, but which is not otherwise based on the Library. Defining a subclass of a class defined by the Library is deemed a mode of using an interface provided by the Library. A "Combined Work" is a work produced by combining or linking an Application with the Library. The particular version of the Library with which the Combined Work was made is also called the "Linked Version". The "Minimal Corresponding Source" for a Combined Work means the Corresponding Source for the Combined Work, excluding any source code for portions of the Combined Work that, considered in isolation, are based on the Application, and not on the Linked Version. The "Corresponding Application Code" for a Combined Work means the object code and/or source code for the Application, including any data and utility programs needed for reproducing the Combined Work from the Application, but excluding the System Libraries of the Combined Work. 1. Exception to Section 3 of the GNU GPL. You may convey a covered work under sections 3 and 4 of this License without being bound by section 3 of the GNU GPL. 2. Conveying Modified Versions. If you modify a copy of the Library, and, in your modifications, a facility refers to a function or data to be supplied by an Application that uses the facility (other than as an argument passed when the facility is invoked), then you may convey a copy of the modified version: a) under this License, provided that you make a good faith effort to ensure that, in the event an Application does not supply the function or data, the facility still operates, and performs whatever part of its purpose remains meaningful, or b) under the GNU GPL, with none of the additional permissions of this License applicable to that copy. 3. Object Code Incorporating Material from Library Header Files. The object code form of an Application may incorporate material from a header file that is part of the Library. You may convey such object code under terms of your choice, provided that, if the incorporated material is not limited to numerical parameters, data structure layouts and accessors, or small macros, inline functions and templates (ten or fewer lines in length), you do both of the following: a) Give prominent notice with each copy of the object code that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the object code with a copy of the GNU GPL and this license document. 4. Combined Works. You may convey a Combined Work under terms of your choice that, taken together, effectively do not restrict modification of the portions of the Library contained in the Combined Work and reverse engineering for debugging such modifications, if you also do each of the following: a) Give prominent notice with each copy of the Combined Work that the Library is used in it and that the Library and its use are covered by this License. b) Accompany the Combined Work with a copy of the GNU GPL and this license document. c) For a Combined Work that displays copyright notices during execution, include the copyright notice for the Library among these notices, as well as a reference directing the user to the copies of the GNU GPL and this license document. d) Do one of the following: 0) Convey the Minimal Corresponding Source under the terms of this License, and the Corresponding Application Code in a form suitable for, and under terms that permit, the user to recombine or relink the Application with a modified version of the Linked Version to produce a modified Combined Work, in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source. 1) Use a suitable shared library mechanism for linking with the Library. A suitable mechanism is one that (a) uses at run time a copy of the Library already present on the user's computer system, and (b) will operate properly with a modified version of the Library that is interface-compatible with the Linked Version. e) Provide Installation Information, but only if you would otherwise be required to provide such information under section 6 of the GNU GPL, and only to the extent that such information is necessary to install and execute a modified version of the Combined Work produced by recombining or relinking the Application with a modified version of the Linked Version. (If you use option 4d0, the Installation Information must accompany the Minimal Corresponding Source and Corresponding Application Code. If you use option 4d1, you must provide the Installation Information in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source.) 5. Combined Libraries. You may place library facilities that are a work based on the Library side by side in a single library together with other library facilities that are not Applications and are not covered by this License, and convey such a combined library under terms of your choice, if you do both of the following: a) Accompany the combined library with a copy of the same work based on the Library, uncombined with any other library facilities, conveyed under the terms of this License. b) Give prominent notice with the combined library that part of it is a work based on the Library, and explaining where to find the accompanying uncombined form of the same work. 6. Revised Versions of the GNU Lesser General Public License. The Free Software Foundation may publish revised and/or new versions of the GNU Lesser General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Library as you received it specifies that a certain numbered version of the GNU Lesser General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that published version or of any later version published by the Free Software Foundation. If the Library as you received it does not specify a version number of the GNU Lesser General Public License, you may choose any version of the GNU Lesser General Public License ever published by the Free Software Foundation. If the Library as you received it specifies that a proxy can decide whether future versions of the GNU Lesser General Public License shall apply, that proxy's public statement of acceptance of any version is permanent authorization for you to choose that version for the Library. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/MANIFEST.in0000644000175000017500000000042414551547203013650 0ustar00joschjoschinclude README.md include test_comp.sh include test.sh include magick.py include CHANGES.rst include LICENSE recursive-include src *.jpg recursive-include src *.pdf recursive-include src *.png recursive-include src *.tif recursive-include src *.gif recursive-include src *.py ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1745772903.5349615 img2pdf-0.6.1/PKG-INFO0000644000175000017500000003262215003460550013204 0ustar00joschjoschMetadata-Version: 2.1 Name: img2pdf Version: 0.6.1 Summary: Convert images to PDF via direct JPEG inclusion. Home-page: https://gitlab.mister-muffin.de/josch/img2pdf Download-URL: https://gitlab.mister-muffin.de/josch/img2pdf/repository/archive.tar.gz?ref=0.6.1 Author: Johannes Schauer Marin Rodrigues Author-email: josch@mister-muffin.de License: LGPL Keywords: jpeg pdf converter Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: Intended Audience :: Other Audience Classifier: Environment :: Console Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3) Classifier: Natural Language :: English Classifier: Operating System :: OS Independent Description-Content-Type: text/markdown Provides-Extra: gui License-File: LICENSE [![Travis Status](https://travis-ci.com/josch/img2pdf.svg?branch=main)](https://app.travis-ci.com/josch/img2pdf) [![Appveyor Status](https://ci.appveyor.com/api/projects/status/2kws3wkqvi526llj/branch/main?svg=true)](https://ci.appveyor.com/project/josch/img2pdf/branch/main) img2pdf ======= Lossless conversion of raster images to PDF. You should use img2pdf if your priorities are (in this order): 1. **always lossless**: the image embedded in the PDF will always have the exact same color information for every pixel as the input 2. **small**: if possible, the difference in filesize between the input image and the output PDF will only be the overhead of the PDF container itself 3. **fast**: if possible, the input image is just pasted into the PDF document as-is without any CPU hungry re-encoding of the pixel data Conventional conversion software (like ImageMagick) would either: 1. not be lossless because lossy re-encoding to JPEG 2. not be small because using wasteful flate encoding of raw pixel data 3. not be fast because input data gets re-encoded Another advantage of not having to re-encode the input (in most common situations) is, that img2pdf is able to handle much larger input than other software, because the raw pixel data never has to be loaded into memory. The following table shows how img2pdf handles different input depending on the input file format and image color space. | Format | Colorspace | Result | | ------------------------------------- | ------------------------------------ | ------------- | | JPEG | any | direct | | JPEG2000 | any | direct | | PNG (non-interlaced, no transparency) | any | direct | | TIFF (CCITT Group 4) | 1-bit monochrome | direct | | JBIG2 (single-page generic coding) | 1-bit monochrome | direct | | any | any except CMYK and 1-bit monochrome | PNG Paeth | | any | 1-bit monochrome | CCITT Group 4 | | any | CMYK | flate | For JPEG, JPEG2000, non-interlaced PNG, TIFF images with CCITT Group 4 encoded data, and JBIG2 with single-page generic coding (e.g. using `jbig2enc`), img2pdf directly embeds the image data into the PDF without re-encoding it. It thus treats the PDF format merely as a container format for the image data. In these cases, img2pdf only increases the filesize by the size of the PDF container (typically around 500 to 700 bytes). Since data is only copied and not re-encoded, img2pdf is also typically faster than other solutions for these input formats. For all other input types, img2pdf first has to transform the pixel data to make it compatible with PDF. In most cases, the PNG Paeth filter is applied to the pixel data. For 1-bit monochrome input, CCITT Group 4 is used instead. Only for CMYK input no filter is applied before finally applying flate compression. Usage ----- The images must be provided as files because img2pdf needs to seek in the file descriptor. If no output file is specified with the `-o`/`--output` option, output will be done to stdout. A typical invocation is: $ img2pdf img1.png img2.jpg -o out.pdf The detailed documentation can be accessed by running: $ img2pdf --help With no command line arguments supplied, img2pdf will read a single image from standard input and write the resulting PDF to standard output. Here is an example for how to scan directly to PDF using scanimage(1) from SANE: $ scanimage --mode=Color --resolution=300 | pnmtojpeg -quality 90 | img2pdf > scan.pdf Bugs ---- - If you find a JPEG, JPEG2000, PNG or CCITT Group 4 encoded TIFF file that, when embedded into the PDF cannot be read by the Adobe Acrobat Reader, please contact me. - An error is produced if the input image is broken. This commonly happens if the input image has an invalid EXIF Orientation value of zero. Even though only nine different values from 1 to 9 are permitted, Anroid phones and Canon DSLR cameras produce JPEG images with the invalid value of zero. Either fix your input images with `exiftool` or similar software before passing the JPEG to `img2pdf` or run `img2pdf` with `--rotation=ifvalid` (if you run img2pdf from the commandline) or by passing `rotation=img2pdf.Rotation.ifvalid` as an argument to `convert()` when using img2pdf as a library. - img2pdf uses PIL (or Pillow) to obtain image meta data and to convert the input if necessary. To prevent decompression bomb denial of service attacks, Pillow limits the maximum number of pixels an input image is allowed to have. If you are sure that you know what you are doing, then you can disable this safeguard by passing the `--pillow-limit-break` option to img2pdf. This allows one to process even very large input images. Installation ------------ On a Debian- and Ubuntu-based systems, img2pdf can be installed from the official repositories: $ apt install img2pdf If you want to install it using pip, you can run: $ pip3 install img2pdf If you prefer to install from source code use: $ cd img2pdf/ $ pip3 install . To test the console script without installing the package on your system, use virtualenv: $ cd img2pdf/ $ virtualenv ve $ ve/bin/pip3 install . You can then test the converter using: $ ve/bin/img2pdf -o test.pdf src/tests/test.jpg If you don't want to setup Python on Windows, then head to the [releases](https://gitlab.mister-muffin.de/josch/img2pdf/releases) section and download the latest `img2pdf.exe`. GUI --- There exists an experimental GUI with all settings currently disabled. You can directly convert images to PDF but you cannot set any options via the GUI yet. If you are interested in adding more features to the PDF, please submit a merge request. The GUI is based on tkinter and works on Linux, Windows and MacOS. ![](screenshot.png) Library ------- The package can also be used as a library: import img2pdf # opening from filename with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg')) # opening from file handle with open("name.pdf","wb") as f1, open("test.jpg") as f2: f1.write(img2pdf.convert(f2)) # opening using pathlib with open("name.pdf","wb") as f: f.write(img2pdf.convert(pathlib.Path('test.jpg'))) # using in-memory image data with open("name.pdf","wb") as f: f.write(img2pdf.convert("\x89PNG...") # multiple inputs (variant 1) with open("name.pdf","wb") as f: f.write(img2pdf.convert("test1.jpg", "test2.png")) # multiple inputs (variant 2) with open("name.pdf","wb") as f: f.write(img2pdf.convert(["test1.jpg", "test2.png"])) # convert all files ending in .jpg inside a directory dirname = "/path/to/images" imgs = [] for fname in os.listdir(dirname): if not fname.endswith(".jpg"): continue path = os.path.join(dirname, fname) if os.path.isdir(path): continue imgs.append(path) with open("name.pdf","wb") as f: f.write(img2pdf.convert(imgs)) # convert all files ending in .jpg in a directory and its subdirectories dirname = "/path/to/images" imgs = [] for r, _, f in os.walk(dirname): for fname in f: if not fname.endswith(".jpg"): continue imgs.append(os.path.join(r, fname)) with open("name.pdf","wb") as f: f.write(img2pdf.convert(imgs)) # convert all files matching a glob import glob with open("name.pdf","wb") as f: f.write(img2pdf.convert(glob.glob("/path/to/*.jpg"))) # convert all files matching a glob using pathlib.Path from pathlib import Path with open("name.pdf","wb") as f: f.write(img2pdf.convert(*Path("/path").glob("**/*.jpg"))) # ignore invalid rotation values in the input images with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg'), rotation=img2pdf.Rotation.ifvalid) # writing to file descriptor with open("name.pdf","wb") as f1, open("test.jpg") as f2: img2pdf.convert(f2, outputstream=f1) # specify paper size (A4) a4inpt = (img2pdf.mm_to_pt(210),img2pdf.mm_to_pt(297)) layout_fun = img2pdf.get_layout_fun(a4inpt) with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun)) # use a fixed dpi of 300 instead of reading it from the image dpix = dpiy = 300 layout_fun = img2pdf.get_fixed_dpi_layout_fun((dpix, dpiy)) with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun)) # create a PDF/A-1b compliant document by passing an ICC profile with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', pdfa="/usr/share/color/icc/sRGB.icc")) Comparison to ImageMagick ------------------------- Create a large test image: $ convert logo: -resize 8000x original.jpg Convert it into PDF using ImageMagick and img2pdf: $ time img2pdf original.jpg -o img2pdf.pdf $ time convert original.jpg imagemagick.pdf Notice how ImageMagick took an order of magnitude longer to do the conversion than img2pdf. It also used twice the memory. Now extract the image data from both PDF documents and compare it to the original: $ pdfimages -all img2pdf.pdf tmp $ compare -metric AE original.jpg tmp-000.jpg null: 0 $ pdfimages -all imagemagick.pdf tmp $ compare -metric AE original.jpg tmp-000.jpg null: 118716 To get lossless output with ImageMagick we can use Zip compression but that unnecessarily increases the size of the output: $ convert original.jpg -compress Zip imagemagick.pdf $ pdfimages -all imagemagick.pdf tmp $ compare -metric AE original.jpg tmp-000.png null: 0 $ stat --format="%s %n" original.jpg img2pdf.pdf imagemagick.pdf 1535837 original.jpg 1536683 img2pdf.pdf 9397809 imagemagick.pdf Comparison to pdfLaTeX ---------------------- pdfLaTeX performs a lossless conversion from included images to PDF by default. If the input is a JPEG, then it simply embeds the JPEG into the PDF in the same way as img2pdf does it. But for other image formats it uses flate compression of the plain pixel data and thus needlessly increases the output file size: $ convert logo: -resize 8000x original.png $ cat << END > pdflatex.tex \documentclass{article} \usepackage{graphicx} \begin{document} \includegraphics{original.png} \end{document} END $ pdflatex pdflatex.tex $ stat --format="%s %n" original.png pdflatex.pdf 4500182 original.png 9318120 pdflatex.pdf Comparison to podofoimg2pdf --------------------------- Like pdfLaTeX, podofoimg2pdf is able to perform a lossless conversion from JPEG to PDF by plainly embedding the JPEG data into the pdf container. But just like pdfLaTeX it uses flate compression for all other file formats, thus sometimes resulting in larger files than necessary. $ convert logo: -resize 8000x original.png $ podofoimg2pdf out.pdf original.png stat --format="%s %n" original.png out.pdf 4500181 original.png 9335629 out.pdf It also only supports JPEG, PNG and TIF as input and lacks many of the convenience features of img2pdf like page sizes, borders, rotation and metadata. Comparison to Tesseract OCR --------------------------- Tesseract OCR comes closest to the functionality img2pdf provides. It is able to convert JPEG and PNG input to PDF without needlessly increasing the filesize and is at the same time lossless. So if your input is JPEG and PNG images, then you should safely be able to use Tesseract instead of img2pdf. For other input, Tesseract might not do a lossless conversion. For example it converts CMYK input to RGB and removes the alpha channel from images with transparency. For multipage TIFF or animated GIF, it will only convert the first frame. Comparison to econvert from ExactImage -------------------------------------- Like pdflatex and podofoimg2pf, econvert is able to embed JPEG images into PDF directly without re-encoding but when given other file formats, it stores them just using flate compressen, which unnecessarily increases the filesize. Furthermore, it throws an error with CMYK TIF input. It also doesn't store CMYK jpeg files as CMYK but converts them to RGB, so it's not lossless. When trying to feed it 16bit files, it errors out with Unhandled bps/spp combination. It also seems to choose JPEG encoding when using it on some file types (like palette images) making it again not lossless for that input as well. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1742957678.0 img2pdf-0.6.1/README.md0000644000175000017500000003050314770666156013406 0ustar00joschjosch[![Travis Status](https://travis-ci.com/josch/img2pdf.svg?branch=main)](https://app.travis-ci.com/josch/img2pdf) [![Appveyor Status](https://ci.appveyor.com/api/projects/status/2kws3wkqvi526llj/branch/main?svg=true)](https://ci.appveyor.com/project/josch/img2pdf/branch/main) img2pdf ======= Lossless conversion of raster images to PDF. You should use img2pdf if your priorities are (in this order): 1. **always lossless**: the image embedded in the PDF will always have the exact same color information for every pixel as the input 2. **small**: if possible, the difference in filesize between the input image and the output PDF will only be the overhead of the PDF container itself 3. **fast**: if possible, the input image is just pasted into the PDF document as-is without any CPU hungry re-encoding of the pixel data Conventional conversion software (like ImageMagick) would either: 1. not be lossless because lossy re-encoding to JPEG 2. not be small because using wasteful flate encoding of raw pixel data 3. not be fast because input data gets re-encoded Another advantage of not having to re-encode the input (in most common situations) is, that img2pdf is able to handle much larger input than other software, because the raw pixel data never has to be loaded into memory. The following table shows how img2pdf handles different input depending on the input file format and image color space. | Format | Colorspace | Result | | ------------------------------------- | ------------------------------------ | ------------- | | JPEG | any | direct | | JPEG2000 | any | direct | | PNG (non-interlaced, no transparency) | any | direct | | TIFF (CCITT Group 4) | 1-bit monochrome | direct | | JBIG2 (single-page generic coding) | 1-bit monochrome | direct | | any | any except CMYK and 1-bit monochrome | PNG Paeth | | any | 1-bit monochrome | CCITT Group 4 | | any | CMYK | flate | For JPEG, JPEG2000, non-interlaced PNG, TIFF images with CCITT Group 4 encoded data, and JBIG2 with single-page generic coding (e.g. using `jbig2enc`), img2pdf directly embeds the image data into the PDF without re-encoding it. It thus treats the PDF format merely as a container format for the image data. In these cases, img2pdf only increases the filesize by the size of the PDF container (typically around 500 to 700 bytes). Since data is only copied and not re-encoded, img2pdf is also typically faster than other solutions for these input formats. For all other input types, img2pdf first has to transform the pixel data to make it compatible with PDF. In most cases, the PNG Paeth filter is applied to the pixel data. For 1-bit monochrome input, CCITT Group 4 is used instead. Only for CMYK input no filter is applied before finally applying flate compression. Usage ----- The images must be provided as files because img2pdf needs to seek in the file descriptor. If no output file is specified with the `-o`/`--output` option, output will be done to stdout. A typical invocation is: $ img2pdf img1.png img2.jpg -o out.pdf The detailed documentation can be accessed by running: $ img2pdf --help With no command line arguments supplied, img2pdf will read a single image from standard input and write the resulting PDF to standard output. Here is an example for how to scan directly to PDF using scanimage(1) from SANE: $ scanimage --mode=Color --resolution=300 | pnmtojpeg -quality 90 | img2pdf > scan.pdf Bugs ---- - If you find a JPEG, JPEG2000, PNG or CCITT Group 4 encoded TIFF file that, when embedded into the PDF cannot be read by the Adobe Acrobat Reader, please contact me. - An error is produced if the input image is broken. This commonly happens if the input image has an invalid EXIF Orientation value of zero. Even though only nine different values from 1 to 9 are permitted, Anroid phones and Canon DSLR cameras produce JPEG images with the invalid value of zero. Either fix your input images with `exiftool` or similar software before passing the JPEG to `img2pdf` or run `img2pdf` with `--rotation=ifvalid` (if you run img2pdf from the commandline) or by passing `rotation=img2pdf.Rotation.ifvalid` as an argument to `convert()` when using img2pdf as a library. - img2pdf uses PIL (or Pillow) to obtain image meta data and to convert the input if necessary. To prevent decompression bomb denial of service attacks, Pillow limits the maximum number of pixels an input image is allowed to have. If you are sure that you know what you are doing, then you can disable this safeguard by passing the `--pillow-limit-break` option to img2pdf. This allows one to process even very large input images. Installation ------------ On a Debian- and Ubuntu-based systems, img2pdf can be installed from the official repositories: $ apt install img2pdf If you want to install it using pip, you can run: $ pip3 install img2pdf If you prefer to install from source code use: $ cd img2pdf/ $ pip3 install . To test the console script without installing the package on your system, use virtualenv: $ cd img2pdf/ $ virtualenv ve $ ve/bin/pip3 install . You can then test the converter using: $ ve/bin/img2pdf -o test.pdf src/tests/test.jpg If you don't want to setup Python on Windows, then head to the [releases](https://gitlab.mister-muffin.de/josch/img2pdf/releases) section and download the latest `img2pdf.exe`. GUI --- There exists an experimental GUI with all settings currently disabled. You can directly convert images to PDF but you cannot set any options via the GUI yet. If you are interested in adding more features to the PDF, please submit a merge request. The GUI is based on tkinter and works on Linux, Windows and MacOS. ![](screenshot.png) Library ------- The package can also be used as a library: import img2pdf # opening from filename with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg')) # opening from file handle with open("name.pdf","wb") as f1, open("test.jpg") as f2: f1.write(img2pdf.convert(f2)) # opening using pathlib with open("name.pdf","wb") as f: f.write(img2pdf.convert(pathlib.Path('test.jpg'))) # using in-memory image data with open("name.pdf","wb") as f: f.write(img2pdf.convert("\x89PNG...") # multiple inputs (variant 1) with open("name.pdf","wb") as f: f.write(img2pdf.convert("test1.jpg", "test2.png")) # multiple inputs (variant 2) with open("name.pdf","wb") as f: f.write(img2pdf.convert(["test1.jpg", "test2.png"])) # convert all files ending in .jpg inside a directory dirname = "/path/to/images" imgs = [] for fname in os.listdir(dirname): if not fname.endswith(".jpg"): continue path = os.path.join(dirname, fname) if os.path.isdir(path): continue imgs.append(path) with open("name.pdf","wb") as f: f.write(img2pdf.convert(imgs)) # convert all files ending in .jpg in a directory and its subdirectories dirname = "/path/to/images" imgs = [] for r, _, f in os.walk(dirname): for fname in f: if not fname.endswith(".jpg"): continue imgs.append(os.path.join(r, fname)) with open("name.pdf","wb") as f: f.write(img2pdf.convert(imgs)) # convert all files matching a glob import glob with open("name.pdf","wb") as f: f.write(img2pdf.convert(glob.glob("/path/to/*.jpg"))) # convert all files matching a glob using pathlib.Path from pathlib import Path with open("name.pdf","wb") as f: f.write(img2pdf.convert(*Path("/path").glob("**/*.jpg"))) # ignore invalid rotation values in the input images with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg'), rotation=img2pdf.Rotation.ifvalid) # writing to file descriptor with open("name.pdf","wb") as f1, open("test.jpg") as f2: img2pdf.convert(f2, outputstream=f1) # specify paper size (A4) a4inpt = (img2pdf.mm_to_pt(210),img2pdf.mm_to_pt(297)) layout_fun = img2pdf.get_layout_fun(a4inpt) with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun)) # use a fixed dpi of 300 instead of reading it from the image dpix = dpiy = 300 layout_fun = img2pdf.get_fixed_dpi_layout_fun((dpix, dpiy)) with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun)) # create a PDF/A-1b compliant document by passing an ICC profile with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', pdfa="/usr/share/color/icc/sRGB.icc")) Comparison to ImageMagick ------------------------- Create a large test image: $ convert logo: -resize 8000x original.jpg Convert it into PDF using ImageMagick and img2pdf: $ time img2pdf original.jpg -o img2pdf.pdf $ time convert original.jpg imagemagick.pdf Notice how ImageMagick took an order of magnitude longer to do the conversion than img2pdf. It also used twice the memory. Now extract the image data from both PDF documents and compare it to the original: $ pdfimages -all img2pdf.pdf tmp $ compare -metric AE original.jpg tmp-000.jpg null: 0 $ pdfimages -all imagemagick.pdf tmp $ compare -metric AE original.jpg tmp-000.jpg null: 118716 To get lossless output with ImageMagick we can use Zip compression but that unnecessarily increases the size of the output: $ convert original.jpg -compress Zip imagemagick.pdf $ pdfimages -all imagemagick.pdf tmp $ compare -metric AE original.jpg tmp-000.png null: 0 $ stat --format="%s %n" original.jpg img2pdf.pdf imagemagick.pdf 1535837 original.jpg 1536683 img2pdf.pdf 9397809 imagemagick.pdf Comparison to pdfLaTeX ---------------------- pdfLaTeX performs a lossless conversion from included images to PDF by default. If the input is a JPEG, then it simply embeds the JPEG into the PDF in the same way as img2pdf does it. But for other image formats it uses flate compression of the plain pixel data and thus needlessly increases the output file size: $ convert logo: -resize 8000x original.png $ cat << END > pdflatex.tex \documentclass{article} \usepackage{graphicx} \begin{document} \includegraphics{original.png} \end{document} END $ pdflatex pdflatex.tex $ stat --format="%s %n" original.png pdflatex.pdf 4500182 original.png 9318120 pdflatex.pdf Comparison to podofoimg2pdf --------------------------- Like pdfLaTeX, podofoimg2pdf is able to perform a lossless conversion from JPEG to PDF by plainly embedding the JPEG data into the pdf container. But just like pdfLaTeX it uses flate compression for all other file formats, thus sometimes resulting in larger files than necessary. $ convert logo: -resize 8000x original.png $ podofoimg2pdf out.pdf original.png stat --format="%s %n" original.png out.pdf 4500181 original.png 9335629 out.pdf It also only supports JPEG, PNG and TIF as input and lacks many of the convenience features of img2pdf like page sizes, borders, rotation and metadata. Comparison to Tesseract OCR --------------------------- Tesseract OCR comes closest to the functionality img2pdf provides. It is able to convert JPEG and PNG input to PDF without needlessly increasing the filesize and is at the same time lossless. So if your input is JPEG and PNG images, then you should safely be able to use Tesseract instead of img2pdf. For other input, Tesseract might not do a lossless conversion. For example it converts CMYK input to RGB and removes the alpha channel from images with transparency. For multipage TIFF or animated GIF, it will only convert the first frame. Comparison to econvert from ExactImage -------------------------------------- Like pdflatex and podofoimg2pf, econvert is able to embed JPEG images into PDF directly without re-encoding but when given other file formats, it stores them just using flate compressen, which unnecessarily increases the filesize. Furthermore, it throws an error with CMYK TIF input. It also doesn't store CMYK jpeg files as CMYK but converts them to RGB, so it's not lossless. When trying to feed it 16bit files, it errors out with Unhandled bps/spp combination. It also seems to choose JPEG encoding when using it on some file types (like palette images) making it again not lossless for that input as well. ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1745772903.5349615 img2pdf-0.6.1/setup.cfg0000644000175000017500000000004615003460550013723 0ustar00joschjosch[egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772836.0 img2pdf-0.6.1/setup.py0000644000175000017500000000325615003460444013624 0ustar00joschjoschimport sys from setuptools import setup VERSION = "0.6.1" INSTALL_REQUIRES = ( "Pillow", "pikepdf", ) setup( name="img2pdf", version=VERSION, author="Johannes Schauer Marin Rodrigues", author_email="josch@mister-muffin.de", description="Convert images to PDF via direct JPEG inclusion.", long_description=open("README.md").read(), long_description_content_type="text/markdown", license="LGPL", keywords="jpeg pdf converter", classifiers=[ "Development Status :: 5 - Production/Stable", "Intended Audience :: Developers", "Intended Audience :: Other Audience", "Environment :: Console", "Programming Language :: Python", "Programming Language :: Python :: 3", "Programming Language :: Python :: 3.5", "Programming Language :: Python :: Implementation :: CPython", "Programming Language :: Python :: Implementation :: PyPy", "License :: OSI Approved :: GNU Lesser General Public License v3 " "(LGPLv3)", "Natural Language :: English", "Operating System :: OS Independent", ], url="https://gitlab.mister-muffin.de/josch/img2pdf", download_url="https://gitlab.mister-muffin.de/josch/img2pdf/repository/" "archive.tar.gz?ref=" + VERSION, package_dir={"": "src"}, py_modules=["img2pdf", "jp2"], include_package_data=True, zip_safe=True, install_requires=INSTALL_REQUIRES, extras_require={ "gui": ("tkinter"), }, entry_points={ "setuptools.installation": ["eggsecutable = img2pdf:main"], "console_scripts": ["img2pdf = img2pdf:main"], "gui_scripts": ["img2pdf-gui = img2pdf:gui"], }, ) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1745772903.510961 img2pdf-0.6.1/src/0000755000175000017500000000000015003460550012671 5ustar00joschjosch././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1745772903.514961 img2pdf-0.6.1/src/img2pdf.egg-info/0000755000175000017500000000000015003460550015713 5ustar00joschjosch././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772903.0 img2pdf-0.6.1/src/img2pdf.egg-info/PKG-INFO0000644000175000017500000003262215003460547017023 0ustar00joschjoschMetadata-Version: 2.1 Name: img2pdf Version: 0.6.1 Summary: Convert images to PDF via direct JPEG inclusion. Home-page: https://gitlab.mister-muffin.de/josch/img2pdf Download-URL: https://gitlab.mister-muffin.de/josch/img2pdf/repository/archive.tar.gz?ref=0.6.1 Author: Johannes Schauer Marin Rodrigues Author-email: josch@mister-muffin.de License: LGPL Keywords: jpeg pdf converter Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: Intended Audience :: Other Audience Classifier: Environment :: Console Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.5 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3) Classifier: Natural Language :: English Classifier: Operating System :: OS Independent Description-Content-Type: text/markdown Provides-Extra: gui License-File: LICENSE [![Travis Status](https://travis-ci.com/josch/img2pdf.svg?branch=main)](https://app.travis-ci.com/josch/img2pdf) [![Appveyor Status](https://ci.appveyor.com/api/projects/status/2kws3wkqvi526llj/branch/main?svg=true)](https://ci.appveyor.com/project/josch/img2pdf/branch/main) img2pdf ======= Lossless conversion of raster images to PDF. You should use img2pdf if your priorities are (in this order): 1. **always lossless**: the image embedded in the PDF will always have the exact same color information for every pixel as the input 2. **small**: if possible, the difference in filesize between the input image and the output PDF will only be the overhead of the PDF container itself 3. **fast**: if possible, the input image is just pasted into the PDF document as-is without any CPU hungry re-encoding of the pixel data Conventional conversion software (like ImageMagick) would either: 1. not be lossless because lossy re-encoding to JPEG 2. not be small because using wasteful flate encoding of raw pixel data 3. not be fast because input data gets re-encoded Another advantage of not having to re-encode the input (in most common situations) is, that img2pdf is able to handle much larger input than other software, because the raw pixel data never has to be loaded into memory. The following table shows how img2pdf handles different input depending on the input file format and image color space. | Format | Colorspace | Result | | ------------------------------------- | ------------------------------------ | ------------- | | JPEG | any | direct | | JPEG2000 | any | direct | | PNG (non-interlaced, no transparency) | any | direct | | TIFF (CCITT Group 4) | 1-bit monochrome | direct | | JBIG2 (single-page generic coding) | 1-bit monochrome | direct | | any | any except CMYK and 1-bit monochrome | PNG Paeth | | any | 1-bit monochrome | CCITT Group 4 | | any | CMYK | flate | For JPEG, JPEG2000, non-interlaced PNG, TIFF images with CCITT Group 4 encoded data, and JBIG2 with single-page generic coding (e.g. using `jbig2enc`), img2pdf directly embeds the image data into the PDF without re-encoding it. It thus treats the PDF format merely as a container format for the image data. In these cases, img2pdf only increases the filesize by the size of the PDF container (typically around 500 to 700 bytes). Since data is only copied and not re-encoded, img2pdf is also typically faster than other solutions for these input formats. For all other input types, img2pdf first has to transform the pixel data to make it compatible with PDF. In most cases, the PNG Paeth filter is applied to the pixel data. For 1-bit monochrome input, CCITT Group 4 is used instead. Only for CMYK input no filter is applied before finally applying flate compression. Usage ----- The images must be provided as files because img2pdf needs to seek in the file descriptor. If no output file is specified with the `-o`/`--output` option, output will be done to stdout. A typical invocation is: $ img2pdf img1.png img2.jpg -o out.pdf The detailed documentation can be accessed by running: $ img2pdf --help With no command line arguments supplied, img2pdf will read a single image from standard input and write the resulting PDF to standard output. Here is an example for how to scan directly to PDF using scanimage(1) from SANE: $ scanimage --mode=Color --resolution=300 | pnmtojpeg -quality 90 | img2pdf > scan.pdf Bugs ---- - If you find a JPEG, JPEG2000, PNG or CCITT Group 4 encoded TIFF file that, when embedded into the PDF cannot be read by the Adobe Acrobat Reader, please contact me. - An error is produced if the input image is broken. This commonly happens if the input image has an invalid EXIF Orientation value of zero. Even though only nine different values from 1 to 9 are permitted, Anroid phones and Canon DSLR cameras produce JPEG images with the invalid value of zero. Either fix your input images with `exiftool` or similar software before passing the JPEG to `img2pdf` or run `img2pdf` with `--rotation=ifvalid` (if you run img2pdf from the commandline) or by passing `rotation=img2pdf.Rotation.ifvalid` as an argument to `convert()` when using img2pdf as a library. - img2pdf uses PIL (or Pillow) to obtain image meta data and to convert the input if necessary. To prevent decompression bomb denial of service attacks, Pillow limits the maximum number of pixels an input image is allowed to have. If you are sure that you know what you are doing, then you can disable this safeguard by passing the `--pillow-limit-break` option to img2pdf. This allows one to process even very large input images. Installation ------------ On a Debian- and Ubuntu-based systems, img2pdf can be installed from the official repositories: $ apt install img2pdf If you want to install it using pip, you can run: $ pip3 install img2pdf If you prefer to install from source code use: $ cd img2pdf/ $ pip3 install . To test the console script without installing the package on your system, use virtualenv: $ cd img2pdf/ $ virtualenv ve $ ve/bin/pip3 install . You can then test the converter using: $ ve/bin/img2pdf -o test.pdf src/tests/test.jpg If you don't want to setup Python on Windows, then head to the [releases](https://gitlab.mister-muffin.de/josch/img2pdf/releases) section and download the latest `img2pdf.exe`. GUI --- There exists an experimental GUI with all settings currently disabled. You can directly convert images to PDF but you cannot set any options via the GUI yet. If you are interested in adding more features to the PDF, please submit a merge request. The GUI is based on tkinter and works on Linux, Windows and MacOS. ![](screenshot.png) Library ------- The package can also be used as a library: import img2pdf # opening from filename with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg')) # opening from file handle with open("name.pdf","wb") as f1, open("test.jpg") as f2: f1.write(img2pdf.convert(f2)) # opening using pathlib with open("name.pdf","wb") as f: f.write(img2pdf.convert(pathlib.Path('test.jpg'))) # using in-memory image data with open("name.pdf","wb") as f: f.write(img2pdf.convert("\x89PNG...") # multiple inputs (variant 1) with open("name.pdf","wb") as f: f.write(img2pdf.convert("test1.jpg", "test2.png")) # multiple inputs (variant 2) with open("name.pdf","wb") as f: f.write(img2pdf.convert(["test1.jpg", "test2.png"])) # convert all files ending in .jpg inside a directory dirname = "/path/to/images" imgs = [] for fname in os.listdir(dirname): if not fname.endswith(".jpg"): continue path = os.path.join(dirname, fname) if os.path.isdir(path): continue imgs.append(path) with open("name.pdf","wb") as f: f.write(img2pdf.convert(imgs)) # convert all files ending in .jpg in a directory and its subdirectories dirname = "/path/to/images" imgs = [] for r, _, f in os.walk(dirname): for fname in f: if not fname.endswith(".jpg"): continue imgs.append(os.path.join(r, fname)) with open("name.pdf","wb") as f: f.write(img2pdf.convert(imgs)) # convert all files matching a glob import glob with open("name.pdf","wb") as f: f.write(img2pdf.convert(glob.glob("/path/to/*.jpg"))) # convert all files matching a glob using pathlib.Path from pathlib import Path with open("name.pdf","wb") as f: f.write(img2pdf.convert(*Path("/path").glob("**/*.jpg"))) # ignore invalid rotation values in the input images with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg'), rotation=img2pdf.Rotation.ifvalid) # writing to file descriptor with open("name.pdf","wb") as f1, open("test.jpg") as f2: img2pdf.convert(f2, outputstream=f1) # specify paper size (A4) a4inpt = (img2pdf.mm_to_pt(210),img2pdf.mm_to_pt(297)) layout_fun = img2pdf.get_layout_fun(a4inpt) with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun)) # use a fixed dpi of 300 instead of reading it from the image dpix = dpiy = 300 layout_fun = img2pdf.get_fixed_dpi_layout_fun((dpix, dpiy)) with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', layout_fun=layout_fun)) # create a PDF/A-1b compliant document by passing an ICC profile with open("name.pdf","wb") as f: f.write(img2pdf.convert('test.jpg', pdfa="/usr/share/color/icc/sRGB.icc")) Comparison to ImageMagick ------------------------- Create a large test image: $ convert logo: -resize 8000x original.jpg Convert it into PDF using ImageMagick and img2pdf: $ time img2pdf original.jpg -o img2pdf.pdf $ time convert original.jpg imagemagick.pdf Notice how ImageMagick took an order of magnitude longer to do the conversion than img2pdf. It also used twice the memory. Now extract the image data from both PDF documents and compare it to the original: $ pdfimages -all img2pdf.pdf tmp $ compare -metric AE original.jpg tmp-000.jpg null: 0 $ pdfimages -all imagemagick.pdf tmp $ compare -metric AE original.jpg tmp-000.jpg null: 118716 To get lossless output with ImageMagick we can use Zip compression but that unnecessarily increases the size of the output: $ convert original.jpg -compress Zip imagemagick.pdf $ pdfimages -all imagemagick.pdf tmp $ compare -metric AE original.jpg tmp-000.png null: 0 $ stat --format="%s %n" original.jpg img2pdf.pdf imagemagick.pdf 1535837 original.jpg 1536683 img2pdf.pdf 9397809 imagemagick.pdf Comparison to pdfLaTeX ---------------------- pdfLaTeX performs a lossless conversion from included images to PDF by default. If the input is a JPEG, then it simply embeds the JPEG into the PDF in the same way as img2pdf does it. But for other image formats it uses flate compression of the plain pixel data and thus needlessly increases the output file size: $ convert logo: -resize 8000x original.png $ cat << END > pdflatex.tex \documentclass{article} \usepackage{graphicx} \begin{document} \includegraphics{original.png} \end{document} END $ pdflatex pdflatex.tex $ stat --format="%s %n" original.png pdflatex.pdf 4500182 original.png 9318120 pdflatex.pdf Comparison to podofoimg2pdf --------------------------- Like pdfLaTeX, podofoimg2pdf is able to perform a lossless conversion from JPEG to PDF by plainly embedding the JPEG data into the pdf container. But just like pdfLaTeX it uses flate compression for all other file formats, thus sometimes resulting in larger files than necessary. $ convert logo: -resize 8000x original.png $ podofoimg2pdf out.pdf original.png stat --format="%s %n" original.png out.pdf 4500181 original.png 9335629 out.pdf It also only supports JPEG, PNG and TIF as input and lacks many of the convenience features of img2pdf like page sizes, borders, rotation and metadata. Comparison to Tesseract OCR --------------------------- Tesseract OCR comes closest to the functionality img2pdf provides. It is able to convert JPEG and PNG input to PDF without needlessly increasing the filesize and is at the same time lossless. So if your input is JPEG and PNG images, then you should safely be able to use Tesseract instead of img2pdf. For other input, Tesseract might not do a lossless conversion. For example it converts CMYK input to RGB and removes the alpha channel from images with transparency. For multipage TIFF or animated GIF, it will only convert the first frame. Comparison to econvert from ExactImage -------------------------------------- Like pdflatex and podofoimg2pf, econvert is able to embed JPEG images into PDF directly without re-encoding but when given other file formats, it stores them just using flate compressen, which unnecessarily increases the filesize. Furthermore, it throws an error with CMYK TIF input. It also doesn't store CMYK jpeg files as CMYK but converts them to RGB, so it's not lossless. When trying to feed it 16bit files, it errors out with Unhandled bps/spp combination. It also seems to choose JPEG encoding when using it on some file types (like palette images) making it again not lossless for that input as well. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772903.0 img2pdf-0.6.1/src/img2pdf.egg-info/SOURCES.txt0000644000175000017500000000150715003460547017610 0ustar00joschjoschCHANGES.rst LICENSE MANIFEST.in README.md setup.py test_comp.sh src/img2pdf.py src/img2pdf_test.py src/jp2.py src/img2pdf.egg-info/PKG-INFO src/img2pdf.egg-info/SOURCES.txt src/img2pdf.egg-info/dependency_links.txt src/img2pdf.egg-info/entry_points.txt src/img2pdf.egg-info/requires.txt src/img2pdf.egg-info/top_level.txt src/img2pdf.egg-info/zip-safe src/tests/input/CMYK.jpg src/tests/input/CMYK.tif src/tests/input/animation.gif src/tests/input/gray.png src/tests/input/mono.png src/tests/input/mono.tif src/tests/input/normal.jpg src/tests/input/normal.png src/tests/output/CMYK.jpg.pdf src/tests/output/CMYK.tif.pdf src/tests/output/animation.gif.pdf src/tests/output/gray.png.pdf src/tests/output/mono.jb2.pdf src/tests/output/mono.png.pdf src/tests/output/mono.tif.pdf src/tests/output/normal.jpg.pdf src/tests/output/normal.png.pdf././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772903.0 img2pdf-0.6.1/src/img2pdf.egg-info/dependency_links.txt0000644000175000017500000000000115003460547021767 0ustar00joschjosch ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772903.0 img2pdf-0.6.1/src/img2pdf.egg-info/entry_points.txt0000644000175000017500000000021115003460547021211 0ustar00joschjosch[console_scripts] img2pdf = img2pdf:main [gui_scripts] img2pdf-gui = img2pdf:gui [setuptools.installation] eggsecutable = img2pdf:main ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772903.0 img2pdf-0.6.1/src/img2pdf.egg-info/requires.txt0000644000175000017500000000003615003460547020320 0ustar00joschjoschPillow pikepdf [gui] tkinter ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772903.0 img2pdf-0.6.1/src/img2pdf.egg-info/top_level.txt0000644000175000017500000000001415003460547020446 0ustar00joschjoschimg2pdf jp2 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1698475203.0 img2pdf-0.6.1/src/img2pdf.egg-info/zip-safe0000644000175000017500000000000114517126303017347 0ustar00joschjosch ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745772852.0 img2pdf-0.6.1/src/img2pdf.py0000755000175000017500000052040215003460464014605 0ustar00joschjosch#!/usr/bin/env python3 # -*- coding: utf-8 -*- # Copyright (C) 2012-2021 Johannes Schauer Marin Rodrigues # # This program is free software: you can redistribute it and/or # modify it under the terms of the GNU Lesser General Public # License as published by the Free Software Foundation, either # version 3 of the License, or (at your option) any later # version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public # License along with this program. If not, see # . import sys import os import zlib import argparse from PIL import Image, TiffImagePlugin, GifImagePlugin, ImageCms, ExifTags if hasattr(GifImagePlugin, "LoadingStrategy"): # Pillow 9.0.0 started emitting all frames but the first as RGB instead of # P to make sure that more than 256 colors can be represented. But palette # images compress far better than RGB images in PDF so we instruct Pillow # to only emit RGB frames if the palette differs and return P otherwise. # This works since Pillow 9.1.0. GifImagePlugin.LOADING_STRATEGY = ( GifImagePlugin.LoadingStrategy.RGB_AFTER_DIFFERENT_PALETTE_ONLY ) # TiffImagePlugin.DEBUG = True from PIL.ExifTags import TAGS from datetime import datetime, timezone import jp2 from enum import Enum from io import BytesIO import logging import struct import platform import hashlib from itertools import chain import re import io logger = logging.getLogger(__name__) have_pdfrw = True try: import pdfrw except ImportError: have_pdfrw = False have_pikepdf = True try: import pikepdf except ImportError: have_pikepdf = False __version__ = "0.6.1" default_dpi = 96.0 papersizes = { "letter": "8.5inx11in", "a0": "841mmx1189mm", "a1": "594mmx841mm", "a2": "420mmx594mm", "a3": "297mmx420mm", "a4": "210mmx297mm", "a5": "148mmx210mm", "a6": "105mmx148mm", "b0": "1000mmx1414mm", "b1": "707mmx1000mm", "b2": "500mmx707mm", "b3": "353mmx500mm", "b4": "250mmx353mm", "b5": "176mmx250mm", "b6": "125mmx176mm", "jb0": "1030mmx1456mm", "jb1": "728mmx1030mm", "jb2": "515mmx728mm", "jb3": "364mmx515mm", "jb4": "257mmx364mm", "jb5": "182mmx257mm", "jb6": "128mmx182mm", "legal": "8.5inx14in", "tabloid": "11inx17in", } papernames = { "letter": "Letter", "a0": "A0", "a1": "A1", "a2": "A2", "a3": "A3", "a4": "A4", "a5": "A5", "a6": "A6", "b0": "B0", "b1": "B1", "b2": "B2", "b3": "B3", "b4": "B4", "b5": "B5", "b6": "B6", "jb0": "JB0", "jb1": "JB1", "jb2": "JB2", "jb3": "JB3", "jb4": "JB4", "jb5": "JB5", "jb6": "JB6", "legal": "Legal", "tabloid": "Tabloid", } Engine = Enum("Engine", "internal pdfrw pikepdf") Rotation = Enum("Rotation", "auto none ifvalid 0 90 180 270") FitMode = Enum("FitMode", "into fill exact shrink enlarge") PageOrientation = Enum("PageOrientation", "portrait landscape") Colorspace = Enum("Colorspace", "RGB RGBA L LA 1 CMYK CMYK;I P PA other") ImageFormat = Enum( "ImageFormat", "JPEG JPEG2000 CCITTGroup4 PNG GIF TIFF MPO MIFF JBIG2 other" ) PageMode = Enum("PageMode", "none outlines thumbs") PageLayout = Enum( "PageLayout", "single onecolumn twocolumnright twocolumnleft twopageright twopageleft", ) Magnification = Enum("Magnification", "fit fith fitbh") ImgSize = Enum("ImgSize", "abs perc dpi") Unit = Enum("Unit", "pt cm mm inch") ImgUnit = Enum("ImgUnit", "pt cm mm inch perc dpi") TIFFBitRevTable = [ 0x00, 0x80, 0x40, 0xC0, 0x20, 0xA0, 0x60, 0xE0, 0x10, 0x90, 0x50, 0xD0, 0x30, 0xB0, 0x70, 0xF0, 0x08, 0x88, 0x48, 0xC8, 0x28, 0xA8, 0x68, 0xE8, 0x18, 0x98, 0x58, 0xD8, 0x38, 0xB8, 0x78, 0xF8, 0x04, 0x84, 0x44, 0xC4, 0x24, 0xA4, 0x64, 0xE4, 0x14, 0x94, 0x54, 0xD4, 0x34, 0xB4, 0x74, 0xF4, 0x0C, 0x8C, 0x4C, 0xCC, 0x2C, 0xAC, 0x6C, 0xEC, 0x1C, 0x9C, 0x5C, 0xDC, 0x3C, 0xBC, 0x7C, 0xFC, 0x02, 0x82, 0x42, 0xC2, 0x22, 0xA2, 0x62, 0xE2, 0x12, 0x92, 0x52, 0xD2, 0x32, 0xB2, 0x72, 0xF2, 0x0A, 0x8A, 0x4A, 0xCA, 0x2A, 0xAA, 0x6A, 0xEA, 0x1A, 0x9A, 0x5A, 0xDA, 0x3A, 0xBA, 0x7A, 0xFA, 0x06, 0x86, 0x46, 0xC6, 0x26, 0xA6, 0x66, 0xE6, 0x16, 0x96, 0x56, 0xD6, 0x36, 0xB6, 0x76, 0xF6, 0x0E, 0x8E, 0x4E, 0xCE, 0x2E, 0xAE, 0x6E, 0xEE, 0x1E, 0x9E, 0x5E, 0xDE, 0x3E, 0xBE, 0x7E, 0xFE, 0x01, 0x81, 0x41, 0xC1, 0x21, 0xA1, 0x61, 0xE1, 0x11, 0x91, 0x51, 0xD1, 0x31, 0xB1, 0x71, 0xF1, 0x09, 0x89, 0x49, 0xC9, 0x29, 0xA9, 0x69, 0xE9, 0x19, 0x99, 0x59, 0xD9, 0x39, 0xB9, 0x79, 0xF9, 0x05, 0x85, 0x45, 0xC5, 0x25, 0xA5, 0x65, 0xE5, 0x15, 0x95, 0x55, 0xD5, 0x35, 0xB5, 0x75, 0xF5, 0x0D, 0x8D, 0x4D, 0xCD, 0x2D, 0xAD, 0x6D, 0xED, 0x1D, 0x9D, 0x5D, 0xDD, 0x3D, 0xBD, 0x7D, 0xFD, 0x03, 0x83, 0x43, 0xC3, 0x23, 0xA3, 0x63, 0xE3, 0x13, 0x93, 0x53, 0xD3, 0x33, 0xB3, 0x73, 0xF3, 0x0B, 0x8B, 0x4B, 0xCB, 0x2B, 0xAB, 0x6B, 0xEB, 0x1B, 0x9B, 0x5B, 0xDB, 0x3B, 0xBB, 0x7B, 0xFB, 0x07, 0x87, 0x47, 0xC7, 0x27, 0xA7, 0x67, 0xE7, 0x17, 0x97, 0x57, 0xD7, 0x37, 0xB7, 0x77, 0xF7, 0x0F, 0x8F, 0x4F, 0xCF, 0x2F, 0xAF, 0x6F, 0xEF, 0x1F, 0x9F, 0x5F, 0xDF, 0x3F, 0xBF, 0x7F, 0xFF, ] class NegativeDimensionError(Exception): pass class UnsupportedColorspaceError(Exception): pass class ImageOpenError(Exception): pass class JpegColorspaceError(Exception): pass class PdfTooLargeError(Exception): pass class AlphaChannelError(Exception): pass class ExifOrientationError(Exception): pass # temporary change the attribute of an object using a context manager class temp_attr: def __init__(self, obj, field, value): self.obj = obj self.field = field self.value = value def __enter__(self): self.exists = False if hasattr(self.obj, self.field): self.exists = True self.old_value = getattr(self.obj, self.field) logger.debug(f"setting {self.obj}.{self.field} = {self.value}") setattr(self.obj, self.field, self.value) def __exit__(self, exctype, excinst, exctb): if self.exists: setattr(self.obj, self.field, self.old_value) else: delattr(self.obj, self.field) # without pdfrw this function is a no-op def my_convert_load(string): return string def parse(cont, indent=1): if type(cont) is dict: return ( b"<<\n" + b"\n".join( [ 4 * indent * b" " + k + b" " + parse(v, indent + 1) for k, v in sorted(cont.items()) ] ) + b"\n" + 4 * (indent - 1) * b" " + b">>" ) elif type(cont) is int: return str(cont).encode() elif type(cont) is float: if int(cont) == cont: return parse(int(cont)) else: return ("%0.4f" % cont).rstrip("0").encode() elif isinstance(cont, MyPdfDict): # if cont got an identifier, then addobj() has been called with it # and a link to it will be added, otherwise add it inline if hasattr(cont, "identifier"): return ("%d 0 R" % cont.identifier).encode() else: return parse(cont.content, indent) elif type(cont) is str or isinstance(cont, bytes): if type(cont) is str and type(cont) is not bytes: raise TypeError( "parse must be passed a bytes object in py3. Got: %s" % cont ) return cont elif isinstance(cont, list): return b"[ " + b" ".join([parse(c, indent) for c in cont]) + b" ]" else: raise TypeError("cannot handle type %s with content %s" % (type(cont), cont)) class MyPdfDict(object): def __init__(self, *args, **kw): self.content = dict() if args: if len(args) == 1: args = args[0] self.content.update(args) self.stream = None for key, value in kw.items(): if key == "stream": self.stream = value self.content[MyPdfName.Length] = len(value) elif key == "indirect": pass else: self.content[getattr(MyPdfName, key)] = value def tostring(self): if self.stream is not None: return ( ("%d 0 obj\n" % self.identifier).encode() + parse(self.content) + b"\nstream\n" + self.stream + b"\nendstream\nendobj\n" ) else: return ( ("%d 0 obj\n" % self.identifier).encode() + parse(self.content) + b"\nendobj\n" ) def __setitem__(self, key, value): self.content[key] = value def __getitem__(self, key): return self.content[key] def __contains__(self, key): return key in self.content class MyPdfName: def __getattr__(self, name): return b"/" + name.encode("ascii") MyPdfName = MyPdfName() class MyPdfObject(bytes): def __new__(cls, string): return bytes.__new__(cls, string.encode("ascii")) class MyPdfArray(list): pass class MyPdfWriter: def __init__(self): self.objects = [] # create an incomplete pages object so that a /Parent entry can be # added to each page self.pages = MyPdfDict(Type=MyPdfName.Pages, Kids=[], Count=0) self.catalog = MyPdfDict(Pages=self.pages, Type=MyPdfName.Catalog) self.pagearray = [] def addobj(self, obj): newid = len(self.objects) + 1 obj.identifier = newid self.objects.append(obj) def tostream(self, info, stream, version="1.3", ident=None): xreftable = list() # justification of the random binary garbage in the header from # adobe: # # > Note: If a PDF file contains binary data, as most do (see Section # > 3.1, “Lexical Conventions”), it is recommended that the header # > line be immediately followed by a comment line containing at # > least four binary characters—that is, characters whose codes are # > 128 or greater. This ensures proper behavior of file transfer # > applications that inspect data near the beginning of a file to # > determine whether to treat the file’s contents as text or as # > binary. # # the choice of binary characters is arbitrary but those four seem to # be used elsewhere. pdfheader = ("%%PDF-%s\n" % version).encode("ascii") pdfheader += b"%\xe2\xe3\xcf\xd3\n" stream.write(pdfheader) # From section 3.4.3 of the PDF Reference (version 1.7): # # > Each entry is exactly 20 bytes long, including the end-of-line # > marker. # > # > [...] # > # > The format of an in-use entry is # > nnnnnnnnnn ggggg n eol # > where # > nnnnnnnnnn is a 10-digit byte offset # > ggggg is a 5-digit generation number # > n is a literal keyword identifying this as an in-use entry # > eol is a 2-character end-of-line sequence # > # > [...] # > # > If the file’s end-of-line marker is a single character (either a # > carriage return or a line feed), it is preceded by a single space; # # Since we chose to use a single character eol marker, we precede it by # a space pos = len(pdfheader) xreftable.append(b"0000000000 65535 f \n") for o in self.objects: xreftable.append(("%010d 00000 n \n" % pos).encode()) content = o.tostring() stream.write(content) pos += len(content) xrefoffset = pos stream.write(b"xref\n") stream.write(("0 %d\n" % len(xreftable)).encode()) for x in xreftable: stream.write(x) stream.write(b"trailer\n") trailer = {b"/Size": len(xreftable), b"/Info": info, b"/Root": self.catalog} if ident is not None: md5 = hashlib.md5(ident).hexdigest().encode("ascii") trailer[b"/ID"] = b"[<%s><%s>]" % (md5, md5) stream.write(parse(trailer) + b"\n") stream.write(b"startxref\n") stream.write(("%d\n" % xrefoffset).encode()) stream.write(b"%%EOF\n") return def addpage(self, page): page[b"/Parent"] = self.pages self.pagearray.append(page) self.pages.content[b"/Kids"].append(page) self.pages.content[b"/Count"] += 1 self.addobj(page) class MyPdfString: @classmethod def encode(cls, string, hextype=False): if hextype: return ( b"< " + b" ".join(("%06x" % c).encode("ascii") for c in string) + b" >" ) else: try: string = string.encode("ascii") except UnicodeEncodeError: string = b"\xfe\xff" + string.encode("utf-16-be") # We should probably encode more here because at least # ghostscript interpretes a carriage return byte (0x0D) as a # new line byte (0x0A) # PDF supports: \n, \r, \t, \b and \f string = string.replace(b"\\", b"\\\\") string = string.replace(b"(", b"\\(") string = string.replace(b")", b"\\)") return b"(" + string + b")" class pdfdoc(object): def __init__( self, engine=Engine.internal, version="1.3", title=None, author=None, creator=None, producer=None, creationdate=None, moddate=None, subject=None, keywords=None, nodate=False, panes=None, initial_page=None, magnification=None, page_layout=None, fit_window=False, center_window=False, fullscreen=False, pdfa=None, ): if engine is None: if have_pikepdf: engine = Engine.pikepdf elif have_pdfrw: engine = Engine.pdfrw else: engine = Engine.internal if engine == Engine.pikepdf: PdfWriter = pikepdf.new PdfDict = pikepdf.Dictionary PdfName = pikepdf.Name elif engine == Engine.pdfrw: from pdfrw import PdfWriter, PdfDict, PdfName, PdfString elif engine == Engine.internal: PdfWriter = MyPdfWriter PdfDict = MyPdfDict PdfName = MyPdfName PdfString = MyPdfString else: raise ValueError("unknown engine: %s" % engine) self.writer = PdfWriter() if engine != Engine.pikepdf: self.writer.docinfo = PdfDict(indirect=True) def datetime_to_pdfdate(dt): return dt.astimezone(tz=timezone.utc).strftime("%Y%m%d%H%M%SZ") for k in ["Title", "Author", "Creator", "Producer", "Subject"]: v = locals()[k.lower()] if v is None or v == "": continue if engine != Engine.pikepdf: v = PdfString.encode(v) self.writer.docinfo[getattr(PdfName, k)] = v now = datetime.now().astimezone() for k in ["CreationDate", "ModDate"]: v = locals()[k.lower()] if v is None and nodate: continue if v is None: v = now v = ("D:" + datetime_to_pdfdate(v)).encode("ascii") if engine == Engine.internal: v = b"(" + v + b")" self.writer.docinfo[getattr(PdfName, k)] = v if keywords is not None: if engine == Engine.pikepdf: self.writer.docinfo[PdfName.Keywords] = ",".join(keywords) else: self.writer.docinfo[PdfName.Keywords] = PdfString.encode( ",".join(keywords) ) def datetime_to_xmpdate(dt): return dt.astimezone(tz=timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ") self.xmp = b""" %s %s """ % ( b" pdf:Producer='%s'" % producer.encode("ascii") if producer is not None else b"", b"" if creationdate is None and nodate else b"%s" % datetime_to_xmpdate(now if creationdate is None else creationdate).encode( "ascii" ), b"" if moddate is None and nodate else b"%s" % datetime_to_xmpdate(now if moddate is None else moddate).encode("ascii"), ) if engine != Engine.pikepdf: # this is done because pdfrw adds info, catalog and pages as the first # three objects in this order if engine == Engine.internal: self.writer.addobj(self.writer.docinfo) self.writer.addobj(self.writer.catalog) self.writer.addobj(self.writer.pages) self.panes = panes self.initial_page = initial_page self.magnification = magnification self.page_layout = page_layout self.fit_window = fit_window self.center_window = center_window self.fullscreen = fullscreen self.engine = engine self.output_version = version self.pdfa = pdfa def add_imagepage( self, color, imgwidthpx, imgheightpx, imgformat, imgdata, smaskdata, imgwidthpdf, imgheightpdf, imgxpdf, imgypdf, pagewidth, pageheight, userunit=None, palette=None, inverted=False, depth=0, rotate=0, cropborder=None, bleedborder=None, trimborder=None, artborder=None, iccp=None, ): assert ( color not in [Colorspace.RGBA, Colorspace.LA] or (imgformat == ImageFormat.PNG and smaskdata is not None) or imgformat == ImageFormat.JPEG2000 ) if self.engine == Engine.pikepdf: PdfArray = pikepdf.Array PdfDict = pikepdf.Dictionary PdfName = pikepdf.Name elif self.engine == Engine.pdfrw: from pdfrw import PdfDict, PdfName, PdfObject, PdfString from pdfrw.py23_diffs import convert_load elif self.engine == Engine.internal: PdfDict = MyPdfDict PdfName = MyPdfName PdfObject = MyPdfObject PdfString = MyPdfString convert_load = my_convert_load else: raise ValueError("unknown engine: %s" % self.engine) TrueObject = True if self.engine == Engine.pikepdf else PdfObject("true") FalseObject = False if self.engine == Engine.pikepdf else PdfObject("false") if color == Colorspace["1"] or color == Colorspace.L or color == Colorspace.LA: colorspace = PdfName.DeviceGray elif color == Colorspace.RGB or color == Colorspace.RGBA: if color == Colorspace.RGBA and imgformat == ImageFormat.JPEG2000: # there is no DeviceRGBA and for JPXDecode it is okay to have # no colorspace as the pdf reader is supposed to get this info # from the jpeg2000 payload itself colorspace = None else: colorspace = PdfName.DeviceRGB elif color == Colorspace.CMYK or color == Colorspace["CMYK;I"]: colorspace = PdfName.DeviceCMYK elif color == Colorspace.P: if self.engine == Engine.pdfrw: # https://github.com/pmaupin/pdfrw/issues/128 # https://github.com/pmaupin/pdfrw/issues/147 raise Exception( "pdfrw does not support hex strings for " "palette image input, re-run with " "--engine=internal or --engine=pikepdf" ) assert len(palette) % 3 == 0 colorspace = [ PdfName.Indexed, PdfName.DeviceRGB, (len(palette) // 3) - 1, bytes(palette) if self.engine == Engine.pikepdf else PdfString.encode( [ int.from_bytes(palette[i : i + 3], "big") for i in range(0, len(palette), 3) ], hextype=True, ), ] else: raise UnsupportedColorspaceError("unsupported color space: %s" % color.name) if iccp is not None: if self.engine == Engine.pikepdf: iccpdict = self.writer.make_stream(iccp) else: iccpdict = PdfDict(stream=convert_load(iccp)) iccpdict[PdfName.Alternate] = colorspace if ( color == Colorspace["1"] or color == Colorspace.L or color == Colorspace.LA ): iccpdict[PdfName.N] = 1 elif color == Colorspace.RGB or color == Colorspace.RGBA: iccpdict[PdfName.N] = 3 elif color == Colorspace.CMYK or color == Colorspace["CMYK;I"]: iccpdict[PdfName.N] = 4 elif color == Colorspace.P: raise Exception("Cannot have Palette images with ICC profile") colorspace = [PdfName.ICCBased, iccpdict] # either embed the whole jpeg or deflate the bitmap representation if imgformat is ImageFormat.JPEG: ofilter = PdfName.DCTDecode elif imgformat is ImageFormat.JPEG2000: ofilter = PdfName.JPXDecode self.output_version = "1.5" # jpeg2000 needs pdf 1.5 elif imgformat is ImageFormat.CCITTGroup4: ofilter = [PdfName.CCITTFaxDecode] elif imgformat is ImageFormat.JBIG2: ofilter = PdfName.JBIG2Decode # JBIG2Decode requires PDF 1.4 if self.output_version < "1.4": self.output_version = "1.4" else: ofilter = PdfName.FlateDecode if self.engine == Engine.pikepdf: image = self.writer.make_stream(imgdata) else: image = PdfDict(stream=convert_load(imgdata)) image[PdfName.Type] = PdfName.XObject image[PdfName.Subtype] = PdfName.Image image[PdfName.Filter] = ofilter image[PdfName.Width] = imgwidthpx image[PdfName.Height] = imgheightpx if colorspace is not None: image[PdfName.ColorSpace] = colorspace image[PdfName.BitsPerComponent] = depth smask = None if color == Colorspace["CMYK;I"]: # Inverts all four channels image[PdfName.Decode] = [1, 0, 1, 0, 1, 0, 1, 0] if imgformat is ImageFormat.CCITTGroup4: decodeparms = PdfDict() # The default for the K parameter is 0 which indicates Group 3 1-D # encoding. We set it to -1 because we want Group 4 encoding. decodeparms[PdfName.K] = -1 if inverted: decodeparms[PdfName.BlackIs1] = FalseObject else: decodeparms[PdfName.BlackIs1] = TrueObject decodeparms[PdfName.Columns] = imgwidthpx decodeparms[PdfName.Rows] = imgheightpx image[PdfName.DecodeParms] = [decodeparms] elif imgformat is ImageFormat.PNG: if smaskdata is not None: if self.engine == Engine.pikepdf: smask = self.writer.make_stream(smaskdata) else: smask = PdfDict(stream=convert_load(smaskdata)) smask[PdfName.Type] = PdfName.XObject smask[PdfName.Subtype] = PdfName.Image smask[PdfName.Filter] = PdfName.FlateDecode smask[PdfName.Width] = imgwidthpx smask[PdfName.Height] = imgheightpx smask[PdfName.ColorSpace] = PdfName.DeviceGray smask[PdfName.BitsPerComponent] = depth decodeparms = PdfDict() decodeparms[PdfName.Predictor] = 15 decodeparms[PdfName.Colors] = 1 decodeparms[PdfName.Columns] = imgwidthpx decodeparms[PdfName.BitsPerComponent] = depth smask[PdfName.DecodeParms] = decodeparms image[PdfName.SMask] = smask # /SMask requires PDF 1.4 if self.output_version < "1.4": self.output_version = "1.4" decodeparms = PdfDict() decodeparms[PdfName.Predictor] = 15 if color in [Colorspace.P, Colorspace["1"], Colorspace.L, Colorspace.LA]: decodeparms[PdfName.Colors] = 1 else: decodeparms[PdfName.Colors] = 3 decodeparms[PdfName.Columns] = imgwidthpx decodeparms[PdfName.BitsPerComponent] = depth image[PdfName.DecodeParms] = decodeparms text = ( "q\n%0.4f 0 0 %0.4f %0.4f %0.4f cm\n/Im0 Do\nQ" % (imgwidthpdf, imgheightpdf, imgxpdf, imgypdf) ).encode("ascii") if self.engine == Engine.pikepdf: content = self.writer.make_stream(text) else: content = PdfDict(stream=convert_load(text)) resources = PdfDict(XObject=PdfDict(Im0=image)) if self.engine == Engine.pikepdf: page = self.writer.add_blank_page(page_size=(pagewidth, pageheight)) else: page = PdfDict(indirect=True) page[PdfName.Type] = PdfName.Page page[PdfName.MediaBox] = [0, 0, pagewidth, pageheight] # 14.11.2 Page Boundaries # ... # The crop, bleed, trim, and art boxes shall not ordinarily extend # beyond the boundaries of the media box. If they do, they are # effectively reduced to their intersection with the media box. if cropborder is not None: page[PdfName.CropBox] = [ cropborder[1], cropborder[0], pagewidth - cropborder[1], pageheight - cropborder[0], ] if bleedborder is None: if PdfName.CropBox in page: page[PdfName.BleedBox] = page[PdfName.CropBox] else: page[PdfName.BleedBox] = [ bleedborder[1], bleedborder[0], pagewidth - bleedborder[1], pageheight - bleedborder[0], ] if trimborder is None: if PdfName.CropBox in page: page[PdfName.TrimBox] = page[PdfName.CropBox] else: page[PdfName.TrimBox] = [ trimborder[1], trimborder[0], pagewidth - trimborder[1], pageheight - trimborder[0], ] if artborder is None: if PdfName.CropBox in page: page[PdfName.ArtBox] = page[PdfName.CropBox] else: page[PdfName.ArtBox] = [ artborder[1], artborder[0], pagewidth - artborder[1], pageheight - artborder[0], ] page[PdfName.Resources] = resources page[PdfName.Contents] = content if rotate != 0: page[PdfName.Rotate] = rotate if userunit is not None: # /UserUnit requires PDF 1.6 if self.output_version < "1.6": self.output_version = "1.6" page[PdfName.UserUnit] = userunit if self.engine != Engine.pikepdf: self.writer.addpage(page) if self.engine == Engine.internal: self.writer.addobj(content) self.writer.addobj(image) if smask is not None: self.writer.addobj(smask) if iccp is not None: self.writer.addobj(iccpdict) def tostring(self): stream = BytesIO() self.tostream(stream) return stream.getvalue() def finalize(self): if self.engine == Engine.pikepdf: PdfArray = pikepdf.Array PdfDict = pikepdf.Dictionary PdfName = pikepdf.Name elif self.engine == Engine.pdfrw: from pdfrw import PdfDict, PdfName, PdfArray, PdfObject from pdfrw.py23_diffs import convert_load elif self.engine == Engine.internal: PdfDict = MyPdfDict PdfName = MyPdfName PdfObject = MyPdfObject PdfArray = MyPdfArray convert_load = my_convert_load else: raise ValueError("unknown engine: %s" % self.engine) NullObject = None if self.engine == Engine.pikepdf else PdfObject("null") TrueObject = True if self.engine == Engine.pikepdf else PdfObject("true") # We fill the catalog with more information like /ViewerPreferences, # /PageMode, /PageLayout or /OpenAction because the latter refers to a # page object which has to be present so that we can get its id. # # Furthermore, if using pdfrw, the trailer is cleared every time a page # is added, so we can only start using it after all pages have been # written. if self.engine == Engine.pikepdf: catalog = self.writer.Root elif self.engine == Engine.pdfrw: catalog = self.writer.trailer.Root elif self.engine == Engine.internal: catalog = self.writer.catalog else: raise ValueError("unknown engine: %s" % self.engine) if ( self.fullscreen or self.fit_window or self.center_window or self.panes is not None ): catalog[PdfName.ViewerPreferences] = PdfDict() if self.fullscreen: # this setting might be overwritten later by the page mode catalog[PdfName.ViewerPreferences][ PdfName.NonFullScreenPageMode ] = PdfName.UseNone if self.panes == PageMode.thumbs: catalog[PdfName.ViewerPreferences][ PdfName.NonFullScreenPageMode ] = PdfName.UseThumbs # this setting might be overwritten later if fullscreen catalog[PdfName.PageMode] = PdfName.UseThumbs elif self.panes == PageMode.outlines: catalog[PdfName.ViewerPreferences][ PdfName.NonFullScreenPageMode ] = PdfName.UseOutlines # this setting might be overwritten later if fullscreen catalog[PdfName.PageMode] = PdfName.UseOutlines elif self.panes in [PageMode.none, None]: pass else: raise ValueError("unknown page mode: %s" % self.panes) if self.fit_window: catalog[PdfName.ViewerPreferences][PdfName.FitWindow] = TrueObject if self.center_window: catalog[PdfName.ViewerPreferences][PdfName.CenterWindow] = TrueObject if self.fullscreen: catalog[PdfName.PageMode] = PdfName.FullScreen # see table 8.2 in section 8.2.1 in # http://partners.adobe.com/public/developer/en/pdf/PDFReference16.pdf # Fit - Fits the page to the window. # FitH - Fits the width of the page to the window. # FitV - Fits the height of the page to the window. # FitR - Fits the rectangle specified by the four coordinates to the # window. # FitB - Fits the page bounding box to the window. This basically # reduces the amount of whitespace (margins) that is displayed # and thus focussing more on the text content. # FitBH - Fits the width of the page bounding box to the window. # FitBV - Fits the height of the page bounding box to the window. # by default the initial page is the first one if self.engine == Engine.pikepdf: initial_page = self.writer.pages[0] else: initial_page = self.writer.pagearray[0] # we set the open action here to make sure we open on the requested # initial page but this value might be overwritten by a custom open # action later while still taking the requested initial page into # account if self.initial_page is not None: if self.engine == Engine.pikepdf: initial_page = self.writer.pages[self.initial_page - 1] else: initial_page = self.writer.pagearray[self.initial_page - 1] catalog[PdfName.OpenAction] = PdfArray( [initial_page, PdfName.XYZ, NullObject, NullObject, 0] ) # The /OpenAction array must contain the page as an indirect object. # This changed some time after 4.2.0 and on or before 5.0.0 and current # versions require to use .obj or otherwise we get: # TypeError: Can't convert ObjectHelper (or subclass) to Object # implicitly. Use .obj to get access the underlying object. # See https://github.com/pikepdf/pikepdf/issues/313 for details. if self.engine == Engine.pikepdf: if isinstance(initial_page, pikepdf.Page): initial_page = self.writer.make_indirect(initial_page.obj) else: initial_page = self.writer.make_indirect(initial_page) if self.magnification == Magnification.fit: catalog[PdfName.OpenAction] = PdfArray([initial_page, PdfName.Fit]) elif self.magnification == Magnification.fith: pagewidth = initial_page[PdfName.MediaBox][2] catalog[PdfName.OpenAction] = PdfArray( [initial_page, PdfName.FitH, pagewidth] ) elif self.magnification == Magnification.fitbh: # quick hack to determine the image width on the page imgwidth = float(initial_page[PdfName.Contents].stream.split()[4]) catalog[PdfName.OpenAction] = PdfArray( [initial_page, PdfName.FitBH, imgwidth] ) elif isinstance(self.magnification, float): catalog[PdfName.OpenAction] = PdfArray( [initial_page, PdfName.XYZ, NullObject, NullObject, self.magnification] ) elif self.magnification is None: pass else: raise ValueError("unknown magnification: %s" % self.magnification) if self.page_layout == PageLayout.single: catalog[PdfName.PageLayout] = PdfName.SinglePage elif self.page_layout == PageLayout.onecolumn: catalog[PdfName.PageLayout] = PdfName.OneColumn elif self.page_layout == PageLayout.twocolumnright: catalog[PdfName.PageLayout] = PdfName.TwoColumnRight elif self.page_layout == PageLayout.twocolumnleft: catalog[PdfName.PageLayout] = PdfName.TwoColumnLeft elif self.page_layout == PageLayout.twopageright: catalog[PdfName.PageLayout] = PdfName.TwoPageRight if self.output_version < "1.5": self.output_version = "1.5" elif self.page_layout == PageLayout.twopageleft: catalog[PdfName.PageLayout] = PdfName.TwoPageLeft if self.output_version < "1.5": self.output_version = "1.5" elif self.page_layout is None: pass else: raise ValueError("unknown page layout: %s" % self.page_layout) if self.pdfa is not None: if self.engine == Engine.pikepdf: metadata = self.writer.make_stream(self.xmp) else: metadata = PdfDict(stream=convert_load(self.xmp)) metadata[PdfName.Subtype] = PdfName.XML metadata[PdfName.Type] = PdfName.Metadata with open(self.pdfa, "rb") as f: icc = f.read() intents = PdfDict() if self.engine == Engine.pikepdf: iccstream = self.writer.make_stream(icc) iccstream.stream_dict.N = 3 else: iccstream = PdfDict(stream=convert_load(zlib.compress(icc))) iccstream[PdfName.N] = 3 iccstream[PdfName.Filter] = PdfName.FlateDecode intents[PdfName.S] = PdfName.GTS_PDFA1 intents[PdfName.Type] = PdfName.OutputIntent intents[PdfName.OutputConditionIdentifier] = ( b"sRGB" if self.engine == Engine.pikepdf else b"(sRGB)" ) intents[PdfName.DestOutputProfile] = iccstream catalog[PdfName.OutputIntents] = PdfArray([intents]) catalog[PdfName.Metadata] = metadata if self.engine == Engine.internal: self.writer.addobj(metadata) self.writer.addobj(iccstream) def tostream(self, outputstream): # write out the PDF # this assumes that finalize() has been invoked beforehand by the caller if self.engine == Engine.pikepdf: kwargs = {} if pikepdf.__version__ >= "6.2.0": kwargs["deterministic_id"] = True self.writer.save( outputstream, min_version=self.output_version, linearize=True, **kwargs ) elif self.engine == Engine.pdfrw: from pdfrw import PdfName, PdfArray self.writer.trailer.Info = self.writer.docinfo # setting the version attribute of the pdfrw PdfWriter object will # influence the behaviour of the write() function self.writer.version = self.output_version if self.pdfa: md5 = hashlib.md5(b"").hexdigest().encode("ascii") self.writer.trailer[PdfName.ID] = PdfArray([md5, md5]) self.writer.write(outputstream) elif self.engine == Engine.internal: self.writer.tostream( self.writer.docinfo, outputstream, self.output_version, None if self.pdfa is None else b"", ) else: raise ValueError("unknown engine: %s" % self.engine) def pil_get_dpi(imgdata, imgformat, default_dpi): ndpi = imgdata.info.get("dpi") if ndpi is None: # the PNG plugin of PIL adds the undocumented "aspect" field instead of # the "dpi" field if the PNG pHYs chunk unit is not set to meters if imgformat == ImageFormat.PNG and imgdata.info.get("aspect") is not None: aspect = imgdata.info["aspect"] # make sure not to go below the default dpi if aspect[0] > aspect[1]: ndpi = (default_dpi * aspect[0] / aspect[1], default_dpi) else: ndpi = (default_dpi, default_dpi * aspect[1] / aspect[0]) else: ndpi = (default_dpi, default_dpi) # In python3, the returned dpi value for some tiff images will # not be an integer but a float. To make the behaviour of # img2pdf the same between python2 and python3, we convert that # float into an integer by rounding. # Search online for the 72.009 dpi problem for more info. ndpi = (int(round(ndpi[0])), int(round(ndpi[1]))) # Since commit 07a96209597c5e8dfe785c757d7051ce67a980fb or release 4.1.0 # Pillow retrieves the DPI from EXIF if it cannot find the DPI in the JPEG # header. In that case it can happen that the horizontal and vertical DPI # are set to zero. if ndpi == (0, 0): ndpi = (default_dpi, default_dpi) # PIL defaults to a dpi of 1 if a TIFF image does not specify the dpi. # In that case, we want to use a different default. if ndpi == (1, 1) and imgformat == ImageFormat.TIFF: ndpi = ( imgdata.tag_v2.get(TiffImagePlugin.X_RESOLUTION, default_dpi), imgdata.tag_v2.get(TiffImagePlugin.Y_RESOLUTION, default_dpi), ) return ndpi def get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rawdata=None, rotreq=None ): if imgformat == ImageFormat.JPEG2000 and rawdata is not None and imgdata is None: # this codepath gets called if the PIL installation is not able to # handle JPEG2000 files imgwidthpx, imgheightpx, ics, hdpi, vdpi, channels, bpp = jp2.parse(rawdata) if hdpi is None: hdpi = default_dpi if vdpi is None: vdpi = default_dpi ndpi = (hdpi, vdpi) elif imgformat == ImageFormat.JBIG2: imgwidthpx, imgheightpx, xres, yres = struct.unpack(">IIII", rawdata[24:40]) INCH_PER_METER = 39.370079 if xres == 0: hdpi = default_dpi elif xres < 1000: # If xres is very small, it's likely accidentally expressed in dpi instead # of dpm. See e.g. https://github.com/agl/jbig2enc/issues/86 hdpi = xres else: hdpi = int(float(xres) / INCH_PER_METER) if yres == 0: vdpi = default_dpi elif yres < 1000: vdpi = yres else: vdpi = int(float(yres) / INCH_PER_METER) ndpi = (hdpi, vdpi) ics = "1" else: imgwidthpx, imgheightpx = imgdata.size ndpi = pil_get_dpi(imgdata, imgformat, default_dpi) ics = imgdata.mode logger.debug("input dpi = %d x %d", *ndpi) # GIF and PNG files with transparency are supported if imgformat in [ImageFormat.PNG, ImageFormat.GIF, ImageFormat.JPEG2000] and ( ics in ["RGBA", "LA"] or (imgdata is not None and "transparency" in imgdata.info) ): # Must check the IHDR chunk for the bit depth, because PIL would lossily # convert 16-bit RGBA/LA images to 8-bit. if imgformat == ImageFormat.PNG and rawdata is not None: depth = rawdata[24] if depth > 8: logger.warning("Image with transparency and a bit depth of %d." % depth) logger.warning("This is unsupported due to PIL limitations.") logger.warning( "If you accept a lossy conversion, you can manually convert " "your images to 8 bit using `convert -depth 8` from imagemagick" ) raise AlphaChannelError( "Refusing to work with multiple >8bit channels." ) elif ics in ["LA", "PA", "RGBA"] or ( imgdata is not None and "transparency" in imgdata.info ): raise AlphaChannelError("This function must not be called on images with alpha") rotation = 0 if rotreq in (None, Rotation.auto, Rotation.ifvalid): if hasattr(imgdata, "getexif") and imgdata.getexif() is not None: exif_dict = imgdata.getexif() o_key = ExifTags.Base.Orientation.value # 274 rsp. 0x112 if exif_dict and o_key in exif_dict: # Detailed information on EXIF rotation tags: # http://impulseadventure.com/photo/exif-orientation.html value = exif_dict[o_key] if value == 1: rotation = 0 elif value == 6: rotation = 90 elif value == 3: rotation = 180 elif value == 8: rotation = 270 elif value in (2, 4, 5, 7): if rotreq == Rotation.ifvalid: logger.warning( "Unsupported flipped rotation mode (%d): use " "--rotation=ifvalid or " "rotation=img2pdf.Rotation.ifvalid to ignore", value, ) else: raise ExifOrientationError( "Unsupported flipped rotation mode (%d): use " "--rotation=ifvalid or " "rotation=img2pdf.Rotation.ifvalid to ignore" % value ) else: if rotreq == Rotation.ifvalid: logger.warning("Invalid rotation (%d)", value) else: raise ExifOrientationError( "Invalid rotation (%d): use --rotation=ifvalid " "or rotation=img2pdf.Rotation.ifvalid to ignore" % value ) elif hasattr(imgdata, "_getexif") and imgdata._getexif() is not None: for tag, value in imgdata._getexif().items(): if TAGS.get(tag, tag) == "Orientation": # Detailed information on EXIF rotation tags: # http://impulseadventure.com/photo/exif-orientation.html if value == 1: rotation = 0 elif value == 6: rotation = 90 elif value == 3: rotation = 180 elif value == 8: rotation = 270 elif value in (2, 4, 5, 7): if rotreq == Rotation.ifvalid: logger.warning( "Unsupported flipped rotation mode (%d): use " "--rotation=ifvalid or " "rotation=img2pdf.Rotation.ifvalid to ignore", value, ) else: raise ExifOrientationError( "Unsupported flipped rotation mode (%d): use " "--rotation=ifvalid or " "rotation=img2pdf.Rotation.ifvalid to ignore" % value ) else: if rotreq == Rotation.ifvalid: logger.warning("Invalid rotation (%d)", value) else: raise ExifOrientationError( "Invalid rotation (%d): use --rotation=ifvalid " "or rotation=img2pdf.Rotation.ifvalid to ignore" % value ) elif rotreq in (Rotation.none, Rotation["0"]): rotation = 0 elif rotreq == Rotation["90"]: rotation = 90 elif rotreq == Rotation["180"]: rotation = 180 elif rotreq == Rotation["270"]: rotation = 270 else: raise Exception("invalid rotreq") logger.debug("rotation = %d°", rotation) if colorspace: color = colorspace logger.debug("input colorspace (forced) = %s", color) else: color = None for c in Colorspace: if c.name == ics: color = c if color is None: # PIL does not provide the information about the original # colorspace for 16bit grayscale PNG images. Thus, we retrieve # that info manually by looking at byte 10 in the IHDR chunk. We # know where to find that in the file because the IHDR chunk must # be the first chunk if ( rawdata is not None and imgformat == ImageFormat.PNG and rawdata[25] == 0 ): color = Colorspace.L else: raise ValueError("unknown colorspace") if color == Colorspace.CMYK and imgformat == ImageFormat.JPEG: # Adobe inverts CMYK JPEGs for some reason, and others # have followed suit as well. Some software assumes the # JPEG is inverted if the Adobe tag (APP14), while other # software assumes all CMYK JPEGs are inverted. I don't # have enough experience with these to know which is # better for images currently in the wild, so I'm going # with the first approach for now. if "adobe" in imgdata.info: color = Colorspace["CMYK;I"] logger.debug("input colorspace = %s", color.name) iccp = None if imgdata is not None and "icc_profile" in imgdata.info: iccp = imgdata.info.get("icc_profile") # GIMP saves bilevel TIFF images and palette PNG images with only black and # white in the palette with an RGB ICC profile which is useless # https://gitlab.gnome.org/GNOME/gimp/-/issues/3438 # and produces an error in Adobe Acrobat, so we ignore it with a warning. # imagemagick also used to (wrongly) include an RGB ICC profile for bilevel # images: https://github.com/ImageMagick/ImageMagick/issues/2070 if iccp is not None and ( (color == Colorspace["1"] and imgformat == ImageFormat.TIFF) or ( imgformat == ImageFormat.PNG and color == Colorspace.P and rawdata is not None and parse_png(rawdata)[1] in [b"\x00\x00\x00\xff\xff\xff", b"\xff\xff\xff\x00\x00\x00"] ) ): with io.BytesIO(iccp) as f: prf = ImageCms.ImageCmsProfile(f) if ( prf.profile.model == "sRGB" and prf.profile.manufacturer == "GIMP" and prf.profile.profile_description == "GIMP built-in sRGB" ): if imgformat == ImageFormat.TIFF: logger.warning( "Ignoring RGB ICC profile in bilevel TIFF produced by GIMP." ) elif imgformat == ImageFormat.PNG: logger.warning( "Ignoring RGB ICC profile in 2-color palette PNG produced by GIMP." ) logger.warning("https://gitlab.gnome.org/GNOME/gimp/-/issues/3438") iccp = None # SmartAlbums old version (found 2.2.6) exports JPG with only 1 compone # with an RGB ICC profile which is useless. # This produces an error in Adobe Acrobat, so we ignore it with a warning. # Update: Found another case, the JPG is created by Adobe PhotoShop, so we # don't check software anymore. if iccp is not None and ( (color == Colorspace["L"] and imgformat == ImageFormat.JPEG) ): with io.BytesIO(iccp) as f: prf = ImageCms.ImageCmsProfile(f) if prf.profile.xcolor_space not in ("GRAY"): logger.warning("Ignoring non-GRAY ICC profile in Grayscale JPG") iccp = None logger.debug("width x height = %dpx x %dpx", imgwidthpx, imgheightpx) return (color, ndpi, imgwidthpx, imgheightpx, rotation, iccp) def ccitt_payload_location_from_pil(img): # If Pillow is passed an invalid compression argument it will ignore it; # make sure the image actually got compressed. if img.info["compression"] != "group4": raise ValueError( "Image not compressed with CCITT Group 4 but with: %s" % img.info["compression"] ) # Read the TIFF tags to find the offset(s) of the compressed data strips. strip_offsets = img.tag_v2[TiffImagePlugin.STRIPOFFSETS] strip_bytes = img.tag_v2[TiffImagePlugin.STRIPBYTECOUNTS] # PIL always seems to create a single strip even for very large TIFFs when # it saves images, so assume we only have to read a single strip. # A test ~10 GPixel image was still encoded as a single strip. Just to be # safe check throw an error if there is more than one offset. if len(strip_offsets) != 1 or len(strip_bytes) != 1: raise NotImplementedError( "Transcoding multiple strips not supported by the PDF format" ) (offset,), (length,) = strip_offsets, strip_bytes logger.debug("TIFF strip_offsets: %d" % offset) logger.debug("TIFF strip_bytes: %d" % length) return offset, length def transcode_monochrome(imgdata): """Convert the open PIL.Image imgdata to compressed CCITT Group4 data""" logger.debug("Converting monochrome to CCITT Group4") # Convert the image to Group 4 in memory. If libtiff is not installed and # Pillow is not compiled against it, .save() will raise an exception. newimgio = BytesIO() # we create a whole new PIL image or otherwise it might happen with some # input images, that libtiff fails an assert and the whole process is # killed by a SIGABRT: # https://gitlab.mister-muffin.de/josch/img2pdf/issues/46 im = Image.frombytes(imgdata.mode, imgdata.size, imgdata.tobytes()) # Since version 8.3.0 Pillow limits strips to 64 KB. Since PDF only # supports single strip CCITT Group4 payloads, we have to coerce it back # into putting everything into a single strip. Thanks to Andrew Murray for # the hack. # # Since version 8.4.0 Pillow allows us to modify the strip size explicitly tmp_strip_size = (imgdata.size[0] + 7) // 8 * imgdata.size[1] if hasattr(TiffImagePlugin, "STRIP_SIZE"): # we are using Pillow 8.4.0 or later with temp_attr(TiffImagePlugin, "STRIP_SIZE", tmp_strip_size): im.save(newimgio, format="TIFF", compression="group4") else: # only needed for Pillow 8.3.x but works for versions before that as # well pillow__getitem__ = TiffImagePlugin.ImageFileDirectory_v2.__getitem__ def __getitem__(self, tag): overrides = { TiffImagePlugin.ROWSPERSTRIP: imgdata.size[1], TiffImagePlugin.STRIPBYTECOUNTS: [tmp_strip_size], TiffImagePlugin.STRIPOFFSETS: [0], } return overrides.get(tag, pillow__getitem__(self, tag)) with temp_attr( TiffImagePlugin.ImageFileDirectory_v2, "__getitem__", __getitem__ ): im.save(newimgio, format="TIFF", compression="group4") # Open new image in memory newimgio.seek(0) newimg = Image.open(newimgio) offset, length = ccitt_payload_location_from_pil(newimg) newimgio.seek(offset) return newimgio.read(length) def parse_png(rawdata): pngidat = b"" palette = b"" i = 16 while i < len(rawdata): # once we can require Python >= 3.2 we can use int.from_bytes() instead (n,) = struct.unpack(">I", rawdata[i - 8 : i - 4]) if i + n > len(rawdata): raise Exception("invalid png: %d %d %d" % (i, n, len(rawdata))) if rawdata[i - 4 : i] == b"IDAT": pngidat += rawdata[i : i + n] elif rawdata[i - 4 : i] == b"PLTE": palette += rawdata[i : i + n] i += n i += 12 return pngidat, palette miff_re = re.compile( r""" [^\x00-\x20\x7f-\x9f] # the field name must not start with a control char or space [^=]+ # the field name can even contain spaces = # field name and value are separated by an equal sign (?: [^\x00-\x20\x7f-\x9f{}] # either chars that are not braces and not control chars |{[^}]*} # or any kind of char surrounded by braces )+""", re.VERBOSE, ) # https://imagemagick.org/script/miff.php # turn off black formatting until python 3.10 is available on more platforms # and we can use match/case # fmt: off def parse_miff(data): results = [] header, rest = data.split(b":\x1a", 1) header = header.decode("ISO-8859-1") assert header.lower().startswith("id=imagemagick") hdata = {} for i, line in enumerate(re.findall(miff_re, header)): if not line: continue k, v = line.split("=", 1) if i == 0: assert k.lower() == "id" assert v.lower() == "imagemagick" #match k.lower(): # case "class": if k.lower() == "class": #match v: # case "DirectClass" | "PseudoClass": if v in ["DirectClass", "PseudoClass"]: hdata["class"] = v # case _: else: print("cannot understand class", v) # case "colorspace": elif k.lower() == "colorspace": # theoretically RGBA and CMYKA should be supported as well # please teach me how to create such a MIFF file #match v: # case "sRGB" | "CMYK" | "Gray": if v in ["sRGB", "CMYK", "Gray"]: hdata["colorspace"] = v # case _: else: print("cannot understand colorspace", v) # case "depth": elif k.lower() == "depth": #match v: # case "8" | "16" | "32": if v in ["8", "16", "32"]: hdata["depth"] = int(v) # case _: else: print("cannot understand depth", v) # case "colors": elif k.lower() == "colors": hdata["colors"] = int(v) # case "matte": elif k.lower() == "matte": #match v: # case "True": if v == "True": hdata["matte"] = True # case "False": elif v == "False": hdata["matte"] = False # case _: else: print("cannot understand matte", v) # case "columns" | "rows": elif k.lower() in ["columns", "rows"]: hdata[k.lower()] = int(v) # case "compression": elif k.lower() == "compression": print("compression not yet supported") # case "profile": elif k.lower() == "profile": assert v in ["icc", "exif"] hdata["profile"] = v # case "resolution": elif k.lower() == "resolution": dpix, dpiy = v.split("x", 1) hdata["resolution"] = (float(dpix), float(dpiy)) assert "depth" in hdata assert "columns" in hdata assert "rows" in hdata #match hdata["class"]: # case "DirectClass": if hdata["class"] == "DirectClass": if "colors" in hdata: assert hdata["colors"] == 0 #match hdata["colorspace"]: # case "sRGB": if hdata["colorspace"] == "sRGB": numchannels = 3 colorspace = Colorspace.RGB # case "CMYK": elif hdata["colorspace"] == "CMYK": numchannels = 4 colorspace = Colorspace.CMYK # case "Gray": elif hdata["colorspace"] == "Gray": numchannels = 1 colorspace = Colorspace.L if hdata.get("matte"): numchannels += 1 if hdata.get("profile"): # there is no key encoding the length of icc or exif data # according to the docs, the profile-icc key is supposed to do this print("FAIL: exif") else: lenimgdata = ( hdata["depth"] // 8 * numchannels * hdata["columns"] * hdata["rows"] ) assert len(rest) >= lenimgdata, ( len(rest), hdata["depth"], numchannels, hdata["columns"], hdata["rows"], lenimgdata, ) if colorspace == Colorspace.RGB and hdata["depth"] == 8: newimg = Image.frombytes("RGB", (hdata["columns"], hdata["rows"]), rest[:lenimgdata]) imgdata, palette, depth = to_png_data(newimg) assert palette == b"" assert depth == hdata["depth"] imgfmt = ImageFormat.PNG else: imgdata = zlib.compress(rest[:lenimgdata]) imgfmt = ImageFormat.MIFF results.append( ( colorspace, hdata.get("resolution") or (default_dpi, default_dpi), imgfmt, imgdata, None, # smask hdata["columns"], hdata["rows"], [], # palette False, # inverted hdata["depth"], 0, # rotation None, # icc profile ) ) if len(rest) > lenimgdata: # another image is here assert rest[lenimgdata:][:14].lower() == b"id=imagemagick" results.extend(parse_miff(rest[lenimgdata:])) # case "PseudoClass": elif hdata["class"] == "PseudoClass": assert "colors" in hdata if hdata.get("matte"): numchannels = 2 else: numchannels = 1 lenpal = 3 * hdata["colors"] * hdata["depth"] // 8 lenimgdata = numchannels * hdata["rows"] * hdata["columns"] assert len(rest) >= lenpal + lenimgdata, (len(rest), lenpal, lenimgdata) results.append( ( Colorspace.RGB, hdata.get("resolution") or (default_dpi, default_dpi), ImageFormat.MIFF, zlib.compress(rest[lenpal : lenpal + lenimgdata]), None, # FIXME: allow alpha channel smask hdata["columns"], hdata["rows"], rest[:lenpal], # palette False, # inverted hdata["depth"], 0, # rotation None, # icc profile ) ) if len(rest) > lenpal + lenimgdata: # another image is here assert rest[lenpal + lenimgdata :][:14].lower() == b"id=imagemagick", ( len(rest), lenpal, lenimgdata, ) results.extend(parse_miff(rest[lenpal + lenimgdata :])) return results # fmt: on def read_images( rawdata, colorspace, first_frame_only=False, rot=None, include_thumbnails=False ): im = BytesIO(rawdata) im.seek(0) imgdata = None try: imgdata = Image.open(im) except IOError as e: # test if it is a jpeg2000 image if rawdata[:12] == b"\x00\x00\x00\x0C\x6A\x50\x20\x20\x0D\x0A\x87\x0A": # image is jpeg2000 imgformat = ImageFormat.JPEG2000 elif rawdata[:8] == b"\x97\x4a\x42\x32\x0d\x0a\x1a\x0a": # For now we only support single-page generic coding of JBIG2, for example as generated by # https://github.com/agl/jbig2enc # # In fact, you can pipe an example image `like src/tests/input/mono.png` directly into img2pdf: # jbig2 src/tests/input/mono.png | img2pdf -o src/tests/output/mono.png.pdf # # For this we assume that the first 13 bytes are the JBIG file header describing a document with one page, # followed by a "page information" segment describing the dimensions of that page. # # The following annotated `hexdump -C 042.jb2` shows the first 40 bytes that we inspect directly. # The first 24 bytes (until "||") have to match exactly, while the following 16 bytes are read by get_imgmetadata. # # 97 4a 42 32 0d 0a 1a 0a 01 00 00 00 01 00 00 00 # \_____________________/ | \_________/ \______ # magic-bytes org/unk pages seg-num # # 00 30 00 01 00 00 00 13 || 00 00 00 73 00 00 00 30 # _/ | | | \_________/ || \_________/ \_________/ # type refs page seg-size || width-px height-px # # 00 00 00 48 00 00 00 48 # \_________/ \_________/ # xres yres # # For more information on the data format, see: # * https://github.com/agl/jbig2enc/blob/ea05019/fcd14492.pdf # For more information about the generic coding, see: # * https://github.com/agl/jbig2enc/blob/ea05019/src/jbig2enc.cc#L898 imgformat = ImageFormat.JBIG2 if ( rawdata[:24] != b"\x97\x4a\x42\x32\x0d\x0a\x1a\x0a\x01\x00\x00\x00\x01\x00\x00\x00\x00\x30\x00\x01\x00\x00\x00\x13" ): raise ImageOpenError( "Unsupported JBIG2 format; only single-page generic coding is supported (e.g. from `jbig2enc`)." ) if ( rawdata[-22:] != b"\x00\x00\x00\x021\x00\x01\x00\x00\x00\x00\x00\x00\x00\x033\x00\x01\x00\x00\x00\x00" ): raise ImageOpenError( "Unsupported JBIG2 format; we expect end-of-page and end-of-file segments at the end (e.g. from `jbig2enc`)." ) elif rawdata[:14].lower() == b"id=imagemagick": # image is in MIFF format # this is useful for 16 bit CMYK because PNG cannot do CMYK and thus # we need PIL but PIL cannot do 16 bit imgformat = ImageFormat.MIFF else: raise ImageOpenError( "cannot read input image (not jpeg2000). " "PIL: error reading image: %s" % e ) else: logger.debug("PIL format = %s", imgdata.format) imgformat = getattr(ImageFormat, imgdata.format, ImageFormat.other) def cleanup(): if imgdata is not None: # the python-pil version 2.3.0-1ubuntu3 in Ubuntu does not have the # close() method try: imgdata.close() except AttributeError: pass im.close() logger.debug("imgformat = %s", imgformat.name) # depending on the input format, determine whether to pass the raw # image or the zlib compressed color information # JPEG and JPEG2000 can be embedded into the PDF as-is if imgformat == ImageFormat.JPEG or imgformat == ImageFormat.JPEG2000: color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rawdata, rot ) if color == Colorspace["1"]: raise JpegColorspaceError("jpeg can't be monochrome") if color == Colorspace["P"]: raise JpegColorspaceError("jpeg can't have a color palette") if color == Colorspace["RGBA"] and imgformat != ImageFormat.JPEG2000: raise JpegColorspaceError("jpeg can't have an alpha channel") logger.debug("read_images() embeds a JPEG") cleanup() depth = 8 if imgformat == ImageFormat.JPEG2000: *_, depth = jp2.parse(rawdata) return [ ( color, ndpi, imgformat, rawdata, None, imgwidthpx, imgheightpx, [], False, depth, rotation, iccp, ) ] # The MPO format is multiple JPEG images concatenated together # we use the offset and size information to dissect the MPO into its # individual JPEG images and then embed those into the PDF individually. # # The downside is, that this truncates the first JPEG as the MPO metadata # will still be in it but the referenced images are chopped off. We still # do it that way instead of adding the full MPO as the first image to not # store duplicate image data. if imgformat == ImageFormat.MPO: result = [] img_page_count = 0 assert len(imgdata._MpoImageFile__mpoffsets) == len(imgdata.mpinfo[0xB002]) num_frames = len(imgdata.mpinfo[0xB002]) # An MPO file can be a main image together with one or more thumbnails # if that is the case, then we only include all frames if the # --include-thumbnails option is given. If it is not, such an MPO file # will be embedded as is, so including its thumbnails but showing up # as a single image page in the resulting PDF. num_main_frames = 0 num_thumbnail_frames = 0 for i, mpent in enumerate(imgdata.mpinfo[0xB002]): # check only the first frame for being the main image if ( i == 0 and mpent["Attribute"]["DependentParentImageFlag"] and not mpent["Attribute"]["DependentChildImageFlag"] and mpent["Attribute"]["RepresentativeImageFlag"] and mpent["Attribute"]["MPType"] == "Baseline MP Primary Image" ): num_main_frames += 1 elif ( not mpent["Attribute"]["DependentParentImageFlag"] and mpent["Attribute"]["DependentChildImageFlag"] and not mpent["Attribute"]["RepresentativeImageFlag"] and mpent["Attribute"]["MPType"] in [ "Large Thumbnail (VGA Equivalent)", "Large Thumbnail (Full HD Equivalent)", ] ): num_thumbnail_frames += 1 logger.debug(f"number of frames: {num_frames}") logger.debug(f"number of main frames: {num_main_frames}") logger.debug(f"number of thumbnail frames: {num_thumbnail_frames}") # this MPO file is a main image plus zero or more thumbnails # embed as-is unless the --include-thumbnails option was given if num_frames == 1 or ( not include_thumbnails and num_main_frames == 1 and num_thumbnail_frames + 1 == num_frames ): color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rawdata, rot ) if color == Colorspace["1"]: raise JpegColorspaceError("jpeg can't be monochrome") if color == Colorspace["P"]: raise JpegColorspaceError("jpeg can't have a color palette") if color == Colorspace["RGBA"]: raise JpegColorspaceError("jpeg can't have an alpha channel") logger.debug("read_images() embeds an MPO verbatim") cleanup() return [ ( color, ndpi, ImageFormat.JPEG, rawdata, None, imgwidthpx, imgheightpx, [], False, 8, rotation, iccp, ) ] # If the control flow reaches here, the MPO has more than a single # frame but was not detected to be a main image followed by multiple # thumbnails. We thus treat this MPO as we do other multi-frame images # and include all its frames as individual pages. for offset, mpent in zip( imgdata._MpoImageFile__mpoffsets, imgdata.mpinfo[0xB002] ): if first_frame_only and img_page_count > 0: break with BytesIO(rawdata[offset : offset + mpent["Size"]]) as rawframe: with Image.open(rawframe) as imframe: # The first frame contains the data that makes the JPEG a MPO # Could we thus embed an MPO into another MPO? Lets not support # such madness ;) if img_page_count > 0 and imframe.format != "JPEG": raise Exception("MPO payload must be a JPEG %s", imframe.format) ( color, ndpi, imgwidthpx, imgheightpx, rotation, iccp, ) = get_imgmetadata( imframe, ImageFormat.JPEG, default_dpi, colorspace, rotreq=rot ) if color == Colorspace["1"]: raise JpegColorspaceError("jpeg can't be monochrome") if color == Colorspace["P"]: raise JpegColorspaceError("jpeg can't have a color palette") if color == Colorspace["RGBA"]: raise JpegColorspaceError("jpeg can't have an alpha channel") logger.debug("read_images() embeds a JPEG from MPO") result.append( ( color, ndpi, ImageFormat.JPEG, rawdata[offset : offset + mpent["Size"]], None, imgwidthpx, imgheightpx, [], False, 8, rotation, iccp, ) ) img_page_count += 1 cleanup() return result # We can directly embed the IDAT chunk of PNG images if the PNG is not # interlaced # # PIL does not provide the information whether a PNG was stored interlaced # or not. Thus, we retrieve that info manually by looking at byte 13 in the # IHDR chunk. We know where to find that in the file because the IHDR chunk # must be the first chunk. if imgformat == ImageFormat.PNG and rawdata[28] == 0: color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rawdata, rot ) if ( color != Colorspace.RGBA and color != Colorspace.LA and color != Colorspace.PA and "transparency" not in imgdata.info ): pngidat, palette = parse_png(rawdata) # PIL does not provide the information about the original bits per # sample. Thus, we retrieve that info manually by looking at byte 9 in # the IHDR chunk. We know where to find that in the file because the # IHDR chunk must be the first chunk depth = rawdata[24] if depth not in [1, 2, 4, 8, 16]: raise ValueError("invalid bit depth: %d" % depth) # we embed the PNG only if it is not at the same time palette based # and has an icc profile because PDF doesn't support icc profiles # on palette images if palette == b"" or iccp is None: logger.debug("read_images() embeds a PNG") cleanup() return [ ( color, ndpi, imgformat, pngidat, None, imgwidthpx, imgheightpx, palette, False, depth, rotation, iccp, ) ] if imgformat == ImageFormat.JBIG2: color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rawdata, rot ) streamdata = rawdata[13:-22] # Strip file header and footer return [ ( color, ndpi, imgformat, streamdata, None, imgwidthpx, imgheightpx, [], False, 1, rotation, iccp, ) ] if imgformat == ImageFormat.MIFF: return parse_miff(rawdata) # If our input is not JPEG or PNG, then we might have a format that # supports multiple frames (like TIFF or GIF), so we need a loop to # iterate through all frames of the image. # # Each frame gets compressed using PNG compression *except* if: # # * The image is monochrome => encode using CCITT group 4 # # * The image is CMYK => zip plain RGB data # # * We are handling a CCITT encoded TIFF frame => embed data result = [] img_page_count = 0 # loop through all frames of the image (example: multipage TIFF) while True: try: imgdata.seek(img_page_count) except EOFError: break if first_frame_only and img_page_count > 0: break # PIL is unable to preserve the data of 16-bit RGB TIFF files and will # convert it to 8-bit without the possibility to retrieve the original # data # https://github.com/python-pillow/Pillow/issues/1888 # # Some tiff images do not have BITSPERSAMPLE set. Use this to create # such a tiff: tiffset -u 258 test.tif if ( imgformat == ImageFormat.TIFF and max(imgdata.tag_v2.get(TiffImagePlugin.BITSPERSAMPLE, [1])) > 8 ): raise ValueError("PIL is unable to preserve more than 8 bits per sample") # We can directly copy the data out of a CCITT Group 4 encoded TIFF, if it # only contains a single strip if ( imgformat == ImageFormat.TIFF and imgdata.info["compression"] == "group4" and len(imgdata.tag_v2[TiffImagePlugin.STRIPOFFSETS]) == 1 and len(imgdata.tag_v2[TiffImagePlugin.STRIPBYTECOUNTS]) == 1 ): photo = imgdata.tag_v2[TiffImagePlugin.PHOTOMETRIC_INTERPRETATION] inverted = False if photo == 0: inverted = True elif photo != 1: raise ValueError( "unsupported photometric interpretation for " "group4 tiff: %d" % photo ) color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rawdata, rot ) offset, length = ccitt_payload_location_from_pil(imgdata) im.seek(offset) rawdata = im.read(length) fillorder = imgdata.tag_v2.get(TiffImagePlugin.FILLORDER) if fillorder is None: # no FillOrder: nothing to do pass elif fillorder == 1: # msb-to-lsb: nothing to do pass elif fillorder == 2: logger.debug("fillorder is lsb-to-msb => reverse bits") # lsb-to-msb: reverse bits of each byte rawdata = bytearray(rawdata) for i in range(len(rawdata)): rawdata[i] = TIFFBitRevTable[rawdata[i]] rawdata = bytes(rawdata) else: raise ValueError("unsupported FillOrder: %d" % fillorder) logger.debug("read_images() embeds Group4 from TIFF") result.append( ( color, ndpi, ImageFormat.CCITTGroup4, rawdata, None, imgwidthpx, imgheightpx, [], inverted, 1, rotation, iccp, ) ) img_page_count += 1 continue logger.debug("Converting frame: %d" % img_page_count) color, ndpi, imgwidthpx, imgheightpx, rotation, iccp = get_imgmetadata( imgdata, imgformat, default_dpi, colorspace, rotreq=rot ) newimg = None if color == Colorspace["1"]: try: ccittdata = transcode_monochrome(imgdata) logger.debug("read_images() encoded a B/W image as CCITT group 4") result.append( ( color, ndpi, ImageFormat.CCITTGroup4, ccittdata, None, imgwidthpx, imgheightpx, [], False, 1, rotation, iccp, ) ) img_page_count += 1 continue except Exception as e: logger.debug(e) logger.debug("Converting colorspace 1 to L") newimg = imgdata.convert("L") color = Colorspace.L elif color in [ Colorspace.RGB, Colorspace.RGBA, Colorspace.L, Colorspace.LA, Colorspace.CMYK, Colorspace["CMYK;I"], Colorspace.P, ]: logger.debug("Colorspace is OK: %s", color) newimg = imgdata else: raise ValueError("unknown or unsupported colorspace: %s" % color.name) # the PNG format does not support CMYK, so we fall back to normal # compression if color in [Colorspace.CMYK, Colorspace["CMYK;I"]]: imggz = zlib.compress(newimg.tobytes()) logger.debug("read_images() encoded CMYK with flate compression") result.append( ( color, ndpi, imgformat, imggz, None, imgwidthpx, imgheightpx, [], False, 8, rotation, iccp, ) ) else: if color in [Colorspace.P, Colorspace.PA] and iccp is not None: # PDF does not support palette images with icc profile if color == Colorspace.P: newcolor = Colorspace.RGB newimg = newimg.convert(mode="RGB") elif color == Colorspace.PA: newcolor = Colorspace.RGBA newimg = newimg.convert(mode="RGBA") smaskidat = None elif ( color == Colorspace.RGBA or color == Colorspace.LA or color == Colorspace.PA or "transparency" in newimg.info ): if color == Colorspace.RGBA: newcolor = color r, g, b, a = newimg.split() newimg = Image.merge("RGB", (r, g, b)) elif color == Colorspace.LA: newcolor = color l, a = newimg.split() newimg = l elif color == Colorspace.PA or ( color == Colorspace.P and "transparency" in newimg.info ): newcolor = color a = newimg.convert(mode="RGBA").split()[-1] else: newcolor = Colorspace.RGBA r, g, b, a = newimg.convert(mode="RGBA").split() newimg = Image.merge("RGB", (r, g, b)) smaskidat, *_ = to_png_data(a) logger.warning( "Image contains an alpha channel. Computing a separate " "soft mask (/SMask) image to store transparency in PDF." ) else: newcolor = color smaskidat = None pngidat, palette, depth = to_png_data(newimg) logger.debug("read_images() encoded an image as PNG") result.append( ( newcolor, ndpi, ImageFormat.PNG, pngidat, smaskidat, imgwidthpx, imgheightpx, palette, False, depth, rotation, iccp, ) ) img_page_count += 1 cleanup() return result def to_png_data(img): # cheapo version to retrieve a PNG encoding of the payload is to # just save it with PIL. In the future this could be replaced by # dedicated function applying the Paeth PNG filter to the raw pixel pngbuffer = BytesIO() img.save(pngbuffer, format="png") pngidat, palette = parse_png(pngbuffer.getvalue()) # PIL does not provide the information about the original bits per # sample. Thus, we retrieve that info manually by looking at byte 9 in # the IHDR chunk. We know where to find that in the file because the # IHDR chunk must be the first chunk pngbuffer.seek(24) depth = ord(pngbuffer.read(1)) if depth not in [1, 2, 4, 8, 16]: raise ValueError("invalid bit depth: %d" % depth) return pngidat, palette, depth # converts a length in pixels to a length in PDF units (1/72 of an inch) def px_to_pt(length, dpi): return 72.0 * length / dpi def cm_to_pt(length): return (72.0 * length) / 2.54 def mm_to_pt(length): return (72.0 * length) / 25.4 def in_to_pt(length): return 72.0 * length def get_layout_fun( pagesize=None, imgsize=None, border=None, fit=None, auto_orient=False ): def fitfun(fit, imgwidth, imgheight, fitwidth, fitheight): if fitwidth is None and fitheight is None: raise ValueError("fitwidth and fitheight cannot both be None") # if fit is fill or enlarge then it is okay if one of the dimensions # are negative but one of them must still be positive # if fit is not fill or enlarge then both dimensions must be positive if ( fit in [FitMode.fill, FitMode.enlarge] and fitwidth is not None and fitwidth < 0 and fitheight is not None and fitheight < 0 ): raise ValueError( "cannot fit into a rectangle where both dimensions are negative" ) elif fit not in [FitMode.fill, FitMode.enlarge] and ( (fitwidth is not None and fitwidth < 0) or (fitheight is not None and fitheight < 0) ): raise Exception( "cannot fit into a rectangle where either dimensions are negative" ) def default(): if fitwidth is not None and fitheight is not None: newimgwidth = fitwidth newimgheight = (newimgwidth * imgheight) / imgwidth if newimgheight > fitheight: newimgheight = fitheight newimgwidth = (newimgheight * imgwidth) / imgheight elif fitwidth is None and fitheight is not None: newimgheight = fitheight newimgwidth = (newimgheight * imgwidth) / imgheight elif fitheight is None and fitwidth is not None: newimgwidth = fitwidth newimgheight = (newimgwidth * imgheight) / imgwidth else: raise ValueError("fitwidth and fitheight cannot both be None") return newimgwidth, newimgheight if fit is None or fit == FitMode.into: return default() elif fit == FitMode.fill: if fitwidth is not None and fitheight is not None: newimgwidth = fitwidth newimgheight = (newimgwidth * imgheight) / imgwidth if newimgheight < fitheight: newimgheight = fitheight newimgwidth = (newimgheight * imgwidth) / imgheight elif fitwidth is None and fitheight is not None: newimgheight = fitheight newimgwidth = (newimgheight * imgwidth) / imgheight elif fitheight is None and fitwidth is not None: newimgwidth = fitwidth newimgheight = (newimgwidth * imgheight) / imgwidth else: raise ValueError("fitwidth and fitheight cannot both be None") return newimgwidth, newimgheight elif fit == FitMode.exact: if fitwidth is not None and fitheight is not None: return fitwidth, fitheight elif fitwidth is None and fitheight is not None: newimgheight = fitheight newimgwidth = (newimgheight * imgwidth) / imgheight elif fitheight is None and fitwidth is not None: newimgwidth = fitwidth newimgheight = (newimgwidth * imgheight) / imgwidth else: raise ValueError("fitwidth and fitheight cannot both be None") return newimgwidth, newimgheight elif fit == FitMode.shrink: if fitwidth is not None and fitheight is not None: if imgwidth <= fitwidth and imgheight <= fitheight: return imgwidth, imgheight elif fitwidth is None and fitheight is not None: if imgheight <= fitheight: return imgwidth, imgheight elif fitheight is None and fitwidth is not None: if imgwidth <= fitwidth: return imgwidth, imgheight else: raise ValueError("fitwidth and fitheight cannot both be None") return default() elif fit == FitMode.enlarge: if fitwidth is not None and fitheight is not None: if imgwidth > fitwidth or imgheight > fitheight: return imgwidth, imgheight elif fitwidth is None and fitheight is not None: if imgheight > fitheight: return imgwidth, imgheight elif fitheight is None and fitwidth is not None: if imgwidth > fitwidth: return imgwidth, imgheight else: raise ValueError("fitwidth and fitheight cannot both be None") return default() else: raise NotImplementedError # if no layout arguments are given, then the image size is equal to the # page size and will be drawn with the default dpi if pagesize is None and imgsize is None and border is None: return default_layout_fun if pagesize is None and imgsize is None and border is not None: def layout_fun(imgwidthpx, imgheightpx, ndpi): imgwidthpdf = px_to_pt(imgwidthpx, ndpi[0]) imgheightpdf = px_to_pt(imgheightpx, ndpi[1]) pagewidth = imgwidthpdf + 2 * border[1] pageheight = imgheightpdf + 2 * border[0] return pagewidth, pageheight, imgwidthpdf, imgheightpdf return layout_fun if border is None: border = (0, 0) # if the pagesize is given but the imagesize is not, then the imagesize # will be calculated from the pagesize, taking into account the border # and the fitting if pagesize is not None and imgsize is None: def layout_fun(imgwidthpx, imgheightpx, ndpi): if ( pagesize[0] is not None and pagesize[1] is not None and auto_orient and ( (imgwidthpx > imgheightpx and pagesize[0] < pagesize[1]) or (imgwidthpx < imgheightpx and pagesize[0] > pagesize[1]) ) ): pagewidth, pageheight = pagesize[1], pagesize[0] newborder = border[1], border[0] else: pagewidth, pageheight = pagesize[0], pagesize[1] newborder = border if pagewidth is not None: fitwidth = pagewidth - 2 * newborder[1] else: fitwidth = None if pageheight is not None: fitheight = pageheight - 2 * newborder[0] else: fitheight = None if ( fit in [FitMode.fill, FitMode.enlarge] and fitwidth is not None and fitwidth < 0 and fitheight is not None and fitheight < 0 ): raise NegativeDimensionError( "at least one border dimension musts be smaller than half " "the respective page dimension" ) elif fit not in [FitMode.fill, FitMode.enlarge] and ( (fitwidth is not None and fitwidth < 0) or (fitheight is not None and fitheight < 0) ): raise NegativeDimensionError( "one border dimension is larger than half of the " "respective page dimension" ) imgwidthpdf, imgheightpdf = fitfun( fit, px_to_pt(imgwidthpx, ndpi[0]), px_to_pt(imgheightpx, ndpi[1]), fitwidth, fitheight, ) if pagewidth is None: pagewidth = imgwidthpdf + border[1] * 2 if pageheight is None: pageheight = imgheightpdf + border[0] * 2 return pagewidth, pageheight, imgwidthpdf, imgheightpdf return layout_fun def scale_imgsize(s, px, dpi): if s is None: return None mode, value = s if mode == ImgSize.abs: return value if mode == ImgSize.perc: return (px_to_pt(px, dpi) * value) / 100 if mode == ImgSize.dpi: return px_to_pt(px, value) raise NotImplementedError if pagesize is None and imgsize is not None: def layout_fun(imgwidthpx, imgheightpx, ndpi): imgwidthpdf, imgheightpdf = fitfun( fit, px_to_pt(imgwidthpx, ndpi[0]), px_to_pt(imgheightpx, ndpi[1]), scale_imgsize(imgsize[0], imgwidthpx, ndpi[0]), scale_imgsize(imgsize[1], imgheightpx, ndpi[1]), ) pagewidth = imgwidthpdf + 2 * border[1] pageheight = imgheightpdf + 2 * border[0] return pagewidth, pageheight, imgwidthpdf, imgheightpdf return layout_fun if pagesize is not None and imgsize is not None: def layout_fun(imgwidthpx, imgheightpx, ndpi): if ( pagesize[0] is not None and pagesize[1] is not None and auto_orient and ( (imgwidthpx > imgheightpx and pagesize[0] < pagesize[1]) or (imgwidthpx < imgheightpx and pagesize[0] > pagesize[1]) ) ): pagewidth, pageheight = pagesize[1], pagesize[0] else: pagewidth, pageheight = pagesize[0], pagesize[1] imgwidthpdf, imgheightpdf = fitfun( fit, px_to_pt(imgwidthpx, ndpi[0]), px_to_pt(imgheightpx, ndpi[1]), scale_imgsize(imgsize[0], imgwidthpx, ndpi[0]), scale_imgsize(imgsize[1], imgheightpx, ndpi[1]), ) return pagewidth, pageheight, imgwidthpdf, imgheightpdf return layout_fun raise NotImplementedError def default_layout_fun(imgwidthpx, imgheightpx, ndpi): imgwidthpdf = pagewidth = px_to_pt(imgwidthpx, ndpi[0]) imgheightpdf = pageheight = px_to_pt(imgheightpx, ndpi[1]) return pagewidth, pageheight, imgwidthpdf, imgheightpdf def get_fixed_dpi_layout_fun(fixed_dpi): """Layout function that overrides whatever DPI is claimed in input images. >>> layout_fun = get_fixed_dpi_layout_fun((300, 300)) >>> convert(image1, layout_fun=layout_fun, ... outputstream=...) """ def fixed_dpi_layout_fun(imgwidthpx, imgheightpx, ndpi): return default_layout_fun(imgwidthpx, imgheightpx, fixed_dpi) return fixed_dpi_layout_fun def find_scale(pagewidth, pageheight): """Find the power of 10 (10, 100, 1000...) that will reduce the scale below the PDF specification limit of 14400 PDF units (=200 inches). In principle we could also choose a scale that is not a power of 10. We use powers of 10 because numbers in the PDF format are represented in base-10 and using powers of 10 will thus just shift the comma and keep the numbers easily readable by humans as well.""" from math import log10, ceil major = max(pagewidth, pageheight) oversized = major / 14400.0 return 10 ** ceil(log10(oversized)) # Convert the image(s) to a `pdfdoc` object. # The `.writer` attribute holds the underlying engine document handle, and # `.output_version` the minimum version the caller should use when saving. # The main convert() wraps this implementation function. def convert_to_docobject(*images, **kwargs): _default_kwargs = dict( engine=None, title=None, author=None, creator=None, producer=None, creationdate=None, moddate=None, subject=None, keywords=None, colorspace=None, nodate=False, layout_fun=default_layout_fun, viewer_panes=None, viewer_initial_page=None, viewer_magnification=None, viewer_page_layout=None, viewer_fit_window=False, viewer_center_window=False, viewer_fullscreen=False, first_frame_only=False, allow_oversized=True, cropborder=None, bleedborder=None, trimborder=None, artborder=None, pdfa=None, rotation=None, include_thumbnails=False, ) for kwname, default in _default_kwargs.items(): if kwname not in kwargs: kwargs[kwname] = default pdf = pdfdoc( kwargs["engine"], "1.3", kwargs["title"], kwargs["author"], kwargs["creator"], kwargs["producer"], kwargs["creationdate"], kwargs["moddate"], kwargs["subject"], kwargs["keywords"], kwargs["nodate"], kwargs["viewer_panes"], kwargs["viewer_initial_page"], kwargs["viewer_magnification"], kwargs["viewer_page_layout"], kwargs["viewer_fit_window"], kwargs["viewer_center_window"], kwargs["viewer_fullscreen"], kwargs["pdfa"], ) # backwards compatibility with older img2pdf versions where the first # argument to the function had to be given as a list if len(images) == 1: # if only one argument was given and it is a list, expand it if isinstance(images[0], (list, tuple)): images = images[0] if not isinstance(images, (list, tuple)): images = [images] else: if len(images) == 0: raise ValueError("Unable to process empty list") for img in images: # img is allowed to be a path, a binary string representing image data # or a file-like object (really anything that implements read()) # or a pathlib.Path object (really anything that implements read_bytes()) rawdata = None for fun in "read", "read_bytes": try: rawdata = getattr(img, fun)() except AttributeError: pass if rawdata is None: if not isinstance(img, (str, bytes)): raise TypeError("Neither read(), read_bytes() nor is str or bytes") # the thing doesn't have a read() function, so try if we can treat # it as a file name try: f = open(img, "rb") except Exception: # whatever the exception is (string could contain NUL # characters or the path could just not exist) it's not a file # name so we now try treating it as raw image content rawdata = img else: # we are not using a "with" block here because we only want to # catch exceptions thrown by open(). The read() may throw its # own exceptions like MemoryError which should be handled # differently. rawdata = f.read() f.close() # md5 = hashlib.md5(rawdata).hexdigest() # with open("./testdata/" + md5, "wb") as f: # f.write(rawdata) for ( color, ndpi, imgformat, imgdata, smaskdata, imgwidthpx, imgheightpx, palette, inverted, depth, rotation, iccp, ) in read_images( rawdata, kwargs["colorspace"], kwargs["first_frame_only"], kwargs["rotation"], kwargs["include_thumbnails"], ): pagewidth, pageheight, imgwidthpdf, imgheightpdf = kwargs["layout_fun"]( imgwidthpx, imgheightpx, ndpi ) userunit = None if pagewidth < 3.00 or pageheight < 3.00: logger.warning( "pdf width or height is below 3.00 - too small for some viewers!" ) elif pagewidth > 14400.0 or pageheight > 14400.0: if kwargs["allow_oversized"]: userunit = find_scale(pagewidth, pageheight) pagewidth /= userunit pageheight /= userunit imgwidthpdf /= userunit imgheightpdf /= userunit else: raise PdfTooLargeError( "pdf width or height must not exceed 200 inches." ) for border in ["crop", "bleed", "trim", "art"]: if kwargs[border + "border"] is None: continue if pagewidth < 2 * kwargs[border + "border"][1]: raise ValueError( "horizontal %s border larger than page width" % border ) if pageheight < 2 * kwargs[border + "border"][0]: raise ValueError( "vertical %s border larger than page height" % border ) # the image is always centered on the page imgxpdf = (pagewidth - imgwidthpdf) / 2.0 imgypdf = (pageheight - imgheightpdf) / 2.0 pdf.add_imagepage( color, imgwidthpx, imgheightpx, imgformat, imgdata, smaskdata, imgwidthpdf, imgheightpdf, imgxpdf, imgypdf, pagewidth, pageheight, userunit, palette, inverted, depth, rotation, kwargs["cropborder"], kwargs["bleedborder"], kwargs["trimborder"], kwargs["artborder"], iccp, ) pdf.finalize() return pdf # given one or more input image, depending on outputstream, either return a # string containing the whole PDF if outputstream is None or write the PDF # data to the given file-like object and return None # # Input images can be given as file like objects (they must implement read()), # as a binary string representing the image content or as filenames to the # images. def convert(*images, outputstream=None, **kwargs): pdf = convert_to_docobject(*images, **kwargs) if outputstream: pdf.tostream(outputstream) return return pdf.tostring() def parse_num(num, name): if num == "": return None unit = None if num.endswith("pt"): unit = Unit.pt elif num.endswith("cm"): unit = Unit.cm elif num.endswith("mm"): unit = Unit.mm elif num.endswith("in"): unit = Unit.inch else: try: num = float(num) except ValueError: msg = ( "%s is not a floating point number and doesn't have a " "valid unit: %s" % (name, num) ) raise argparse.ArgumentTypeError(msg) if unit is None: unit = Unit.pt else: num = num[:-2] try: num = float(num) except ValueError: msg = "%s is not a floating point number: %s" % (name, num) raise argparse.ArgumentTypeError(msg) if num < 0: msg = "%s must not be negative: %s" % (name, num) raise argparse.ArgumentTypeError(msg) if unit == Unit.cm: num = cm_to_pt(num) elif unit == Unit.mm: num = mm_to_pt(num) elif unit == Unit.inch: num = in_to_pt(num) return num def parse_imgsize_num(num, name): if num == "": return None unit = None if num.endswith("pt"): unit = ImgUnit.pt elif num.endswith("cm"): unit = ImgUnit.cm elif num.endswith("mm"): unit = ImgUnit.mm elif num.endswith("in"): unit = ImgUnit.inch elif num.endswith("dpi"): unit = ImgUnit.dpi elif num.endswith("%"): unit = ImgUnit.perc else: try: num = float(num) except ValueError: msg = ( "%s is not a floating point number and doesn't have a " "valid unit: %s" % (name, num) ) raise argparse.ArgumentTypeError(msg) if unit is None: unit = ImgUnit.pt else: # strip off unit from string if unit == ImgUnit.dpi: num = num[:-3] elif unit == ImgUnit.perc: num = num[:-1] else: num = num[:-2] try: num = float(num) except ValueError: msg = "%s is not a floating point number: %s" % (name, num) raise argparse.ArgumentTypeError(msg) if unit == ImgUnit.cm: num = (ImgSize.abs, cm_to_pt(num)) elif unit == ImgUnit.mm: num = (ImgSize.abs, mm_to_pt(num)) elif unit == ImgUnit.inch: num = (ImgSize.abs, in_to_pt(num)) elif unit == ImgUnit.pt: num = (ImgSize.abs, num) elif unit == ImgUnit.dpi: num = (ImgSize.dpi, num) elif unit == ImgUnit.perc: num = (ImgSize.perc, num) return num def parse_pagesize_rectarg(string): transposed = string.endswith("^T") if transposed: string = string[:-2] if papersizes.get(string.lower()): string = papersizes[string.lower()] if "x" not in string: # if there is no separating "x" in the string, then the string is # interpreted as the width w = parse_num(string, "width") h = None else: w, h = string.split("x", 1) w = parse_num(w, "width") h = parse_num(h, "height") if transposed: w, h = h, w if w is None and h is None: raise argparse.ArgumentTypeError("at least one dimension must be specified") return w, h def parse_imgsize_rectarg(string): transposed = string.endswith("^T") if transposed: string = string[:-2] if papersizes.get(string.lower()): string = papersizes[string.lower()] if "x" not in string: # if there is no separating "x" in the string, then the string is # interpreted as the width w = parse_imgsize_num(string, "width") h = None else: w, h = string.split("x", 1) w = parse_imgsize_num(w, "width") h = parse_imgsize_num(h, "height") if transposed: w, h = h, w if w is None and h is None: raise argparse.ArgumentTypeError("at least one dimension must be specified") return w, h def parse_colorspacearg(string): for c in Colorspace: if c.name == string: return c allowed = ", ".join([c.name for c in Colorspace]) raise argparse.ArgumentTypeError( "Unsupported colorspace: %s. Must be one of: %s." % (string, allowed) ) def parse_enginearg(string): for c in Engine: if c.name == string: return c allowed = ", ".join([c.name for c in Engine]) raise argparse.ArgumentTypeError( "Unsupported engine: %s. Must be one of: %s." % (string, allowed) ) def parse_borderarg(string): if ":" in string: h, v = string.split(":", 1) if h == "": raise argparse.ArgumentTypeError("missing value before colon") if v == "": raise argparse.ArgumentTypeError("missing value after colon") else: if string == "": raise argparse.ArgumentTypeError("border option cannot be empty") h, v = string, string h, v = parse_num(h, "left/right border"), parse_num(v, "top/bottom border") if h is None and v is None: raise argparse.ArgumentTypeError("missing value") return h, v def from_file(path): result = [] if path == "-": content = sys.stdin.buffer.read() else: with open(path, "rb") as f: content = f.read() for path in content.split(b"\0"): if path == b"": continue try: # test-read a byte from it so that we can abort early in case # we cannot read data from the file with open(path, "rb") as im: im.read(1) except IsADirectoryError: raise argparse.ArgumentTypeError('"%s" is a directory' % path) except PermissionError: raise argparse.ArgumentTypeError('"%s" permission denied' % path) except FileNotFoundError: raise argparse.ArgumentTypeError('"%s" does not exist' % path) result.append(path) return result def input_images(path_expr): if path_expr == "-": # we slurp in all data from stdin because we need to seek in it later result = [sys.stdin.buffer.read()] if len(result) == 0: raise argparse.ArgumentTypeError('"%s" is empty' % path_expr) else: result = [] paths = [path_expr] if sys.platform == "win32" and ("*" in path_expr or "?" in path_expr): # on windows, program is responsible for expanding wildcards such as *.jpg # glob won't return files that don't exist so we only use it for wildcards # paths without wildcards that do not exist will trigger "does not exist" from glob import glob paths = sorted(glob(path_expr)) for path in paths: try: if os.path.getsize(path) == 0: raise argparse.ArgumentTypeError('"%s" is empty' % path) # test-read a byte from it so that we can abort early in case # we cannot read data from the file with open(path, "rb") as im: im.read(1) except IsADirectoryError: raise argparse.ArgumentTypeError('"%s" is a directory' % path) except PermissionError: raise argparse.ArgumentTypeError('"%s" permission denied' % path) except FileNotFoundError: raise argparse.ArgumentTypeError('"%s" does not exist' % path) result.append(path) return result def parse_rotationarg(string): for m in Rotation: if m.name == string.lower(): return m raise argparse.ArgumentTypeError("unknown rotation value: %s" % string) def parse_fitarg(string): for m in FitMode: if m.name == string.lower(): return m raise argparse.ArgumentTypeError("unknown fit mode: %s" % string) def parse_panes(string): for m in PageMode: if m.name == string.lower(): return m allowed = ", ".join([m.name for m in PageMode]) raise argparse.ArgumentTypeError( "Unsupported page mode: %s. Must be one of: %s." % (string, allowed) ) def parse_magnification(string): for m in Magnification: if m.name == string.lower(): return m try: return float(string) except ValueError: pass allowed = ", ".join([m.name for m in Magnification]) raise argparse.ArgumentTypeError( "Unsupported magnification: %s. Must be " "a floating point number or one of: %s." % (string, allowed) ) def parse_layout(string): for l in PageLayout: if l.name == string.lower(): return l allowed = ", ".join([l.name for l in PageLayout]) raise argparse.ArgumentTypeError( "Unsupported page layout: %s. Must be one of: %s." % (string, allowed) ) def valid_date(string): # first try parsing in ISO8601 format try: return datetime.strptime(string, "%Y-%m-%d") except ValueError: pass try: return datetime.strptime(string, "%Y-%m-%dT%H:%M") except ValueError: pass try: return datetime.strptime(string, "%Y-%m-%dT%H:%M:%S") except ValueError: pass # then try dateutil try: from dateutil import parser except ImportError: pass else: try: return parser.parse(string) except: pass # as a last resort, try the local date utility try: import subprocess except ImportError: pass else: try: utime = subprocess.check_output(["date", "--date", string, "+%s"]) except subprocess.CalledProcessError: pass else: return datetime.fromtimestamp(int(utime)) raise argparse.ArgumentTypeError("cannot parse date: %s" % string) def gui(): import tkinter import tkinter.filedialog have_fitz = True try: import fitz except ImportError: have_fitz = False # from Python 3.7 Lib/idlelib/configdialog.py # Copyright 2015-2017 Terry Jan Reedy # Python License class VerticalScrolledFrame(tkinter.Frame): """A pure Tkinter vertically scrollable frame. * Use the 'interior' attribute to place widgets inside the scrollable frame * Construct and pack/place/grid normally * This frame only allows vertical scrolling """ def __init__(self, parent, *args, **kw): tkinter.Frame.__init__(self, parent, *args, **kw) # Create a canvas object and a vertical scrollbar for scrolling it. vscrollbar = tkinter.Scrollbar(self, orient=tkinter.VERTICAL) vscrollbar.pack(fill=tkinter.Y, side=tkinter.RIGHT, expand=tkinter.FALSE) canvas = tkinter.Canvas( self, borderwidth=0, highlightthickness=0, yscrollcommand=vscrollbar.set, width=240, ) canvas.pack(side=tkinter.LEFT, fill=tkinter.BOTH, expand=tkinter.TRUE) vscrollbar.config(command=canvas.yview) # Reset the view. canvas.xview_moveto(0) canvas.yview_moveto(0) # Create a frame inside the canvas which will be scrolled with it. self.interior = interior = tkinter.Frame(canvas) interior_id = canvas.create_window(0, 0, window=interior, anchor=tkinter.NW) # Track changes to the canvas and frame width and sync them, # also updating the scrollbar. def _configure_interior(event): # Update the scrollbars to match the size of the inner frame. size = (interior.winfo_reqwidth(), interior.winfo_reqheight()) canvas.config(scrollregion="0 0 %s %s" % size) interior.bind("", _configure_interior) def _configure_canvas(event): if interior.winfo_reqwidth() != canvas.winfo_width(): # Update the inner frame's width to fill the canvas. canvas.itemconfigure(interior_id, width=canvas.winfo_width()) canvas.bind("", _configure_canvas) return # From Python 3.7 Lib/tkinter/__init__.py # Copyright 2000 Fredrik Lundh # Python License # # add support for 'state' and 'name' kwargs # add support for updating list of options class OptionMenu(tkinter.Menubutton): """OptionMenu which allows the user to select a value from a menu.""" def __init__(self, master, variable, value, *values, **kwargs): """Construct an optionmenu widget with the parent MASTER, with the resource textvariable set to VARIABLE, the initially selected value VALUE, the other menu values VALUES and an additional keyword argument command.""" kw = { "borderwidth": 2, "textvariable": variable, "indicatoron": 1, "relief": tkinter.RAISED, "anchor": "c", "highlightthickness": 2, } if "state" in kwargs: kw["state"] = kwargs["state"] del kwargs["state"] if "name" in kwargs: kw["name"] = kwargs["name"] del kwargs["name"] tkinter.Widget.__init__(self, master, "menubutton", kw) self.widgetName = "tk_optionMenu" self.callback = kwargs.get("command") self.variable = variable if "command" in kwargs: del kwargs["command"] if kwargs: raise tkinter.TclError("unknown option -" + list(kwargs.keys())[0]) self.set_values([value] + list(values)) def __getitem__(self, name): if name == "menu": return self.__menu return tkinter.Widget.__getitem__(self, name) def set_values(self, values): menu = self.__menu = tkinter.Menu(self, name="menu", tearoff=0) self.menuname = menu._w for v in values: menu.add_command( label=v, command=tkinter._setit(self.variable, v, self.callback) ) self["menu"] = menu def destroy(self): """Destroy this widget and the associated menu.""" tkinter.Menubutton.destroy(self) self.__menu = None root = tkinter.Tk() app = tkinter.Frame(master=root) infiles = [] maxpagewidth = 0 maxpageheight = 0 doc = None args = { "engine": tkinter.StringVar(), "auto_orient": tkinter.BooleanVar(), "fit": tkinter.StringVar(), "title": tkinter.StringVar(), "author": tkinter.StringVar(), "creator": tkinter.StringVar(), "producer": tkinter.StringVar(), "subject": tkinter.StringVar(), "keywords": tkinter.StringVar(), "nodate": tkinter.BooleanVar(), "creationdate": tkinter.StringVar(), "moddate": tkinter.StringVar(), "viewer_panes": tkinter.StringVar(), "viewer_initial_page": tkinter.IntVar(), "viewer_magnification": tkinter.StringVar(), "viewer_page_layout": tkinter.StringVar(), "viewer_fit_window": tkinter.BooleanVar(), "viewer_center_window": tkinter.BooleanVar(), "viewer_fullscreen": tkinter.BooleanVar(), "pagesize_dropdown": tkinter.StringVar(), "pagesize_width": tkinter.DoubleVar(), "pagesize_height": tkinter.DoubleVar(), "imgsize_dropdown": tkinter.StringVar(), "imgsize_width": tkinter.DoubleVar(), "imgsize_height": tkinter.DoubleVar(), "colorspace": tkinter.StringVar(), "first_frame_only": tkinter.BooleanVar(), } args["engine"].set("auto") args["title"].set("") args["auto_orient"].set(False) args["fit"].set("into") args["colorspace"].set("auto") args["viewer_panes"].set("auto") args["viewer_initial_page"].set(1) args["viewer_magnification"].set("auto") args["viewer_page_layout"].set("auto") args["first_frame_only"].set(False) args["pagesize_dropdown"].set("auto") args["imgsize_dropdown"].set("auto") def on_open_button(): nonlocal infiles nonlocal doc nonlocal maxpagewidth nonlocal maxpageheight infiles = tkinter.filedialog.askopenfilenames( parent=root, title="open image", filetypes=[ ( "images", "*.bmp *.eps *.gif *.ico *.jpeg *.jpg *.jp2 *.pcx *.png *.ppm *.tiff", ), ("all files", "*"), ], # initialdir="/home/josch/git/plakativ", # initialfile="test.pdf", ) if have_fitz: with BytesIO() as f: save_pdf(f) f.seek(0) doc = fitz.open(stream=f, filetype="pdf") for page in doc: if page.get_displaylist().rect.width > maxpagewidth: maxpagewidth = page.get_displaylist().rect.width if page.get_displaylist().rect.height > maxpageheight: maxpageheight = page.get_displaylist().rect.height draw() def save_pdf(stream): pagesizearg = None if args["pagesize_dropdown"].get() == "auto": # nothing to do pass elif args["pagesize_dropdown"].get() == "custom": pagesizearg = args["pagesize_width"].get(), args["pagesize_height"].get() elif args["pagesize_dropdown"].get() in papernames.values(): raise NotImplemented() else: raise Exception("no such pagesize: %s" % args["pagesize_dropdown"].get()) imgsizearg = None if args["imgsize_dropdown"].get() == "auto": # nothing to do pass elif args["imgsize_dropdown"].get() == "custom": imgsizearg = args["imgsize_width"].get(), args["imgsize_height"].get() elif args["imgsize_dropdown"].get() in papernames.values(): raise NotImplemented() else: raise Exception("no such imgsize: %s" % args["imgsize_dropdown"].get()) borderarg = None layout_fun = get_layout_fun( pagesizearg, imgsizearg, borderarg, args["fit"].get(), args["auto_orient"].get(), ) viewer_panesarg = None if args["viewer_panes"].get() == "auto": # nothing to do pass elif args["viewer_panes"].get() in PageMode: viewer_panesarg = args["viewer_panes"].get() else: raise Exception("no such viewer_panes: %s" % args["viewer_panes"].get()) viewer_magnificationarg = None if args["viewer_magnification"].get() == "auto": # nothing to do pass elif args["viewer_magnification"].get() in Magnification: viewer_magnificationarg = args["viewer_magnification"].get() else: raise Exception( "no such viewer_magnification: %s" % args["viewer_magnification"].get() ) viewer_page_layoutarg = None if args["viewer_page_layout"].get() == "auto": # nothing to do pass elif args["viewer_page_layout"].get() in PageLayout: viewer_page_layoutarg = args["viewer_page_layout"].get() else: raise Exception( "no such viewer_page_layout: %s" % args["viewer_page_layout"].get() ) colorspacearg = None if args["colorspace"].get() != "auto": colorspacearg = next( v for v in Colorspace if v.name == args["colorspace"].get() ) enginearg = None if args["engine"].get() != "auto": enginearg = next(v for v in Engine if v.name == args["engine"].get()) convert( *infiles, engine=enginearg, title=args["title"].get() if args["title"].get() else None, author=args["author"].get() if args["author"].get() else None, creator=args["creator"].get() if args["creator"].get() else None, producer=args["producer"].get() if args["producer"].get() else None, creationdate=args["creationdate"].get() if args["creationdate"].get() else None, moddate=args["moddate"].get() if args["moddate"].get() else None, subject=args["subject"].get() if args["subject"].get() else None, keywords=args["keywords"].get() if args["keywords"].get() else None, colorspace=colorspacearg, nodate=args["nodate"].get(), layout_fun=layout_fun, viewer_panes=viewer_panesarg, viewer_initial_page=args["viewer_initial_page"].get() if args["viewer_initial_page"].get() > 1 else None, viewer_magnification=viewer_magnificationarg, viewer_page_layout=viewer_page_layoutarg, viewer_fit_window=(args["viewer_fit_window"].get() or None), viewer_center_window=(args["viewer_center_window"].get() or None), viewer_fullscreen=(args["viewer_fullscreen"].get() or None), outputstream=stream, first_frame_only=args["first_frame_only"].get(), cropborder=None, bleedborder=None, trimborder=None, artborder=None, ) def on_save_button(): filename = tkinter.filedialog.asksaveasfilename( parent=root, title="save PDF", defaultextension=".pdf", filetypes=[("pdf documents", "*.pdf"), ("all files", "*")], # initialdir="/home/josch/git/plakativ", # initialfile=base + "_poster" + ext, ) with open(filename, "wb") as f: save_pdf(f) root.title("img2pdf") app.pack(fill=tkinter.BOTH, expand=tkinter.TRUE) canvas = tkinter.Canvas(app, bg="black") def draw(): canvas.delete(tkinter.ALL) if not infiles: canvas.create_text( canvas.size[0] / 2, canvas.size[1] / 2, text='Click on the "Open Image(s)" button in the upper right.', fill="white", ) return if not doc: canvas.create_text( canvas.size[0] / 2, canvas.size[1] / 2, text="PyMuPDF not available. Install the Python fitz module\n" + "for preview functionality.", fill="white", ) return canvas_padding = 10 # factor to convert from pdf dimensions (given in pt) into canvas # dimensions (given in pixels) zoom = min( (canvas.size[0] - canvas_padding) / maxpagewidth, (canvas.size[1] - canvas_padding) / maxpageheight, ) pagenum = 0 mat_0 = fitz.Matrix(zoom, zoom) canvas.image = tkinter.PhotoImage( data=doc[pagenum] .get_displaylist() .get_pixmap(matrix=mat_0, alpha=False) .tobytes("ppm") ) canvas.create_image( (canvas.size[0] - maxpagewidth * zoom) / 2, (canvas.size[1] - maxpageheight * zoom) / 2, anchor=tkinter.NW, image=canvas.image, ) canvas.create_rectangle( (canvas.size[0] - maxpagewidth * zoom) / 2, (canvas.size[1] - maxpageheight * zoom) / 2, (canvas.size[0] - maxpagewidth * zoom) / 2 + canvas.image.width(), (canvas.size[1] - maxpageheight * zoom) / 2 + canvas.image.height(), outline="red", ) def on_resize(event): canvas.size = (event.width, event.height) draw() canvas.pack(fill=tkinter.BOTH, side=tkinter.LEFT, expand=tkinter.TRUE) canvas.bind("", on_resize) frame_right = tkinter.Frame(app) frame_right.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.Y) top_frame = tkinter.Frame(frame_right) top_frame.pack(fill=tkinter.X) tkinter.Button(top_frame, text="Open Image(s)", command=on_open_button).pack( side=tkinter.LEFT, expand=tkinter.TRUE, fill=tkinter.X ) tkinter.Button(top_frame, text="Help", state=tkinter.DISABLED).pack( side=tkinter.RIGHT, expand=tkinter.TRUE, fill=tkinter.X ) frame1 = VerticalScrolledFrame(frame_right) frame1.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.Y) output_options = tkinter.LabelFrame(frame1.interior, text="Output Options") output_options.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) tkinter.Label(output_options, text="colorspace").grid( row=0, column=0, sticky=tkinter.W ) OptionMenu(output_options, args["colorspace"], "auto", state=tkinter.DISABLED).grid( row=0, column=1, sticky=tkinter.W ) tkinter.Label(output_options, text="engine").grid(row=1, column=0, sticky=tkinter.W) OptionMenu(output_options, args["engine"], "auto", state=tkinter.DISABLED).grid( row=1, column=1, sticky=tkinter.W ) tkinter.Checkbutton( output_options, text="Suppress timestamp", variable=args["nodate"], state=tkinter.DISABLED, ).grid(row=2, column=0, columnspan=2, sticky=tkinter.W) tkinter.Checkbutton( output_options, text="only first frame", variable=args["first_frame_only"], state=tkinter.DISABLED, ).grid(row=3, column=0, columnspan=2, sticky=tkinter.W) tkinter.Checkbutton( output_options, text="force large input", state=tkinter.DISABLED ).grid(row=4, column=0, columnspan=2, sticky=tkinter.W) image_size_frame = tkinter.LabelFrame(frame1.interior, text="Image size") image_size_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) OptionMenu( image_size_frame, args["imgsize_dropdown"], *(["auto", "custom"] + sorted(papernames.values())), state=tkinter.DISABLED, ).grid(row=1, column=0, columnspan=3, sticky=tkinter.W) tkinter.Label( image_size_frame, text="Width:", state=tkinter.DISABLED, name="size_label_width" ).grid(row=2, column=0, sticky=tkinter.W) tkinter.Spinbox( image_size_frame, format="%.2f", increment=0.01, from_=0, to=100, width=5, state=tkinter.DISABLED, name="spinbox_width", ).grid(row=2, column=1, sticky=tkinter.W) tkinter.Label( image_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_width_mm" ).grid(row=2, column=2, sticky=tkinter.W) tkinter.Label( image_size_frame, text="Height:", state=tkinter.DISABLED, name="size_label_height", ).grid(row=3, column=0, sticky=tkinter.W) tkinter.Spinbox( image_size_frame, format="%.2f", increment=0.01, from_=0, to=100, width=5, state=tkinter.DISABLED, name="spinbox_height", ).grid(row=3, column=1, sticky=tkinter.W) tkinter.Label( image_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_height_mm" ).grid(row=3, column=2, sticky=tkinter.W) page_size_frame = tkinter.LabelFrame(frame1.interior, text="Page size") page_size_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) OptionMenu( page_size_frame, args["pagesize_dropdown"], *(["auto", "custom"] + sorted(papernames.values())), state=tkinter.DISABLED, ).grid(row=1, column=0, columnspan=3, sticky=tkinter.W) tkinter.Label( page_size_frame, text="Width:", state=tkinter.DISABLED, name="size_label_width" ).grid(row=2, column=0, sticky=tkinter.W) tkinter.Spinbox( page_size_frame, format="%.2f", increment=0.01, from_=0, to=100, width=5, state=tkinter.DISABLED, name="spinbox_width", ).grid(row=2, column=1, sticky=tkinter.W) tkinter.Label( page_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_width_mm" ).grid(row=2, column=2, sticky=tkinter.W) tkinter.Label( page_size_frame, text="Height:", state=tkinter.DISABLED, name="size_label_height", ).grid(row=3, column=0, sticky=tkinter.W) tkinter.Spinbox( page_size_frame, format="%.2f", increment=0.01, from_=0, to=100, width=5, state=tkinter.DISABLED, name="spinbox_height", ).grid(row=3, column=1, sticky=tkinter.W) tkinter.Label( page_size_frame, text="mm", state=tkinter.DISABLED, name="size_label_height_mm" ).grid(row=3, column=2, sticky=tkinter.W) layout_frame = tkinter.LabelFrame(frame1.interior, text="Layout") layout_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) tkinter.Label(layout_frame, text="border", state=tkinter.DISABLED).grid( row=0, column=0, sticky=tkinter.W ) tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid( row=0, column=1, sticky=tkinter.W ) tkinter.Label(layout_frame, text="fit", state=tkinter.DISABLED).grid( row=1, column=0, sticky=tkinter.W ) OptionMenu( layout_frame, args["fit"], *[v.name for v in FitMode], state=tkinter.DISABLED ).grid(row=1, column=1, sticky=tkinter.W) tkinter.Checkbutton( layout_frame, text="auto orient", state=tkinter.DISABLED, variable=args["auto_orient"], ).grid(row=2, column=0, columnspan=2, sticky=tkinter.W) tkinter.Label(layout_frame, text="crop border", state=tkinter.DISABLED).grid( row=3, column=0, sticky=tkinter.W ) tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid( row=3, column=1, sticky=tkinter.W ) tkinter.Label(layout_frame, text="bleed border", state=tkinter.DISABLED).grid( row=4, column=0, sticky=tkinter.W ) tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid( row=4, column=1, sticky=tkinter.W ) tkinter.Label(layout_frame, text="trim border", state=tkinter.DISABLED).grid( row=5, column=0, sticky=tkinter.W ) tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid( row=5, column=1, sticky=tkinter.W ) tkinter.Label(layout_frame, text="art border", state=tkinter.DISABLED).grid( row=6, column=0, sticky=tkinter.W ) tkinter.Spinbox(layout_frame, state=tkinter.DISABLED).grid( row=6, column=1, sticky=tkinter.W ) metadata_frame = tkinter.LabelFrame(frame1.interior, text="PDF metadata") metadata_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) tkinter.Label(metadata_frame, text="title", state=tkinter.DISABLED).grid( row=0, column=0, sticky=tkinter.W ) tkinter.Entry( metadata_frame, textvariable=args["title"], state=tkinter.DISABLED ).grid(row=0, column=1, sticky=tkinter.W) tkinter.Label(metadata_frame, text="author", state=tkinter.DISABLED).grid( row=1, column=0, sticky=tkinter.W ) tkinter.Entry( metadata_frame, textvariable=args["author"], state=tkinter.DISABLED ).grid(row=1, column=1, sticky=tkinter.W) tkinter.Label(metadata_frame, text="creator", state=tkinter.DISABLED).grid( row=2, column=0, sticky=tkinter.W ) tkinter.Entry( metadata_frame, textvariable=args["creator"], state=tkinter.DISABLED ).grid(row=2, column=1, sticky=tkinter.W) tkinter.Label(metadata_frame, text="producer", state=tkinter.DISABLED).grid( row=3, column=0, sticky=tkinter.W ) tkinter.Entry( metadata_frame, textvariable=args["producer"], state=tkinter.DISABLED ).grid(row=3, column=1, sticky=tkinter.W) tkinter.Label(metadata_frame, text="creation date", state=tkinter.DISABLED).grid( row=4, column=0, sticky=tkinter.W ) tkinter.Entry( metadata_frame, textvariable=args["creationdate"], state=tkinter.DISABLED ).grid(row=4, column=1, sticky=tkinter.W) tkinter.Label( metadata_frame, text="modification date", state=tkinter.DISABLED ).grid(row=5, column=0, sticky=tkinter.W) tkinter.Entry( metadata_frame, textvariable=args["moddate"], state=tkinter.DISABLED ).grid(row=5, column=1, sticky=tkinter.W) tkinter.Label(metadata_frame, text="subject", state=tkinter.DISABLED).grid( row=6, column=0, sticky=tkinter.W ) tkinter.Entry(metadata_frame, state=tkinter.DISABLED).grid( row=6, column=1, sticky=tkinter.W ) tkinter.Label(metadata_frame, text="keywords", state=tkinter.DISABLED).grid( row=7, column=0, sticky=tkinter.W ) tkinter.Entry(metadata_frame, state=tkinter.DISABLED).grid( row=7, column=1, sticky=tkinter.W ) viewer_frame = tkinter.LabelFrame(frame1.interior, text="PDF viewer options") viewer_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) tkinter.Label(viewer_frame, text="panes", state=tkinter.DISABLED).grid( row=0, column=0, sticky=tkinter.W ) OptionMenu( viewer_frame, args["viewer_panes"], *(["auto"] + [v.name for v in PageMode]), state=tkinter.DISABLED, ).grid(row=0, column=1, sticky=tkinter.W) tkinter.Label(viewer_frame, text="initial page", state=tkinter.DISABLED).grid( row=1, column=0, sticky=tkinter.W ) tkinter.Spinbox( viewer_frame, increment=1, from_=1, to=10000, width=6, textvariable=args["viewer_initial_page"], state=tkinter.DISABLED, name="viewer_initial_page_spinbox", ).grid(row=1, column=1, sticky=tkinter.W) tkinter.Label(viewer_frame, text="magnification", state=tkinter.DISABLED).grid( row=2, column=0, sticky=tkinter.W ) OptionMenu( viewer_frame, args["viewer_magnification"], *(["auto", "custom"] + [v.name for v in Magnification]), state=tkinter.DISABLED, ).grid(row=2, column=1, sticky=tkinter.W) tkinter.Label(viewer_frame, text="page layout", state=tkinter.DISABLED).grid( row=3, column=0, sticky=tkinter.W ) OptionMenu( viewer_frame, args["viewer_page_layout"], *(["auto"] + [v.name for v in PageLayout]), state=tkinter.DISABLED, ).grid(row=3, column=1, sticky=tkinter.W) tkinter.Checkbutton( viewer_frame, text="fit window to page size", variable=args["viewer_fit_window"], state=tkinter.DISABLED, ).grid(row=4, column=0, columnspan=2, sticky=tkinter.W) tkinter.Checkbutton( viewer_frame, text="center window", variable=args["viewer_center_window"], state=tkinter.DISABLED, ).grid(row=5, column=0, columnspan=2, sticky=tkinter.W) tkinter.Checkbutton( viewer_frame, text="open in fullscreen", variable=args["viewer_fullscreen"], state=tkinter.DISABLED, ).grid(row=6, column=0, columnspan=2, sticky=tkinter.W) option_frame = tkinter.LabelFrame(frame1.interior, text="Program options") option_frame.pack(side=tkinter.TOP, expand=tkinter.TRUE, fill=tkinter.X) tkinter.Label(option_frame, text="Unit:", state=tkinter.DISABLED).grid( row=0, column=0, sticky=tkinter.W ) unit = tkinter.StringVar() unit.set("mm") OptionMenu(option_frame, unit, ["mm"], state=tkinter.DISABLED).grid( row=0, column=1, sticky=tkinter.W ) tkinter.Label(option_frame, text="Language:", state=tkinter.DISABLED).grid( row=1, column=0, sticky=tkinter.W ) language = tkinter.StringVar() language.set("English") OptionMenu(option_frame, language, ["English"], state=tkinter.DISABLED).grid( row=1, column=1, sticky=tkinter.W ) bottom_frame = tkinter.Frame(frame_right) bottom_frame.pack(fill=tkinter.X) tkinter.Button(bottom_frame, text="Save PDF", command=on_save_button).pack( side=tkinter.LEFT, expand=tkinter.TRUE, fill=tkinter.X ) tkinter.Button(bottom_frame, text="Exit", command=root.destroy).pack( side=tkinter.RIGHT, expand=tkinter.TRUE, fill=tkinter.X ) app.mainloop() def file_is_icc(fname): with open(fname, "rb") as f: data = f.read(40) if len(data) < 40: return False return data[36:] == b"acsp" def validate_icc(fname): if not file_is_icc(fname): raise argparse.ArgumentTypeError('"%s" is not an ICC profile' % fname) return fname def get_default_icc_profile(): for profile in [ "/usr/share/color/icc/sRGB.icc", "/usr/share/color/icc/OpenICC/sRGB.icc", "/usr/share/color/icc/colord/sRGB.icc", ]: if not os.path.exists(profile): continue if not file_is_icc(profile): continue return profile return "/usr/share/color/icc/sRGB.icc" def get_main_parser(): rendered_papersizes = "" for k, v in sorted(papersizes.items()): rendered_papersizes += " %-8s %s\n" % (papernames[k], v) parser = argparse.ArgumentParser( formatter_class=argparse.RawDescriptionHelpFormatter, description="""\ Losslessly convert raster images to PDF without re-encoding PNG, JPEG, and JPEG2000 images. This leads to a lossless conversion of PNG, JPEG and JPEG2000 images with the only added file size coming from the PDF container itself. Other raster graphics formats are losslessly stored using the same encoding that PNG uses. For images with transparency, the alpha channel will be stored as a separate soft mask. This is lossless, too. The output is sent to standard output so that it can be redirected into a file or to another program as part of a shell pipe. To directly write the output into a file, use the -o or --output option. Options: """, epilog="""\ Colorspace: Currently, the colorspace must be forced for JPEG 2000 images that are not in the RGB colorspace. Available colorspace options are based on Python Imaging Library (PIL) short handles. RGB RGB color L Grayscale 1 Black and white (internally converted to grayscale) CMYK CMYK color CMYK;I CMYK color with inversion (for CMYK JPEG files from Adobe) Paper sizes: You can specify the short hand paper size names shown in the first column in the table below as arguments to the --pagesize and --imgsize options. The width and height they are mapping to is shown in the second column. Giving the value in the second column has the same effect as giving the short hand in the first column. Appending ^T (a caret/circumflex followed by the letter T) turns the paper size from portrait into landscape. The postfix thus symbolizes the transpose. Note that on Windows cmd.exe the caret symbol is the escape character, so you need to put quotes around the option value. The values are case insensitive. %s Fit options: The img2pdf options for the --fit argument are shown in the first column in the table below. The function of these options can be mapped to the geometry operators of imagemagick. For users who are familiar with imagemagick, the corresponding operator is shown in the second column. The third column shows whether or not the aspect ratio is preserved for that option (same as in imagemagick). Just like imagemagick, img2pdf tries hard to preserve the aspect ratio, so if the --fit argument is not given, then the default is "into" which corresponds to the absence of any operator in imagemagick. The value of the --fit option is case insensitive. into | | Y | The default. Width and height values specify maximum | | | values. ---------+---+---+---------------------------------------------------------- fill | ^ | Y | Width and height values specify the minimum values. ---------+---+---+---------------------------------------------------------- exact | ! | N | Width and height emphatically given. ---------+---+---+---------------------------------------------------------- shrink | > | Y | Shrinks an image with dimensions larger than the given | | | ones (and otherwise behaves like "into"). ---------+---+---+---------------------------------------------------------- enlarge | < | Y | Enlarges an image with dimensions smaller than the given | | | ones (and otherwise behaves like "into"). Argument parsing: Argument long options can be abbreviated to a prefix if the abbreviation is unambiguous. That is, the prefix must match a unique option. Beware of your shell interpreting argument values as special characters (like the semicolon in the CMYK;I colorspace option). If in doubt, put the argument values in single quotes. If you want an argument value to start with one or more minus characters, you must use the long option name and join them with an equal sign like so: $ img2pdf --author=--test-- If your input file name starts with one or more minus characters, either separate the input files from the other arguments by two minus signs: $ img2pdf -- --my-file-starts-with-two-minuses.jpg Or be more explicit about its relative path by prepending a ./: $ img2pdf ./--my-file-starts-with-two-minuses.jpg The order of non-positional arguments (all arguments other than the input images) does not matter. Examples: Lines starting with a dollar sign denote commands you can enter into your terminal. The dollar sign signifies your command prompt. It is not part of the command you type. Convert two scans in JPEG format to a PDF document. $ img2pdf --output out.pdf page1.jpg page2.jpg Use a custom dpi value for the input images: $ img2pdf --output out.pdf --imgsize 300dpi page1.jpg page2.jpg Convert a directory of JPEG images into a PDF with printable A4 pages in landscape mode. On each page, the photo takes the maximum amount of space while preserving its aspect ratio and a print border of 2 cm on the top and bottom and 2.5 cm on the left and right hand side. $ img2pdf --output out.pdf --pagesize "A4^T" --border 2cm:2.5cm *.jpg On each A4 page, fit images into a 10 cm times 15 cm rectangle but keep the original image size if the image is smaller than that. $ img2pdf --output out.pdf -S A4 --imgsize 10cmx15cm --fit shrink *.jpg Prepare a directory of photos to be printed borderless on photo paper with a 3:2 aspect ratio and rotate each page so that its orientation is the same as the input image. $ img2pdf --output out.pdf --pagesize 15cmx10cm --auto-orient *.jpg Encode a grayscale JPEG2000 image. The colorspace has to be forced as img2pdf cannot read it from the JPEG2000 file automatically. $ img2pdf --output out.pdf --colorspace L input.jp2 Written by Johannes Schauer Marin Rodrigues Report bugs at https://gitlab.mister-muffin.de/josch/img2pdf/issues """ % rendered_papersizes, ) parser.add_argument( "images", metavar="infile", type=input_images, nargs="*", help="Specifies the input file(s) in any format that can be read by " "the Python Imaging Library (PIL). If no input images are given, then " 'a single image is read from standard input. The special filename "-" ' "can be used once to read an image from standard input. To read a " 'file in the current directory with the filename "-" (or with a ' 'filename starting with "-"), pass it to img2pdf by explicitly ' 'stating its relative path like "./-". Cannot be used together with ' "--from-file.", ) parser.add_argument( "-v", "--verbose", action="store_true", help="Makes the program operate in verbose mode, printing messages on " "standard error.", ) parser.add_argument( "-V", "--version", action="version", version="%(prog)s " + __version__, help="Prints version information and exits.", ) parser.add_argument( "--gui", dest="gui", action="store_true", help="run experimental tkinter gui" ) parser.add_argument( "--from-file", metavar="FILE", type=from_file, default=[], help="Read the list of images from FILE instead of passing them as " "positional arguments. If this option is used, then the list of " "positional arguments must be empty. The paths to the input images " 'in FILE are separated by NUL bytes. If FILE is "-" then the paths ' "are expected on standard input. This option is useful if you want " "to pass more images than the maximum command length of your shell " "permits. This option can be used with commands like `find -print0`.", ) outargs = parser.add_argument_group( title="General output arguments", description="Arguments controlling the output format.", ) # In Python3 we have to output to sys.stdout.buffer because we write are # bytes and not strings. In certain situations, like when the main # function is wrapped by contextlib.redirect_stdout(), sys.stdout does not # have the buffer attribute. Thus we write to sys.stdout by default and # to sys.stdout.buffer if it exists. outargs.add_argument( "-o", "--output", metavar="out", type=argparse.FileType("wb"), default=sys.stdout.buffer if hasattr(sys.stdout, "buffer") else sys.stdout, help="Makes the program output to a file instead of standard output.", ) outargs.add_argument( "-C", "--colorspace", metavar="colorspace", type=parse_colorspacearg, help=""" Forces the PIL colorspace. See the epilogue for a list of possible values. Usually the PDF colorspace would be derived from the color space of the input image. This option overwrites the automatically detected colorspace from the input image and thus forces a certain colorspace in the output PDF /ColorSpace property. This is useful for JPEG 2000 images with a different colorspace than RGB.""", ) outargs.add_argument( "-D", "--nodate", action="store_true", help="Suppresses timestamps in the output and thus makes the output " "deterministic between individual runs. You can also manually " "set a date using the --moddate and --creationdate options.", ) outargs.add_argument( "--engine", metavar="engine", type=parse_enginearg, help="Choose PDF engine. Can be either internal, pikepdf or pdfrw. " "The internal engine does not have additional requirements and writes " "out a human readable PDF. The pikepdf engine requires the pikepdf " "Python module and qpdf library, is most featureful, can " 'linearize PDFs ("fast web view") and can compress more parts of it.' "The pdfrw engine requires the pdfrw Python " "module but does not support unicode metadata (See " "https://github.com/pmaupin/pdfrw/issues/39) or palette data (See " "https://github.com/pmaupin/pdfrw/issues/128).", ) outargs.add_argument( "--first-frame-only", action="store_true", help="By default, img2pdf will convert multi-frame images like " "multi-page TIFF or animated GIF images to one page per frame. " "This option will only let the first frame of every multi-frame " "input image be converted into a page in the resulting PDF.", ) outargs.add_argument( "--include-thumbnails", action="store_true", help="Some multi-frame formats like MPO carry a main image and " "one or more scaled-down copies of the main image (thumbnails). " "In such a case, img2pdf will only include the main image and " "not create additional pages for each of the thumbnails. If this " "option is set, img2pdf will instead create one page per frame and " "thus store each thumbnail on its own page.", ) outargs.add_argument( "--pillow-limit-break", action="store_true", help="img2pdf uses the Python Imaging Library Pillow to read input " "images. Pillow limits the maximum input image size to %d pixels " "to prevent decompression bomb denial of service attacks. If " "your input image contains more pixels than that, use this " "option to disable this safety measure during this run of img2pdf" % Image.MAX_IMAGE_PIXELS, ) if sys.platform == "win32": # on Windows, there are no default paths to search for an ICC profile # so make the argument required instead of optional outargs.add_argument( "--pdfa", type=validate_icc, help="Output a PDF/A-1b compliant document. The argument to this " "option is the path to the ICC profile that will be embedded into " "the resulting PDF.", ) else: outargs.add_argument( "--pdfa", nargs="?", const=get_default_icc_profile(), default=None, type=validate_icc, help="Output a PDF/A-1b compliant document. By default, this will " "embed either /usr/share/color/icc/sRGB.icc, " "/usr/share/color/icc/OpenICC/sRGB.icc or " "/usr/share/color/icc/colord/sRGB.icc as the color profile, whichever " "is found to exist first.", ) sizeargs = parser.add_argument_group( title="Image and page size and layout arguments", description="""\ Every input image will be placed on its own page. The image size is controlled by the dpi value of the input image or, if unset or missing, the default dpi of %.2f. By default, each page will have the same size as the image it shows. Thus, there will be no visible border between the image and the page border by default. If image size and page size are made different from each other by the options in this section, the image will always be centered in both dimensions. The image size and page size can be explicitly set using the --imgsize and --pagesize options, respectively. If either dimension of the image size is specified but the same dimension of the page size is not, then the latter will be derived from the former using an optional minimal distance between the image and the page border (given by the --border option) and/or a certain fitting strategy (given by the --fit option). The converse happens if a dimension of the page size is set but the same dimension of the image size is not. Any length value in below options is represented by the meta variable L which is a floating point value with an optional unit appended (without a space between them). The default unit is pt (1/72 inch, the PDF unit) and other allowed units are cm (centimeter), mm (millimeter), and in (inch). Any size argument of the format LxL in the options below specifies the width and height of a rectangle where the first L represents the width and the second L represents the height with an optional unit following each value as described above. Either width or height may be omitted. If the height is omitted, the separating x can be omitted as well. Omitting the width requires to prefix the height with the separating x. The missing dimension will be chosen so to not change the image aspect ratio. Instead of giving the width and height explicitly, you may also specify some (case-insensitive) common page sizes such as letter and A4. See the epilogue at the bottom for a complete list of the valid sizes. The --fit option scales to fit the image into a rectangle that is either derived from the --imgsize option or otherwise from the --pagesize option. If the --border option is given in addition to the --imgsize option while the --pagesize option is not given, then the page size will be calculated from the image size, respecting the border setting. If the --border option is given in addition to the --pagesize option while the --imgsize option is not given, then the image size will be calculated from the page size, respecting the border setting. If the --border option is given while both the --pagesize and --imgsize options are passed, then the --border option will be ignored. The --pagesize option or the --imgsize option with the --border option will determine the MediaBox size of the resulting PDF document. """ % default_dpi, ) sizeargs.add_argument( "-S", "--pagesize", metavar="LxL", type=parse_pagesize_rectarg, help=""" Sets the size of the PDF pages. The short-option is the upper case S because it is an mnemonic for being bigger than the image size.""", ) sizeargs.add_argument( "-s", "--imgsize", metavar="LxL", type=parse_imgsize_rectarg, help=""" Sets the size of the images on the PDF pages. In addition, the unit dpi is allowed which will set the image size as a value of dots per inch. Instead of a unit, width and height values may also have a percentage sign appended, indicating a resize of the image by that percentage. The short-option is the lower case s because it is an mnemonic for being smaller than the page size. """, ) sizeargs.add_argument( "-b", "--border", metavar="L[:L]", type=parse_borderarg, help=""" Specifies the minimal distance between the image border and the PDF page border. This value Is overwritten by explicit values set by --pagesize or --imgsize. The value will be used when calculating page dimensions from the image dimensions or the other way round. One, or two length values can be given as an argument, separated by a colon. One value specifies the minimal border on all four sides. Two values specify the minimal border on the top/bottom and left/right, respectively. It is not possible to specify asymmetric borders because images will always be centered on the page. """, ) sizeargs.add_argument( "-f", "--fit", metavar="FIT", type=parse_fitarg, default=FitMode.into, help=""" If --imgsize is given, fits the image using these dimensions. Otherwise, fit the image into the dimensions given by --pagesize. FIT is one of into, fill, exact, shrink and enlarge. The default value is "into". See the epilogue at the bottom for a description of the FIT options. """, ) sizeargs.add_argument( "-a", "--auto-orient", action="store_true", help=""" If both dimensions of the page are given via --pagesize, conditionally swaps these dimensions such that the page orientation is the same as the orientation of the input image. If the orientation of a page gets flipped, then so do the values set via the --border option. """, ) sizeargs.add_argument( "-r", "--rotation", "--orientation", metavar="ROT", type=parse_rotationarg, default=Rotation.auto, help=""" Specifies how input images should be rotated. ROT can be one of auto, none, ifvalid, 0, 90, 180 and 270. The default value is auto and indicates that input images are rotated according to their EXIF Orientation tag. The values none and 0 ignore the EXIF Orientation values of the input images. The value ifvalid acts like auto but ignores invalid EXIF rotation values and only issues a warning instead of throwing an error. This is useful because many devices like Android phones, Canon cameras or scanners emit an invalid Orientation tag value of zero. The values 90, 180 and 270 perform a clockwise rotation of the image. """, ) sizeargs.add_argument( "--crop-border", metavar="L[:L]", type=parse_borderarg, help=""" Specifies the border between the CropBox and the MediaBox. One, or two length values can be given as an argument, separated by a colon. One value specifies the border on all four sides. Two values specify the border on the top/bottom and left/right, respectively. It is not possible to specify asymmetric borders. """, ) sizeargs.add_argument( "--bleed-border", metavar="L[:L]", type=parse_borderarg, help=""" Specifies the border between the BleedBox and the MediaBox. One, or two length values can be given as an argument, separated by a colon. One value specifies the border on all four sides. Two values specify the border on the top/bottom and left/right, respectively. It is not possible to specify asymmetric borders. """, ) sizeargs.add_argument( "--trim-border", metavar="L[:L]", type=parse_borderarg, help=""" Specifies the border between the TrimBox and the MediaBox. One, or two length values can be given as an argument, separated by a colon. One value specifies the border on all four sides. Two values specify the border on the top/bottom and left/right, respectively. It is not possible to specify asymmetric borders. """, ) sizeargs.add_argument( "--art-border", metavar="L[:L]", type=parse_borderarg, help=""" Specifies the border between the ArtBox and the MediaBox. One, or two length values can be given as an argument, separated by a colon. One value specifies the border on all four sides. Two values specify the border on the top/bottom and left/right, respectively. It is not possible to specify asymmetric borders. """, ) metaargs = parser.add_argument_group( title="Arguments setting metadata", description="Options handling embedded timestamps, title and author " "information.", ) metaargs.add_argument( "--title", metavar="title", type=str, help="Sets the title metadata value" ) metaargs.add_argument( "--author", metavar="author", type=str, help="Sets the author metadata value" ) metaargs.add_argument( "--creator", metavar="creator", type=str, help="Sets the creator metadata value" ) metaargs.add_argument( "--producer", metavar="producer", type=str, default="img2pdf " + __version__, help="Sets the producer metadata value " "(default is: img2pdf " + __version__ + ")", ) metaargs.add_argument( "--creationdate", metavar="creationdate", type=valid_date, help="Sets the UTC creation date metadata value in YYYY-MM-DD or " "YYYY-MM-DDTHH:MM or YYYY-MM-DDTHH:MM:SS format or any format " "understood by python dateutil module or any format understood " "by `date --date`", ) metaargs.add_argument( "--moddate", metavar="moddate", type=valid_date, help="Sets the UTC modification date metadata value in YYYY-MM-DD " "or YYYY-MM-DDTHH:MM or YYYY-MM-DDTHH:MM:SS format or any format " "understood by python dateutil module or any format understood " "by `date --date`", ) metaargs.add_argument( "--subject", metavar="subject", type=str, help="Sets the subject metadata value" ) metaargs.add_argument( "--keywords", metavar="kw", type=str, nargs="+", help="Sets the keywords metadata value (can be given multiple times)", ) viewerargs = parser.add_argument_group( title="PDF viewer arguments", description="PDF files can specify how they are meant to be " "presented to the user by a PDF viewer", ) viewerargs.add_argument( "--viewer-panes", metavar="PANES", type=parse_panes, help="Instruct the PDF viewer which side panes to show. Valid values " 'are "outlines" and "thumbs". It is not possible to specify both ' "at the same time.", ) viewerargs.add_argument( "--viewer-initial-page", metavar="NUM", type=int, help="Instead of showing the first page, instruct the PDF viewer to " "show the given page instead. Page numbers start with 1.", ) viewerargs.add_argument( "--viewer-magnification", metavar="MAG", type=parse_magnification, help="Instruct the PDF viewer to open the PDF with a certain zoom " "level. Valid values are either a floating point number giving " 'the exact zoom level, "fit" (zoom to fit whole page), "fith" ' '(zoom to fit page width) and "fitbh" (zoom to fit visible page ' "width).", ) viewerargs.add_argument( "--viewer-page-layout", metavar="LAYOUT", type=parse_layout, help="Instruct the PDF viewer how to arrange the pages on the screen. " 'Valid values are "single" (display single pages), "onecolumn" ' '(one continuous column), "twocolumnright" (two continuous ' 'columns with odd number pages on the right) and "twocolumnleft" ' "(two continuous columns with odd numbered pages on the left), " '"twopageright" (two pages with odd numbered page on the right) ' 'and "twopageleft" (two pages with odd numbered page on the left)', ) viewerargs.add_argument( "--viewer-fit-window", action="store_true", help="Instruct the PDF viewer to resize the window to fit the page size", ) viewerargs.add_argument( "--viewer-center-window", action="store_true", help="Instruct the PDF viewer to center the PDF viewer window", ) viewerargs.add_argument( "--viewer-fullscreen", action="store_true", help="Instruct the PDF viewer to open the PDF in fullscreen mode", ) return parser def main(argv=sys.argv): args = get_main_parser().parse_args(argv[1:]) if args.verbose: logging.basicConfig(level=logging.DEBUG) if args.pillow_limit_break: Image.MAX_IMAGE_PIXELS = None if args.gui: gui() sys.exit(0) layout_fun = get_layout_fun( args.pagesize, args.imgsize, args.border, args.fit, args.auto_orient ) if len(args.images) > 0 and len(args.from_file) > 0: logger.error( "%s: error: cannot use --from-file with positional arguments" % parser.prog ) sys.exit(2) elif len(args.images) == 0 and len(args.from_file) == 0: # if no positional arguments were supplied, read a single image from # standard input print( "Reading image from standard input...\n" "Re-run with -h or --help for usage information.", file=sys.stderr, ) try: images = [sys.stdin.buffer.read()] except KeyboardInterrupt: sys.exit(0) elif len(args.images) > 0 and len(args.from_file) == 0: # On windows, each positional argument can expand into multiple paths # because we do globbing ourselves. Here we flatten the list of lists # again. images = list(chain.from_iterable(args.images)) elif len(args.images) == 0 and len(args.from_file) > 0: images = args.from_file # with the number of pages being equal to the number of images, the # value passed to --viewer-initial-page must be between 1 and that number if args.viewer_initial_page is not None: if args.viewer_initial_page < 1: parser.print_usage(file=sys.stderr) logger.error( "%s: error: argument --viewer-initial-page: must be " "greater than zero" % parser.prog ) sys.exit(2) if args.viewer_initial_page > len(images): parser.print_usage(file=sys.stderr) logger.error( "%s: error: argument --viewer-initial-page: must be " "less than or equal to the total number of pages" % parser.prog ) sys.exit(2) try: convert( *images, engine=args.engine, title=args.title, author=args.author, creator=args.creator, producer=args.producer, creationdate=args.creationdate, moddate=args.moddate, subject=args.subject, keywords=args.keywords, colorspace=args.colorspace, nodate=args.nodate, layout_fun=layout_fun, viewer_panes=args.viewer_panes, viewer_initial_page=args.viewer_initial_page, viewer_magnification=args.viewer_magnification, viewer_page_layout=args.viewer_page_layout, viewer_fit_window=args.viewer_fit_window, viewer_center_window=args.viewer_center_window, viewer_fullscreen=args.viewer_fullscreen, outputstream=args.output, first_frame_only=args.first_frame_only, cropborder=args.crop_border, bleedborder=args.bleed_border, trimborder=args.trim_border, artborder=args.art_border, pdfa=args.pdfa, rotation=args.rotation, include_thumbnails=args.include_thumbnails, ) except Exception as e: logger.error("error: " + str(e)) if logger.isEnabledFor(logging.DEBUG): import traceback traceback.print_exc(file=sys.stderr) sys.exit(1) if __name__ == "__main__": main() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1745766091.0 img2pdf-0.6.1/src/img2pdf_test.py0000755000175000017500000104111215003443313015633 0ustar00joschjosch#!/usr/bin/env python3 import sys import numpy import scipy.signal import zlib import struct import subprocess import pytest import re import pikepdf import hashlib import img2pdf import os from io import BytesIO from PIL import Image import decimal from packaging.version import parse as parse_version import warnings import json import pathlib import itertools import xml.etree.ElementTree as ET import platform img2pdfprog = os.getenv("img2pdfprog", default="src/img2pdf.py") ICC_PROFILE = None ICC_PROFILE_PATHS = ( # Debian "/usr/share/color/icc/ghostscript/srgb.icc", # Fedora "/usr/share/ghostscript/iccprofiles/srgb.icc", # Archlinux and Gentoo "/usr/share/ghostscript/*/iccprofiles/srgb.icc", ) for glob in ICC_PROFILE_PATHS: for path in pathlib.Path("/").glob(glob.lstrip("/")): if path.is_file(): ICC_PROFILE = path break HAVE_FAKETIME = True try: ver = subprocess.check_output(["faketime", "--version"]) if b"faketime: Version " not in ver: HAVE_FAKETIME = False except FileNotFoundError: HAVE_FAKETIME = False HAVE_MUTOOL = True try: ver = subprocess.check_output(["mutool", "-v"], stderr=subprocess.STDOUT) m = re.fullmatch(r"mutool version ([0-9.]+)\n", ver.decode("utf8")) if m is None: HAVE_MUTOOL = False else: if parse_version(m.group(1)) < parse_version("1.10.0"): HAVE_MUTOOL = False except FileNotFoundError: HAVE_MUTOOL = False if not HAVE_MUTOOL: warnings.warn("mutool >= 1.10.0 not available, skipping checks...") HAVE_PDFIMAGES_CMYK = True try: ver = subprocess.check_output(["pdfimages", "-v"], stderr=subprocess.STDOUT) m = re.fullmatch(r"pdfimages version ([0-9.]+)", ver.split(b"\n")[0].decode("utf8")) if m is None: HAVE_PDFIMAGES_CMYK = False else: if parse_version(m.group(1)) < parse_version("0.42.0"): HAVE_PDFIMAGES_CMYK = False except FileNotFoundError: HAVE_PDFIMAGES_CMYK = False if not HAVE_PDFIMAGES_CMYK: warnings.warn("pdfimages >= 0.42.0 not available, skipping CMYK checks...") for prog in ["convert", "compare", "identify"]: try: subprocess.check_call([prog] + ["-version"], stderr=subprocess.STDOUT) globals()[prog.upper()] = [prog] except subprocess.CalledProcessError: globals()[prog.upper()] = ["magick", prog] HAVE_IMAGEMAGICK_MODERN = True HAVE_EXACT_CMYK8 = True try: ver = subprocess.check_output(CONVERT + ["-version"], stderr=subprocess.STDOUT) m = re.fullmatch( r"Version: ImageMagick ([0-9.]+-[0-9]+) .*", ver.split(b"\n")[0].decode("utf8") ) if m is None: HAVE_IMAGEMAGICK_MODERN = False HAVE_EXACT_CMYK8 = False else: if parse_version(m.group(1)) < parse_version("6.9.10-12"): HAVE_IMAGEMAGICK_MODERN = False if parse_version(m.group(1)) < parse_version("7.1.0-48"): HAVE_EXACT_CMYK8 = False except FileNotFoundError: HAVE_IMAGEMAGICK_MODERN = False HAVE_EXACT_CMYK8 = False except subprocess.CalledProcessError: HAVE_IMAGEMAGICK_MODERN = False HAVE_EXACT_CMYK8 = False if not HAVE_IMAGEMAGICK_MODERN: warnings.warn("imagemagick >= 6.9.10-12 not available, skipping certain checks...") HAVE_JP2 = True try: ver = subprocess.check_output( IDENTIFY + ["-list", "format"], stderr=subprocess.STDOUT ) found = False for line in ver.split(b"\n"): if re.match(rb"\s+JP2\* JP2\s+rw-\s+JPEG-2000 File Format Syntax", line): found = True break if not found: HAVE_JP2 = False except FileNotFoundError: HAVE_JP2 = False except subprocess.CalledProcessError: HAVE_JP2 = False if not HAVE_JP2: warnings.warn("imagemagick has no jpeg 2000 support, skipping certain checks...") # the result of compare -metric PSNR is either just a floating point value or a # floating point value following by the same value multiplied by 0.01, # surrounded in parenthesis since ImagemMagick 7.1.0-48: # https://github.com/ImageMagick/ImageMagick/commit/751829cd4c911d7a42953a47c1f73068d9e7da2f psnr_re = re.compile(rb"((?:inf|(?:0|[1-9][0-9]*)(?:\.[0-9]+)?))(?: \([0-9.]+\))?") ############################################################################### # HELPER FUNCTIONS # ############################################################################### # Interpret a datetime string in a given timezone and format it according to a # given format string in in UTC. # We avoid using the Python datetime module for this job because doing so would # just replicate the code we want to test for correctness. def tz2utcstrftime(string, fmt, timezone): return ( subprocess.check_output( [ "date", "--utc", f'--date=TZ="{timezone}" {string}', f"+{fmt}", ] ) .decode("utf8") .removesuffix("\n") ) def find_closest_palette_color(color, palette): if color.ndim == 0: idx = (numpy.abs(palette - color)).argmin() else: # naive distance function by computing the euclidean distance in RGB space idx = ((palette - color) ** 2).sum(axis=-1).argmin() return palette[idx] def floyd_steinberg(img, palette): result = numpy.array(img, copy=True) for y in range(result.shape[0]): for x in range(result.shape[1]): oldpixel = result[y, x] newpixel = find_closest_palette_color(oldpixel, palette) quant_error = oldpixel - newpixel result[y, x] = newpixel if x + 1 < result.shape[1]: result[y, x + 1] += quant_error * 7 / 16 if y + 1 < result.shape[0]: result[y + 1, x - 1] += quant_error * 3 / 16 result[y + 1, x] += quant_error * 5 / 16 if x + 1 < result.shape[1] and y + 1 < result.shape[0]: result[y + 1, x + 1] += quant_error * 1 / 16 return result def convolve_rgba(img, kernel): return numpy.stack( ( scipy.signal.convolve2d(img[:, :, 0], kernel, "same"), scipy.signal.convolve2d(img[:, :, 1], kernel, "same"), scipy.signal.convolve2d(img[:, :, 2], kernel, "same"), scipy.signal.convolve2d(img[:, :, 3], kernel, "same"), ), axis=-1, ) def rgb2gray(img): result = numpy.zeros((60, 60), dtype=numpy.dtype("int64")) count = 0 for y in range(img.shape[0]): for x in range(img.shape[1]): clin = sum(img[y, x] * [0.2126, 0.7152, 0.0722]) / 0xFFFF if clin <= 0.0031308: csrgb = 12.92 * clin else: csrgb = 1.055 * clin ** (1 / 2.4) - 0.055 result[y, x] = csrgb * 0xFFFF count += 1 # if count == 24: # raise Exception(result[y, x]) return result def palettize(img, pal): result = numpy.zeros((img.shape[0], img.shape[1]), dtype=numpy.dtype("int64")) for y in range(img.shape[0]): for x in range(img.shape[1]): for i, col in enumerate(pal): if numpy.array_equal(img[y, x], col): result[y, x] = i break else: raise Exception() return result # we cannot use zlib.compress() because different compressors may compress the # same data differently, for example by using different optimizations on # different architectures: # https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/thread/R7GD4L5Z6HELCDAL2RDESWR2F3ZXHWVX/ # # to make the compressed representation of the uncompressed data bit-by-bit # identical on all platforms we make use of the compression method 0, that is, # no compression at all :) def compress(data): # two-byte zlib header (rfc1950) # common header for lowest compression level # bits 0-3: Compression info, base-2 logarithm of the LZ77 window size, # minus eight -- 7 indicates a 32K window size # bits 4-7: Compression method -- 8 is deflate # bits 8-9: Compression level -- 0 is fastest # bit 10: preset dictionary -- 0 is none # bits 11-15: check bits so that the 16-bit unsigned integer stored in MSB # order is a multiple of 31 result = b"\x78\x01" # content is stored in deflate format (rfc1951) # maximum chunk size is the largest 16 bit unsigned integer chunksize = 0xFFFF for i in range(0, len(data), chunksize): # bits 0-4 are unused # bits 5-6 indicate compression method -- 0 is no compression # bit 7 indicates the last chunk if i * chunksize < len(data) - chunksize: result += b"\x00" else: # last chunck result += b"\x01" chunk = data[i : i + chunksize] # the chunk length as little endian 16 bit unsigned integer result += struct.pack("I", zlib.adler32(data)) return result def write_png(data, path, bitdepth, colortype, palette=None, iccp=None): with open(str(path), "wb") as f: f.write(b"\x89PNG\r\n\x1A\n") # PNG image type Colour type Allowed bit depths # Greyscale 0 1, 2, 4, 8, 16 # Truecolour 2 8, 16 # Indexed-colour 3 1, 2, 4, 8 # Greyscale with alpha 4 8, 16 # Truecolour with alpha 6 8, 16 block = b"IHDR" + struct.pack( ">IIBBBBB", data.shape[1], # width data.shape[0], # height bitdepth, # bitdepth colortype, # colortype 0, # compression 0, # filtertype 0, # interlaced ) f.write( struct.pack(">I", len(block) - 4) + block + struct.pack(">I", zlib.crc32(block)) ) if iccp is not None: with open(iccp, "rb") as infh: iccdata = infh.read() block = b"iCCP" block += b"icc\0" # arbitrary profile name block += b"\0" # compression method (deflate) block += zlib.compress(iccdata) f.write( struct.pack(">I", len(block) - 4) + block + struct.pack(">I", zlib.crc32(block)) ) if palette is not None: block = b"PLTE" for col in palette: block += struct.pack(">BBB", col[0], col[1], col[2]) f.write( struct.pack(">I", len(block) - 4) + block + struct.pack(">I", zlib.crc32(block)) ) raw = b"" for y in range(data.shape[0]): raw += b"\0" if bitdepth == 16: raw += data[y].astype(">u2").tobytes() elif bitdepth == 8: raw += data[y].astype(">u1").tobytes() elif bitdepth in [4, 2, 1]: valsperbyte = 8 // bitdepth for x in range(0, data.shape[1], valsperbyte): val = 0 for j in range(valsperbyte): if x + j >= data.shape[1]: break val |= (data[y, x + j].astype(">u2") & (2**bitdepth - 1)) << ( (valsperbyte - j - 1) * bitdepth ) raw += struct.pack(">B", val) else: raise Exception() compressed = compress(raw) block = b"IDAT" + compressed f.write( struct.pack(">I", len(compressed)) + block + struct.pack(">I", zlib.crc32(block)) ) block = b"IEND" f.write(struct.pack(">I", 0) + block + struct.pack(">I", zlib.crc32(block))) def compare(im1, im2, exact, icc, cmyk): if exact: if cmyk and not HAVE_EXACT_CMYK8: raise Exception("cmyk cannot be exact before ImageMagick 7.1.0-48") elif icc: raise Exception("icc cannot be exact") else: subprocess.check_call( COMPARE + [ "-metric", "AE", "-alpha", "off", im1, im2, "null:", ] ) else: iccargs = [] if icc: if ICC_PROFILE is None: pytest.skip("Could not locate an ICC profile") iccargs = ["-profile", ICC_PROFILE] psnr = subprocess.run( COMPARE + iccargs + [ "-metric", "PSNR", im1, im2, "null:", ], check=False, stderr=subprocess.PIPE, ).stderr assert psnr != b"0" assert psnr != b"0 (0)" assert psnr_re.fullmatch(psnr) is not None, psnr psnr = psnr_re.fullmatch(psnr).group(1) psnr = float(psnr) assert psnr != 0 # or otherwise we would use the exact variant assert psnr > 50 def compare_ghostscript(tmpdir, img, pdf, gsdevice="png16m", exact=True, icc=False): if gsdevice in ["png16m", "pnggray"]: ext = "png" elif gsdevice in ["tiff24nc", "tiff32nc", "tiff48nc"]: ext = "tiff" else: raise Exception("unknown gsdevice: " + gsdevice) subprocess.check_call( [ "gs", "-dQUIET", "-dNOPAUSE", "-dBATCH", "-sDEVICE=" + gsdevice, "-r96", "-sOutputFile=" + str(tmpdir / "gs-") + "%00d." + ext, str(pdf), ] ) compare(str(img), str(tmpdir / "gs-1.") + ext, exact, icc, False) (tmpdir / ("gs-1." + ext)).unlink() def compare_poppler(tmpdir, img, pdf, exact=True, icc=False): subprocess.check_call( ["pdftocairo", "-r", "96", "-png", str(pdf), str(tmpdir / "poppler")] ) compare(str(img), str(tmpdir / "poppler-1.png"), exact, icc, False) (tmpdir / "poppler-1.png").unlink() def compare_mupdf(tmpdir, img, pdf, exact=True, cmyk=False): if not HAVE_MUTOOL: return if cmyk: out = tmpdir / "mupdf.pam" subprocess.check_call( ["mutool", "draw", "-r", "96", "-c", "cmyk", "-o", str(out), str(pdf)] ) else: out = tmpdir / "mupdf.png" subprocess.check_call( ["mutool", "draw", "-r", "96", "-png", "-o", str(out), str(pdf)] ) compare(str(img), str(out), exact, False, cmyk) out.unlink() def compare_pdfimages_jpg(tmpdir, img, pdf): subprocess.check_call(["pdfimages", "-j", str(pdf), str(tmpdir / "images")]) assert img.read_bytes() == (tmpdir / "images-000.jpg").read_bytes() (tmpdir / "images-000.jpg").unlink() def compare_pdfimages_cmyk(tmpdir, img, pdf): if not HAVE_PDFIMAGES_CMYK: return subprocess.check_call(["pdfimages", "-j", str(pdf), str(tmpdir / "images")]) assert img.read_bytes() == (tmpdir / "images-000.jpg").read_bytes() (tmpdir / "images-000.jpg").unlink() def compare_pdfimages_jp2(tmpdir, img, pdf): subprocess.check_call(["pdfimages", "-jp2", str(pdf), str(tmpdir / "images")]) assert img.read_bytes() == (tmpdir / "images-000.jp2").read_bytes() (tmpdir / "images-000.jp2").unlink() def compare_pdfimages_tiff(tmpdir, img, pdf): subprocess.check_call(["pdfimages", "-tiff", str(pdf), str(tmpdir / "images")]) subprocess.check_call( COMPARE + [ "-metric", "AE", str(img), str(tmpdir / "images-000.tif"), "null:", ] ) (tmpdir / "images-000.tif").unlink() def compare_pdfimages_png(tmpdir, img, pdf, exact=True, icc=False): subprocess.check_call(["pdfimages", "-png", str(pdf), str(tmpdir / "images")]) # images-001.png is the grayscale SMask image (the original alpha channel) if os.path.isfile(tmpdir / "images-001.png"): subprocess.check_call( CONVERT + [ str(tmpdir / "images-000.png"), str(tmpdir / "images-001.png"), "-compose", "copy-opacity", "-composite", str(tmpdir / "composite.png"), ] ) (tmpdir / "images-000.png").unlink() (tmpdir / "images-001.png").unlink() os.rename(tmpdir / "composite.png", tmpdir / "images-000.png") if exact: if icc: raise Exception("not exact with icc") subprocess.check_call( COMPARE + [ "-metric", "AE", str(img), str(tmpdir / "images-000.png"), "null:", ] ) else: if icc: if ICC_PROFILE is None: pytest.skip("Could not locate an ICC profile") psnr = subprocess.run( COMPARE + [ "-metric", "PSNR", "(", "-profile", ICC_PROFILE, "-depth", "8", str(img), ")", str(tmpdir / "images-000.png"), "null:", ], check=False, stderr=subprocess.PIPE, ).stderr else: psnr = subprocess.run( COMPARE + [ "-metric", "PSNR", str(img), str(tmpdir / "images-000.png"), "null:", ], check=False, stderr=subprocess.PIPE, ).stderr assert psnr != b"0" assert psnr != b"0 (0)" psnr = psnr_re.fullmatch(psnr).group(1) psnr = float(psnr) assert psnr != 0 # or otherwise we would use the exact variant assert psnr > 50 (tmpdir / "images-000.png").unlink() def tiff_header_for_ccitt(width, height, img_size, ccitt_group=4): # Quick and dirty TIFF header builder from # https://stackoverflow.com/questions/2641770 tiff_header_struct = "<" + "2s" + "h" + "l" + "h" + "hhll" * 8 + "h" return struct.pack( # fmt: off tiff_header_struct, b'II', # Byte order indication: Little indian 42, # Version number (always 42) 8, # Offset to first IFD 8, # Number of tags in IFD 256, 4, 1, width, # ImageWidth, LONG, 1, width 257, 4, 1, height, # ImageLength, LONG, 1, lenght 258, 3, 1, 1, # BitsPerSample, SHORT, 1, 1 259, 3, 1, ccitt_group, # Compression, SHORT, 1, 4 = CCITT Group 4 262, 3, 1, 1, # Threshholding, SHORT, 1, 0 = WhiteIsZero 273, 4, 1, struct.calcsize( tiff_header_struct), # StripOffsets, LONG, 1, len of header 278, 4, 1, height, # RowsPerStrip, LONG, 1, lenght 279, 4, 1, img_size, # StripByteCounts, LONG, 1, size of image 0 # last IFD # fmt: on ) pixel_R = [ [1, 1, 1, 0], [1, 0, 0, 1], [1, 0, 0, 1], [1, 1, 1, 0], [1, 0, 0, 1], [1, 0, 0, 1], [1, 0, 0, 1], ] pixel_G = [ [0, 1, 1, 0], [1, 0, 0, 1], [1, 0, 0, 0], [1, 0, 1, 1], [1, 0, 0, 1], [1, 0, 0, 1], [0, 1, 1, 0], ] pixel_B = [ [1, 1, 1, 0], [1, 0, 0, 1], [1, 0, 0, 1], [1, 1, 1, 0], [1, 0, 0, 1], [1, 0, 0, 1], [1, 1, 1, 0], ] def alpha_value(): # gaussian kernel with sigma=3 kernel = numpy.array( [ [0.011362, 0.014962, 0.017649, 0.018648, 0.017649, 0.014962, 0.011362], [0.014962, 0.019703, 0.02324, 0.024556, 0.02324, 0.019703, 0.014962], [0.017649, 0.02324, 0.027413, 0.028964, 0.027413, 0.02324, 0.017649], [0.018648, 0.024556, 0.028964, 0.030603, 0.028964, 0.024556, 0.018648], [0.017649, 0.02324, 0.027413, 0.028964, 0.027413, 0.02324, 0.017649], [0.014962, 0.019703, 0.02324, 0.024556, 0.02324, 0.019703, 0.014962], [0.011362, 0.014962, 0.017649, 0.018648, 0.017649, 0.014962, 0.011362], ], float, ) # constructs a 2D array of a circle with a width of 36 circle = list() offsets_36 = [14, 11, 9, 7, 6, 5, 4, 3, 3, 2, 2, 1, 1, 1, 0, 0, 0, 0] for offs in offsets_36 + offsets_36[::-1]: circle.append([0] * offs + [1] * (len(offsets_36) - offs) * 2 + [0] * offs) alpha = numpy.zeros((60, 60, 4), dtype=numpy.dtype("int64")) # draw three circles for xpos, ypos, color in [ (12, 3, [0xFFFF, 0, 0, 0xFFFF]), (21, 21, [0, 0xFFFF, 0, 0xFFFF]), (3, 21, [0, 0, 0xFFFF, 0xFFFF]), ]: for x, row in enumerate(circle): for y, pos in enumerate(row): if pos: alpha[y + ypos, x + xpos] += color alpha = numpy.clip(alpha, 0, 0xFFFF) alpha = convolve_rgba(alpha, kernel) # draw letters for y, row in enumerate(pixel_R): for x, pos in enumerate(row): if pos: alpha[13 + y, 28 + x] = [0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF] for y, row in enumerate(pixel_G): for x, pos in enumerate(row): if pos: alpha[39 + y, 40 + x] = [0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF] for y, row in enumerate(pixel_B): for x, pos in enumerate(row): if pos: alpha[39 + y, 15 + x] = [0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF] return alpha def icc_profile(): PCS = (0.96420288, 1.0, 0.82490540) # D50 illuminant constants # approximate X,Y,Z values for white, red, green and blue white = (0.95, 1.0, 1.09) red = (0.44, 0.22, 0.014) green = (0.39, 0.72, 0.1) blue = (0.14, 0.06, 0.71) getxyz = lambda v: (round(65536 * v[0]), round(65536 * v[1]), round(65536 * v[2])) header = ( # header +4 * b"\0" # cmmsignatures + 4 * b"\0" # version + b"mntr" # device class + b"RGB " # color space + b"XYZ " # PCS + 12 * b"\0" # datetime + b"\x61\x63\x73\x70" # static signature + 4 * b"\0" # platform + 4 * b"\0" # flags + 4 * b"\0" # device manufacturer + 4 * b"\0" # device model + 8 * b"\0" # device attributes + 4 * b"\0" # rendering intents + struct.pack(">III", *getxyz(PCS)) + 4 * b"\0" # creator + 16 * b"\0" # identifier + 28 * b"\0" # reserved ) def pad4(s): if len(s) % 4 == 0: return s else: return s + b"\x00" * (4 - len(s) % 4) tagdata = [ b"desc\x00\x00\x00\x00" + struct.pack(">I", 5) + b"fake" + 79 * b"\x00", b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(white)), # by mixing up red, green and blue, we create a test profile b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(blue)), # red b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(red)), # green b"XYZ \x00\x00\x00\x00" + struct.pack(">III", *getxyz(green)), # blue # by only supplying two values, we create the most trivial "curve", # where the remaining values will be linearly interpolated between them b"curv\x00\x00\x00\x00" + struct.pack(">IHH", 2, 0, 65535), b"text\x00\x00\x00\x00" + b"no copyright, use freely" + 1 * b"\x00", ] table = [ (b"desc", 0), (b"wtpt", 1), (b"rXYZ", 2), (b"gXYZ", 3), (b"bXYZ", 4), # we use the same curve for all three channels, so the same offset is referenced (b"rTRC", 5), (b"gTRC", 5), (b"bTRC", 5), (b"cprt", 6), ] offset = ( lambda n: 4 # total size + len(header) # header length + 4 # number table entries + len(table) * 12 # table length + sum([len(pad4(s)) for s in tagdata[:n]]) ) table = struct.pack(">I", len(table)) + b"".join( [t + struct.pack(">II", offset(o), len(tagdata[o])) for t, o in table] ) data = b"".join([pad4(s) for s in tagdata]) data = ( struct.pack(">I", 4 + len(header) + len(table) + len(data)) + header + table + data ) return data ############################################################################### # INPUT FIXTURES # ############################################################################### @pytest.fixture(scope="session") def alpha(): return alpha_value() @pytest.fixture(scope="session") def tmp_alpha_png(tmp_path_factory, alpha): tmp_alpha_png = tmp_path_factory.mktemp("alpha_png") / "alpha.png" write_png(alpha, str(tmp_alpha_png), 16, 6) assert ( hashlib.md5(tmp_alpha_png.read_bytes()).hexdigest() == "600bb4cffb039a022cec6ed55537deba" ) return tmp_alpha_png @pytest.fixture(scope="session") def tmp_gray1_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] gray16 = rgb2gray(normal16) tmp_gray1_png = tmp_path_factory.mktemp("gray1_png") / "gray1.png" write_png( floyd_steinberg(gray16, numpy.arange(2) / 0x1 * 0xFFFF) / 0xFFFF * 0x1, str(tmp_gray1_png), 1, 0, ) assert ( hashlib.md5(tmp_gray1_png.read_bytes()).hexdigest() == "dd2c528152d34324747355b73495a115" ) return tmp_gray1_png @pytest.fixture(scope="session") def tmp_gray2_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] gray16 = rgb2gray(normal16) tmp_gray2_png = tmp_path_factory.mktemp("gray2_png") / "gray2.png" write_png( floyd_steinberg(gray16, numpy.arange(4) / 0x3 * 0xFFFF) / 0xFFFF * 0x3, str(tmp_gray2_png), 2, 0, ) assert ( hashlib.md5(tmp_gray2_png.read_bytes()).hexdigest() == "68e614f4e6a85053d47098dad0ca3976" ) return tmp_gray2_png @pytest.fixture(scope="session") def tmp_gray4_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] gray16 = rgb2gray(normal16) tmp_gray4_png = tmp_path_factory.mktemp("gray4_png") / "gray4.png" write_png( floyd_steinberg(gray16, numpy.arange(16) / 0xF * 0xFFFF) / 0xFFFF * 0xF, str(tmp_gray4_png), 4, 0, ) assert ( hashlib.md5(tmp_gray4_png.read_bytes()).hexdigest() == "ff04a6fea88133eb77bbb748692ae0fd" ) return tmp_gray4_png @pytest.fixture(scope="session") def tmp_gray8_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] gray16 = rgb2gray(normal16) tmp_gray8_png = tmp_path_factory.mktemp("gray8_png") / "gray8.png" write_png(gray16 / 0xFFFF * 0xFF, tmp_gray8_png, 8, 0) assert ( hashlib.md5(tmp_gray8_png.read_bytes()).hexdigest() == "90b4ed9123f295dda7fde499744dede7" ) return tmp_gray8_png @pytest.fixture(scope="session") def tmp_gray16_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] gray16 = rgb2gray(normal16) tmp_gray16_png = tmp_path_factory.mktemp("gray16_png") / "gray16.png" write_png(gray16, str(tmp_gray16_png), 16, 0) assert ( hashlib.md5(tmp_gray16_png.read_bytes()).hexdigest() == "f76153d2e72fada11d934c32c8168a57" ) return tmp_gray16_png @pytest.fixture(scope="session") def tmp_inverse_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_inverse_png = tmp_path_factory.mktemp("inverse_png") / "inverse.png" write_png(0xFF - normal16 / 0xFFFF * 0xFF, str(tmp_inverse_png), 8, 2) assert ( hashlib.md5(tmp_inverse_png.read_bytes()).hexdigest() == "0a7d57dc09c4d8fd1ad3511b116c7dfa" ) return tmp_inverse_png @pytest.fixture(scope="session") def tmp_icc_profile(tmp_path_factory): tmp_icc_profile = tmp_path_factory.mktemp("icc_profile") / "fake.icc" tmp_icc_profile.write_bytes(icc_profile()) return tmp_icc_profile @pytest.fixture(scope="session") def tmp_icc_png(tmp_path_factory, alpha, tmp_icc_profile): normal16 = alpha[:, :, 0:3] tmp_icc_png = tmp_path_factory.mktemp("icc_png") / "icc.png" write_png( normal16 / 0xFFFF * 0xFF, str(tmp_icc_png), 8, 2, iccp=str(tmp_icc_profile), ) assert ( hashlib.md5(tmp_icc_png.read_bytes()).hexdigest() == "bf25f673c1617f5f9353b2a043747655" ) return tmp_icc_png @pytest.fixture(scope="session") def tmp_normal16_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_normal16_png = tmp_path_factory.mktemp("normal16_png") / "normal16.png" write_png(normal16, str(tmp_normal16_png), 16, 2) assert ( hashlib.md5(tmp_normal16_png.read_bytes()).hexdigest() == "820dd30a2566775fc64c110e8ac65c7e" ) return tmp_normal16_png @pytest.fixture(scope="session") def tmp_normal_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_normal_png = tmp_path_factory.mktemp("normal_png") / "normal.png" write_png(normal16 / 0xFFFF * 0xFF, str(tmp_normal_png), 8, 2) assert ( hashlib.md5(tmp_normal_png.read_bytes()).hexdigest() == "bc30c705f455991cd04be1c298063002" ) return tmp_normal_png @pytest.fixture(scope="session") def tmp_palette1_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_palette1_png = tmp_path_factory.mktemp("palette1_png") / "palette1.png" # don't choose black and white or otherwise imagemagick will classify the # image as bilevel with 8/1-bit depth instead of palette with 8-bit color # don't choose gray colors or otherwise imagemagick will classify the # image as grayscale pal1 = numpy.array( [[0x01, 0x02, 0x03], [0xFE, 0xFD, 0xFC]], dtype=numpy.dtype("int64") ) write_png( palettize( floyd_steinberg(normal16, pal1 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal1 ), str(tmp_palette1_png), 1, 3, pal1, ) assert ( hashlib.md5(tmp_palette1_png.read_bytes()).hexdigest() == "3d065f731540e928fb730b3233e4e8a7" ) return tmp_palette1_png @pytest.fixture(scope="session") def tmp_palette2_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_palette2_png = tmp_path_factory.mktemp("palette2_png") / "palette2.png" # choose values slightly off red, lime and blue because otherwise # imagemagick will classify the image as Depth: 8/1-bit pal2 = numpy.array( [[0, 0, 0], [0xFE, 0, 0], [0, 0xFE, 0], [0, 0, 0xFE]], dtype=numpy.dtype("int64"), ) write_png( palettize( floyd_steinberg(normal16, pal2 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal2 ), str(tmp_palette2_png), 2, 3, pal2, ) assert ( hashlib.md5(tmp_palette2_png.read_bytes()).hexdigest() == "0b0d4412c28da26163a622d218ee02ca" ) return tmp_palette2_png @pytest.fixture(scope="session") def tmp_palette4_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_palette4_png = tmp_path_factory.mktemp("palette4_png") / "palette4.png" # windows 16 color palette pal4 = numpy.array( [ [0x00, 0x00, 0x00], [0x80, 0x00, 0x00], [0x00, 0x80, 0x00], [0x80, 0x80, 0x00], [0x00, 0x00, 0x80], [0x80, 0x00, 0x80], [0x00, 0x80, 0x80], [0xC0, 0xC0, 0xC0], [0x80, 0x80, 0x80], [0xFF, 0x00, 0x00], [0x00, 0xFF, 0x00], [0xFF, 0x00, 0x00], [0x00, 0xFF, 0x00], [0xFF, 0x00, 0xFF], [0x00, 0xFF, 0x00], [0xFF, 0xFF, 0xFF], ], dtype=numpy.dtype("int64"), ) write_png( palettize( floyd_steinberg(normal16, pal4 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal4 ), str(tmp_palette4_png), 4, 3, pal4, ) assert ( hashlib.md5(tmp_palette4_png.read_bytes()).hexdigest() == "163f6d7964b80eefa0dc6a48cb7315dd" ) return tmp_palette4_png @pytest.fixture(scope="session") def tmp_palette8_png(tmp_path_factory, alpha): normal16 = alpha[:, :, 0:3] tmp_palette8_png = tmp_path_factory.mktemp("palette8_png") / "palette8.png" # create a 256 color palette by first writing 16 shades of gray # and then writing an array of RGB colors with 6, 8 and 5 levels # for red, green and blue, respectively pal8 = numpy.zeros((256, 3), dtype=numpy.dtype("int64")) i = 0 for gray in range(15, 255, 15): pal8[i] = [gray, gray, gray] i += 1 for red in 0, 0x33, 0x66, 0x99, 0xCC, 0xFF: for green in 0, 0x24, 0x49, 0x6D, 0x92, 0xB6, 0xDB, 0xFF: for blue in 0, 0x40, 0x80, 0xBF, 0xFF: pal8[i] = [red, green, blue] i += 1 assert i == 256 write_png( palettize( floyd_steinberg(normal16, pal8 * 0xFFFF / 0xFF) / 0xFFFF * 0xFF, pal8 ), str(tmp_palette8_png), 8, 3, pal8, ) assert ( hashlib.md5(tmp_palette8_png.read_bytes()).hexdigest() == "8847bb734eba0e2d85e3f97fc2849dd4" ) return tmp_palette8_png @pytest.fixture(scope="session") def jpg_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("jpg") / "in.jpg" subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "JPEG", str(identify) assert identify[0]["image"].get("mimeType") == "image/jpeg", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert "resolution" not in identify[0]["image"] assert identify[0]["image"].get("units") == "Undefined", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) == "Undefined", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "JPEG", str(identify) assert identify[0]["image"].get("orientation") == "Undefined", str(identify) assert ( identify[0]["image"].get("properties", {}).get("jpeg:colorspace") == "2" ), str(identify) return in_img @pytest.fixture(scope="session") def jpg_rot_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("jpg_rot") / "in.jpg" subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)]) subprocess.check_call( ["exiftool", "-overwrite_original", "-all=", str(in_img), "-n"] ) subprocess.check_call( [ "exiftool", "-overwrite_original", "-Orientation=6", "-XResolution=96", "-YResolution=96", "-ResolutionUnit=2", "-n", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "JPEG", str(identify) assert identify[0]["image"].get("mimeType") == "image/jpeg", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("resolution") == {"x": 96, "y": 96} assert identify[0]["image"].get("units") == "PixelsPerInch", str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "JPEG", str(identify) assert identify[0]["image"].get("orientation") == "RightTop", str(identify) return in_img @pytest.fixture(scope="session") def jpg_cmyk_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("jpg_cmyk") / "in.jpg" subprocess.check_call( CONVERT + [str(tmp_normal_png), "-colorspace", "cmyk", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "JPEG", str(identify) assert identify[0]["image"].get("mimeType") == "image/jpeg", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "JPEG", str(identify) return in_img @pytest.fixture(scope="session") def jpg_2000_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("jpg_2000") / "in.jp2" subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "JP2", str(identify) assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "JPEG2000", str(identify) return in_img @pytest.fixture(scope="session") def jpg_2000_rgba8_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("jpg_2000_rgba8") / "in.jp2" subprocess.check_call(CONVERT + [str(tmp_alpha_png), "-depth", "8", str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "JP2", str(identify) assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "JPEG2000", str(identify) return in_img @pytest.fixture(scope="session") def jpg_2000_rgba16_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("jpg_2000_rgba16") / "in.jp2" subprocess.check_call(CONVERT + [str(tmp_alpha_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "JP2", str(identify) assert identify[0]["image"].get("mimeType") == "image/jp2", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "JPEG2000", str(identify) return in_img @pytest.fixture(scope="session") def png_rgb8_img(tmp_normal_png): in_img = tmp_normal_png identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "2 (Truecolor)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_rgb16_img(tmp_normal16_png): in_img = tmp_normal16_png identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "2 (Truecolor)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_rgba8_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("png_rgba8") / "in.png" subprocess.check_call( CONVERT + [str(tmp_alpha_png), "-depth", "8", "-strip", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "6" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "6 (RGBA)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_rgba16_img(tmp_alpha_png): in_img = tmp_alpha_png identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "6" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "6 (RGBA)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_gray8a_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("png_gray8a") / "in.png" subprocess.check_call( CONVERT + [ str(tmp_alpha_png), "-colorspace", "Gray", "-dither", "FloydSteinberg", "-colors", "256", "-depth", "8", "-strip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "GrayscaleAlpha", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "4" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "4 (GrayAlpha)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_gray16a_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("png_gray16a") / "in.png" subprocess.check_call( CONVERT + [ str(tmp_alpha_png), "-colorspace", "Gray", "-depth", "16", "-strip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "GrayscaleAlpha", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "4" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "4 (GrayAlpha)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_interlaced_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("png_interlaced") / "in.png" subprocess.check_call( CONVERT + [ str(tmp_normal_png), "-interlace", "PNG", "-strip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "2 (Truecolor)" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.interlace_method") == "1 (Adam7 method)" ), str(identify) return in_img @pytest.fixture(scope="session") def png_gray1_img(tmp_path_factory, tmp_gray1_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_gray1_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") in ["Bilevel", "Grayscale"], str(identify) assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "1" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "1" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "0" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "0 (Grayscale)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_gray1_png @pytest.fixture(scope="session") def png_gray2_img(tmp_path_factory, tmp_gray2_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_gray2_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 2, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "0" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "0 (Grayscale)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_gray2_png @pytest.fixture(scope="session") def png_gray4_img(tmp_path_factory, tmp_gray4_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_gray4_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 4, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "4" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "4" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "0" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "0 (Grayscale)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_gray4_png @pytest.fixture(scope="session") def png_gray8_img(tmp_path_factory, tmp_gray8_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_gray8_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "0" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "0 (Grayscale)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_gray8_png @pytest.fixture(scope="session") def png_gray16_img(tmp_path_factory, tmp_gray16_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_gray16_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "16" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "0" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "0 (Grayscale)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_gray16_png @pytest.fixture(scope="session") def png_palette1_img(tmp_path_factory, tmp_palette1_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_palette1_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "1" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "1" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "3" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "3 (Indexed)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_palette1_png @pytest.fixture(scope="session") def png_palette2_img(tmp_path_factory, tmp_palette2_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_palette2_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "3" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "3 (Indexed)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_palette2_png @pytest.fixture(scope="session") def png_palette4_img(tmp_path_factory, tmp_palette4_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_palette4_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "4" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "4" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "3" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "3 (Indexed)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_palette4_png @pytest.fixture(scope="session") def png_palette8_img(tmp_path_factory, tmp_palette8_png): identify = json.loads( subprocess.check_output(CONVERT + [str(tmp_palette8_png), "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "3" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "3 (Indexed)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return tmp_palette8_png @pytest.fixture(scope="session") def gif_transparent_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("gif_transparent_img") / "in.gif" subprocess.check_call(CONVERT + [str(tmp_alpha_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "PaletteAlpha", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 256, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) return in_img @pytest.fixture(scope="session") def gif_palette1_img(tmp_path_factory, tmp_palette1_png): in_img = tmp_path_factory.mktemp("gif_palette1_img") / "in.gif" subprocess.check_call(CONVERT + [str(tmp_palette1_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 2, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) return in_img @pytest.fixture(scope="session") def gif_palette2_img(tmp_path_factory, tmp_palette2_png): in_img = tmp_path_factory.mktemp("gif_palette2_img") / "in.gif" subprocess.check_call(CONVERT + [str(tmp_palette2_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 4, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) return in_img @pytest.fixture(scope="session") def gif_palette4_img(tmp_path_factory, tmp_palette4_png): in_img = tmp_path_factory.mktemp("gif_palette4_img") / "in.gif" subprocess.check_call(CONVERT + [str(tmp_palette4_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) return in_img @pytest.fixture(scope="session") def gif_palette8_img(tmp_path_factory, tmp_palette8_png): in_img = tmp_path_factory.mktemp("gif_palette8_img") / "in.gif" subprocess.check_call(CONVERT + [str(tmp_palette8_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 256, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) return in_img @pytest.fixture(scope="session") def gif_animation_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png): in_img = tmp_path_factory.mktemp("gif_animation_img") / "in.gif" pal_img = tmp_path_factory.mktemp("gif_animation_img") / "pal.gif" tmp_img = tmp_path_factory.mktemp("gif_animation_img") / "tmp.gif" subprocess.check_call( CONVERT + [ str(tmp_normal_png), str(tmp_inverse_png), str(tmp_img), ] ) # create palette image with all unique colors subprocess.check_call( CONVERT + [ str(tmp_img), "-unique-colors", str(pal_img), ] ) # make sure all frames have the same palette by using -remap subprocess.check_call( CONVERT + [str(tmp_img), "-strip", "-remap", str(pal_img), str(in_img)] ) pal_img.unlink() tmp_img.unlink() identify = json.loads( subprocess.check_output(CONVERT + [str(in_img) + "[0]", "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 256, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) colormap_frame0 = identify[0]["image"].get("colormap") identify = json.loads( subprocess.check_output(CONVERT + [str(in_img) + "[1]", "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "GIF", str(identify) assert identify[0]["image"].get("mimeType") == "image/gif", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 256, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "LZW", str(identify) assert identify[0]["image"].get("scene") == 1, str(identify) colormap_frame1 = identify[0]["image"].get("colormap") assert colormap_frame0 == colormap_frame1 return in_img @pytest.fixture(scope="session") def tiff_float_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("tiff_float_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal_png), "-depth", "32", "-define", "quantum:format=floating-point", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("baseDepth") == 32, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("quantum:format") == "floating-point" ), str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_cmyk8_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("tiff_cmyk8") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal_png), "-colorspace", "cmyk", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "separated" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_cmyk16_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("tiff_cmyk16") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal_png), "-depth", "16", "-colorspace", "cmyk", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "separated" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_rgb8_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff" subprocess.check_call( CONVERT + [str(tmp_normal_png), "-compress", "Zip", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_rgb12_img(tmp_path_factory, tmp_normal16_png): in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal16_png), "-depth", "12", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("baseDepth") == 12, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_rgb14_img(tmp_path_factory, tmp_normal16_png): in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal16_png), "-depth", "14", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("baseDepth") == 14, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_rgb16_img(tmp_path_factory, tmp_normal16_png): in_img = tmp_path_factory.mktemp("tiff_rgb8") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal16_png), "-depth", "16", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_rgba8_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("tiff_rgba8") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_alpha_png), "-depth", "8", "-strip", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_rgba16_img(tmp_path_factory, tmp_alpha_png): in_img = tmp_path_factory.mktemp("tiff_rgba16") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_alpha_png), "-depth", "16", "-strip", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColorAlpha", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:alpha") == "unassociated" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_gray1_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_gray1") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-depth", "1", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-black" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_gray2_img(tmp_path_factory, tmp_gray2_png): in_img = tmp_path_factory.mktemp("tiff_gray2") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray2_png), "-depth", "2", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 2, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-black" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_gray4_img(tmp_path_factory, tmp_gray4_png): in_img = tmp_path_factory.mktemp("tiff_gray4") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray4_png), "-depth", "4", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 4, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-black" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_gray8_img(tmp_path_factory, tmp_gray8_png): in_img = tmp_path_factory.mktemp("tiff_gray8") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray8_png), "-depth", "8", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-black" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_gray16_img(tmp_path_factory, tmp_gray16_png): in_img = tmp_path_factory.mktemp("tiff_gray16") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray16_png), "-depth", "16", "-compress", "Zip", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Grayscale", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-black" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_multipage_img(tmp_path_factory, tmp_normal_png, tmp_inverse_png): in_img = tmp_path_factory.mktemp("tiff_multipage_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_normal_png), str(tmp_inverse_png), "-strip", "-compress", "Zip", str(in_img), ] ) identify = json.loads( subprocess.check_output(CONVERT + [str(in_img) + "[0]", "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) identify = json.loads( subprocess.check_output(CONVERT + [str(in_img) + "[1]", "json:"]) ) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "RGB" ), str(identify) assert identify[0]["image"].get("scene") == 1, str(identify) return in_img @pytest.fixture(scope="session") def tiff_palette1_img(tmp_path_factory, tmp_palette1_png): in_img = tmp_path_factory.mktemp("tiff_palette1_img") / "in.tiff" subprocess.check_call( CONVERT + [str(tmp_palette1_png), "-compress", "Zip", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("baseDepth") == 1, str(identify) assert identify[0]["image"].get("colormapEntries") == 2, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_palette2_img(tmp_path_factory, tmp_palette2_png): in_img = tmp_path_factory.mktemp("tiff_palette2_img") / "in.tiff" subprocess.check_call( CONVERT + [str(tmp_palette2_png), "-compress", "Zip", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("baseDepth") == 2, str(identify) assert identify[0]["image"].get("colormapEntries") == 4, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_palette4_img(tmp_path_factory, tmp_palette4_png): in_img = tmp_path_factory.mktemp("tiff_palette4_img") / "in.tiff" subprocess.check_call( CONVERT + [str(tmp_palette4_png), "-compress", "Zip", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("baseDepth") == 4, str(identify) assert identify[0]["image"].get("colormapEntries") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_palette8_img(tmp_path_factory, tmp_palette8_png): in_img = tmp_path_factory.mktemp("tiff_palette8_img") / "in.tiff" subprocess.check_call( CONVERT + [str(tmp_palette8_png), "-compress", "Zip", str(in_img)] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "Palette", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("colormapEntries") == 256, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "palette" ), str(identify) return in_img @pytest.fixture(scope="session") def tiff_ccitt_lsb_m2l_white_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_white_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-compress", "group4", "-define", "tiff:endian=lsb", "-define", "tiff:fill-order=msb", "-define", "quantum:polarity=min-is-white", "-compress", "Group4", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) in [ "Undefined", "LSB", ], str(identify) assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str( identify ) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-white" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60" ), str(identify) tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)]) expected = [ r"^ Image Width: 60 Image Length: 60", r"^ Bits/Sample: 1", r"^ Compression Scheme: CCITT Group 4", r"^ Photometric Interpretation: min-is-white", r"^ FillOrder: msb-to-lsb", r"^ Samples/Pixel: 1", r"^ Rows/Strip: 60", ] for e in expected: assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode( "utf8" ) return in_img @pytest.fixture(scope="session") def tiff_ccitt_msb_m2l_white_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_ccitt_msb_m2l_white_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-compress", "group4", "-define", "tiff:endian=msb", "-define", "tiff:fill-order=msb", "-define", "quantum:polarity=min-is-white", "-compress", "Group4", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) in [ "Undefined", "MSB", ] # FIXME: should be MSB assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "msb", str( identify ) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-white" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60" ), str(identify) tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)]) expected = [ r"^ Image Width: 60 Image Length: 60", r"^ Bits/Sample: 1", r"^ Compression Scheme: CCITT Group 4", r"^ Photometric Interpretation: min-is-white", r"^ FillOrder: msb-to-lsb", r"^ Samples/Pixel: 1", r"^ Rows/Strip: 60", ] for e in expected: assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode( "utf8" ) return in_img @pytest.fixture(scope="session") def tiff_ccitt_msb_l2m_white_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_ccitt_msb_l2m_white_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-compress", "group4", "-define", "tiff:endian=msb", "-define", "tiff:fill-order=lsb", "-define", "quantum:polarity=min-is-white", "-compress", "Group4", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) in [ "Undefined", "MSB", ] # FIXME: should be MSB assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "msb", str( identify ) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-white" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60" ), str(identify) tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)]) expected = [ r"^ Image Width: 60 Image Length: 60", r"^ Bits/Sample: 1", r"^ Compression Scheme: CCITT Group 4", r"^ Photometric Interpretation: min-is-white", r"^ FillOrder: lsb-to-msb", r"^ Samples/Pixel: 1", r"^ Rows/Strip: 60", ] for e in expected: assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode( "utf8" ) return in_img @pytest.fixture(scope="session") def tiff_ccitt_lsb_m2l_black_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_black_img") / "in.tiff" # "-define quantum:polarity=min-is-black" requires ImageMagick with: # https://github.com/ImageMagick/ImageMagick/commit/00730551f0a34328685c59d0dde87dd9e366103a # or at least 7.0.8-11 from Aug 29, 2018 # or at least 6.9.10-12 from Sep 7, 2018 (for the ImageMagick6 branch) # also see: https://www.imagemagick.org/discourse-server/viewtopic.php?f=1&t=34605 subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-compress", "group4", "-define", "tiff:endian=lsb", "-define", "tiff:fill-order=msb", "-define", "quantum:polarity=min-is-black", "-compress", "Group4", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) in [ "Undefined", "LSB", ], str(identify) assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str( identify ) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-black" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60" ), str(identify) tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)]) expected = [ r"^ Image Width: 60 Image Length: 60", r"^ Bits/Sample: 1", r"^ Compression Scheme: CCITT Group 4", r"^ Photometric Interpretation: min-is-black", r"^ FillOrder: msb-to-lsb", r"^ Samples/Pixel: 1", r"^ Rows/Strip: 60", ] for e in expected: assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode( "utf8" ) return in_img @pytest.fixture(scope="session") def tiff_ccitt_nometa1_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_ccitt_nometa1_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-compress", "group4", "-define", "tiff:endian=lsb", "-define", "tiff:fill-order=msb", "-define", "quantum:polarity=min-is-white", "-compress", "Group4", str(in_img), ] ) subprocess.check_call( ["tiffset", "-u", "258", str(in_img)] ) # remove BitsPerSample (258) subprocess.check_call( ["tiffset", "-u", "266", str(in_img)] ) # remove FillOrder (266) subprocess.check_call( ["tiffset", "-u", "277", str(in_img)] ) # remove SamplesPerPixel (277) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) in [ "Undefined", "LSB", ], str(identify) assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str( identify ) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-white" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("tiff:rows-per-strip") == "60" ), str(identify) tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)]) expected = [ r"^ Image Width: 60 Image Length: 60", r"^ Compression Scheme: CCITT Group 4", r"^ Photometric Interpretation: min-is-white", r"^ Rows/Strip: 60", ] for e in expected: assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode( "utf8" ) unexpected = [" Bits/Sample: ", " FillOrder: ", " Samples/Pixel: "] for e in unexpected: assert e not in tiffinfo.decode("utf8") return in_img @pytest.fixture(scope="session") def tiff_ccitt_nometa2_img(tmp_path_factory, tmp_gray1_png): in_img = tmp_path_factory.mktemp("tiff_ccitt_nometa2_img") / "in.tiff" subprocess.check_call( CONVERT + [ str(tmp_gray1_png), "-compress", "group4", "-define", "tiff:endian=lsb", "-define", "tiff:fill-order=msb", "-define", "quantum:polarity=min-is-white", "-compress", "Group4", str(in_img), ] ) subprocess.check_call( ["tiffset", "-u", "278", str(in_img)] ) # remove RowsPerStrip (278) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "TIFF", str(identify) assert identify[0]["image"].get("mimeType") == "image/tiff", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("units") == "PixelsPerInch", str(identify) assert identify[0]["image"].get("type") == "Bilevel", str(identify) endian = "endianess" if identify[0].get("version", "0") < "1.0" else "endianness" assert identify[0]["image"].get(endian) in [ "Undefined", "LSB", ], str(identify) assert identify[0]["image"].get("colorspace") == "Gray", str(identify) assert identify[0]["image"].get("depth") == 1, str(identify) assert identify[0]["image"].get("compression") == "Group4", str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:alpha") in [ "unspecified", None, ], str(identify) assert identify[0]["image"].get("properties", {}).get("tiff:endian") == "lsb", str( identify ) assert ( identify[0]["image"].get("properties", {}).get("tiff:photometric") == "min-is-white" ), str(identify) assert "tiff:rows-per-strip" not in identify[0]["image"]["properties"] tiffinfo = subprocess.check_output(["tiffinfo", str(in_img)]) expected = [ r"^ Image Width: 60 Image Length: 60", r"^ Bits/Sample: 1", r"^ Compression Scheme: CCITT Group 4", r"^ Photometric Interpretation: min-is-white", r"^ FillOrder: msb-to-lsb", r"^ Samples/Pixel: 1", ] for e in expected: assert re.search(e, tiffinfo.decode("utf8"), re.MULTILINE), identify.decode( "utf8" ) unexpected = [" Rows/Strip: "] for e in unexpected: assert e not in tiffinfo.decode("utf8") return in_img @pytest.fixture(scope="session") def miff_cmyk8_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("miff_cmyk8") / "in.miff" subprocess.check_call( CONVERT + [ str(tmp_normal_png), "-colorspace", "cmyk", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "MIFF", str(identify) assert identify[0]["image"].get("class") == "DirectClass" assert identify[0]["image"].get("type") == "ColorSeparation" assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) return in_img @pytest.fixture(scope="session") def miff_cmyk16_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("miff_cmyk16") / "in.miff" subprocess.check_call( CONVERT + [ str(tmp_normal_png), "-depth", "16", "-colorspace", "cmyk", str(in_img), ] ) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "MIFF", str(identify) assert identify[0]["image"].get("class") == "DirectClass" assert identify[0]["image"].get("type") == "ColorSeparation" assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "CMYK", str(identify) assert identify[0]["image"].get("type") == "ColorSeparation", str(identify) assert identify[0]["image"].get("depth") == 16, str(identify) assert identify[0]["image"].get("baseDepth") == 16, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) return in_img @pytest.fixture(scope="session") def miff_rgb8_img(tmp_path_factory, tmp_normal_png): in_img = tmp_path_factory.mktemp("miff_rgb8") / "in.miff" subprocess.check_call(CONVERT + [str(tmp_normal_png), str(in_img)]) identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "MIFF", str(identify) assert identify[0]["image"].get("class") == "DirectClass" assert identify[0]["image"].get("type") == "TrueColor" assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) return in_img @pytest.fixture(scope="session") def png_icc_img(tmp_icc_png): in_img = tmp_icc_png identify = json.loads(subprocess.check_output(CONVERT + [str(in_img), "json:"])) assert len(identify) == 1 # somewhere between imagemagick 6.9.7.4 and 6.9.9.34, the json output was # put into an array, here we cater for the older version containing just # the bare dictionary if "image" in identify: identify = [identify] assert "image" in identify[0] assert identify[0]["image"].get("format") == "PNG", str(identify) assert identify[0]["image"].get("mimeType") == "image/png", str(identify) assert identify[0]["image"].get("geometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert identify[0]["image"].get("colorspace") == "sRGB", str(identify) assert identify[0]["image"].get("type") == "TrueColor", str(identify) assert identify[0]["image"].get("depth") == 8, str(identify) assert identify[0]["image"].get("pageGeometry") == { "width": 60, "height": 60, "x": 0, "y": 0, }, str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit-depth-orig") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.bit_depth") == "8" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color-type-orig") == "2" ), str(identify) assert ( identify[0]["image"].get("properties", {}).get("png:IHDR.color_type") == "2 (Truecolor)" ), str(identify) assert ( identify[0]["image"]["properties"]["png:IHDR.interlace_method"] == "0 (Not interlaced)" ), str(identify) return in_img ############################################################################### # OUTPUT FIXTURES # ############################################################################### @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def jpg_pdf(tmp_path_factory, jpg_img, request): out_pdf = tmp_path_factory.mktemp("jpg_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(jpg_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.Filter == "/DCTDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def jpg_rot_pdf(tmp_path_factory, jpg_rot_img, request): out_pdf = tmp_path_factory.mktemp("jpg_rot_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(jpg_rot_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.Filter == "/DCTDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 assert p.pages[0].Rotate == 90 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def jpg_cmyk_pdf(tmp_path_factory, jpg_cmyk_img, request): out_pdf = tmp_path_factory.mktemp("jpg_cmyk_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(jpg_cmyk_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK" assert p.pages[0].Resources.XObject.Im0.Decode == pikepdf.Array( [1, 0, 1, 0, 1, 0, 1, 0] ) assert p.pages[0].Resources.XObject.Im0.Filter == "/DCTDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def jpg_2000_pdf(tmp_path_factory, jpg_2000_img, request): out_pdf = tmp_path_factory.mktemp("jpg_2000_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), jpg_2000_img, ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def jpg_2000_rgba8_pdf(tmp_path_factory, jpg_2000_rgba8_img, request): out_pdf = tmp_path_factory.mktemp("jpg_2000_rgba8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), jpg_2000_rgba8_img, ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert not hasattr(p.pages[0].Resources.XObject.Im0, "ColorSpace") assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def jpg_2000_rgba16_pdf(tmp_path_factory, jpg_2000_rgba16_img, request): out_pdf = tmp_path_factory.mktemp("jpg_2000_rgba16_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), jpg_2000_rgba16_img, ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16 assert not hasattr(p.pages[0].Resources.XObject.Im0, "ColorSpace") assert p.pages[0].Resources.XObject.Im0.Filter == "/JPXDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_rgb8_pdf(tmp_path_factory, png_rgb8_img, request): out_pdf = tmp_path_factory.mktemp("png_rgb8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(png_rgb8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_rgba8_pdf(tmp_path_factory, png_rgba8_img, request): out_pdf = tmp_path_factory.mktemp("png_rgba8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(png_rgba8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 assert p.pages[0].Resources.XObject.Im0.SMask is not None assert p.pages[0].Resources.XObject.Im0.SMask.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.SMask.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.SMask.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.SMask.Height == 60 assert p.pages[0].Resources.XObject.Im0.SMask.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def gif_transparent_pdf(tmp_path_factory, gif_transparent_img, request): out_pdf = tmp_path_factory.mktemp("gif_transparent_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(gif_transparent_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 assert p.pages[0].Resources.XObject.Im0.SMask is not None assert p.pages[0].Resources.XObject.Im0.SMask.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.SMask.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.SMask.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.SMask.Height == 60 assert p.pages[0].Resources.XObject.Im0.SMask.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_rgb16_pdf(tmp_path_factory, png_rgb16_img, request): out_pdf = tmp_path_factory.mktemp("png_rgb16_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(png_rgb16_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 16 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_interlaced_pdf(tmp_path_factory, png_interlaced_img, request): out_pdf = tmp_path_factory.mktemp("png_interlaced_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(png_interlaced_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_gray1_pdf(tmp_path_factory, tmp_gray1_png, request): out_pdf = tmp_path_factory.mktemp("png_gray1_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_gray1_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_gray2_pdf(tmp_path_factory, tmp_gray2_png, request): out_pdf = tmp_path_factory.mktemp("png_gray2_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_gray2_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_gray4_pdf(tmp_path_factory, tmp_gray4_png, request): out_pdf = tmp_path_factory.mktemp("png_gray4_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_gray4_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_gray8_pdf(tmp_path_factory, tmp_gray8_png, request): out_pdf = tmp_path_factory.mktemp("png_gray8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_gray8_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_gray8a_pdf(tmp_path_factory, png_gray8a_img, request): out_pdf = tmp_path_factory.mktemp("png_gray8a_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(png_gray8a_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 assert p.pages[0].Resources.XObject.Im0.SMask is not None assert p.pages[0].Resources.XObject.Im0.SMask.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.SMask.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.SMask.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.SMask.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.SMask.Height == 60 assert p.pages[0].Resources.XObject.Im0.SMask.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_gray16_pdf(tmp_path_factory, tmp_gray16_png, request): out_pdf = tmp_path_factory.mktemp("png_gray16_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_gray16_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 16 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_palette1_pdf(tmp_path_factory, tmp_palette1_png, request): out_pdf = tmp_path_factory.mktemp("png_palette1_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_palette1_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_palette2_pdf(tmp_path_factory, tmp_palette2_png, request): out_pdf = tmp_path_factory.mktemp("png_palette2_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_palette2_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_palette4_pdf(tmp_path_factory, tmp_palette4_png, request): out_pdf = tmp_path_factory.mktemp("png_palette4_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_palette4_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_palette8_pdf(tmp_path_factory, tmp_palette8_png, request): out_pdf = tmp_path_factory.mktemp("png_palette8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_palette8_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def png_icc_pdf(tmp_path_factory, tmp_icc_png, tmp_icc_profile, request): out_pdf = tmp_path_factory.mktemp("png_icc_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tmp_icc_png), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/ICCBased" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1].N == 3 assert p.pages[0].Resources.XObject.Im0.ColorSpace[1].Alternate == "/DeviceRGB" assert ( p.pages[0].Resources.XObject.Im0.ColorSpace[1].read_bytes() == tmp_icc_profile.read_bytes() ) assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def gif_palette1_pdf(tmp_path_factory, gif_palette1_img, request): out_pdf = tmp_path_factory.mktemp("gif_palette1_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(gif_palette1_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def gif_palette2_pdf(tmp_path_factory, gif_palette2_img, request): out_pdf = tmp_path_factory.mktemp("gif_palette2_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(gif_palette2_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def gif_palette4_pdf(tmp_path_factory, gif_palette4_img, request): out_pdf = tmp_path_factory.mktemp("gif_palette4_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(gif_palette4_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def gif_palette8_pdf(tmp_path_factory, gif_palette8_img, request): out_pdf = tmp_path_factory.mktemp("gif_palette8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(gif_palette8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def gif_animation_pdf(tmp_path_factory, gif_animation_img, request): tmpdir = tmp_path_factory.mktemp("gif_animation_pdf") out_pdf = tmpdir / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(gif_animation_img), ] ) pdfinfo = subprocess.check_output(["pdfinfo", str(out_pdf)]) assert re.search( "^Pages: +2$", pdfinfo.decode("utf8"), re.MULTILINE ), identify.decode("utf8") subprocess.check_call(["pdfseparate", str(out_pdf), str(tmpdir / "page-%d.pdf")]) for page in [1, 2]: gif_animation_pdf_nr = tmpdir / ("page-%d.pdf" % page) with pikepdf.open(gif_animation_pdf_nr) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 gif_animation_pdf_nr.unlink() return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_cmyk8_pdf(tmp_path_factory, tiff_cmyk8_img, request): out_pdf = tmp_path_factory.mktemp("tiff_cmyk8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_cmyk8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK" assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_rgb8_pdf(tmp_path_factory, tiff_rgb8_img, request): out_pdf = tmp_path_factory.mktemp("tiff_rgb8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_rgb8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_gray1_pdf(tmp_path_factory, tiff_gray1_img, request): out_pdf = tmp_path_factory.mktemp("tiff_gray1_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_gray1_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == True assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_gray2_pdf(tmp_path_factory, tiff_gray2_img, request): out_pdf = tmp_path_factory.mktemp("tiff_gray2_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_gray2_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_gray4_pdf(tmp_path_factory, tiff_gray4_img, request): out_pdf = tmp_path_factory.mktemp("tiff_gray4_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_gray4_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_gray8_pdf(tmp_path_factory, tiff_gray8_img, request): out_pdf = tmp_path_factory.mktemp("tiff_gray8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_gray8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_multipage_pdf(tmp_path_factory, tiff_multipage_img, request): tmpdir = tmp_path_factory.mktemp("tiff_multipage_pdf") out_pdf = tmpdir / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_multipage_img), ] ) pdfinfo = subprocess.check_output(["pdfinfo", str(out_pdf)]) assert re.search( "^Pages: +2$", pdfinfo.decode("utf8"), re.MULTILINE ), identify.decode("utf8") subprocess.check_call(["pdfseparate", str(out_pdf), str(tmpdir / "page-%d.pdf")]) for page in [1, 2]: tiff_multipage_pdf_nr = tmpdir / ("page-%d.pdf" % page) with pikepdf.open(tiff_multipage_pdf_nr) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 tiff_multipage_pdf_nr.unlink() return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_palette1_pdf(tmp_path_factory, tiff_palette1_img, request): out_pdf = tmp_path_factory.mktemp("tiff_palette1_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_palette1_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_palette2_pdf(tmp_path_factory, tiff_palette2_img, request): out_pdf = tmp_path_factory.mktemp("tiff_palette2_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_palette2_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 2 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_palette4_pdf(tmp_path_factory, tiff_palette4_img, request): out_pdf = tmp_path_factory.mktemp("tiff_palette4_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_palette4_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 4 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_palette8_pdf(tmp_path_factory, tiff_palette8_img, request): out_pdf = tmp_path_factory.mktemp("tiff_palette8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_palette8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace[0] == "/Indexed" assert p.pages[0].Resources.XObject.Im0.ColorSpace[1] == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 1 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_ccitt_lsb_m2l_white_pdf( tmp_path_factory, tiff_ccitt_lsb_m2l_white_img, request ): out_pdf = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_white_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_ccitt_lsb_m2l_white_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_ccitt_msb_m2l_white_pdf( tmp_path_factory, tiff_ccitt_msb_m2l_white_img, request ): out_pdf = tmp_path_factory.mktemp("tiff_ccitt_msb_m2l_white_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_ccitt_msb_m2l_white_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_ccitt_msb_l2m_white_pdf( tmp_path_factory, tiff_ccitt_msb_l2m_white_img, request ): out_pdf = tmp_path_factory.mktemp("tiff_ccitt_msb_l2m_white_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_ccitt_msb_l2m_white_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_ccitt_lsb_m2l_black_pdf( tmp_path_factory, tiff_ccitt_lsb_m2l_black_img, request ): out_pdf = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_black_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_ccitt_lsb_m2l_black_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == True assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_ccitt_nometa1_pdf(tmp_path_factory, tiff_ccitt_nometa1_img, request): out_pdf = tmp_path_factory.mktemp("tiff_ccitt_nometa1_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_ccitt_nometa1_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def tiff_ccitt_nometa2_pdf(tmp_path_factory, tiff_ccitt_nometa2_img, request): out_pdf = tmp_path_factory.mktemp("tiff_ccitt_nometa2_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(tiff_ccitt_nometa2_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 1 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceGray" assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].BlackIs1 == False assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Columns == 60 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].K == -1 assert p.pages[0].Resources.XObject.Im0.DecodeParms[0].Rows == 60 assert p.pages[0].Resources.XObject.Im0.Filter[0] == "/CCITTFaxDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def miff_cmyk8_pdf(tmp_path_factory, miff_cmyk8_img, request): out_pdf = tmp_path_factory.mktemp("miff_cmyk8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(miff_cmyk8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK" assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def miff_cmyk16_pdf(tmp_path_factory, miff_cmyk16_img, request): out_pdf = tmp_path_factory.mktemp("miff_cmyk16_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(miff_cmyk16_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 16 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceCMYK" assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf @pytest.fixture(scope="session", params=["internal", "pikepdf"]) def miff_rgb8_pdf(tmp_path_factory, miff_rgb8_img, request): out_pdf = tmp_path_factory.mktemp("miff_rgb8_pdf") / "out.pdf" subprocess.check_call( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + request.param, "--output=" + str(out_pdf), str(miff_rgb8_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert ( p.pages[0].Contents.read_bytes() == b"q\n45.0000 0 0 45.0000 0.0000 0.0000 cm\n/Im0 Do\nQ" ) assert p.pages[0].Resources.XObject.Im0.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.ColorSpace == "/DeviceRGB" assert p.pages[0].Resources.XObject.Im0.DecodeParms.BitsPerComponent == 8 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Colors == 3 assert p.pages[0].Resources.XObject.Im0.DecodeParms.Predictor == 15 assert p.pages[0].Resources.XObject.Im0.Filter == "/FlateDecode" assert p.pages[0].Resources.XObject.Im0.Height == 60 assert p.pages[0].Resources.XObject.Im0.Width == 60 return out_pdf ############################################################################### # TEST CASES # ############################################################################### @pytest.mark.skipif( sys.platform in ["darwin", "win32"], reason="test utilities not available on Windows and MacOS", ) def test_jpg(tmp_path_factory, jpg_img, jpg_pdf): tmpdir = tmp_path_factory.mktemp("jpg") pnm = tmpdir / "jpg.pnm" # We have to use jpegtopnm with the original JPG before being able to compare # it with imagemagick because imagemagick will decode the JPG slightly # differently than ghostscript, poppler and mupdf do it. # We have to use jpegtopnm and cannot use djpeg because the latter produces # slightly different results as well when called like this: # djpeg -dct int -pnm "$tempdir/normal.jpg" > "$tempdir/normal.pnm" # An alternative way to compare the JPG would be to require a different DCT # method when decoding by setting -define jpeg:dct-method=ifast in the # compare command. pnm.write_bytes(subprocess.check_output(["jpegtopnm", "-dct", "int", str(jpg_img)])) compare_ghostscript(tmpdir, pnm, jpg_pdf) compare_poppler(tmpdir, pnm, jpg_pdf) compare_mupdf(tmpdir, pnm, jpg_pdf) pnm.unlink() compare_pdfimages_jpg(tmpdir, jpg_img, jpg_pdf) @pytest.mark.skipif( sys.platform in ["darwin", "win32"], reason="test utilities not available on Windows and MacOS", ) def test_jpg_rot(tmp_path_factory, jpg_rot_img, jpg_rot_pdf): tmpdir = tmp_path_factory.mktemp("jpg_rot") # We have to use jpegtopnm with the original JPG before being able to compare # it with imagemagick because imagemagick will decode the JPG slightly # differently than ghostscript, poppler and mupdf do it. # We have to use jpegtopnm and cannot use djpeg because the latter produces # slightly different results as well when called like this: # djpeg -dct int -pnm "$tempdir/normal.jpg" > "$tempdir/normal.pnm" # An alternative way to compare the JPG would be to require a different DCT # method when decoding by setting -define jpeg:dct-method=ifast in the # compare command. jpg_rot_pnm = tmpdir / "jpg_rot.pnm" jpg_rot_pnm.write_bytes( subprocess.check_output(["jpegtopnm", "-dct", "int", str(jpg_rot_img)]) ) jpg_rot_png = tmpdir / "jpg_rot.png" subprocess.check_call( CONVERT + ["-rotate", "90", str(jpg_rot_pnm), str(jpg_rot_png)] ) jpg_rot_pnm.unlink() compare_ghostscript(tmpdir, jpg_rot_png, jpg_rot_pdf) compare_poppler(tmpdir, jpg_rot_png, jpg_rot_pdf) compare_mupdf(tmpdir, jpg_rot_png, jpg_rot_pdf) jpg_rot_png.unlink() compare_pdfimages_jpg(tmpdir, jpg_rot_img, jpg_rot_pdf) @pytest.mark.skipif( sys.platform in ["darwin", "win32"], reason="test utilities not available on Windows and MacOS", ) def test_jpg_cmyk(tmp_path_factory, jpg_cmyk_img, jpg_cmyk_pdf): tmpdir = tmp_path_factory.mktemp("jpg_cmyk") compare_ghostscript( tmpdir, jpg_cmyk_img, jpg_cmyk_pdf, gsdevice="tiff32nc", exact=HAVE_EXACT_CMYK8 ) # not testing with poppler as it cannot write CMYK images compare_mupdf(tmpdir, jpg_cmyk_img, jpg_cmyk_pdf, exact=HAVE_EXACT_CMYK8, cmyk=True) compare_pdfimages_cmyk(tmpdir, jpg_cmyk_img, jpg_cmyk_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.skipif( not HAVE_JP2, reason="requires imagemagick with support for jpeg2000" ) def test_jpg_2000(tmp_path_factory, jpg_2000_img, jpg_2000_pdf): tmpdir = tmp_path_factory.mktemp("jpg_2000") compare_ghostscript(tmpdir, jpg_2000_img, jpg_2000_pdf) compare_poppler(tmpdir, jpg_2000_img, jpg_2000_pdf) compare_mupdf(tmpdir, jpg_2000_img, jpg_2000_pdf) compare_pdfimages_jp2(tmpdir, jpg_2000_img, jpg_2000_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.skipif( not HAVE_JP2, reason="requires imagemagick with support for jpeg2000" ) def test_jpg_2000_rgba8(tmp_path_factory, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf): tmpdir = tmp_path_factory.mktemp("jpg_2000_rgba8") compare_ghostscript(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf) # compare_poppler(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf) # compare_mupdf(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf) compare_pdfimages_jp2(tmpdir, jpg_2000_rgba8_img, jpg_2000_rgba8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.skipif( not HAVE_JP2, reason="requires imagemagick with support for jpeg2000" ) def test_jpg_2000_rgba16(tmp_path_factory, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf): tmpdir = tmp_path_factory.mktemp("jpg_2000_rgba16") compare_ghostscript( tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf, gsdevice="tiff48nc" ) # poppler outputs 8-bit RGB so the comparison will not be exact # compare_poppler(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf, exact=False) # compare_mupdf(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf) compare_pdfimages_jp2(tmpdir, jpg_2000_rgba16_img, jpg_2000_rgba16_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_rgb8(tmp_path_factory, png_rgb8_img, png_rgb8_pdf): tmpdir = tmp_path_factory.mktemp("png_rgb8") compare_ghostscript(tmpdir, png_rgb8_img, png_rgb8_pdf) compare_poppler(tmpdir, png_rgb8_img, png_rgb8_pdf) compare_mupdf(tmpdir, png_rgb8_img, png_rgb8_pdf) compare_pdfimages_png(tmpdir, png_rgb8_img, png_rgb8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_rgb16(tmp_path_factory, png_rgb16_img, png_rgb16_pdf): tmpdir = tmp_path_factory.mktemp("png_rgb16") compare_ghostscript(tmpdir, png_rgb16_img, png_rgb16_pdf, gsdevice="tiff48nc") # poppler outputs 8-bit RGB so the comparison will not be exact compare_poppler(tmpdir, png_rgb16_img, png_rgb16_pdf, exact=False) # pdfimages is unable to write 16 bit output @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_rgba8(tmp_path_factory, png_rgba8_img, png_rgba8_pdf): tmpdir = tmp_path_factory.mktemp("png_rgba8") compare_pdfimages_png(tmpdir, png_rgba8_img, png_rgba8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_png_rgba16(tmp_path_factory, png_rgba16_img, engine): out_pdf = tmp_path_factory.mktemp("png_rgba16") / "out.pdf" assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(png_rgba16_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_gray8a(tmp_path_factory, png_gray8a_img, png_gray8a_pdf): tmpdir = tmp_path_factory.mktemp("png_gray8a") compare_pdfimages_png(tmpdir, png_gray8a_img, png_gray8a_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_png_gray16a(tmp_path_factory, png_gray16a_img, engine): out_pdf = tmp_path_factory.mktemp("png_gray16a") / "out.pdf" assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(png_gray16a_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_interlaced(tmp_path_factory, png_interlaced_img, png_interlaced_pdf): tmpdir = tmp_path_factory.mktemp("png_interlaced") compare_ghostscript(tmpdir, png_interlaced_img, png_interlaced_pdf) compare_poppler(tmpdir, png_interlaced_img, png_interlaced_pdf) compare_mupdf(tmpdir, png_interlaced_img, png_interlaced_pdf) compare_pdfimages_png(tmpdir, png_interlaced_img, png_interlaced_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_gray1(tmp_path_factory, png_gray1_img, png_gray1_pdf): tmpdir = tmp_path_factory.mktemp("png_gray1") compare_ghostscript(tmpdir, png_gray1_img, png_gray1_pdf, gsdevice="pnggray") compare_poppler(tmpdir, png_gray1_img, png_gray1_pdf) compare_mupdf(tmpdir, png_gray1_img, png_gray1_pdf) compare_pdfimages_png(tmpdir, png_gray1_img, png_gray1_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_gray2(tmp_path_factory, png_gray2_img, png_gray2_pdf): tmpdir = tmp_path_factory.mktemp("png_gray2") compare_ghostscript(tmpdir, png_gray2_img, png_gray2_pdf, gsdevice="pnggray") compare_poppler(tmpdir, png_gray2_img, png_gray2_pdf) compare_mupdf(tmpdir, png_gray2_img, png_gray2_pdf) compare_pdfimages_png(tmpdir, png_gray2_img, png_gray2_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_gray4(tmp_path_factory, png_gray4_img, png_gray4_pdf): tmpdir = tmp_path_factory.mktemp("png_gray4") compare_ghostscript(tmpdir, png_gray4_img, png_gray4_pdf, gsdevice="pnggray") compare_poppler(tmpdir, png_gray4_img, png_gray4_pdf) compare_mupdf(tmpdir, png_gray4_img, png_gray4_pdf) compare_pdfimages_png(tmpdir, png_gray4_img, png_gray4_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_gray8(tmp_path_factory, png_gray8_img, png_gray8_pdf): tmpdir = tmp_path_factory.mktemp("png_gray8") compare_ghostscript(tmpdir, png_gray8_img, png_gray8_pdf, gsdevice="pnggray") compare_poppler(tmpdir, png_gray8_img, png_gray8_pdf) compare_mupdf(tmpdir, png_gray8_img, png_gray8_pdf) compare_pdfimages_png(tmpdir, png_gray8_img, png_gray8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_gray16(tmp_path_factory, png_gray16_img, png_gray16_pdf): tmpdir = tmp_path_factory.mktemp("png_gray16") # ghostscript outputs 8-bit grayscale, so the comparison will not be exact compare_ghostscript( tmpdir, png_gray16_img, png_gray16_pdf, gsdevice="pnggray", exact=False ) # poppler outputs 8-bit grayscale so the comparison will not be exact compare_poppler(tmpdir, png_gray16_img, png_gray16_pdf, exact=False) # pdfimages outputs 8-bit grayscale so the comparison will not be exact compare_pdfimages_png(tmpdir, png_gray16_img, png_gray16_pdf, exact=False) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_palette1(tmp_path_factory, png_palette1_img, png_palette1_pdf): tmpdir = tmp_path_factory.mktemp("png_palette1") compare_ghostscript(tmpdir, png_palette1_img, png_palette1_pdf) compare_poppler(tmpdir, png_palette1_img, png_palette1_pdf) compare_mupdf(tmpdir, png_palette1_img, png_palette1_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_palette2(tmp_path_factory, png_palette2_img, png_palette2_pdf): tmpdir = tmp_path_factory.mktemp("png_palette2") compare_ghostscript(tmpdir, png_palette2_img, png_palette2_pdf) compare_poppler(tmpdir, png_palette2_img, png_palette2_pdf) compare_mupdf(tmpdir, png_palette2_img, png_palette2_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_palette4(tmp_path_factory, png_palette4_img, png_palette4_pdf): tmpdir = tmp_path_factory.mktemp("png_palette4") compare_ghostscript(tmpdir, png_palette4_img, png_palette4_pdf) compare_poppler(tmpdir, png_palette4_img, png_palette4_pdf) compare_mupdf(tmpdir, png_palette4_img, png_palette4_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_palette8(tmp_path_factory, png_palette8_img, png_palette8_pdf): tmpdir = tmp_path_factory.mktemp("png_palette8") compare_ghostscript(tmpdir, png_palette8_img, png_palette8_pdf) compare_poppler(tmpdir, png_palette8_img, png_palette8_pdf) compare_mupdf(tmpdir, png_palette8_img, png_palette8_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["darwin", "win32"], reason="test utilities not available on Windows and MacOS", ) def test_png_icc(tmp_path_factory, png_icc_img, png_icc_pdf): tmpdir = tmp_path_factory.mktemp("png_icc") compare_ghostscript(tmpdir, png_icc_img, png_icc_pdf, exact=False, icc=True) compare_poppler(tmpdir, png_icc_img, png_icc_pdf, exact=False, icc=True) # mupdf ignores the ICC profile in Debian (needs patched thirdparty liblcms2-art) compare_pdfimages_png(tmpdir, png_icc_img, png_icc_pdf, exact=False, icc=True) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_gif_transparent(tmp_path_factory, gif_transparent_img, gif_transparent_pdf): tmpdir = tmp_path_factory.mktemp("gif_transparent") compare_pdfimages_png(tmpdir, gif_transparent_img, gif_transparent_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_gif_palette1(tmp_path_factory, gif_palette1_img, gif_palette1_pdf): tmpdir = tmp_path_factory.mktemp("gif_palette1") compare_ghostscript(tmpdir, gif_palette1_img, gif_palette1_pdf) compare_poppler(tmpdir, gif_palette1_img, gif_palette1_pdf) compare_mupdf(tmpdir, gif_palette1_img, gif_palette1_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_gif_palette2(tmp_path_factory, gif_palette2_img, gif_palette2_pdf): tmpdir = tmp_path_factory.mktemp("gif_palette2") compare_ghostscript(tmpdir, gif_palette2_img, gif_palette2_pdf) compare_poppler(tmpdir, gif_palette2_img, gif_palette2_pdf) compare_mupdf(tmpdir, gif_palette2_img, gif_palette2_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_gif_palette4(tmp_path_factory, gif_palette4_img, gif_palette4_pdf): tmpdir = tmp_path_factory.mktemp("gif_palette4") compare_ghostscript(tmpdir, gif_palette4_img, gif_palette4_pdf) compare_poppler(tmpdir, gif_palette4_img, gif_palette4_pdf) compare_mupdf(tmpdir, gif_palette4_img, gif_palette4_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_gif_palette8(tmp_path_factory, gif_palette8_img, gif_palette8_pdf): tmpdir = tmp_path_factory.mktemp("gif_palette8") compare_ghostscript(tmpdir, gif_palette8_img, gif_palette8_pdf) compare_poppler(tmpdir, gif_palette8_img, gif_palette8_pdf) compare_mupdf(tmpdir, gif_palette8_img, gif_palette8_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_gif_animation(tmp_path_factory, gif_animation_img, gif_animation_pdf): tmpdir = tmp_path_factory.mktemp("gif_animation") subprocess.check_call( ["pdfseparate", str(gif_animation_pdf), str(tmpdir / "page-%d.pdf")] ) for page in [1, 2]: gif_animation_pdf_nr = tmpdir / ("page-%d.pdf" % page) compare_ghostscript( tmpdir, str(gif_animation_img) + "[%d]" % (page - 1), gif_animation_pdf_nr ) compare_poppler( tmpdir, str(gif_animation_img) + "[%d]" % (page - 1), gif_animation_pdf_nr ) compare_mupdf( tmpdir, str(gif_animation_img) + "[%d]" % (page - 1), gif_animation_pdf_nr ) # pdfimages cannot export palette based images gif_animation_pdf_nr.unlink() @pytest.mark.skipif( sys.platform in ["darwin", "win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.skipif( platform.machine() == "s390x", reason="https://github.com/ImageMagick/ImageMagick/issues/8054", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_float(tmp_path_factory, tiff_float_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_float") / "out.pdf" assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_float_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_cmyk8(tmp_path_factory, tiff_cmyk8_img, tiff_cmyk8_pdf): tmpdir = tmp_path_factory.mktemp("tiff_cmyk8") compare_ghostscript( tmpdir, tiff_cmyk8_img, tiff_cmyk8_pdf, gsdevice="tiff32nc", exact=HAVE_EXACT_CMYK8, ) # not testing with poppler as it cannot write CMYK images compare_mupdf( tmpdir, tiff_cmyk8_img, tiff_cmyk8_pdf, exact=HAVE_EXACT_CMYK8, cmyk=True ) compare_pdfimages_tiff(tmpdir, tiff_cmyk8_img, tiff_cmyk8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_cmyk16(tmp_path_factory, tiff_cmyk16_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_cmyk16") / "out.pdf" # PIL is unable to read 16 bit CMYK images assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_cmyk16_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_rgb8(tmp_path_factory, tiff_rgb8_img, tiff_rgb8_pdf): tmpdir = tmp_path_factory.mktemp("tiff_rgb8") compare_ghostscript(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf, gsdevice="tiff24nc") compare_poppler(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf) compare_mupdf(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf) compare_pdfimages_tiff(tmpdir, tiff_rgb8_img, tiff_rgb8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_rgb12(tmp_path_factory, tiff_rgb12_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_rgb12") / "out.pdf" # PIL is unable to preserve more than 8 bits per sample assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_rgb12_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_rgb14(tmp_path_factory, tiff_rgb14_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_rgb14") / "out.pdf" # PIL is unable to preserve more than 8 bits per sample assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_rgb14_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_rgb16(tmp_path_factory, tiff_rgb16_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_rgb16") / "out.pdf" # PIL is unable to preserve more than 8 bits per sample assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_rgb16_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_rgba8(tmp_path_factory, tiff_rgba8_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_rgba8") / "out.pdf" assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_rgba8_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_rgba16(tmp_path_factory, tiff_rgba16_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_rgba16") / "out.pdf" assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_rgba16_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_gray1(tmp_path_factory, tiff_gray1_img, tiff_gray1_pdf): tmpdir = tmp_path_factory.mktemp("tiff_gray1") compare_ghostscript(tmpdir, tiff_gray1_img, tiff_gray1_pdf, gsdevice="pnggray") compare_poppler(tmpdir, tiff_gray1_img, tiff_gray1_pdf) compare_mupdf(tmpdir, tiff_gray1_img, tiff_gray1_pdf) compare_pdfimages_tiff(tmpdir, tiff_gray1_img, tiff_gray1_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_gray2(tmp_path_factory, tiff_gray2_img, tiff_gray2_pdf): tmpdir = tmp_path_factory.mktemp("tiff_gray2") compare_ghostscript(tmpdir, tiff_gray2_img, tiff_gray2_pdf, gsdevice="pnggray") compare_poppler(tmpdir, tiff_gray2_img, tiff_gray2_pdf) compare_mupdf(tmpdir, tiff_gray2_img, tiff_gray2_pdf) compare_pdfimages_tiff(tmpdir, tiff_gray2_img, tiff_gray2_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_gray4(tmp_path_factory, tiff_gray4_img, tiff_gray4_pdf): tmpdir = tmp_path_factory.mktemp("tiff_gray4") compare_ghostscript(tmpdir, tiff_gray4_img, tiff_gray4_pdf, gsdevice="pnggray") compare_poppler(tmpdir, tiff_gray4_img, tiff_gray4_pdf) compare_mupdf(tmpdir, tiff_gray4_img, tiff_gray4_pdf) compare_pdfimages_tiff(tmpdir, tiff_gray4_img, tiff_gray4_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_gray8(tmp_path_factory, tiff_gray8_img, tiff_gray8_pdf): tmpdir = tmp_path_factory.mktemp("tiff_gray8") compare_ghostscript(tmpdir, tiff_gray8_img, tiff_gray8_pdf, gsdevice="pnggray") compare_poppler(tmpdir, tiff_gray8_img, tiff_gray8_pdf) compare_mupdf(tmpdir, tiff_gray8_img, tiff_gray8_pdf) compare_pdfimages_tiff(tmpdir, tiff_gray8_img, tiff_gray8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_tiff_gray16(tmp_path_factory, tiff_gray16_img, engine): out_pdf = tmp_path_factory.mktemp("tiff_gray16") / "out.pdf" assert ( 0 != subprocess.run( [ img2pdfprog, "--producer=", "--nodate", "--engine=" + engine, "--output=" + str(out_pdf), str(tiff_gray16_img), ] ).returncode ) out_pdf.unlink() @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_multipage(tmp_path_factory, tiff_multipage_img, tiff_multipage_pdf): tmpdir = tmp_path_factory.mktemp("tiff_multipage") subprocess.check_call( ["pdfseparate", str(tiff_multipage_pdf), str(tmpdir / "page-%d.pdf")] ) for page in [1, 2]: tiff_multipage_pdf_nr = tmpdir / ("page-%d.pdf" % page) compare_ghostscript( tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr ) compare_poppler( tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr ) compare_mupdf( tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr ) compare_pdfimages_tiff( tmpdir, str(tiff_multipage_img) + "[%d]" % (page - 1), tiff_multipage_pdf_nr ) tiff_multipage_pdf_nr.unlink() @pytest.mark.skipif( not HAVE_IMAGEMAGICK_MODERN, reason="requires imagemagick with support for keeping the palette depth", ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_palette1(tmp_path_factory, tiff_palette1_img, tiff_palette1_pdf): tmpdir = tmp_path_factory.mktemp("tiff_palette1") compare_ghostscript(tmpdir, tiff_palette1_img, tiff_palette1_pdf) compare_poppler(tmpdir, tiff_palette1_img, tiff_palette1_pdf) compare_mupdf(tmpdir, tiff_palette1_img, tiff_palette1_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( not HAVE_IMAGEMAGICK_MODERN, reason="requires imagemagick with support for keeping the palette depth", ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_palette2(tmp_path_factory, tiff_palette2_img, tiff_palette2_pdf): tmpdir = tmp_path_factory.mktemp("tiff_palette2") compare_ghostscript(tmpdir, tiff_palette2_img, tiff_palette2_pdf) compare_poppler(tmpdir, tiff_palette2_img, tiff_palette2_pdf) compare_mupdf(tmpdir, tiff_palette2_img, tiff_palette2_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( not HAVE_IMAGEMAGICK_MODERN, reason="requires imagemagick with support for keeping the palette depth", ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_palette4(tmp_path_factory, tiff_palette4_img, tiff_palette4_pdf): tmpdir = tmp_path_factory.mktemp("tiff_palette4") compare_ghostscript(tmpdir, tiff_palette4_img, tiff_palette4_pdf) compare_poppler(tmpdir, tiff_palette4_img, tiff_palette4_pdf) compare_mupdf(tmpdir, tiff_palette4_img, tiff_palette4_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_palette8(tmp_path_factory, tiff_palette8_img, tiff_palette8_pdf): tmpdir = tmp_path_factory.mktemp("tiff_palette8") compare_ghostscript(tmpdir, tiff_palette8_img, tiff_palette8_pdf) compare_poppler(tmpdir, tiff_palette8_img, tiff_palette8_pdf) compare_mupdf(tmpdir, tiff_palette8_img, tiff_palette8_pdf) # pdfimages cannot export palette based images @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_ccitt_lsb_m2l_white( tmp_path_factory, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf ): tmpdir = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_white") compare_ghostscript( tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf, gsdevice="pnggray", ) compare_poppler(tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf) compare_mupdf(tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf) compare_pdfimages_tiff( tmpdir, tiff_ccitt_lsb_m2l_white_img, tiff_ccitt_lsb_m2l_white_pdf ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_ccitt_msb_m2l_white( tmp_path_factory, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf ): tmpdir = tmp_path_factory.mktemp("tiff_ccitt_msb_m2l_white") compare_ghostscript( tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf, gsdevice="pnggray", ) compare_poppler(tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf) compare_mupdf(tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf) compare_pdfimages_tiff( tmpdir, tiff_ccitt_msb_m2l_white_img, tiff_ccitt_msb_m2l_white_pdf ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_ccitt_msb_l2m_white( tmp_path_factory, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf ): tmpdir = tmp_path_factory.mktemp("tiff_ccitt_msb_l2m_white") compare_ghostscript( tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf, gsdevice="pnggray", ) compare_poppler(tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf) compare_mupdf(tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf) compare_pdfimages_tiff( tmpdir, tiff_ccitt_msb_l2m_white_img, tiff_ccitt_msb_l2m_white_pdf ) @pytest.mark.skipif( not HAVE_IMAGEMAGICK_MODERN, reason="requires imagemagick with support for min-is-black", ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_ccitt_lsb_m2l_black( tmp_path_factory, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf ): tmpdir = tmp_path_factory.mktemp("tiff_ccitt_lsb_m2l_black") compare_ghostscript( tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf, gsdevice="pnggray", ) compare_poppler(tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf) compare_mupdf(tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf) compare_pdfimages_tiff( tmpdir, tiff_ccitt_lsb_m2l_black_img, tiff_ccitt_lsb_m2l_black_pdf ) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_ccitt_nometa1( tmp_path_factory, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf ): tmpdir = tmp_path_factory.mktemp("tiff_ccitt_nometa1") compare_ghostscript( tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf, gsdevice="pnggray" ) compare_poppler(tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf) compare_mupdf(tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf) compare_pdfimages_tiff(tmpdir, tiff_ccitt_nometa1_img, tiff_ccitt_nometa1_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_tiff_ccitt_nometa2( tmp_path_factory, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf ): tmpdir = tmp_path_factory.mktemp("tiff_ccitt_nometa2") compare_ghostscript( tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf, gsdevice="pnggray" ) compare_poppler(tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf) compare_mupdf(tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf) compare_pdfimages_tiff(tmpdir, tiff_ccitt_nometa2_img, tiff_ccitt_nometa2_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_miff_cmyk8(tmp_path_factory, miff_cmyk8_img, tiff_cmyk8_img, miff_cmyk8_pdf): tmpdir = tmp_path_factory.mktemp("miff_cmyk8") compare_ghostscript(tmpdir, tiff_cmyk8_img, miff_cmyk8_pdf, gsdevice="tiff32nc") # not testing with poppler as it cannot write CMYK images compare_mupdf(tmpdir, tiff_cmyk8_img, miff_cmyk8_pdf, cmyk=True) compare_pdfimages_tiff(tmpdir, tiff_cmyk8_img, miff_cmyk8_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) @pytest.mark.skipif( platform.machine() == "s390x", reason="https://github.com/ImageMagick/ImageMagick/issues/8055", ) def test_miff_cmyk16( tmp_path_factory, miff_cmyk16_img, tiff_cmyk16_img, miff_cmyk16_pdf ): tmpdir = tmp_path_factory.mktemp("miff_cmyk16") compare_ghostscript( tmpdir, tiff_cmyk16_img, miff_cmyk16_pdf, gsdevice="tiff32nc", exact=False ) # not testing with poppler as it cannot write CMYK images compare_mupdf(tmpdir, tiff_cmyk16_img, miff_cmyk16_pdf, exact=False, cmyk=True) # compare_pdfimages_tiff(tmpdir, tiff_cmyk16_img, miff_cmyk16_pdf) @pytest.mark.skipif( sys.platform in ["win32"], reason="test utilities not available on Windows and MacOS", ) def test_miff_rgb8(tmp_path_factory, miff_rgb8_img, tiff_rgb8_img, miff_rgb8_pdf): tmpdir = tmp_path_factory.mktemp("miff_rgb8") compare_ghostscript(tmpdir, tiff_rgb8_img, miff_rgb8_pdf, gsdevice="tiff24nc") compare_poppler(tmpdir, tiff_rgb8_img, miff_rgb8_pdf) compare_mupdf(tmpdir, tiff_rgb8_img, miff_rgb8_pdf) compare_pdfimages_tiff(tmpdir, tiff_rgb8_img, miff_rgb8_pdf) # we define some variables so that the table below can be narrower psl = (972, 504) # --pagesize landscape psp = (504, 972) # --pagesize portrait isl = (756, 324) # --imgsize landscape isp = (324, 756) # --imgsize portrait border = (162, 270) # --border poster = (97200, 50400) # shortcuts for fit modes f_into = img2pdf.FitMode.into f_fill = img2pdf.FitMode.fill f_exact = img2pdf.FitMode.exact f_shrink = img2pdf.FitMode.shrink f_enlarge = img2pdf.FitMode.enlarge @pytest.mark.parametrize( "layout_test_cases", [ # fmt: off # psp=972x504, psl=504x972, isl=756x324, isp=324x756, border=162:270 # --pagesize --border -a pagepdf imgpdf # --imgsize --fit (None, None, None, f_into, 0, (648, 216), (648, 216), # 000 (864, 432), (864, 432)), (None, None, None, f_into, 1, (648, 216), (648, 216), # 001 (864, 432), (864, 432)), (None, None, None, f_fill, 0, (648, 216), (648, 216), # 002 (864, 432), (864, 432)), (None, None, None, f_fill, 1, (648, 216), (648, 216), # 003 (864, 432), (864, 432)), (None, None, None, f_exact, 0, (648, 216), (648, 216), # 004 (864, 432), (864, 432)), (None, None, None, f_exact, 1, (648, 216), (648, 216), # 005 (864, 432), (864, 432)), (None, None, None, f_shrink, 0, (648, 216), (648, 216), # 006 (864, 432), (864, 432)), (None, None, None, f_shrink, 1, (648, 216), (648, 216), # 007 (864, 432), (864, 432)), (None, None, None, f_enlarge, 0, (648, 216), (648, 216), # 008 (864, 432), (864, 432)), (None, None, None, f_enlarge, 1, (648, 216), (648, 216), # 009 (864, 432), (864, 432)), (None, None, border, f_into, 0, (1188, 540), (648, 216), # 010 (1404, 756), (864, 432)), (None, None, border, f_into, 1, (1188, 540), (648, 216), # 011 (1404, 756), (864, 432)), (None, None, border, f_fill, 0, (1188, 540), (648, 216), # 012 (1404, 756), (864, 432)), (None, None, border, f_fill, 1, (1188, 540), (648, 216), # 013 (1404, 756), (864, 432)), (None, None, border, f_exact, 0, (1188, 540), (648, 216), # 014 (1404, 756), (864, 432)), (None, None, border, f_exact, 1, (1188, 540), (648, 216), # 015 (1404, 756), (864, 432)), (None, None, border, f_shrink, 0, (1188, 540), (648, 216), # 016 (1404, 756), (864, 432)), (None, None, border, f_shrink, 1, (1188, 540), (648, 216), # 017 (1404, 756), (864, 432)), (None, None, border, f_enlarge, 0, (1188, 540), (648, 216), # 018 (1404, 756), (864, 432)), (None, None, border, f_enlarge, 1, (1188, 540), (648, 216), # 019 (1404, 756), (864, 432)), (None, isp, None, f_into, 0, (324, 108), (324, 108), # 020 (324, 162), (324, 162)), (None, isp, None, f_into, 1, (324, 108), (324, 108), # 021 (324, 162), (324, 162)), (None, isp, None, f_fill, 0, (2268, 756), (2268, 756), # 022 (1512, 756), (1512, 756)), (None, isp, None, f_fill, 1, (2268, 756), (2268, 756), # 023 (1512, 756), (1512, 756)), (None, isp, None, f_exact, 0, (324, 756), (324, 756), # 024 (324, 756), (324, 756)), (None, isp, None, f_exact, 1, (324, 756), (324, 756), # 025 (324, 756), (324, 756)), (None, isp, None, f_shrink, 0, (324, 108), (324, 108), # 026 (324, 162), (324, 162)), (None, isp, None, f_shrink, 1, (324, 108), (324, 108), # 027 (324, 162), (324, 162)), (None, isp, None, f_enlarge, 0, (648, 216), (648, 216), # 028 (864, 432), (864, 432)), (None, isp, None, f_enlarge, 1, (648, 216), (648, 216), # 029 (864, 432), (864, 432)), (None, isp, border, f_into, 0, (864, 432), (324, 108), # 030 (864, 486), (324, 162)), (None, isp, border, f_into, 1, (864, 432), (324, 108), # 031 (864, 486), (324, 162)), (None, isp, border, f_fill, 0, (2808, 1080), (2268, 756), # 032 (2052, 1080), (1512, 756)), (None, isp, border, f_fill, 1, (2808, 1080), (2268, 756), # 033 (2052, 1080), (1512, 756)), (None, isp, border, f_exact, 0, (864, 1080), (324, 756), # 034 (864, 1080), (324, 756)), (None, isp, border, f_exact, 1, (864, 1080), (324, 756), # 035 (864, 1080), (324, 756)), (None, isp, border, f_shrink, 0, (864, 432), (324, 108), # 036 (864, 486), (324, 162)), (None, isp, border, f_shrink, 1, (864, 432), (324, 108), # 037 (864, 486), (324, 162)), (None, isp, border, f_enlarge, 0, (1188, 540), (648, 216), # 038 (1404, 756), (864, 432)), (None, isp, border, f_enlarge, 1, (1188, 540), (648, 216), # 039 (1404, 756), (864, 432)), (None, isl, None, f_into, 0, (756, 252), (756, 252), # 040 (648, 324), (648, 324)), (None, isl, None, f_into, 1, (756, 252), (756, 252), # 041 (648, 324), (648, 324)), (None, isl, None, f_fill, 0, (972, 324), (972, 324), # 042 (756, 378), (756, 378)), (None, isl, None, f_fill, 1, (972, 324), (972, 324), # 043 (756, 378), (756, 378)), (None, isl, None, f_exact, 0, (756, 324), (756, 324), # 044 (756, 324), (756, 324)), (None, isl, None, f_exact, 1, (756, 324), (756, 324), # 045 (756, 324), (756, 324)), (None, isl, None, f_shrink, 0, (648, 216), (648, 216), # 046 (648, 324), (648, 324)), (None, isl, None, f_shrink, 1, (648, 216), (648, 216), # 047 (648, 324), (648, 324)), (None, isl, None, f_enlarge, 0, (756, 252), (756, 252), # 048 (864, 432), (864, 432)), (None, isl, None, f_enlarge, 1, (756, 252), (756, 252), # 049 (864, 432), (864, 432)), # psp=972x504, psp=504x972, isl=756x324, isp=324x756, border=162:270 # --pagesize --border -a pagepdf imgpdf # --imgsize --fit imgpx (None, isl, border, f_into, 0, (1296, 576), (756, 252), # 050 (1188, 648), (648, 324)), (None, isl, border, f_into, 1, (1296, 576), (756, 252), # 051 (1188, 648), (648, 324)), (None, isl, border, f_fill, 0, (1512, 648), (972, 324), # 052 (1296, 702), (756, 378)), (None, isl, border, f_fill, 1, (1512, 648), (972, 324), # 053 (1296, 702), (756, 378)), (None, isl, border, f_exact, 0, (1296, 648), (756, 324), # 054 (1296, 648), (756, 324)), (None, isl, border, f_exact, 1, (1296, 648), (756, 324), # 055 (1296, 648), (756, 324)), (None, isl, border, f_shrink, 0, (1188, 540), (648, 216), # 056 (1188, 648), (648, 324)), (None, isl, border, f_shrink, 1, (1188, 540), (648, 216), # 057 (1188, 648), (648, 324)), (None, isl, border, f_enlarge, 0, (1296, 576), (756, 252), # 058 (1404, 756), (864, 432)), (None, isl, border, f_enlarge, 1, (1296, 576), (756, 252), # 059 (1404, 756), (864, 432)), (psp, None, None, f_into, 0, (504, 972), (504, 168), # 060 (504, 972), (504, 252)), (psp, None, None, f_into, 1, (972, 504), (972, 324), # 061 (972, 504), (972, 486)), (psp, None, None, f_fill, 0, (504, 972), (2916, 972), # 062 (504, 972), (1944, 972)), (psp, None, None, f_fill, 1, (972, 504), (1512, 504), # 063 (972, 504), (1008, 504)), (psp, None, None, f_exact, 0, (504, 972), (504, 972), # 064 (504, 972), (504, 972)), (psp, None, None, f_exact, 1, (972, 504), (972, 504), # 065 (972, 504), (972, 504)), (psp, None, None, f_shrink, 0, (504, 972), (504, 168), # 066 (504, 972), (504, 252)), (psp, None, None, f_shrink, 1, (972, 504), (648, 216), # 067 (972, 504), (864, 432)), (psp, None, None, f_enlarge, 0, (504, 972), (648, 216), # 068 (504, 972), (864, 432)), (psp, None, None, f_enlarge, 1, (972, 504), (972, 324), # 069 (972, 504), (972, 486)), (psp, None, border, f_into, 0, None, None, None, None), # 070 (psp, None, border, f_into, 1, None, None, None, None), # 071 (psp, None, border, f_fill, 0, (504, 972), (1944, 648), # 072 (504, 972), (1296, 648)), (psp, None, border, f_fill, 1, (972, 504), (648, 216), # 073 (972, 504), (648, 324)), (psp, None, border, f_exact, 0, None, None, None, None), # 074 (psp, None, border, f_exact, 1, None, None, None, None), # 075 (psp, None, border, f_shrink, 0, None, None, None, None), # 076 (psp, None, border, f_shrink, 1, None, None, None, None), # 077 (psp, None, border, f_enlarge, 0, (504, 972), (648, 216), # 078 (504, 972), (864, 432)), (psp, None, border, f_enlarge, 1, (972, 504), (648, 216), # 079 (972, 504), (864, 432)), (psp, isp, None, f_into, 0, (504, 972), (324, 108), # 080 (504, 972), (324, 162)), (psp, isp, None, f_into, 1, (972, 504), (324, 108), # 081 (972, 504), (324, 162)), (psp, isp, None, f_fill, 0, (504, 972), (2268, 756), # 082 (504, 972), (1512, 756)), (psp, isp, None, f_fill, 1, (972, 504), (2268, 756), # 083 (972, 504), (1512, 756)), (psp, isp, None, f_exact, 0, (504, 972), (324, 756), # 084 (504, 972), (324, 756)), (psp, isp, None, f_exact, 1, (972, 504), (324, 756), # 085 (972, 504), (324, 756)), (psp, isp, None, f_shrink, 0, (504, 972), (324, 108), # 086 (504, 972), (324, 162)), (psp, isp, None, f_shrink, 1, (972, 504), (324, 108), # 087 (972, 504), (324, 162)), (psp, isp, None, f_enlarge, 0, (504, 972), (648, 216), # 088 (504, 972), (864, 432)), (psp, isp, None, f_enlarge, 1, (972, 504), (648, 216), # 089 (972, 504), (864, 432)), (psp, isp, border, f_into, 0, (504, 972), (324, 108), # 090 (504, 972), (324, 162)), (psp, isp, border, f_into, 1, (972, 504), (324, 108), # 091 (972, 504), (324, 162)), (psp, isp, border, f_fill, 0, (504, 972), (2268, 756), # 092 (504, 972), (1512, 756)), (psp, isp, border, f_fill, 1, (972, 504), (2268, 756), # 093 (972, 504), (1512, 756)), (psp, isp, border, f_exact, 0, (504, 972), (324, 756), # 094 (504, 972), (324, 756)), (psp, isp, border, f_exact, 1, (972, 504), (324, 756), # 095 (972, 504), (324, 756)), (psp, isp, border, f_shrink, 0, (504, 972), (324, 108), # 096 (504, 972), (324, 162)), (psp, isp, border, f_shrink, 1, (972, 504), (324, 108), # 097 (972, 504), (324, 162)), (psp, isp, border, f_enlarge, 0, (504, 972), (648, 216), # 098 (504, 972), (864, 432)), (psp, isp, border, f_enlarge, 1, (972, 504), (648, 216), # 099 (972, 504), (864, 432)), # psp=972x504, psp=504x972, isl=756x324, isp=324x756, border=162:270 # --pagesize --border -a pagepdf imgpdf # --imgsize --fit imgpx (psp, isl, None, f_into, 0, (504, 972), (756, 252), # 100 (504, 972), (648, 324)), (psp, isl, None, f_into, 1, (972, 504), (756, 252), # 101 (972, 504), (648, 324)), (psp, isl, None, f_fill, 0, (504, 972), (972, 324), # 102 (504, 972), (756, 378)), (psp, isl, None, f_fill, 1, (972, 504), (972, 324), # 103 (972, 504), (756, 378)), (psp, isl, None, f_exact, 0, (504, 972), (756, 324), # 104 (504, 972), (756, 324)), (psp, isl, None, f_exact, 1, (972, 504), (756, 324), # 105 (972, 504), (756, 324)), (psp, isl, None, f_shrink, 0, (504, 972), (648, 216), # 106 (504, 972), (648, 324)), (psp, isl, None, f_shrink, 1, (972, 504), (648, 216), # 107 (972, 504), (648, 324)), (psp, isl, None, f_enlarge, 0, (504, 972), (756, 252), # 108 (504, 972), (864, 432)), (psp, isl, None, f_enlarge, 1, (972, 504), (756, 252), # 109 (972, 504), (864, 432)), (psp, isl, border, f_into, 0, (504, 972), (756, 252), # 110 (504, 972), (648, 324)), (psp, isl, border, f_into, 1, (972, 504), (756, 252), # 111 (972, 504), (648, 324)), (psp, isl, border, f_fill, 0, (504, 972), (972, 324), # 112 (504, 972), (756, 378)), (psp, isl, border, f_fill, 1, (972, 504), (972, 324), # 113 (972, 504), (756, 378)), (psp, isl, border, f_exact, 0, (504, 972), (756, 324), # 114 (504, 972), (756, 324)), (psp, isl, border, f_exact, 1, (972, 504), (756, 324), # 115 (972, 504), (756, 324)), (psp, isl, border, f_shrink, 0, (504, 972), (648, 216), # 116 (504, 972), (648, 324)), (psp, isl, border, f_shrink, 1, (972, 504), (648, 216), # 117 (972, 504), (648, 324)), (psp, isl, border, f_enlarge, 0, (504, 972), (756, 252), # 118 (504, 972), (864, 432)), (psp, isl, border, f_enlarge, 1, (972, 504), (756, 252), # 119 (972, 504), (864, 432)), (psl, None, None, f_into, 0, (972, 504), (972, 324), # 120 (972, 504), (972, 486)), (psl, None, None, f_into, 1, (972, 504), (972, 324), # 121 (972, 504), (972, 486)), (psl, None, None, f_fill, 0, (972, 504), (1512, 504), # 122 (972, 504), (1008, 504)), (psl, None, None, f_fill, 1, (972, 504), (1512, 504), # 123 (972, 504), (1008, 504)), (psl, None, None, f_exact, 0, (972, 504), (972, 504), # 124 (972, 504), (972, 504)), (psl, None, None, f_exact, 1, (972, 504), (972, 504), # 125 (972, 504), (972, 504)), (psl, None, None, f_shrink, 0, (972, 504), (648, 216), # 126 (972, 504), (864, 432)), (psl, None, None, f_shrink, 1, (972, 504), (648, 216), # 127 (972, 504), (864, 432)), (psl, None, None, f_enlarge, 0, (972, 504), (972, 324), # 128 (972, 504), (972, 486)), (psl, None, None, f_enlarge, 1, (972, 504), (972, 324), # 129 (972, 504), (972, 486)), (psl, None, border, f_into, 0, (972, 504), (432, 144), # 130 (972, 504), (360, 180)), (psl, None, border, f_into, 1, (972, 504), (432, 144), # 131 (972, 504), (360, 180)), (psl, None, border, f_fill, 0, (972, 504), (540, 180), # 132 (972, 504), (432, 216)), (psl, None, border, f_fill, 1, (972, 504), (540, 180), # 133 (972, 504), (432, 216)), (psl, None, border, f_exact, 0, (972, 504), (432, 180), # 134 (972, 504), (432, 180)), (psl, None, border, f_exact, 1, (972, 504), (432, 180), # 135 (972, 504), (432, 180)), (psl, None, border, f_shrink, 0, (972, 504), (432, 144), # 136 (972, 504), (360, 180)), (psl, None, border, f_shrink, 1, (972, 504), (432, 144), # 137 (972, 504), (360, 180)), (psl, None, border, f_enlarge, 0, (972, 504), (648, 216), # 138 (972, 504), (864, 432)), (psl, None, border, f_enlarge, 1, (972, 504), (648, 216), # 139 (972, 504), (864, 432)), (psl, isp, None, f_into, 0, (972, 504), (324, 108), # 140 (972, 504), (324, 162)), (psl, isp, None, f_into, 1, (972, 504), (324, 108), # 141 (972, 504), (324, 162)), (psl, isp, None, f_fill, 0, (972, 504), (2268, 756), # 142 (972, 504), (1512, 756)), (psl, isp, None, f_fill, 1, (972, 504), (2268, 756), # 143 (972, 504), (1512, 756)), (psl, isp, None, f_exact, 0, (972, 504), (324, 756), # 144 (972, 504), (324, 756)), (psl, isp, None, f_exact, 1, (972, 504), (324, 756), # 145 (972, 504), (324, 756)), (psl, isp, None, f_shrink, 0, (972, 504), (324, 108), # 146 (972, 504), (324, 162)), (psl, isp, None, f_shrink, 1, (972, 504), (324, 108), # 147 (972, 504), (324, 162)), (psl, isp, None, f_enlarge, 0, (972, 504), (648, 216), # 148 (972, 504), (864, 432)), (psl, isp, None, f_enlarge, 1, (972, 504), (648, 216), # 149 (972, 504), (864, 432)), # psp=972x504, psl=504x972, isl=756x324, isp=324x756, border=162:270 # --pagesize --border -a pagepdf imgpdf # --imgsize --fit imgpx (psl, isp, border, f_into, 0, (972, 504), (324, 108), # 150 (972, 504), (324, 162)), (psl, isp, border, f_into, 1, (972, 504), (324, 108), # 151 (972, 504), (324, 162)), (psl, isp, border, f_fill, 0, (972, 504), (2268, 756), # 152 (972, 504), (1512, 756)), (psl, isp, border, f_fill, 1, (972, 504), (2268, 756), # 153 (972, 504), (1512, 756)), (psl, isp, border, f_exact, 0, (972, 504), (324, 756), # 154 (972, 504), (324, 756)), (psl, isp, border, f_exact, 1, (972, 504), (324, 756), # 155 (972, 504), (324, 756)), (psl, isp, border, f_shrink, 0, (972, 504), (324, 108), # 156 (972, 504), (324, 162)), (psl, isp, border, f_shrink, 1, (972, 504), (324, 108), # 157 (972, 504), (324, 162)), (psl, isp, border, f_enlarge, 0, (972, 504), (648, 216), # 158 (972, 504), (864, 432)), (psl, isp, border, f_enlarge, 1, (972, 504), (648, 216), # 159 (972, 504), (864, 432)), (psl, isl, None, f_into, 0, (972, 504), (756, 252), # 160 (972, 504), (648, 324)), (psl, isl, None, f_into, 1, (972, 504), (756, 252), # 161 (972, 504), (648, 324)), (psl, isl, None, f_fill, 0, (972, 504), (972, 324), # 162 (972, 504), (756, 378)), (psl, isl, None, f_fill, 1, (972, 504), (972, 324), # 163 (972, 504), (756, 378)), (psl, isl, None, f_exact, 0, (972, 504), (756, 324), # 164 (972, 504), (756, 324)), (psl, isl, None, f_exact, 1, (972, 504), (756, 324), # 165 (972, 504), (756, 324)), (psl, isl, None, f_shrink, 0, (972, 504), (648, 216), # 166 (972, 504), (648, 324)), (psl, isl, None, f_shrink, 1, (972, 504), (648, 216), # 167 (972, 504), (648, 324)), (psl, isl, None, f_enlarge, 0, (972, 504), (756, 252), # 168 (972, 504), (864, 432)), (psl, isl, None, f_enlarge, 1, (972, 504), (756, 252), # 169 (972, 504), (864, 432)), (psl, isl, border, f_into, 0, (972, 504), (756, 252), # 170 (972, 504), (648, 324)), (psl, isl, border, f_into, 1, (972, 504), (756, 252), # 171 (972, 504), (648, 324)), (psl, isl, border, f_fill, 0, (972, 504), (972, 324), # 172 (972, 504), (756, 378)), (psl, isl, border, f_fill, 1, (972, 504), (972, 324), # 173 (972, 504), (756, 378)), (psl, isl, border, f_exact, 0, (972, 504), (756, 324), # 174 (972, 504), (756, 324)), (psl, isl, border, f_exact, 1, (972, 504), (756, 324), # 175 (972, 504), (756, 324)), (psl, isl, border, f_shrink, 0, (972, 504), (648, 216), # 176 (972, 504), (648, 324)), (psl, isl, border, f_shrink, 1, (972, 504), (648, 216), # 177 (972, 504), (648, 324)), (psl, isl, border, f_enlarge, 0, (972, 504), (756, 252), # 178 (972, 504), (864, 432)), (psl, isl, border, f_enlarge, 1, (972, 504), (756, 252), # 179 (972, 504), (864, 432)), (poster, None, None, f_fill, 0, (97200, 50400), (151200, 50400), (97200, 50400), (100800, 50400)), ] # fmt: on ) def test_layout(layout_test_cases): # there is no need to have test cases with the same images with inverted # orientation (landscape/portrait) because --pagesize and --imgsize are # already inverted im1 = (864, 288) # imgpx #1 => 648x216 im2 = (1152, 576) # imgpx #2 => 864x432 psopt, isopt, border, fit, ao, pspdf1, ispdf1, pspdf2, ispdf2 = layout_test_cases if isopt is not None: isopt = ((img2pdf.ImgSize.abs, isopt[0]), (img2pdf.ImgSize.abs, isopt[1])) layout_fun = img2pdf.get_layout_fun(psopt, isopt, border, fit, ao) try: pwpdf, phpdf, iwpdf, ihpdf = layout_fun( im1[0], im1[1], (img2pdf.default_dpi, img2pdf.default_dpi) ) assert (pwpdf, phpdf) == pspdf1 assert (iwpdf, ihpdf) == ispdf1 except img2pdf.NegativeDimensionError: assert pspdf1 is None assert ispdf1 is None try: pwpdf, phpdf, iwpdf, ihpdf = layout_fun( im2[0], im2[1], (img2pdf.default_dpi, img2pdf.default_dpi) ) assert (pwpdf, phpdf) == pspdf2 assert (iwpdf, ihpdf) == ispdf2 except img2pdf.NegativeDimensionError: assert pspdf2 is None assert ispdf2 is None @pytest.fixture( scope="session", params=os.listdir(os.path.join(os.path.dirname(__file__), "tests", "input")), ) def general_input(request): assert os.path.isfile( os.path.join(os.path.dirname(__file__), "tests", "input", request.param) ) return request.param @pytest.mark.skipif(not HAVE_FAKETIME, reason="requires faketime") @pytest.mark.parametrize( "engine,testdata,timezone,pdfa", itertools.product( ["internal", "pikepdf"], ["2021-02-05 17:49:00"], ["Europe/Berlin", "GMT+12"], [True, False], ), ) def test_faketime(tmp_path_factory, jpg_img, engine, testdata, timezone, pdfa): expected = tz2utcstrftime(testdata, "D:%Y%m%d%H%M%SZ", timezone) out_pdf = tmp_path_factory.mktemp("faketime") / "out.pdf" subprocess.check_call( ["env", f"TZ={timezone}", "faketime", "-f", testdata, img2pdfprog] + (["--pdfa"] if pdfa else []) + [ "--producer=", "--engine=" + engine, "--output=" + str(out_pdf), str(jpg_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert p.docinfo.CreationDate == expected assert p.docinfo.ModDate == expected if pdfa: assert p.Root.Metadata.Subtype == "/XML" assert p.Root.Metadata.Type == "/Metadata" expected = tz2utcstrftime(testdata, "%Y-%m-%dT%H:%M:%SZ", timezone) root = ET.fromstring(p.Root.Metadata.read_bytes()) for k in ["ModifyDate", "CreateDate"]: assert ( root.find( f".//xmp:{k}", {"xmp": "http://ns.adobe.com/xap/1.0/"} ).text == expected ) out_pdf.unlink() @pytest.mark.parametrize( "engine,testdata,timezone,pdfa", itertools.product( ["internal", "pikepdf"], [ "2021-02-05 17:49:00", "2021-02-05T17:49:00", "Fri, 05 Feb 2021 17:49:00 +0100", "last year 12:00", ], ["Europe/Berlin", "GMT+12"], [True, False], ), ) def test_date(tmp_path_factory, jpg_img, engine, testdata, timezone, pdfa): # we use the date utility to convert the timestamp from the local # timezone into UTC with the format used by PDF expected = tz2utcstrftime(testdata, "D:%Y%m%d%H%M%SZ", timezone) out_pdf = tmp_path_factory.mktemp("faketime") / "out.pdf" subprocess.check_call( ["env", f"TZ={timezone}", img2pdfprog] + (["--pdfa"] if pdfa else []) + [ f"--moddate={testdata}", f"--creationdate={testdata}", "--producer=", "--engine=" + engine, "--output=" + str(out_pdf), str(jpg_img), ] ) with pikepdf.open(str(out_pdf)) as p: assert p.docinfo.CreationDate == expected assert p.docinfo.ModDate == expected if pdfa: assert p.Root.Metadata.Subtype == "/XML" assert p.Root.Metadata.Type == "/Metadata" expected = tz2utcstrftime(testdata, "%Y-%m-%dT%H:%M:%SZ", timezone) root = ET.fromstring(p.Root.Metadata.read_bytes()) for k in ["ModifyDate", "CreateDate"]: assert ( root.find( f".//xmp:{k}", {"xmp": "http://ns.adobe.com/xap/1.0/"} ).text == expected ) out_pdf.unlink() @pytest.mark.parametrize("engine", ["internal", "pikepdf"]) def test_general(general_input, engine): inputf = os.path.join(os.path.dirname(__file__), "tests", "input", general_input) outputf = os.path.join( os.path.dirname(__file__), "tests", "output", general_input + ".pdf" ) assert os.path.isfile(outputf) f = inputf out = outputf engine = getattr(img2pdf.Engine, engine) with open(f, "rb") as inf: orig_imgdata = inf.read() output = img2pdf.convert(orig_imgdata, nodate=True, engine=engine) x = pikepdf.open(BytesIO(output)) assert x.Root.Pages.Count in (1, 2) if len(x.Root.Pages.Kids) == "1": assert x.Size == "7" assert len(x.Root.Pages.Kids) == 1 elif len(x.Root.Pages.Kids) == "2": assert x.Size == "10" assert len(x.Root.Pages.Kids) == 2 assert sorted(x.Root.keys()) == ["/Pages", "/Type"] assert x.Root.Type == "/Catalog" assert sorted(x.Root.Pages.keys()) == ["/Count", "/Kids", "/Type"] assert x.Root.Pages.Type == "/Pages" if f.endswith(".jb2"): # PIL doens't support .jb2, so we load the original .png, which # was converted to the .jb2 using `jbig2enc`. orig_img = Image.open(f.replace(".jb2", ".png")) else: orig_img = Image.open(f) for pagenum in range(len(x.Root.Pages.Kids)): # retrieve the original image frame that this page was # generated from orig_img.seek(pagenum) cur_page = x.Root.Pages.Kids[pagenum] ndpi = orig_img.info.get("dpi", (96.0, 96.0)) if ndpi[0] <= 0.001 or ndpi[1] <= 0.001: ndpi = (96.0, 96.0) # In python3, the returned dpi value for some tiff images will # not be an integer but a float. To make the behaviour of # img2pdf the same between python2 and python3, we convert that # float into an integer by rounding. # Search online for the 72.009 dpi problem for more info. ndpi = (int(round(ndpi[0])), int(round(ndpi[1]))) imgwidthpx, imgheightpx = orig_img.size pagewidth = 72.0 * imgwidthpx / ndpi[0] pageheight = 72.0 * imgheightpx / ndpi[1] def format_float(f): if int(f) == f: return int(f) else: return decimal.Decimal("%.4f" % f) assert sorted(cur_page.keys()) == [ "/Contents", "/MediaBox", "/Parent", "/Resources", "/Type", ] assert cur_page.MediaBox == pikepdf.Array( [0, 0, format_float(pagewidth), format_float(pageheight)] ) assert cur_page.Parent == x.Root.Pages assert cur_page.Type == "/Page" assert cur_page.Resources.keys() == {"/XObject"} assert cur_page.Resources.XObject.keys() == {"/Im0"} if engine != img2pdf.Engine.pikepdf: assert cur_page.Contents.Length == len(cur_page.Contents.read_bytes()) assert ( cur_page.Contents.read_bytes() == b"q\n%.4f 0 0 %.4f 0.0000 0.0000 cm\n/Im0 Do\nQ" % ( pagewidth, pageheight, ) ) imgprops = cur_page.Resources.XObject.Im0 # test if the filter is valid: assert imgprops.Filter in [ "/DCTDecode", "/JPXDecode", "/FlateDecode", pikepdf.Array([pikepdf.Name.CCITTFaxDecode]), "/JBIG2Decode", ] # test if the image has correct size assert imgprops.Width == orig_img.size[0] assert imgprops.Height == orig_img.size[1] # if the input file is a jpeg then it should've been copied # verbatim into the PDF if imgprops.Filter in ["/DCTDecode", "/JPXDecode"]: assert cur_page.Resources.XObject.Im0.read_raw_bytes() == orig_imgdata elif imgprops.Filter == "/JBIG2Decode": assert ( cur_page.Resources.XObject.Im0.read_raw_bytes() == orig_imgdata[13:-22] ) # Strip file header and footer. elif imgprops.Filter == pikepdf.Array([pikepdf.Name.CCITTFaxDecode]): tiff_header = tiff_header_for_ccitt( int(imgprops.Width), int(imgprops.Height), int(imgprops.Length), 4 ) imgio = BytesIO() imgio.write(tiff_header) imgio.write(cur_page.Resources.XObject.Im0.read_raw_bytes()) imgio.seek(0) im = Image.open(imgio) assert im.tobytes() == orig_img.tobytes() try: im.close() except AttributeError: pass elif imgprops.Filter == "/FlateDecode": # otherwise, the data is flate encoded and has to be equal # to the pixel data of the input image imgdata = zlib.decompress(cur_page.Resources.XObject.Im0.read_raw_bytes()) if hasattr(imgprops, "DecodeParms"): if orig_img.format == "PNG": pngidat, palette = img2pdf.parse_png(orig_imgdata) elif ( orig_img.format == "TIFF" and orig_img.info["compression"] == "group4" ): offset, length = img2pdf.ccitt_payload_location_from_pil(orig_img) pngidat = orig_imgdata[offset : offset + length] else: pngbuffer = BytesIO() orig_img.save(pngbuffer, format="png") pngidat, palette = img2pdf.parse_png(pngbuffer.getvalue()) assert zlib.decompress(pngidat) == imgdata else: colorspace = imgprops.ColorSpace if colorspace == "/DeviceGray": colorspace = "L" elif colorspace == "/DeviceRGB": colorspace = "RGB" elif colorspace == "/DeviceCMYK": colorspace = "CMYK" else: raise Exception("invalid colorspace") im = Image.frombytes( colorspace, (int(imgprops.Width), int(imgprops.Height)), imgdata ) if orig_img.mode == "1": assert im.tobytes() == orig_img.convert("L").tobytes() elif orig_img.mode not in ("RGB", "L", "CMYK", "CMYK;I"): assert im.tobytes() == orig_img.convert("RGB").tobytes() # the python-pil version 2.3.0-1ubuntu3 in Ubuntu does # not have the close() method try: im.close() except AttributeError: pass else: raise Exception("unknown filter") def rec(obj): if isinstance(obj, pikepdf.Dictionary): return {k: rec(v) for k, v in obj.items() if k != "/Parent"} elif isinstance(obj, pikepdf.Array): return [rec(v) for v in obj] elif isinstance(obj, pikepdf.Stream): ret = rec(obj.stream_dict) stream = obj.read_raw_bytes() assert len(stream) == ret["/Length"] del ret["/Length"] if ret.get("/Filter") == "/FlateDecode": stream = obj.read_bytes() del ret["/Filter"] ret["stream"] = stream return ret elif isinstance(obj, pikepdf.Name) or isinstance(obj, pikepdf.String): return str(obj) elif isinstance(obj, decimal.Decimal) or isinstance(obj, str): return obj elif isinstance(obj, int): return decimal.Decimal(obj) raise Exception("unhandled: %s" % (type(obj))) y = pikepdf.open(out) pydictx = rec(x.Root) pydicty = rec(y.Root) assert pydictx == pydicty # the python-pil version 2.3.0-1ubuntu3 in Ubuntu does not have the # close() method try: orig_img.close() except AttributeError: pass def test_return_engine_doc(tmp_path_factory): inputf = os.path.join(os.path.dirname(__file__), "tests", "input", "normal.jpg") outputf = tmp_path_factory.mktemp("return_engine_doc") / "normal.jpg.pdf" pdf_wrapper = img2pdf.convert_to_docobject(inputf, engine=img2pdf.Engine.pikepdf) pdf = pdf_wrapper.writer assert isinstance(pdf, pikepdf.Pdf) pdf.save(outputf, min_version=pdf_wrapper.output_version, linearize=True) assert os.path.isfile(outputf) def main(): normal16 = alpha_value()[:, :, 0:3] pathlib.Path("test.icc").write_bytes(icc_profile()) write_png( normal16 / 0xFFFF * 0xFF, "icc.png", 8, 2, iccp="test.icc", ) write_png( normal16 / 0xFFFF * 0xFF, "normal.png", 8, 2, ) if __name__ == "__main__": main() ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1723007641.0 img2pdf-0.6.1/src/jp2.py0000644000175000017500000001225514654601231013747 0ustar00joschjosch#!/usr/bin/env python # # Copyright (C) 2013 Johannes Schauer Marin Rodrigues # # this module is heavily based upon jpylyzer which is # KB / National Library of the Netherlands, Open Planets Foundation # and released under the same license conditions # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . import struct def getBox(data, byteStart, noBytes): boxLengthValue = struct.unpack(">I", data[byteStart : byteStart + 4])[0] boxType = data[byteStart + 4 : byteStart + 8] contentsStartOffset = 8 if boxLengthValue == 1: boxLengthValue = struct.unpack(">Q", data[byteStart + 8 : byteStart + 16])[0] contentsStartOffset = 16 if boxLengthValue == 0: boxLengthValue = noBytes - byteStart byteEnd = byteStart + boxLengthValue boxContents = data[byteStart + contentsStartOffset : byteEnd] return (boxLengthValue, boxType, byteEnd, boxContents) def parse_ihdr(data): height, width, channels, bpp = struct.unpack(">IIHB", data[:11]) return width, height, channels, bpp + 1 def parse_colr(data): meth = struct.unpack(">B", data[0:1])[0] if meth != 1: raise Exception("only enumerated color method supported") enumCS = struct.unpack(">I", data[3:])[0] if enumCS == 16: return "RGB" elif enumCS == 17: return "L" else: raise Exception( "only sRGB and greyscale color space is supported, " "got %d" % enumCS ) def parse_resc(data): hnum, hden, vnum, vden, hexp, vexp = struct.unpack(">HHHHBB", data) hdpi = ((hnum / hden) * (10**hexp) * 100) / 2.54 vdpi = ((vnum / vden) * (10**vexp) * 100) / 2.54 return hdpi, vdpi def parse_res(data): hdpi, vdpi = None, None noBytes = len(data) byteStart = 0 boxLengthValue = 1 # dummy value for while loop condition while byteStart < noBytes and boxLengthValue != 0: boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes) if boxType == b"resc": hdpi, vdpi = parse_resc(boxContents) break return hdpi, vdpi def parse_jp2h(data): width, height, colorspace, hdpi, vdpi = None, None, None, None, None noBytes = len(data) byteStart = 0 boxLengthValue = 1 # dummy value for while loop condition while byteStart < noBytes and boxLengthValue != 0: boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes) if boxType == b"ihdr": width, height, channels, bpp = parse_ihdr(boxContents) elif boxType == b"colr": colorspace = parse_colr(boxContents) elif boxType == b"res ": hdpi, vdpi = parse_res(boxContents) byteStart = byteEnd return (width, height, colorspace, hdpi, vdpi, channels, bpp) def parsejp2(data): noBytes = len(data) byteStart = 0 boxLengthValue = 1 # dummy value for while loop condition width, height, colorspace, hdpi, vdpi = None, None, None, None, None while byteStart < noBytes and boxLengthValue != 0: boxLengthValue, boxType, byteEnd, boxContents = getBox(data, byteStart, noBytes) if boxType == b"jp2h": width, height, colorspace, hdpi, vdpi, channels, bpp = parse_jp2h( boxContents ) break byteStart = byteEnd if not width: raise Exception("no width in jp2 header") if not height: raise Exception("no height in jp2 header") if not colorspace: raise Exception("no colorspace in jp2 header") # retrieving the dpi is optional so we do not error out if not present return (width, height, colorspace, hdpi, vdpi, channels, bpp) def parsej2k(data): lsiz, rsiz, xsiz, ysiz, xosiz, yosiz, _, _, _, _, csiz = struct.unpack( ">HHIIIIIIIIH", data[4:42] ) ssiz = [None] * csiz xrsiz = [None] * csiz yrsiz = [None] * csiz for i in range(csiz): ssiz[i], xrsiz[i], yrsiz[i] = struct.unpack( "BBB", data[42 + 3 * i : 42 + 3 * (i + 1)] ) assert ssiz == [7, 7, 7] return xsiz - xosiz, ysiz - yosiz, None, None, None, csiz, 8 def parse(data): if data[:4] == b"\xff\x4f\xff\x51": return parsej2k(data) else: return parsejp2(data) if __name__ == "__main__": import sys width, height, colorspace, hdpi, vdpi, channels, bpp = parse( open(sys.argv[1], "rb").read() ) print("width = %d" % width) print("height = %d" % height) print("colorspace = %s" % colorspace) print("hdpi = %s" % hdpi) print("vdpi = %s" % vdpi) print("channels = %s" % channels) print("bpp = %s" % bpp) ././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1745772903.502961 img2pdf-0.6.1/src/tests/0000755000175000017500000000000015003460550014033 5ustar00joschjosch././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1745772903.5229614 img2pdf-0.6.1/src/tests/input/0000755000175000017500000000000015003460550015172 5ustar00joschjosch././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/CMYK.jpg0000644000175000017500000001126414551547203016453 0ustar00joschjoschJFIFHHAdobedC     C  0s""  D  !"w289#1XAQBYadr'1!3"AQaq#2?cCXm9˚W͒?IK[u\݌k}טonAz=Ai|uenb6AvjuT$ e!7 (kpUq/ e`G]]A$;;gKry6;ә(loSvITl6\ NkJɪ']uyiLPS!1MD{*u呒MOM)]Yco)ڲUTeChˮ盫-JT=0Z]9{VMΑ*hJ[/|# ;52ݑWLVSGՂӬ>r6:Gvzh2繷 nAA>R }=\dwV$\> \mEUNE֧1|DU>6ǮQ?fbbS}5'\~X RF0`̈́|oVeso6: 4 RhCb͢pw+ӺAR)}!GiJ:h !<4Ͼl i*SIԘ/iK\ ?TGd\ČU 3b5qx>zeOIS ;wJqSxla)춭65"T5I2YQ &i b+XIԞ=r4c ?;`hu2{TE1ʨCf^iz2dji%Ŧ(Uۧ'3(}$SHCtХME'=m Z,y -':ENo1M>T'X Y w IҌ񌽩5IrE&zT.7(r51P W_-O Ũ78'>XWyZWqab1HN&~gM| v:KcqBykC`vuVTɽUR) )Q L`&2ث3VMvDkzIH\`LSi'&7\V{J-ZkmK1Gi; m䮚" HtQ"`niXTz=ޚKR#ҼHS1 z.5:&(]0~dPR&cueMRcs;:-T8c箹iU9h >sh!>#`ٲ[FY>: RThCAؙ9D2F+w"UnԿ2G T-;8Oc;w*_rD/CAuڵ(34rr彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮嫝}1((A⽮彮彮彮彮彮彮彮彮彮\LN彮彮岦)!"k彮彮彮彮彮彮彮彮彮Q彮彮^NO彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮ܽ彮彮彮彮彮彮彮彮彮彮彮ݽ彮彮彮彮彮彮彮彮彮彮彮彮彮aQR彮彮彮彮彮彮彮彮彮I彮彮崨彮彮彮彮彮彮彮彮彮L?@彮彮彮嶪gVW彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮w彮彮彮彮幬ۣYyfg >34 彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮 彮彮彮彮彮彮彮彮崨Mo^_彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮/Ὦ彮彮彮彮彮彮彮彮彮弮䫞M@A彮彮彮彮彮彮彮彮彮yx Ž彮彮彮彮彮彮彮彮彮彮娚o 彮彮彮彮彮彮彮彮彮*"#q_`D཮彮彮彮彮彮彮崨彮彮彮彮彮彮彮彮彮:/0no彮彮彮彮媛u彮彮彮彮彮彮彮彮彮OABŽ彮彮彮G;<彮彮彮彮彮彮彮彮彮ʽ彮彮囊?彮彮彮彮彮彮彮彮彮m\]彮彮峧彮彮彮彮彮彮彮彮彮彮彮幬彮彮彮彮彮彮彮彮彮彮彮幬彮彮彮彮彮彮彮彮彮彮彮崨彮彮彮彮彮彮彮彮彮fUV彮彮壓Y彮彮彮彮彮彮彮彮彮{?45 彮彮cRS彮彮彮彮彮彮彮彮彮彮嬞dST%彮彮岦彮彮彮彮彮彮彮彮彮彮彮彮庬ܣYzgh =23 *"#cRSV⽮彮彮庬9./彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮巫<12彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮& no彮彮彮彮彮彮彮彮彮彮彮彮幬׏}|' 彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮彮WHIzy!aѻ߻߸Ѵo{{$RDE彮彮彮s0W@@VVV(L././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/animation.gif0000644000175000017500000000365214551547203017656 0ustar00joschjoschGIF89as0  !!!---///000333;;;===>>>???GGGLLLPPP^^^___bbbjjjtttuuu"""%%%'''+++...AAABBBEEERRRUUUWWWYYY[[[\\\fffpppqqqrrr}}} ###)))***,,,222:::@@@KKKMMMTTTXXXZZZ]]]aaahhhlll~~~! NETSCAPE2.0!Created with GIMP!,s0G;&YO X%2FĄLj< \3É؇j܃kI Z+p>Bq-^߂J*2AXbx@@MTF lxSCUFSJ'Y IaRY 0"/'l&r鳧NJaQIWEI$GB3Lɉ r|(t(b%=`VLG6t~-:=j7# Xc,})Nd,#n淧 mn$Ɨ'Ĥ 1uՐHbڴ%灰lSaM7m|*<.4DXU0&1-^a|" iY",|e H\BECQ @$`CEAC T0 jFj{0e#Z݉$[,*"0""pStX ˓PF)TViXfIJ !H,s0H*\ȰÇ#JHŋ3jȱǏ CIɓ(S\ɲ˗0O2!!ArȘ$BG J(QFEAq<ա j=Ru+փrx:p,Ԭfv5{(="qؓ b|VĽ K@C  -(B`hHh xc烟Zzc芧Z!iS]'n6E Bn6"qqQD44s~/BMD}E?ȠpRBHK> !`Pqf$`& `x!`BL$4bT ė{'aC\'Deb⅐XЊ"P,`4J#Hz  >6rP$\'G59ۖA '`bHbjP* LP 8(By mf(hG eY8 #d10H'vc퀁Z5 Z{**G!% C C 8x ӳF+VkfKR@;././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/gray.png0000644000175000017500000000145614551547203016660 0ustar00joschjoschPNG  IHDRs0gAMA a cHRMz&u0`:pQ<bKGD̿ pHYs  tIME IDATXAKAKQT*Ĥ"ET c=alv*Y|uq- U'ʹ_s"וxtEXtCommentCreated with GIMPW%tEXtdate:create2016-07-12T06:29:12+02:00%tEXtdate:modify2014-06-27T15:04:12+02:00k<IENDB`././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/mono.png0000644000175000017500000000067414551547203016667 0ustar00joschjoschPNG  IHDRs0 xgAMA a cHRMz&u0`:pQ<bKGD݊ pHYs  tIME:(qmIDAT( !пGKJ(,aKs |Q>W50 ?\ᘖ^nl҂vlg> 3Vsꊳa_Aeu}o?WmLfLtEXtCommentCreated with GIMPW%tEXtdate:create2016-01-16T02:58:40-08:00&E%tEXtdate:modify2016-01-16T02:58:40-08:00{IENDB`././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/mono.tif0000644000175000017500000000132014551547203016652 0ustar00joschjoschII**x q1#0 t t'jc_V*نg *4ҋlˑV+IGz3wKtT_JɊYcŲcw,Q<59o1wߕXMGA0Y5C~~L s00 s00( s00"(H s00(HH././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/normal.jpg0000644000175000017500000000445414551547203017203 0ustar00joschjoschJFIFHHCreated with GIMPC     C   0s ~MLU=fHS(,*rQfsTT P4Bzua%EU%E䣢Pzd=dC(`'Q9#y !60fP-YYs ǓTwIyq12NG|m(BȆޫ:($N2˻ JW{Wqn81PAiW%m[a Nv5m'rm/9EMiCceZ'EɛM%cʰįPRuO;-O7jP8'(T_K-SƶíۄloEb g-iP ^D& 8x3)qNSAT&)iPNPMV@9 ^ɂ()JTc͖-AL*K[#!1AQaq 0?!{k $JI84iL/ [CCIB׆|܋mxGRer}1GqQ, F=5%aMc4B ٗ|܋mJpdE(SFADԥǰf4^;vjQ"sI5!nx3',n׼w"Hz*ÁUt2KM 5I{;iȗ D.p3 er~W3K-22KMlI8넀wjKyׄe |././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/input/normal.png0000644000175000017500000001160014551547203017176 0ustar00joschjoschPNG  IHDRs0szTXtRaw profile type exifxڭiv9FcLa98-oIr.UwH&bON%,^'EqSYzݽ%{\h幾?>30P(g{^g`7R|fί-_z?na=;SEJ9ZKu>W⦶T޽o)w&Zrʩړֿ3'͑Ȝ}#sۯ]6Dqӈ0柟@5Cim{.(Pݢhl5 N:iR$L9\md܈zxZ7pLy^'vwyl(+< +P2 C&{`@.NhCOiXcLg Uwy[FY uuĹ[qiqy? fG=մ+6G\,6 CQ }J =zl_)b'z>oFNVn|x^M/TdI&mD24-g٫k@\_S}k+[A|l= s촎{ہN;L$HW2d2ih&lBhepD`t LJlRtZy b4w\U&RF)(FIPJrVQ])d=EBf4^8m]H9>F JTvtO‰إ퐨P0Ni=0˙tJArfN3ZάUI{W.=ײNe 5tf3+Qʾ_2d-h$o~#UX)~)BpWJ* gU1,\` :cevYJ{`ιѧK w;`5SN+ʞ|vO$9T1[E2 4ZYN}Kv o6C&֪ s$idzk٨!lՍ[SYz&JQn1Kq :!r 2a89ޗ$4L!!}GNn!3Woɸw%*w*o 9srzaIzoU+x[ dHEhm}SG¦mjs3\6`iM]V!">P F<07B"m8>@V$O n1Cu,hmC47 Ҁ /~IUl>Рf0+PW16pG5iHE- UJ@[KWWB3!5 /T)lBWALv[QLp#[o1E%F"ĕOgto7rFy$YUo :ׯt2J >/As8׾߀ܒMVߩd`z0 W.IwH|2Ь^unVV?Wd{`[7jPּ3"]:?tӓ $S,Ӹd^k6}:rL M#,JӁk H0? :CTuHIt-W="H_1-Nuh^ nOi4"SȽ)] /DC:0"p$`j _IW٩]ҳ D-/Zо4PS %SL 6=a a𳝵 YHLqyK#٪XP) 6>rRJ, P܌h#5.-h1A@ hJ {GUvJIe۴3.;Ť Ҙk<9fe6 >N%9 ~ B.rP:W *HhBm&<7:%7CU7܌,:A r-RA/Ğ_.ddFY”TzFa^1Y~| mx:$6,4X#-H)尀D 'i4TXE\Ut< Rxov!3ad&7i(KKKvp8q2zR鯟`R8~qqXX,FqqqðΞ`bUi#wJ^7kkk*Jxe;P)^Pjr!b2(k0./// _b(JEY)\h)W*f*nbb7YIV. >/}:`o|+NOOa$r9^?ﳗ777T*P(O;\z]7qyEYj ʶ 877fyy9 5of={iۅRYWՏh<<>.J> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <> stream JFIFHHAdobedC     C  0s""  D  !"w289#1XAQBYadr'1!3"AQaq#2?cCXm9˚W͒?IK[u\݌k}טonAz=Ai|uenb6AvjuT$ e!7 (kpUq/ e`G]]A$;;gKry6;ә(loSvITl6\ NkJɪ']uyiLPS!1MD{*u呒MOM)]Yco)ڲUTeChˮ盫-JT=0Z]9{VMΑ*hJ[/|# ;52ݑWLVSGՂӬ>r6:Gvzh2繷 nAA>R }=\dwV$\> \mEUNE֧1|DU>6ǮQ?fbbS}5'\~X RF0`̈́|oVeso6: 4 RhCb͢pw+ӺAR)}!GiJ:h !<4Ͼl i*SIԘ/iK\ ?TGd\ČU 3b5qx>zeOIS ;wJqSxla)춭65"T5I2YQ &i b+XIԞ=r4c ?;`hu2{TE1ʨCf^iz2dji%Ŧ(Uۧ'3(}$SHCtХME'=m Z,y -':ENo1M>T'X Y w IҌ񌽩5IrE&zT.7(r51P W_-O Ũ78'>XWyZWqab1HN&~gM| v:KcqBykC`vuVTɽUR) )Q L`&2ث3VMvDkzIH\`LSi'&7\V{J-ZkmK1Gi; m䮚" HtQ"`niXTz=ޚKR#ҼHS1 z.5:&(]0~dPR&cueMRcs;:-T8c箹iU9h >sh!>#`ٲ[FY>: RThCAؙ9D2F+w"UnԿ2G T-;8Oc;w*_rD/CAuڵ(> startxref 5343 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/CMYK.tif.pdf0000644000175000017500000000327214551547203017426 0ustar00joschjosch%PDF-1.3 % 1 0 obj <<>> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <> stream xKTQ8[nKKNZT- j)iRh% Qnd%2%::Vn/ADAa;4Gf̻<}f{{sBAAAAdcE Tԧu|;f7()دBm(C켶hDK=Ś4VWkEh%9@/UIfܿVw'>#9l "RHp@9Gt'/Ab׺ b 56͏"QzZ =sEg1Zmw=avln,RY7X^17~秳@/|ͳWTVA/ <9N;A/06Ӹy[gMrM1ץy.*Jk"/i\Lgg z<=>s:[XHI2:;Hj}d ƗjR IIܼ7 wDĒf9z \)8:aGJ%MnpO)iJr"8R,W/n&3|HI\,9 g|HIzJ='^rk˫^ұYw1ڝ!dͭR> startxref 1507 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/animation.gif.pdf0000644000175000017500000001372514551547203020631 0ustar00joschjosch%PDF-1.3 % 1 0 obj << /Producer (img2pdf 0.4.3) >> endobj 2 0 obj << /Pages 3 0 R /Type /Catalog >> endobj 3 0 obj << /Count 2 /Kids [ 4 0 R 7 0 R ] /Type /Pages >> endobj 4 0 obj << /Contents 5 0 R /MediaBox [ 0 0 86.25 36 ] /Parent 3 0 R /Resources << /XObject << /Im0 6 0 R >> >> /Type /Page >> endobj 5 0 obj << /Length 48 >> stream q 86.2500 0 0 36.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj << /BitsPerComponent 8 /ColorSpace [ /Indexed /DeviceRGB 255 < 000000 010101 030303 040404 060606 070707 080808 090909 0a0a0a 0b0b0b 0c0c0c 131313 141414 161616 171717 181818 1b1b1b 202020 212121 2d2d2d 2f2f2f 303030 333333 3b3b3b 3d3d3d 3e3e3e 3f3f3f 474747 4c4c4c 505050 5e5e5e 5f5f5f 626262 6a6a6a 747474 757575 8a8a8a 8b8b8b 959595 9d9d9d a0a0a0 a1a1a1 afafaf b3b3b3 b8b8b8 c0c0c0 c1c1c1 c2c2c2 c4c4c4 cccccc cfcfcf d0d0d0 d2d2d2 dedede dfdfdf e4e4e4 e7e7e7 e8e8e8 e9e9e9 ebebeb ececec f3f3f3 f4f4f4 f5f5f5 f6f6f6 f7f7f7 f8f8f8 f9f9f9 fbfbfb fcfcfc fefefe ffffff 050505 0e0e0e 191919 1e1e1e 222222 252525 272727 2b2b2b 2e2e2e 414141 424242 454545 525252 555555 575757 595959 5b5b5b 5c5c5c 666666 707070 717171 727272 7d7d7d 818181 939393 979797 9e9e9e a2a2a2 a5a5a5 a7a7a7 ababab b2b2b2 b4b4b4 bfbfbf c5c5c5 cdcdcd d3d3d3 d5d5d5 d6d6d6 dcdcdc e0e0e0 e2e2e2 eaeaea efefef f0f0f0 f2f2f2 0d0d0d 0f0f0f 101010 151515 1d1d1d 1f1f1f 232323 292929 2a2a2a 2c2c2c 323232 3a3a3a 404040 4b4b4b 4d4d4d 545454 585858 5a5a5a 5d5d5d 616161 686868 6c6c6c 7e7e7e 828282 8d8d8d 8e8e8e 8f8f8f 999999 a3a3a3 a4a4a4 a6a6a6 a8a8a8 aaaaaa adadad bababa bdbdbd bebebe d1d1d1 d4d4d4 d8d8d8 dadada dddddd e1e1e1 e6e6e6 f1f1f1 fafafaecodeParms << /BitsPerComponent 8 /Colors 1 /Columns 115 /Predictor 15 >> /Filter /FlateDecode /Height 48 /Length 392 /Subtype /Image /Type /XObject /Width 115 >> stream xV@Q) vE ŠTlذ}'9DU抝/$MMT {3:c3oɒJUt)5s6;T3r_0󦿘{-8tP1D!~=iYy IrYyn܇F> >> /Type /Page >> endobj 8 0 obj << /Length 48 >> stream q 86.2500 0 0 36.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 9 0 obj << /BitsPerComponent 8 /ColorSpace [ /Indexed /DeviceRGB 255 < 000000 010101 030303 040404 060606 070707 080808 090909 0a0a0a 0b0b0b 0c0c0c 131313 141414 161616 171717 181818 1b1b1b 202020 212121 2d2d2d 2f2f2f 303030 333333 3b3b3b 3d3d3d 3e3e3e 3f3f3f 474747 4c4c4c 505050 5e5e5e 5f5f5f 626262 6a6a6a 747474 757575 8a8a8a 8b8b8b 959595 9d9d9d a0a0a0 a1a1a1 afafaf b3b3b3 b8b8b8 c0c0c0 c1c1c1 c2c2c2 c4c4c4 cccccc cfcfcf d0d0d0 d2d2d2 dedede dfdfdf e4e4e4 e7e7e7 e8e8e8 e9e9e9 ebebeb ececec f3f3f3 f4f4f4 f5f5f5 f6f6f6 f7f7f7 f8f8f8 f9f9f9 fbfbfb fcfcfc fefefe ffffff 050505 0e0e0e 191919 1e1e1e 222222 252525 272727 2b2b2b 2e2e2e 414141 424242 454545 525252 555555 575757 595959 5b5b5b 5c5c5c 666666 707070 717171 727272 7d7d7d 818181 939393 979797 9e9e9e a2a2a2 a5a5a5 a7a7a7 ababab b2b2b2 b4b4b4 bfbfbf c5c5c5 cdcdcd d3d3d3 d5d5d5 d6d6d6 dcdcdc e0e0e0 e2e2e2 eaeaea efefef f0f0f0 f2f2f2 0d0d0d 0f0f0f 101010 151515 1d1d1d 1f1f1f 232323 292929 2a2a2a 2c2c2c 323232 3a3a3a 404040 4b4b4b 4d4d4d 545454 585858 5a5a5a 5d5d5d 616161 686868 6c6c6c 7e7e7e 828282 8d8d8d 8e8e8e 8f8f8f 999999 a3a3a3 a4a4a4 a6a6a6 a8a8a8 aaaaaa adadad bababa bdbdbd bebebe d1d1d1 d4d4d4 d8d8d8 dadada dddddd e1e1e1 e6e6e6 f1f1f1 fafafa 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000 > ] /DecodeParms << /BitsPerComponent 8 /Colors 1 /Columns 115 /Predictor 15 >> /Filter /FlateDecode /Height 48 /Length 387 /Subtype /Image /Type /XObject /Width 115 >> stream xRP/VT+" v JQ#c^aC 1s~7w2IQJ.-3wV1ZPB.?Z7\W.3:!s/[f1_N;fVZZ'XFn I.y N*;4+N2 )27*;>{+a N^*ఛhLol(LR=NZd lem&n`1]Ĝ_iIMpg<&CP@tI!d Gf @f!> startxref 5811 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/gray.png.pdf0000644000175000017500000000246114551547203017626 0ustar00joschjosch%PDF-1.3 % 1 0 obj <<>> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <> /Filter /FlateDecode /Height 48 /Length 508 /Subtype /Image /Type /XObject /Width 115>> stream XAKAKQT*Ĥ"ET c=alv*Y|uq- U'ʹ_s" endstream endobj xref 0 7 0000000000 65535 f 0000000015 00000 n 0000000035 00000 n 0000000082 00000 n 0000000137 00000 n 0000000262 00000 n 0000000359 00000 n trailer <> startxref 1114 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1739609299.0 img2pdf-0.6.1/src/tests/output/mono.jb2.pdf0000644000175000017500000000217514754052323017526 0ustar00joschjosch%PDF-1.4 % 1 0 obj << /CreationDate (D:20231129042924Z) /ModDate (D:20231129042924Z) /Producer (img2pdf 0.5.1) >> endobj 2 0 obj << /Pages 3 0 R /Type /Catalog >> endobj 3 0 obj << /Count 1 /Kids [ 4 0 R ] /Type /Pages >> endobj 4 0 obj << /Contents 5 0 R /MediaBox [ 0 0 115 48 ] /Parent 3 0 R /Resources << /XObject << /Im0 6 0 R >> >> /Type /Page >> endobj 5 0 obj << /Length 49 >> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj << /BitsPerComponent 1 /ColorSpace /DeviceGray /Filter /JBIG2Decode /Height 48 /Length 170 /Subtype /Image /Type /XObject /Width 115 >> stream 0s0HH&s0.3IU{4IuUQ|oyV.^N&|()nS&@#_LaA@Q$ӻ]hS endstream endobj xref 0 7 0000000000 65535 f 0000000015 00000 n 0000000137 00000 n 0000000194 00000 n 0000000265 00000 n 0000000448 00000 n 0000000551 00000 n trailer << /Info 1 0 R /Root 2 0 R /Size 7 >> startxref 944 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/mono.png.pdf0000644000175000017500000000167614551547203017643 0ustar00joschjosch%PDF-1.3 % 1 0 obj <<>> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <> /Filter /FlateDecode /Height 48 /Length 138 /Subtype /Image /Type /XObject /Width 115>> stream ( !пGKJ(,aKs |Q>W50 ?\ᘖ^nl҂vlg> 3Vsꊳa_Aeu}o?WmLfL endstream endobj xref 0 7 0000000000 65535 f 0000000015 00000 n 0000000035 00000 n 0000000082 00000 n 0000000137 00000 n 0000000262 00000 n 0000000359 00000 n trailer <> startxref 744 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/mono.tif.pdf0000644000175000017500000000162314551547203017631 0ustar00joschjosch%PDF-1.3 % 1 0 obj <<>> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <>] /Filter [/CCITTFaxDecode] /Height 48 /Length 102 /Subtype /Image /Type /XObject /Width 115>> stream &x)d3Ob@Ab? `͆XQCN ƿ+"#@ endstream endobj xref 0 7 0000000000 65535 f 0000000015 00000 n 0000000035 00000 n 0000000082 00000 n 0000000137 00000 n 0000000262 00000 n 0000000359 00000 n trailer <> startxref 701 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/normal.jpg.pdf0000644000175000017500000000602114551547203020144 0ustar00joschjosch%PDF-1.3 % 1 0 obj <<>> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <> stream JFIFHHCreated with GIMPC     C   0s ~MLU=fHS(,*rQfsTT P4Bzua%EU%E䣢Pzd=dC(`'Q9#y !60fP-YYs ǓTwIyq12NG|m(BȆޫ:($N2˻ JW{Wqn81PAiW%m[a Nv5m'rm/9EMiCceZ'EɛM%cʰįPRuO;-O7jP8'(T_K-SƶíۄloEb g-iP ^D& 8x3)qNSAT&)iPNPMV@9 ^ɂ()JTc͖-AL*K[#!1AQaq 0?!{k $JI84iL/ [CCIB׆|܋mxGRer}1GqQ, F=5%aMc4B ٗ|܋mJpdE(SFADԥǰf4^;vjQ"sI5!nx3',n׼w"Hz*ÁUt2KM 5I{;iȗ D.p3 er~W3K-22KMlI8넀wjKyׄe | endstream endobj xref 0 7 0000000000 65535 f 0000000015 00000 n 0000000035 00000 n 0000000082 00000 n 0000000137 00000 n 0000000262 00000 n 0000000359 00000 n trailer <> startxref 2874 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/src/tests/output/normal.png.pdf0000644000175000017500000000320614551547203020152 0ustar00joschjosch%PDF-1.3 % 1 0 obj <<>> endobj 2 0 obj <> endobj 3 0 obj <> endobj 4 0 obj <>>> /Type /Page>> endobj 5 0 obj <> stream q 115.0000 0 0 48.0000 0.0000 0.0000 cm /Im0 Do Q endstream endobj 6 0 obj <> /Filter /FlateDecode /Height 48 /Length 850 /Subtype /Image /Type /XObject /Width 115>> stream xK2Q߱HacDHC-]\hU#&w& ɂ`B 2Ibޅ0 ej9gu=s7{9#0? H@P@YP@YP@Ӵ绽 EeYD2:::555;;VVV0 CDа].@C'r|ggQeF^__ONNvJ(6Û@ LRDd([XX輲 iGĚkZ֡lNu 2lQ*{||zh4bYHduu:i-| vxwh#禄LPX+JV>d&7i(KKKvp8q2zR鯟`R8~qqXX,FqqqðΞ`bUi#wJ^7kkk*Jxe;P)^Pjr!b2(k0./// _b(JEY)\h)W*f*nbb7YIV. >/}:`o|+NOOa$r9^?ﳗ777T*P(O;\z]7qyEYj ʶ 877fyy9 5of={iۅRYWՏh<<>.J> startxref 1455 %%EOF ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1705430659.0 img2pdf-0.6.1/test_comp.sh0000755000175000017500000000161114551547203014445 0ustar00joschjosch#!/bin/sh if [ $# -ne 1 ]; then echo "usage: $0 image" exit fi echo "converting image to pdf, trying all compressions imagemagick has to offer" echo "if, as a result, Zip/FlateDecode should NOT be the lossless compression with the lowest size ratio, contact me j [dot] schauer [at] email [dot] de" echo "also, send me the image in question" echo imsize=`stat -c "%s" "$1"` for a in `convert -list compress`; do echo "encode:\t$a" convert "$1" -compress $a "`basename $1 .jpg`.pdf" pdfimages "`basename $1 .jpg`.pdf" "`basename $1 .jpg`" printf "diff:\t" diff=`compare -metric AE "$1" "\`basename $1 .jpg\`-000.ppm" null: 2>&1` if [ "$diff" != "0" ]; then echo "lossy" else echo "lossless" fi printf "size:\t" pdfsize=`stat -c "%s" "\`basename $1 .jpg\`.pdf"` echo "scale=1;$pdfsize/$imsize" | bc printf "pdf:\t" grep --max-count=1 --text /Filter "`basename $1 .jpg`.pdf" echo done