././@PaxHeader0000000000000000000000000000003300000000000010211 xustar0027 mtime=1777056810.600828 simplejson-4.1.1/0000755000175100017510000000000015172736053013357 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/CHANGES.txt0000644000175100017510000011144415172736046015177 0ustar00runnerrunnerVersion 4.1.1 released 2026-04-24 * The ``build_wheels_py27`` CI job now also builds Python 2.7 wheels for Windows AMD64 and Windows x86, joining the existing Py2.7 manylinux1 / manylinux2010 x86_64 wheels. This unblocks offline / ``--no-index`` installs on Py2.7-on-Windows (the original reporter's case), which previously had no matching binary wheel on PyPI, fell through to the sdist, and failed on the PEP 517 isolated-build step complaining that ``setuptools>=42`` was not in the wheelhouse. https://github.com/simplejson/simplejson/issues/377 Version 4.1.0 released 2026-04-22 * The C extension now accelerates encoding when ``indent=`` is set. Previously the encoder fell back to the pure-Python implementation whenever a non-None ``indent`` was passed; now the C encoder emits the newline-plus-indent prefix, the level-aware item separator, and the closing indent directly. A representative nested-dict workload benchmarks about 4-5x faster end-to-end, and the ``indent=0`` and empty-container edge cases continue to match the Python output byte-for-byte. * The C extension now emits PEP 678 ``exc.add_note()`` annotations on serialization failures, matching the pure-Python encoder. A chained error on ``{'a': [1, object(), 3]}`` produces the same three notes (``when serializing object object``, ``when serializing list item 1``, ``when serializing dict item 'a'``) whether the speedups are loaded or not, so the add_note assertions in ``test_errors.py`` no longer need ``indent=2`` to force the Python path. Version 4.0.1 released 2026-04-18 * Skip uploading Pyodide/wasm wheels to PyPI, which rejects them with "unsupported platform tag 'pyodide_2024_0_wasm32'". The wheels are still built in CI and preserved as workflow artifacts. https://github.com/simplejson/simplejson/pull/375 Version 4.0.0 released 2026-04-18 * simplejson 4 requires Python 2.7 or Python 3.8+. Older Python versions (2.5, 2.6, 3.0-3.7) are no longer supported. pip will not install simplejson 4 on unsupported versions. * The C extension now uses heap types and per-module state instead of static types and global state. This is required for free-threading support and sub-interpreter isolation. The Python-level API is unchanged. * Full support for Python 3.13+ free-threading (PEP 703). The C extension is now safe to use with the GIL disabled (python3.14t): - Converted all static types to heap types with per-module state - Added per-object critical sections to scanner and encoder - Added free-threading-safe dict operations for Python 3.13+ - Unified per-module state management and templated parser https://github.com/simplejson/simplejson/pull/363 https://github.com/simplejson/simplejson/pull/364 https://github.com/simplejson/simplejson/pull/365 https://github.com/simplejson/simplejson/pull/367 https://github.com/simplejson/simplejson/pull/369 * Numerous C extension memory safety fixes: - Fix use-after-free and leak in encoder ident handling - Fix NULL dereferences on OOM in module init and static string init - Fix reference leaks in dict encoder (skipkeys item, variable shadowing) - Fix member table copy-paste, exception clobbering, missing Py_VISIT - Fix error-as-truthy bugs in maybe_quote_bigint and is_raw_json - Fix iterable_as_array swallowing MemoryError and KeyboardInterrupt - Fix for_json and _asdict swallowing MemoryError, KeyboardInterrupt, and other non-AttributeError exceptions raised by user __getattr__ https://github.com/simplejson/simplejson/pull/355 https://github.com/simplejson/simplejson/pull/356 https://github.com/simplejson/simplejson/pull/357 https://github.com/simplejson/simplejson/pull/358 https://github.com/simplejson/simplejson/pull/359 https://github.com/simplejson/simplejson/pull/360 https://github.com/simplejson/simplejson/pull/373 * C/Python parity fixes: - Fix C scanstring off-by-one bounds checks that caused truncated or boundary \uXXXX escapes to raise "Invalid \\uXXXX escape sequence" instead of "Unterminated string", and report error position at the 'u' instead of the leading backslash. The C and Python decoders now agree on exception class, message, and position across all tested edge cases. - Align the Python encoder's dispatch order with the C encoder for objects that define _asdict(). Previously a list/tuple/dict subclass with an _asdict() method encoded as its container type under the Python encoder and as the _asdict() return value under the C encoder; both now check _asdict() before list/tuple/dict. for_json() continues to outrank _asdict() in both. - Fix C scanstring raising a plain ValueError ("end is out of bounds") instead of JSONDecodeError for out-of-range end indices. User code with `except JSONDecodeError:` now catches both the C and pure-Python paths consistently. https://github.com/simplejson/simplejson/pull/372 * C extension performance and correctness improvements: - Add PyDict_Next fast path for unsorted exact-dict encoding, avoiding intermediate items list and N tuple allocations - Add indexed fast path for exact list/tuple encoding, avoiding iterator allocation and per-item PyIter_Next overhead - Use PyUnicodeWriter as JSON_Accu backend on Python 3.14+, eliminating intermediate string objects and ''.join calls - Fix integer overflow in ascii_escape output_size calculation that could cause buffer overwrite on pathologically large strings - Fix list encoder separator counter overflow (int to Py_ssize_t) - Dead code cleanup (unreachable NULL checks, do-while wrappers) https://github.com/simplejson/simplejson/pull/370 * Added Python 3.14 support and updated to cibuildwheel 3.2.1. CI now tests free-threaded (3.14t) and debug builds with -Werror, refcount leak detection, and GIL-disabled mode. https://github.com/simplejson/simplejson/pull/343 * Added a ThreadSanitizer (TSan) stress test CI job. Builds a TSan-instrumented free-threaded CPython (cached between runs) and runs a concurrent stress test script against the C extension to catch data races under free-threading. https://github.com/simplejson/simplejson/pull/373 * Replace deprecated license classifiers with SPDX license expression https://github.com/simplejson/simplejson/pull/347 * Documented RawJSON usage with examples and caveats https://github.com/simplejson/simplejson/pull/346 * Added pyproject.toml for PEP 517 build support. setup.py is retained for Python 2.7 wheel builds and backwards compatibility. * Migrated build_ext import from distutils to setuptools in setup.py. The distutils.errors imports are kept since setuptools vendors distutils on Python 3.12+ where stdlib distutils was removed. * CI now tests PEP 517 builds (pyproject.toml) alongside the existing setup.py-based builds. * Added Pyodide (wasm32) wheel builds with C speedups via cibuildwheel. Previously Pyodide users fell back to the pure-Python wheel; now they get the compiled C extension cross-compiled to WebAssembly. Thread and subprocess tests are skipped on Emscripten where those APIs are unavailable. * Test suite now fails (instead of skipping) when C speedups are missing during cibuildwheel runs, catching broken extension builds early. * New ``array_hook`` parameter for ``loads()``, ``load()``, and ``JSONDecoder``. Called with each decoded JSON array (as a list), its return value replaces the list. Analogous to ``object_hook`` for dicts. Works with both the Python decoder and C scanner. (Matches CPython 3.15 json module.) * Trailing comma detection: the decoder now raises ``JSONDecodeError`` with "Illegal trailing comma before end of object/array" for inputs like ``[1,]`` and ``{"a": 1,}`` instead of generic error messages. Both the Python decoder and C scanner are updated. (Matches CPython 3.13+ json module.) * ``frozendict`` encoding support: when ``frozendict`` is available (CPython 3.15+ PEP 814), it is encoded as a JSON object just like ``dict``. No effect on older Python versions. * Serialization errors now include ``add_note()`` context on Python 3.11+ (PEP 678), annotating exceptions with the path to the error, e.g. "when serializing list item 1" / "when serializing dict item 'key'". Only applies to the Python encoder. * New C fast path for ``encode_basestring`` (``ensure_ascii=False``). Previously the non-ASCII string encoder fell back to pure Python; now it has a C implementation matching the existing ``encode_basestring_ascii`` fast path. https://github.com/simplejson/simplejson/issues/207 * The Python decoder now rejects non-ASCII digits (e.g. fullwidth ``\uff10``) in JSON numbers, matching the C scanner behavior. The ``NUMBER_RE`` regex was changed from ``\d`` to ``[0-9]``. * Removed dead single-phase init code for Python 3.3/3.4 from the C extension (these versions are no longer supported). Version 3.20.2 released 2025-09-24 * Disable speedups on GraalPy same as on PyPy https://github.com/simplejson/simplejson/pull/339 Version 3.20.1 released 2025-02-14 * Do not memoize keys before they are coerced to string https://github.com/simplejson/simplejson/pull/329 Version 3.19.3 released 2024-08-14 * Updated test & build matrix to include Python 3.13. Dropped wheel support for Python 2.7 on macOS. https://github.com/simplejson/simplejson/pull/326 Version 3.19.2 released 2023-10-05 * Updated test & build matrix to include Python 3.12 and use GitHub Actions as a Trusted Publisher (OIDC) https://github.com/simplejson/simplejson/pull/317 Version 3.19.1 released 2023-04-06 * This release contains security hardening measures based on recommendations by a security audit sponsored by OSTIF and conducted by X41 D-Sec GmbH. Several of these measures include changing defaults to be more strict, by default simplejson will now only consume and produce compliant JSON, but the flags still exist for any backwards compatibility needs. No high priority issues were discovered, the reference count leak is thought to be unreachable since the digits of the float are checked before PyOS_string_to_double is called. A link to the public version of this report will be included in a future release of simplejson. The following fixes were implemented in one PR: https://github.com/simplejson/simplejson/pull/313 * Fix invalid handling of unicode escape sequences in the pure Python implementation of the decoder (SJ-PT-23-01) * Fix missing reference count decrease if PyOS_string_to_double raises an exception in Python 2.x; was probably unreachable (SJ-PT-23-02) * Backport the integer string length limitation from Python 3.11 to limit quadratic number parsing (SJ-PT-23-03) * Fix inconsistencies with error messages between the C and Python implementations (SJ-PT-23-100) * Remove unused unichr import from encoder (SJ-PT-23-101) * Remove unused namedtuple_as_object and tuple_as_array arguments from simplejson.load (SJ-PT-23-102) * Remove vestigial _one_shot code from iterencode (SJ-PT-23-103) * Change default of allow_nan from True to False and add allow_nan to decoder (SJ-PT-23-107) Version 3.18.4 released 2023-03-14 * Test the sdist to prevent future regressions https://github.com/simplejson/simplejson/pull/311 * Enable ppc64le wheels https://github.com/simplejson/simplejson/pull/312 Version 3.18.3 released 2023-02-05 * Fix regression in sdist archive https://github.com/simplejson/simplejson/pull/310 Version 3.18.2 released 2023-02-04 * Distribute a pure python wheel for Pyodide https://github.com/simplejson/simplejson/pull/308 Version 3.18.1 released 2023-01-03 * Remove unnecessary `i` variable from encoder module namespace https://github.com/simplejson/simplejson/pull/303 * Declare support for Python 3.11 and add wheels https://github.com/simplejson/simplejson/pull/305 Version 3.18.0 released 2022-11-14 * Allow serialization of classes that implement for_json or _asdict by ignoring TypeError when those methods are called https://github.com/simplejson/simplejson/pull/302 * Raise JSONDecodeError instead of ValueError in invalid unicode escape sequence edge case https://github.com/simplejson/simplejson/pull/298 Version 3.17.6 released 2021-11-15 * Declare support for Python 3.10 and add wheels https://github.com/simplejson/simplejson/pull/291 https://github.com/simplejson/simplejson/pull/292 Version 3.17.5 released 2021-08-23 * Fix the C extension module to harden is_namedtuple against looks-a-likes such as Mocks. Also prevent dict encoding from causing an unraised SystemError when encountering a non-Dict. Noticed by running user tests against a CPython interpreter with C asserts enabled (COPTS += -UNDEBUG). https://github.com/simplejson/simplejson/pull/284 Version 3.17.4 released 2021-08-19 * Upgrade cibuildwheel https://github.com/simplejson/simplejson/pull/287 Version 3.17.3 released 2021-07-09 * Replaced Travis-CI and AppVeyor with Github Actions, adding wheels for Python 3.9. https://github.com/simplejson/simplejson/pull/283 Version 3.17.2 released 2020-07-16 * Added arm64 to build matrix and reintroduced manylinux wheels https://github.com/simplejson/simplejson/pull/264 * No more bdist_wininst builds per PEP 527 https://github.com/simplejson/simplejson/pull/260 * Minor grammatical issue fixed in README https://github.com/simplejson/simplejson/pull/261 Version 3.17.0 released 2019-11-17 * Updated documentation to be Python 3 first, and have removed documentation notes about version changes that occurred more than five years ago. https://github.com/simplejson/simplejson/pull/257 https://github.com/simplejson/simplejson/pull/254 * Update build matrix for Python 3.8 https://github.com/simplejson/simplejson/pull/255 https://github.com/simplejson/simplejson/pull/256 Version 3.16.1 released 2018-09-07 * Added examples for JSON lines use cases https://github.com/simplejson/simplejson/pull/236 * Add wheels for more Python versions and platforms https://github.com/simplejson/simplejson/pull/234 https://github.com/simplejson/simplejson/pull/233 https://github.com/simplejson/simplejson/pull/231 Version 3.16.0 released 2018-06-28 * Restore old behavior with regard to the type of decoded empty strings with speedups enabled on Python 2.x https://github.com/simplejson/simplejson/pull/225 * Add python_requires to setup.py to help pip https://github.com/simplejson/simplejson/pull/224 * Fix CSS in docs when built locally https://github.com/simplejson/simplejson/pull/222 Version 3.15.0 released 2018-05-12 * Clean up the C code https://github.com/simplejson/simplejson/pull/220 * Bypass the decode() method in bytes subclasses https://github.com/simplejson/simplejson/pull/219 * Support builds without cStringIO https://github.com/simplejson/simplejson/pull/217 * Allow to disable serializing bytes by default in Python 3 https://github.com/simplejson/simplejson/pull/216 * Simplify the compatibility code https://github.com/simplejson/simplejson/pull/215 * Fix tests in Python 2.5 https://github.com/simplejson/simplejson/pull/214 Version 3.14.0 released 2018-04-21 * Defer is_raw_json test (performance improvement) https://github.com/simplejson/simplejson/pull/212 * Avoid escaping U+2028 and U+2029 without ensure_ascii https://github.com/simplejson/simplejson/pull/211 * Fix an incorrect type test in Python 2, avoiding an unnecessary unicode copy. https://github.com/simplejson/simplejson/pull/210 Version 3.13.2 released 2017-11-24 * Fix additional Python 2.x compilation issue on Windows Version 3.13.1 released 2017-11-24 * Improve CI to catch speedups build regressions * Fix speedups build regression in Python 2.x https://github.com/simplejson/simplejson/issues/193 Version 3.13.0 released 2017-11-23 * Workarounds for NamedTemporaryFile issues with Windows for tool tests * Make TypeError messages contain type name instead of a repr. https://github.com/simplejson/simplejson/pull/191 * Ensure that encoding of text subtypes is consistent with or without speedups https://github.com/simplejson/simplejson/issues/185 Version 3.12.1 released 2017-11-23 * Misc updates to build infrastructure * Fix an assertion failure when make_encoder receives a bad encoder argument https://github.com/simplejson/simplejson/pull/188 * Fix potential crash during GC https://github.com/simplejson/simplejson/pull/187 * Fix a reference leak when sorting keys https://github.com/simplejson/simplejson/pull/186 Version 3.12.0 released 2017-11-05 * Fix threaded import race condition https://github.com/simplejson/simplejson/issues/184 * Move RawJSON implementation to simplejson.raw_json module * Move JSONDecodeError implementation to simplejson.errors module Version 3.11.1 released 2017-06-19 * Fix issue with item_sort_key when speedups are available, and add auto-discovery to test suites to prevent similar regressions https://github.com/simplejson/simplejson/issues/173 Version 3.11.0 released 2017-06-18 * docstring fix in JSONEncoder https://github.com/simplejson/simplejson/pull/172 * Call PyObject_IsTrue() only once for the strict argument of scanner https://github.com/simplejson/simplejson/pull/170 * Fix a crash with unencodable encoding in the encoder https://github.com/simplejson/simplejson/pull/171 * Remove unused imports https://github.com/simplejson/simplejson/pull/162 * Remove remnants of Python 2.4 support https://github.com/simplejson/simplejson/pull/168 * Fix argument checking errors in _speedups.c https://github.com/simplejson/simplejson/pull/169 * Remove the `__init__` methods in extension classes https://github.com/simplejson/simplejson/pull/166 * Fix typo in the doc for loads https://github.com/simplejson/simplejson/issues/161 * Add Python 3.6 to testing matrix and PyPI metadata https://github.com/simplejson/simplejson/pull/153 https://github.com/simplejson/simplejson/pull/152 Version 3.10.0 released 2016-10-28 * Add RawJSON class to allow a faster path for already encoded JSON. https://github.com/simplejson/simplejson/pull/143 Version 3.9.0 released 2016-10-21 * Workaround for bad behavior in string subclasses https://github.com/simplejson/simplejson/issues/144 * Fix warnings flagged by -3 https://github.com/simplejson/simplejson/pull/146 * Update readthedocs documentation links https://github.com/simplejson/simplejson/pull/137 * Add build status badge to README https://github.com/simplejson/simplejson/pull/134 Version 3.8.2 released 2016-02-14 * Fix implicit cast compiler warning in _speedups.c * simplejson is now available as wheels for OS X and Windows thanks to Travis-CI and AppVeyor respectively! Many thanks to @aebrahim for getting this party started. https://github.com/simplejson/simplejson/pull/130 https://github.com/simplejson/simplejson/issues/122 Version 3.8.1 released 2015-10-27 * Fix issue with iterable_as_array and indent option https://github.com/simplejson/simplejson/issues/128 * Fix typo in keyword argument name introduced in 3.8.0 https://github.com/simplejson/simplejson/pull/123 Version 3.8.0 released 2015-07-18 * New iterable_as_array encoder option to perform lazy serialization of any iterable objects, without having to convert to tuple or list. Version 3.7.3 released 2015-05-31 * Fix typo introduced in 3.7.0 (behavior should be indistinguishable) https://github.com/simplejson/simplejson/commit/e18cc09b688ea1f3305c27616fd3cadd2adc6d31#commitcomment-11443842 Version 3.7.2 released 2015-05-22 * Do not cache Decimal class in encoder, only reference the decimal module. This may make reload work in more common scenarios. Version 3.7.1 released 2015-05-18 * Fix compilation with MSVC https://github.com/simplejson/simplejson/pull/119 Version 3.7.0 released 2015-05-18 * simplejson no longer trusts custom str/repr methods for int, long, float subclasses. These instances are now formatted as if they were exact instances of those types. https://github.com/simplejson/simplejson/issues/118 Version 3.6.5 released 2014-10-24 * Importing bug fix for reference leak when an error occurs during dict encoding https://github.com/simplejson/simplejson/issues/109 Version 3.6.4 released 2014-09-29 * Important bug fix for dump when only sort_keys is set https://github.com/simplejson/simplejson/issues/106 Version 3.6.3 released 2014-08-18 * Documentation updates https://github.com/simplejson/simplejson/issues/103 Version 3.6.2 released 2014-08-09 * Documentation updates http://bugs.python.org/issue21514 Version 3.6.1 released 2014-08-09 * Documentation updates https://github.com/simplejson/simplejson/issues/102 Version 3.6.0 released 2014-07-21 * Automatically strip any UTF-8 BOM from input to more closely follow the latest specs https://github.com/simplejson/simplejson/pull/101 Version 3.5.3 released 2014-06-24 * Fix lower bound checking in scan_once / raw_decode API https://github.com/simplejson/simplejson/issues/98 Version 3.5.2 released 2014-05-22 * Fix Windows build with VS2008 https://github.com/simplejson/simplejson/pull/97 Version 3.5.1 released 2014-05-21 * Consistently reject int_as_string_bitcount settings that are not positive integers Version 3.5.0 released 2014-05-20 * Added int_as_string_bitcount encoder option https://github.com/simplejson/pull/96 * Fixed potential crash when encoder created with incorrect options Version 3.4.1 released 2014-04-30 * Fixed tests to run on Python 3.4 Version 3.4.0 released 2014-04-02 * Native setuptools support re-introduced https://github.com/simplejson/simplejson/pull/92 Version 3.3.3 released 2014-02-14 * Improve test suite's Python 3.4 compatibility https://github.com/simplejson/simplejson/issues/87 Version 3.3.2 released 2014-01-06 * Docstring fix for decoded string types https://github.com/simplejson/simplejson/pull/82 Version 3.3.1 released 2013-10-05 * JSONDecodeError exceptions can now be pickled https://github.com/simplejson/simplejson/pull/78 Version 3.3.0 released 2013-05-07 * Unpaired surrogates once again pass through the decoder, to match older behavior and the RFC-4627 spec. https://github.com/simplejson/simplejson/issues/62 Version 3.2.0 released 2013-05-01 * New ignore_nan kwarg in encoder that serializes out of range floats (Infinity, -Infinity, NaN) as null for ECMA-262 compliance. https://github.com/simplejson/simplejson/pull/63 * New for_json kwarg in encoder to make it possible to for subclasses of dict and list to be specialized. https://github.com/simplejson/simplejson/pull/69 Version 3.1.3 released 2013-04-06 * Updated documentation to discourage subclassing whenever possible. default, object_hook, and object_pairs_hook provide almost all of the functionality of subclassing. Version 3.1.2 released 2013-03-20 * Updated documentation to reflect separators behavior when indent is not None https://github.com/simplejson/simplejson/issues/59 * Test suite should be compatible with debug builds of Python 2.x and 3.x https://github.com/simplejson/simplejson/pull/65 Version 3.1.1 released 2013-02-21 * setup.py now has another workaround for Windows machines without MSVC installed http://bugs.python.org/issue7511 Version 3.1.0 released 2013-02-21 * Updated JSON conformance test suite http://bugs.python.org/issue16559 * simplejson.tool tests and bugfix for Python 3.x http://bugs.python.org/issue16549 * Improve error messages for certain kinds of truncated input http://bugs.python.org/issue16009 * Moved JSONDecodeError to json.scanner (still available for import from json.decoder) * Changed scanner to use JSONDecodeError directly rather than StopIteration to improve error messages Version 3.0.9 released 2013-02-21 * Fix an off-by-one error in the colno property of JSONDecodeError (when lineno == 1) http://bugs.python.org/issue17225 Version 3.0.8 released 2013-02-19 * Fix a Python 2.x compiler warning for narrow unicode builds https://github.com/simplejson/simplejson/issues/56 Version 3.0.7 released 2013-01-11 * NOTE: this release only changes the license. * simplejson is now dual-licensed software, MIT or AFL v2.1. It is also made explicit that this code is also licensed to the PSF under a Contributor Agreement. Version 3.0.6 released 2013-01-11 * Fix for major Python 2.x ensure_ascii=False encoding regression introduced in simplejson 3.0.0. If you use this setting, please upgrade immediately. https://github.com/simplejson/simplejson/issues/50 Version 3.0.5 released 2013-01-03 * NOTE: this release only changes the tests, it is not essential to upgrade * Tests now run with deprecation warnings printed * Fixed Python 3 syntax error in simplejson.tool https://github.com/simplejson/simplejson/issues/49 * Fixed Python 3.3 deprecation warnings in test suite https://github.com/simplejson/simplejson/issues/48 Version 3.0.4 released 2013-01-02 * MSVC compatibility for Python 3.3 https://github.com/simplejson/simplejson/pull/47 Version 3.0.3 released 2013-01-01 * Fixes for bugs introduced in 3.0.2 * Fixes for Python 2.5 compatibility * MSVC compatibility for Python 2.x https://github.com/simplejson/simplejson/pull/46 Version 3.0.2 released 2013-01-01 * THIS VERSION HAS BEEN REMOVED * Missed a changeset to _speedups.c in the 3.0.1 branch cut Version 3.0.1 released 2013-01-01 * THIS VERSION HAS BEEN REMOVED * Add accumulator optimization to encoder, equivalent to the usage of `_Py_Accu` in the Python 3.3 json library. Only relevant if encoding very large JSON documents. Version 3.0.0 released 2012-12-30 * Python 3.3 is now supported, thanks to Vinay Sajip https://github.com/simplejson/simplejson/issues/8 * `sort_keys`/`item_sort_key` now sort on the stringified version of the key, rather than the original object. This ensures that the sort only compares string types and makes the behavior consistent between Python 2.x and Python 3.x. * Like other number types, Decimal instances used as keys are now coerced to strings when use_decimal is True. Version 2.6.2 released 2012-09-21 * JSONEncoderForHTML was not exported in the simplejson module https://github.com/simplejson/simplejson/issues/41 Version 2.6.1 released 2012-07-27 * raw_decode() now skips whitespace before the object https://github.com/simplejson/simplejson/pull/38 Version 2.6.0 released 2012-06-26 * Error messages changed to match proposal for Python 3.3.1 http://bugs.python.org/issue5067 Version 2.5.2 released 2012-05-10 * Fix for regression introduced in 2.5.1 https://github.com/simplejson/simplejson/issues/35 Version 2.5.1 released 2012-05-10 * Support for use_decimal=True in environments that use Python sub-interpreters such as uWSGI https://github.com/simplejson/simplejson/issues/34 Version 2.5.0 released 2012-03-29 * New item_sort_key option for encoder to allow fine grained control of sorted output Version 2.4.0 released 2012-03-06 * New bigint_as_string option for encoder to trade JavaScript number precision issues for type issues. https://github.com/simplejson/simplejson/issues/31 Version 2.3.3 released 2012-02-27 * Allow unknown numerical types for indent parameter https://github.com/simplejson/simplejson/pull/29 Version 2.3.2 released 2011-12-30 * Fix crashing regression in speedups introduced in 2.3.1 Version 2.3.1 released 2011-12-29 * namedtuple_as_object now checks _asdict to ensure that it is callable. https://github.com/simplejson/simplejson/issues/26 Version 2.3.0 released 2011-12-05 * Any objects with _asdict() methods are now considered for namedtuple_as_object. https://github.com/simplejson/simplejson/pull/22 Version 2.2.1 released 2011-09-06 * Fix MANIFEST.in issue when building a sdist from a sdist. https://github.com/simplejson/simplejson/issues/16 Version 2.2.0 released 2011-09-04 * Remove setuptools requirement, reverted to pure distutils * use_decimal default for encoding (dump, dumps, JSONEncoder) is now True * tuple encoding as JSON objects can be turned off with new tuple_as_array=False option. https://github.com/simplejson/simplejson/pull/6 * namedtuple (or other tuple subclasses with _asdict methods) are now encoded as JSON objects rather than arrays by default. Can be disabled and treated as a tuple with the new namedtuple_as_object=False option. https://github.com/simplejson/simplejson/pull/6 * JSONDecodeError is now raised instead of ValueError when a document ends with an opening quote and the C speedups are in use. https://github.com/simplejson/simplejson/issues/15 * Updated documentation with information about JSONDecodeError * Force unicode linebreak characters to be escaped (U+2028 and U+2029) http://timelessrepo.com/json-isnt-a-javascript-subset * Moved documentation from a git submodule to https://simplejson.readthedocs.io/ Version 2.1.6 released 2011-05-08 * Prevent segfaults with deeply nested JSON documents https://github.com/simplejson/simplejson/issues/11 * Fix compatibility with Python 2.5 https://github.com/simplejson/simplejson/issues/5 Version 2.1.5 released 2011-04-17 * Built sdist tarball with setuptools_git installed. Argh. Version 2.1.4 released 2011-04-17 * Does not try to build the extension when using PyPy * Trailing whitespace after commas no longer emitted when indent is used * Migrated to github http://github.com/simplejson/simplejson Version 2.1.3 released 2011-01-17 * Support the sort_keys option in C encoding speedups http://code.google.com/p/simplejson/issues/detail?id=86 * Allow use_decimal to work with dump() http://code.google.com/p/simplejson/issues/detail?id=87 Version 2.1.2 released 2010-11-01 * Correct wrong end when object_pairs_hook is used http://code.google.com/p/simplejson/issues/detail?id=85 * Correct output for indent=0 http://bugs.python.org/issue10019 * Correctly raise TypeError when non-string keys are used with speedups http://code.google.com/p/simplejson/issues/detail?id=82 * Fix the endlineno, endcolno attributes of the JSONDecodeError exception. http://code.google.com/p/simplejson/issues/detail?id=81 Version 2.1.1 released 2010-03-31 * Change how setup.py imports ez_setup.py to try and workaround old versions of setuptools. http://code.google.com/p/simplejson/issues/detail?id=75 * Fix compilation on Windows platform (and other platforms with very picky compilers) * Corrected simplejson.__version__ and other minor doc changes. * Do not fail speedups tests if speedups could not be built. http://code.google.com/p/simplejson/issues/detail?id=73 Version 2.1.0 released 2010-03-10 * Decimal serialization officially supported for encoding with use_decimal=True. For encoding this encodes Decimal objects and for decoding it implies parse_float=Decimal * Python 2.4 no longer supported (may still work, but no longer tested) * Decoding performance and memory utilization enhancements http://bugs.python.org/issue7451 * JSONEncoderForHTML class for escaping &, <, > http://code.google.com/p/simplejson/issues/detail?id=66 * Memoization of object keys during encoding (when using speedups) * Encoder changed to use PyIter_Next for list iteration to avoid potential threading issues * Encoder changed to use iteritems rather than PyDict_Next in order to support dict subclasses that have a well defined ordering http://bugs.python.org/issue6105 * indent encoding parameter changed to be a string rather than an integer (integer use still supported for backwards compatibility) http://code.google.com/p/simplejson/issues/detail?id=56 * Test suite (python setup.py test) now automatically runs with and without speedups http://code.google.com/p/simplejson/issues/detail?id=55 * Fixed support for older versions of easy_install (e.g. stock Mac OS X config) http://code.google.com/p/simplejson/issues/detail?id=54 * Fixed str/unicode mismatches when using ensure_ascii=False http://code.google.com/p/simplejson/issues/detail?id=48 * Fixed error message when parsing an array with trailing comma with speedups http://code.google.com/p/simplejson/issues/detail?id=46 * Refactor decoder errors to raise JSONDecodeError instead of ValueError http://code.google.com/p/simplejson/issues/detail?id=45 * New ordered_pairs_hook feature in decoder which makes it possible to preserve key order. http://bugs.python.org/issue5381 * Fixed containerless unicode float decoding (same bug as 2.0.4, oops!) http://code.google.com/p/simplejson/issues/detail?id=43 * Share PosInf definition between encoder and decoder * Minor reformatting to make it easier to backport simplejson changes to Python 2.7/3.1 json module Version 2.0.9 released 2009-02-18 * Adds cyclic GC to the Encoder and Scanner speedups, which could've caused uncollectible cycles in some cases when using custom parser or encoder functions Version 2.0.8 released 2009-02-15 * Documentation fixes * Fixes encoding True and False as keys * Fixes checking for True and False by identity for several parameters Version 2.0.7 released 2009-01-04 * Documentation fixes * C extension now always returns unicode strings when the input string is unicode, even for empty strings Version 2.0.6 released 2008-12-19 * Windows build fixes Version 2.0.5 released 2008-11-23 * Fixes a segfault in the C extension when using check_circular=False and encoding an invalid document Version 2.0.4 released 2008-10-24 * Fixes a parsing error in the C extension when the JSON document is (only) a floating point number. It would consume one too few characters in that case, and claim the document invalid. Version 2.0.3 released 2008-10-11 * Fixes reference leaks in the encoding speedups (sorry about that!) * Fixes doctest suite for Python 2.6 * More optimizations for the decoder Version 2.0.2 released 2008-10-06 * Fixes MSVC2003 build regression * Fixes Python 2.4 compatibility in _speedups.c Version 2.0.1 released 2008-09-29 * Fixes long encoding regression introduced in 2.0.0 * Fixes MinGW build regression introduced in 2.0.0 Version 2.0.0 released 2008-09-27 * optimized Python encoding path * optimized Python decoding path * optimized C encoding path * optimized C decoding path * switched to sphinx docs (nearly the same as the json module in python 2.6) Version 1.9.3 released 2008-09-23 * Decoding is significantly faster (for our internal benchmarks) * Pretty-printing tool changed from simplejson to simplejson.tool for better Python 2.6 comaptibility * Misc. bug fixes Version 1.9 released 2008-05-03 * Rewrote test suite with unittest and doctest (no more nosetest dependency) * Better PEP 7 and PEP 8 source compliance * Removed simplejson.jsonfilter demo module * simplejson.jsonfilter is no longer included Version 1.8.1 released 2008-03-24 * Optional C extension for accelerating the decoding of JSON strings * Command line interface for pretty-printing JSON (via python -msimplejson) * Decoding of integers and floats is now extensible (e.g. to use Decimal) via parse_int, parse_float options. * Subversion and issue tracker moved to google code: http://code.google.com/p/simplejson/ * "/" is no longer escaped, so if you're embedding JSON directly in HTML you'll want to use .replace("/", "\\/") to prevent a close-tag attack. Version 1.7 released 2007-03-18 * Improves encoding performance with an optional C extension to speed up str/unicode encoding (by 10-150x or so), which yields an overall speed boost of 2x+ (JSON is string-heavy). * Support for encoding unicode code points outside the BMP to UTF-16 surrogate code pairs (specified by the Strings section of RFC 4627). Version 1.6 released 2007-03-03 * Improved str support for encoding. Previous versions of simplejson integrated strings directly into the output stream, this version ensures they're of a particular encoding (default is UTF-8) so that the output stream is valid. Version 1.5 released 2007-01-18 * Better Python 2.5 compatibility * Better Windows compatibility * indent encoding parameter for pretty printing * separators encoding parameter for generating optimally compact JSON Version 1.3 released 2006-04-01 * The optional object_hook function is called upon decoding of any JSON object literal, and its return value is used instead of the dict that would normally be used. This can be used to efficiently implement features such as JSON-RPC class hinting, or other custom decodings of JSON. See the documentation for more information. Version 1.1 released 2005-12-31 * Renamed from simple_json to simplejson to comply with PEP 8 module naming guidelines * Full set of documentation * More tests * The encoder and decoder have been extended to understand NaN, Infinity, and -Infinity (but this can be turned off via allow_nan=False for strict JSON compliance) * The decoder's scanner has been fixed so that it no longer accepts invalid JSON documents * The decoder now reports line and column information as well as character numbers for easier debugging * The encoder now has a circular reference checker, which can be optionally disabled with check_circular=False * dump, dumps, load, loads now accept an optional cls kwarg to use an alternate JSONEncoder or JSONDecoder class for convenience. * The read/write compatibility shim for json-py now have deprecation warnings Version 1.0 released 2005-12-25 * Initial release ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/LICENSE.txt0000644000175100017510000002420715172736046015211 0ustar00runnerrunnersimplejson is dual-licensed software. It is available under the terms of the MIT license, or the Academic Free License version 2.1. The full text of each license agreement is included below. This code is also licensed to the Python Software Foundation (PSF) under a Contributor Agreement. MIT License =========== Copyright (c) 2006 Bob Ippolito Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. Academic Free License v. 2.1 ============================ Copyright (c) 2006 Bob Ippolito. All rights reserved. This Academic Free License (the "License") applies to any original work of authorship (the "Original Work") whose owner (the "Licensor") has placed the following notice immediately following the copyright notice for the Original Work: Licensed under the Academic Free License version 2.1 1) Grant of Copyright License. Licensor hereby grants You a world-wide, royalty-free, non-exclusive, perpetual, sublicenseable license to do the following: a) to reproduce the Original Work in copies; b) to prepare derivative works ("Derivative Works") based upon the Original Work; c) to distribute copies of the Original Work and Derivative Works to the public; d) to perform the Original Work publicly; and e) to display the Original Work publicly. 2) Grant of Patent License. Licensor hereby grants You a world-wide, royalty-free, non-exclusive, perpetual, sublicenseable license, under patent claims owned or controlled by the Licensor that are embodied in the Original Work as furnished by the Licensor, to make, use, sell and offer for sale the Original Work and Derivative Works. 3) Grant of Source Code License. The term "Source Code" means the preferred form of the Original Work for making modifications to it and all available documentation describing how to modify the Original Work. Licensor hereby agrees to provide a machine-readable copy of the Source Code of the Original Work along with each copy of the Original Work that Licensor distributes. Licensor reserves the right to satisfy this obligation by placing a machine-readable copy of the Source Code in an information repository reasonably calculated to permit inexpensive and convenient access by You for as long as Licensor continues to distribute the Original Work, and by publishing the address of that information repository in a notice immediately following the copyright notice that applies to the Original Work. 4) Exclusions From License Grant. Neither the names of Licensor, nor the names of any contributors to the Original Work, nor any of their trademarks or service marks, may be used to endorse or promote products derived from this Original Work without express prior written permission of the Licensor. Nothing in this License shall be deemed to grant any rights to trademarks, copyrights, patents, trade secrets or any other intellectual property of Licensor except as expressly stated herein. No patent license is granted to make, use, sell or offer to sell embodiments of any patent claims other than the licensed claims defined in Section 2. No right is granted to the trademarks of Licensor even if such marks are included in the Original Work. Nothing in this License shall be interpreted to prohibit Licensor from licensing under different terms from this License any Original Work that Licensor otherwise would have a right to license. 5) This section intentionally omitted. 6) Attribution Rights. You must retain, in the Source Code of any Derivative Works that You create, all copyright, patent or trademark notices from the Source Code of the Original Work, as well as any notices of licensing and any descriptive text identified therein as an "Attribution Notice." You must cause the Source Code for any Derivative Works that You create to carry a prominent Attribution Notice reasonably calculated to inform recipients that You have modified the Original Work. 7) Warranty of Provenance and Disclaimer of Warranty. Licensor warrants that the copyright in and to the Original Work and the patent rights granted herein by Licensor are owned by the Licensor or are sublicensed to You under the terms of this License with the permission of the contributor(s) of those copyrights and patent rights. Except as expressly stated in the immediately proceeding sentence, the Original Work is provided under this License on an "AS IS" BASIS and WITHOUT WARRANTY, either express or implied, including, without limitation, the warranties of NON-INFRINGEMENT, MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY OF THE ORIGINAL WORK IS WITH YOU. This DISCLAIMER OF WARRANTY constitutes an essential part of this License. No license to Original Work is granted hereunder except under this disclaimer. 8) Limitation of Liability. Under no circumstances and under no legal theory, whether in tort (including negligence), contract, or otherwise, shall the Licensor be liable to any person for any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or the use of the Original Work including, without limitation, damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses. This limitation of liability shall not apply to liability for death or personal injury resulting from Licensor's negligence to the extent applicable law prohibits such limitation. Some jurisdictions do not allow the exclusion or limitation of incidental or consequential damages, so this exclusion and limitation may not apply to You. 9) Acceptance and Termination. If You distribute copies of the Original Work or a Derivative Work, You must make a reasonable effort under the circumstances to obtain the express assent of recipients to the terms of this License. Nothing else but this License (or another written agreement between Licensor and You) grants You permission to create Derivative Works based upon the Original Work or to exercise any of the rights granted in Section 1 herein, and any attempt to do so except under the terms of this License (or another written agreement between Licensor and You) is expressly prohibited by U.S. copyright law, the equivalent laws of other countries, and by international treaty. Therefore, by exercising any of the rights granted to You in Section 1 herein, You indicate Your acceptance of this License and all of its terms and conditions. 10) Termination for Patent Action. This License shall terminate automatically and You may no longer exercise any of the rights granted to You by this License as of the date You commence an action, including a cross-claim or counterclaim, against Licensor or any licensee alleging that the Original Work infringes a patent. This termination provision shall not apply for an action alleging patent infringement by combinations of the Original Work with other software or hardware. 11) Jurisdiction, Venue and Governing Law. Any action or suit relating to this License may be brought only in the courts of a jurisdiction wherein the Licensor resides or in which Licensor conducts its primary business, and under the laws of that jurisdiction excluding its conflict-of-law provisions. The application of the United Nations Convention on Contracts for the International Sale of Goods is expressly excluded. Any use of the Original Work outside the scope of this License or after its termination shall be subject to the requirements and penalties of the U.S. Copyright Act, 17 U.S.C. ยง 101 et seq., the equivalent laws of other countries, and international treaty. This section shall survive the termination of this License. 12) Attorneys Fees. In any action to enforce the terms of this License or seeking damages relating thereto, the prevailing party shall be entitled to recover its costs and expenses, including, without limitation, reasonable attorneys' fees and costs incurred in connection with such action, including any appeal of such action. This section shall survive the termination of this License. 13) Miscellaneous. This License represents the complete agreement concerning the subject matter hereof. If any provision of this License is held to be unenforceable, such provision shall be reformed only to the extent necessary to make it enforceable. 14) Definition of "You" in This License. "You" throughout this License, whether in upper or lower case, means an individual or a legal entity exercising rights under, and complying with all of the terms of, this License. For legal entities, "You" includes any entity that controls, is controlled by, or is under common control with you. For purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. 15) Right to Use. You may use the Original Work in all ways not otherwise restricted or conditioned by this License or by law, and Licensor promises not to interfere with or be responsible for such uses by You. This license is Copyright (C) 2003-2004 Lawrence E. Rosen. All rights reserved. Permission is hereby granted to copy and distribute this license without modification. This license may not be modified without the express written permission of its copyright owner. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/MANIFEST.in0000644000175100017510000000017015172736046015115 0ustar00runnerrunnerinclude *.py include *.txt include *.rst include *.toml include scripts/*.py include MANIFEST.in include simplejson/*.h ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1777056810.6008108 simplejson-4.1.1/PKG-INFO0000644000175100017510000000730015172736053014454 0ustar00runnerrunnerMetadata-Version: 2.4 Name: simplejson Version: 4.1.1 Summary: Simple, fast, extensible JSON encoder/decoder for Python Home-page: https://github.com/simplejson/simplejson Author: Bob Ippolito Author-email: bob@redivi.com License: MIT OR AFL-2.1 Platform: any Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: WebAssembly :: Emscripten Classifier: Intended Audience :: Developers Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 2.7 Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Programming Language :: Python :: 3.13 Classifier: Programming Language :: Python :: 3.14 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: GraalPy Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: Topic :: Software Development :: Libraries :: Python Modules Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.* License-File: LICENSE.txt Dynamic: author Dynamic: author-email Dynamic: classifier Dynamic: description Dynamic: home-page Dynamic: license Dynamic: license-file Dynamic: platform Dynamic: requires-python Dynamic: summary simplejson ---------- simplejson is a simple, fast, complete, correct and extensible JSON encoder and decoder for Python 3.8+ with legacy support for Python 2.7. It is pure Python code with no dependencies, but includes an optional C extension for a serious speed boost. The latest documentation for simplejson can be read online here: https://simplejson.readthedocs.io/ simplejson is the externally maintained development version of the json library included with Python (since 2.6). This version is tested with Python 3.14 (including free-threaded builds) and maintains backwards compatibility with Python 3.8+. A legacy Python 2.7 wheel is also published. The encoder can be specialized to provide serialization in any kind of situation, without any special support by the objects to be serialized (somewhat like pickle). This is best done with the ``default`` kwarg to dumps. The decoder can handle incoming JSON strings of any specified encoding (UTF-8 by default). It can also be specialized to post-process JSON objects with the ``object_hook`` or ``object_pairs_hook`` kwargs. This is particularly useful for implementing protocols such as JSON-RPC that have a richer type system than JSON itself. For those of you that have legacy systems to maintain, there is a very old fork of simplejson in the `python2.2`_ branch that supports Python 2.2. This is based on a very old version of simplejson, is not maintained, and should only be used as a last resort. .. _python2.2: https://github.com/simplejson/simplejson/tree/python2.2 RawJSON ~~~~~~~ ``RawJSON`` allows embedding pre-encoded JSON strings into output without re-encoding them. This can be useful in advanced cases where JSON content is already serialized and re-encoding would be unnecessary. Example usage:: from simplejson import dumps, RawJSON payload = { "status": "ok", "data": RawJSON('{"a": 1, "b": 2}') } print(dumps(payload)) # Output: {"status": "ok", "data": {"a": 1, "b": 2}} **Caveat:** ``RawJSON`` should be used with care. It bypasses normal serialization and validation, and is not recommended for general use unless the embedded JSON content is fully trusted. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/README.rst0000644000175100017510000000424615172736046015056 0ustar00runnerrunnersimplejson ---------- simplejson is a simple, fast, complete, correct and extensible JSON encoder and decoder for Python 3.8+ with legacy support for Python 2.7. It is pure Python code with no dependencies, but includes an optional C extension for a serious speed boost. The latest documentation for simplejson can be read online here: https://simplejson.readthedocs.io/ simplejson is the externally maintained development version of the json library included with Python (since 2.6). This version is tested with Python 3.14 (including free-threaded builds) and maintains backwards compatibility with Python 3.8+. A legacy Python 2.7 wheel is also published. The encoder can be specialized to provide serialization in any kind of situation, without any special support by the objects to be serialized (somewhat like pickle). This is best done with the ``default`` kwarg to dumps. The decoder can handle incoming JSON strings of any specified encoding (UTF-8 by default). It can also be specialized to post-process JSON objects with the ``object_hook`` or ``object_pairs_hook`` kwargs. This is particularly useful for implementing protocols such as JSON-RPC that have a richer type system than JSON itself. For those of you that have legacy systems to maintain, there is a very old fork of simplejson in the `python2.2`_ branch that supports Python 2.2. This is based on a very old version of simplejson, is not maintained, and should only be used as a last resort. .. _python2.2: https://github.com/simplejson/simplejson/tree/python2.2 RawJSON ~~~~~~~ ``RawJSON`` allows embedding pre-encoded JSON strings into output without re-encoding them. This can be useful in advanced cases where JSON content is already serialized and re-encoding would be unnecessary. Example usage:: from simplejson import dumps, RawJSON payload = { "status": "ok", "data": RawJSON('{"a": 1, "b": 2}') } print(dumps(payload)) # Output: {"status": "ok", "data": {"a": 1, "b": 2}} **Caveat:** ``RawJSON`` should be used with care. It bypasses normal serialization and validation, and is not recommended for general use unless the embedded JSON content is fully trusted. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/conf.py0000644000175100017510000001306215172736046014662 0ustar00runnerrunner# -*- coding: utf-8 -*- # # simplejson documentation build configuration file, created by # sphinx-quickstart on Fri Sep 26 18:58:30 2008. # # This file is execfile()d with the current directory set to its containing dir. # # The contents of this file are pickled, so don't put values in the namespace # that aren't pickleable (module imports are okay, they're removed automatically). # # All configuration values have a default value; values that are commented out # serve to show the default value. import sys, os # If your extensions are in another directory, add it here. If the directory # is relative to the documentation root, use os.path.abspath to make it # absolute, like shown here. #sys.path.append(os.path.abspath('some/directory')) # General configuration # --------------------- # Add any Sphinx extension module names here, as strings. They can be extensions # coming with Sphinx (named 'sphinx.ext.*') or your custom ones. extensions = [] # Add any paths that contain templates here, relative to this directory. templates_path = ['_templates'] # The suffix of source filenames. source_suffix = '.rst' # The master toctree document. master_doc = 'index' # General substitutions. project = 'simplejson' copyright = '2025, Bob Ippolito' # The default replacements for |version| and |release|, also used in various # other places throughout the built documents. # # The short X.Y version. version = '4.1' # The full version, including alpha/beta/rc tags. release = '4.1.1' # There are two options for replacing |today|: either, you set today to some # non-false value, then it is used: #today = '' # Else, today_fmt is used as the format for a strftime call. today_fmt = '%B %d, %Y' # List of documents that shouldn't be included in the build. #unused_docs = [] # List of directories, relative to source directories, that shouldn't be searched # for source files. #exclude_dirs = [] # The reST default role (used for this markup: `text`) to use for all documents. #default_role = None # If true, '()' will be appended to :func: etc. cross-reference text. #add_function_parentheses = True # If true, the current module name will be prepended to all description # unit titles (such as .. function::). #add_module_names = True # If true, sectionauthor and moduleauthor directives will be shown in the # output. They are ignored by default. #show_authors = False # The name of the Pygments (syntax highlighting) style to use. pygments_style = 'sphinx' # Options for HTML output # ----------------------- # The style sheet to use for HTML and HTML Help pages. A file of that name # must exist either in Sphinx' static/ path, or in one of the custom paths # given in html_static_path. #html_style = 'default.css' # The name for this set of Sphinx documents. If None, it defaults to # " v documentation". #html_title = None # A shorter title for the navigation bar. Default is the same as html_title. #html_short_title = None # The name of an image file (within the static path) to place at the top of # the sidebar. #html_logo = None # The name of an image file (within the static path) to use as favicon of the # docs. This file should be a Windows icon file (.ico) being 16x16 or 32x32 # pixels large. #html_favicon = None # Add any paths that contain custom static files (such as style sheets) here, # relative to this directory. They are copied after the builtin static files, # so a file named "default.css" will overwrite the builtin "default.css". html_static_path = ['_static'] # If not '', a 'Last updated on:' timestamp is inserted at every page bottom, # using the given strftime format. html_last_updated_fmt = '%b %d, %Y' # If true, SmartyPants will be used to convert quotes and dashes to # typographically correct entities. #html_use_smartypants = True # Custom sidebar templates, maps document names to template names. #html_sidebars = {} # Additional templates that should be rendered to pages, maps page names to # template names. #html_additional_pages = {} # If false, no module index is generated. html_use_modindex = False # If false, no index is generated. #html_use_index = True # If true, the index is split into individual pages for each letter. #html_split_index = False # If true, the reST sources are included in the HTML build as _sources/. #html_copy_source = True # If true, an OpenSearch description file will be output, and all pages will # contain a tag referring to it. The value of this option must be the # base URL from which the finished HTML is served. #html_use_opensearch = '' # If nonempty, this is the file name suffix for HTML files (e.g. ".xhtml"). html_file_suffix = '.html' # Output file base name for HTML help builder. htmlhelp_basename = 'simplejsondoc' # Options for LaTeX output # ------------------------ # The paper size ('letter' or 'a4'). #latex_paper_size = 'letter' # The font size ('10pt', '11pt' or '12pt'). #latex_font_size = '10pt' # Grouping the document tree into LaTeX files. List of tuples # (source start file, target name, title, author, document class [howto/manual]). latex_documents = [ ('index', 'simplejson.tex', 'simplejson Documentation', 'Bob Ippolito', 'manual'), ] # The name of an image file (relative to this directory) to place at the top of # the title page. #latex_logo = None # For "manual" documents, if this is true, then toplevel headings are parts, # not chapters. #latex_use_parts = False # Additional stuff for the LaTeX preamble. #latex_preamble = '' # Documents to append as an appendix to all manuals. #latex_appendices = [] # If false, no module index is generated. #latex_use_modindex = True ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/index.rst0000644000175100017510000012046115172736046015226 0ustar00runnerrunner:mod:`simplejson` --- JSON encoder and decoder ============================================== .. module:: simplejson :synopsis: Encode and decode the JSON format. .. moduleauthor:: Bob Ippolito .. sectionauthor:: Bob Ippolito `JSON (JavaScript Object Notation) `_, specified by :rfc:`7159` (which obsoletes :rfc:`4627`) and by `ECMA-404 `_, is a lightweight data interchange format inspired by `JavaScript `_ object literal syntax (although it is not a strict subset of JavaScript [#rfc-errata]_ ). :mod:`simplejson` is a simple, fast, complete, correct and extensible JSON encoder and decoder for Python. It is pure Python code with no dependencies, but includes an optional C extension for a serious speed boost. :mod:`simplejson` exposes an API familiar to users of the standard library :mod:`marshal` and :mod:`pickle` modules. It is the externally maintained version of the :mod:`json` library, but maintains compatibility with Python 3.8 through 3.14 (including free-threaded builds) as well as legacy Python 2.7. Development of simplejson happens on Github: http://github.com/simplejson/simplejson Encoding basic Python object hierarchies:: >>> import simplejson as json >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}]) '["foo", {"bar": ["baz", null, 1.0, 2]}]' >>> print(json.dumps("\"foo\bar")) "\"foo\bar" >>> print(json.dumps(u'\u1234')) "\u1234" >>> print(json.dumps('\\')) "\\" >>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)) {"a": 0, "b": 0, "c": 0} >>> from simplejson.compat import StringIO >>> io = StringIO() >>> json.dump(['streaming API'], io) >>> io.getvalue() '["streaming API"]' Compact encoding:: >>> import simplejson as json >>> obj = [1,2,3,{'4': 5, '6': 7}] >>> json.dumps(obj, separators=(',', ':'), sort_keys=True) '[1,2,3,{"4":5,"6":7}]' Pretty printing:: >>> import simplejson as json >>> print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4 * ' ')) { "4": 5, "6": 7 } Decoding JSON:: >>> import simplejson as json >>> obj = ['foo', {'bar': ['baz', None, 1.0, 2]}] >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') == obj True >>> json.loads('"\\"foo\\bar"') == '"foo\x08ar' True >>> from simplejson.compat import StringIO >>> io = StringIO('["streaming API"]') >>> json.load(io)[0] == 'streaming API' True Using Decimal instead of float:: >>> import simplejson as json >>> from decimal import Decimal >>> json.loads('1.1', use_decimal=True) == Decimal('1.1') True >>> json.dumps(Decimal('1.1'), use_decimal=True) == '1.1' True Specializing JSON object decoding:: >>> import simplejson as json >>> def as_complex(dct): ... if '__complex__' in dct: ... return complex(dct['real'], dct['imag']) ... return dct ... >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}', ... object_hook=as_complex) (1+2j) >>> import decimal >>> json.loads('1.1', parse_float=decimal.Decimal) == decimal.Decimal('1.1') True Specializing JSON object encoding:: >>> import simplejson as json >>> def encode_complex(obj): ... if isinstance(obj, complex): ... return [obj.real, obj.imag] ... raise TypeError(repr(obj) + " is not JSON serializable") ... >>> json.dumps(2 + 1j, default=encode_complex) '[2.0, 1.0]' >>> json.JSONEncoder(default=encode_complex).encode(2 + 1j) '[2.0, 1.0]' >>> ''.join(json.JSONEncoder(default=encode_complex).iterencode(2 + 1j)) '[2.0, 1.0]' .. highlight:: bash Using :mod:`simplejson.tool` from the shell to validate and pretty-print:: $ echo '{"json":"obj"}' | python -m simplejson.tool { "json": "obj" } $ echo '{ 1.2:3.4}' | python -m simplejson.tool Expecting property name enclosed in double quotes: line 1 column 3 (char 2) .. highlight:: python Parsing multiple documents serialized as JSON lines (newline-delimited JSON):: >>> import simplejson as json >>> def loads_lines(docs): ... for doc in docs.splitlines(): ... yield json.loads(doc) ... >>> sum(doc["count"] for doc in loads_lines('{"count":1}\n{"count":2}\n{"count":3}\n')) 6 Serializing multiple objects to JSON lines (newline-delimited JSON):: >>> import simplejson as json >>> def dumps_lines(objs): ... for obj in objs: ... yield json.dumps(obj, separators=(',',':')) + '\n' ... >>> ''.join(dumps_lines([{'count': 1}, {'count': 2}, {'count': 3}])) '{"count":1}\n{"count":2}\n{"count":3}\n' Basic Usage ----------- .. function:: dump(obj, fp, skipkeys=False, ensure_ascii=True, \ check_circular=True, allow_nan=False, cls=None, \ indent=None, separators=None, encoding='utf-8', \ default=None, use_decimal=True, \ namedtuple_as_object=True, tuple_as_array=True, \ bigint_as_string=False, sort_keys=False, \ item_sort_key=None, for_json=None, ignore_nan=False, \ int_as_string_bitcount=None, iterable_as_array=False, **kw) Serialize *obj* as a JSON formatted stream to *fp* (a ``.write()``-supporting file-like object) using this :ref:`conversion table `. The :mod:`simplejson` module will produce :class:`str` objects in Python 3, not :class:`bytes` objects. Therefore, ``fp.write()`` must support :class:`str` input. See :func:`dumps` for a description of each argument. The only difference is that this function writes the resulting JSON document to *fp* instead of returning it. .. note:: When using Python 2, if *ensure_ascii* is set to false, some chunks written to *fp* may be :class:`unicode` instances, subject to normal Python :class:`str` to :class:`unicode` coercion rules. Unless ``fp.write()`` explicitly understands :class:`unicode` (as in :func:`codecs.getwriter`) this is likely to cause an error. It's best to leave the default settings, because they are safe and it is highly optimized. .. function:: dumps(obj, skipkeys=False, ensure_ascii=True, \ check_circular=True, allow_nan=False, cls=None, \ indent=None, separators=None, encoding='utf-8', \ default=None, use_decimal=True, \ namedtuple_as_object=True, tuple_as_array=True, \ bigint_as_string=False, sort_keys=False, \ item_sort_key=None, for_json=None, ignore_nan=False, \ int_as_string_bitcount=None, iterable_as_array=False, **kw) Serialize *obj* to a JSON formatted :class:`str`. If *skipkeys* is true (default: ``False``), then dict keys that are not of a basic type (:class:`str`, :class:`int`, :class:`long`, :class:`float`, :class:`bool`, ``None``) will be skipped instead of raising a :exc:`TypeError`. .. note:: When using Python 2, both :class:`str` and :class:`unicode` are considered to be basic types that represent text. If *ensure_ascii* is false (default: ``True``), then the output may contain non-ASCII characters, so long as they do not need to be escaped by JSON. When it is true, all non-ASCII characters are escaped. .. note:: When using Python 2, if *ensure_ascii* is set to false, the result may be a :class:`unicode` object. By default, as a memory optimization, the result would be a :class:`str` object. If *check_circular* is false (default: ``True``), then the circular reference check for container types will be skipped and a circular reference will result in an :exc:`OverflowError` (or worse). If *allow_nan* is false (default: ``False``), then it will be a :exc:`ValueError` to serialize out of range :class:`float` values (``nan``, ``inf``, ``-inf``) in strict compliance of the original JSON specification. If *allow_nan* is true, their JavaScript equivalents will be used (``NaN``, ``Infinity``, ``-Infinity``). See also *ignore_nan* for ECMA-262 compliant behavior. .. versionchanged:: 3.19.0 The default for *allow_nan* was changed to False for better spec compliance. If *indent* is a string, then JSON array elements and object members will be pretty-printed with a newline followed by that string repeated for each level of nesting. ``None`` (the default) selects the most compact representation without any newlines. For backwards compatibility with versions of simplejson earlier than 2.1.0, an integer is also accepted and is converted to a string with that many spaces. If specified, *separators* should be an ``(item_separator, key_separator)`` tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and ``(',', ': ')`` otherwise. To get the most compact JSON representation, you should specify ``(',', ':')`` to eliminate whitespace. If *encoding* is not ``None``, then all input :class:`bytes` objects in Python 3 and 8-bit strings in Python 2 will be transformed into unicode using that encoding prior to JSON-encoding. The default is ``'utf-8'``. If *encoding* is ``None``, then all :class:`bytes` objects will be passed to the *default* function in Python 3 .. versionchanged:: 3.15.0 ``encoding=None`` disables serializing :class:`bytes` by default in Python 3. *default(obj)* is a function that should return a serializable version of *obj* or raise :exc:`TypeError`. The default implementation always raises :exc:`TypeError`. To use a custom :class:`JSONEncoder` subclass (e.g. one that overrides the :meth:`default` method to serialize additional types), specify it with the *cls* kwarg. .. note:: Subclassing is not recommended. Use the *default* kwarg or *for_json* instead. This is faster and more portable. If *use_decimal* is true (default: ``True``) then :class:`decimal.Decimal` will be natively serialized to JSON with full precision. If *namedtuple_as_object* is true (default: ``True``), objects with ``_asdict()`` methods will be encoded as JSON objects. If *tuple_as_array* is true (default: ``True``), :class:`tuple` (and subclasses) will be encoded as JSON arrays. If *iterable_as_array* is true (default: ``False``), any object not in the above table that implements ``__iter__()`` will be encoded as a JSON array. .. versionchanged:: 3.8.0 *iterable_as_array* is new in 3.8.0. If *bigint_as_string* is true (default: ``False``), :class:`int` ``2**53`` and higher or lower than ``-2**53`` will be encoded as strings. This is to avoid the rounding that happens in Javascript otherwise. Note that this option loses type information, so use with extreme caution. See also *int_as_string_bitcount*. If *sort_keys* is true (not the default), then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis. If *item_sort_key* is a callable (not the default), then the output of dictionaries will be sorted with it. The callable will be used like this: ``sorted(dct.items(), key=item_sort_key)``. This option takes precedence over *sort_keys*. If *for_json* is true (not the default), objects with a ``for_json()`` method will use the return value of that method for encoding as JSON instead of the object. If *ignore_nan* is true (default: ``False``), then out of range :class:`float` values (``nan``, ``inf``, ``-inf``) will be serialized as ``null`` in compliance with the ECMA-262 specification. If true, this will override *allow_nan*. If *int_as_string_bitcount* is a positive number ``n`` (default: ``None``), :class:`int` ``2**n`` and higher or lower than ``-2**n`` will be encoded as strings. This is to avoid the rounding that happens in Javascript otherwise. Note that this option loses type information, so use with extreme caution. See also *bigint_as_string* (which is equivalent to `int_as_string_bitcount=53`). .. note:: JSON is not a framed protocol so unlike :mod:`pickle` or :mod:`marshal` it does not make sense to serialize more than one JSON document without some container protocol to delimit them. .. function:: load(fp, encoding='utf-8', cls=None, object_hook=None, \ parse_float=None, parse_int=None, \ parse_constant=None, object_pairs_hook=None, \ use_decimal=None, allow_nan=False, **kw) Deserialize *fp* (a ``.read()``-supporting file-like object containing a JSON document) to a Python object using this :ref:`conversion table `. :exc:`JSONDecodeError` will be raised if the given JSON document is not valid. If *fp.read()* returns :class:`bytes`, such as a file opened in binary mode, then an appropriate *encoding* should be specified (the default is UTF-8). .. note:: :func:`load` will read the rest of the file-like object as a string and then call :func:`loads`. It does not stop at the end of the first valid JSON document it finds and it will raise an error if there is anything other than whitespace after the document. Except for files containing only one JSON document, it is recommended to use :func:`loads`. .. note:: In Python 2, :class:`str` is considered to be :class:`bytes` and this is the default behavior of all :class:`file` objects. If the contents of *fp* are encoded with an ASCII based encoding other than UTF-8 (e.g. latin-1), then an appropriate *encoding* name must be specified. Encodings that are not ASCII based (such as UCS-2) are not allowed, and should be wrapped with ``codecs.getreader(fp)(encoding)``, or decoded to a :class:`unicode` object and passed to :func:`loads`. The default setting of ``'utf-8'`` is fastest and should be using whenever possible. If *fp.read()* returns :class:`str` then decoded JSON strings that contain only ASCII characters may be parsed as :class:`str` for performance and memory reasons. If your code expects only :class:`unicode` the appropriate solution is to wrap fp with a reader as demonstrated above. See :func:`loads` for a description of each argument. The only difference is that this function reads the JSON document from a file-like object *fp* instead of a :class:`str` or :class:`bytes`. .. function:: loads(s, encoding='utf-8', cls=None, object_hook=None, \ parse_float=None, parse_int=None, \ parse_constant=None, object_pairs_hook=None, \ use_decimal=None, allow_nan=False, **kw) Deserialize *s* (a :class:`str` or :class:`unicode` instance containing a JSON document) to a Python object. :exc:`JSONDecodeError` will be raised if the given JSON document is not valid. .. note:: In Python 2, :class:`str` is considered to be :class:`bytes` as above, if your JSON is using an encoding that is not ASCII based, then you must decode to :class:`unicode` first. If *s* is a :class:`str` instance and is encoded with an ASCII based encoding other than UTF-8 (e.g. latin-1), then an appropriate *encoding* name must be specified. Encodings that are not ASCII based (such as UCS-2) are not allowed and should be decoded to :class:`unicode` first. Additionally, decoded JSON strings that contain only ASCII characters may be parsed as :class:`str` instead of :class:`unicode` for performance and memory reasons. If your code expects only :class:`unicode` the appropriate solution is decode *s* to :class:`unicode` prior to calling :func:`loads`. *object_hook* is an optional function that will be called with the result of any object literal decode (a :class:`dict`). The return value of *object_hook* will be used instead of the :class:`dict`. This feature can be used to implement custom decoders (e.g. `JSON-RPC `_ class hinting). *object_pairs_hook* is an optional function that will be called with the result of any object literal decode with an ordered list of pairs. The return value of *object_pairs_hook* will be used instead of the :class:`dict`. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, :class:`collections.OrderedDict` will remember the order of insertion). If *object_hook* is also defined, the *object_pairs_hook* takes priority. *parse_float*, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to ``float(num_str)``. This can be used to use another datatype or parser for JSON floats (e.g. :class:`decimal.Decimal`). *parse_int*, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to ``int(num_str)``. This can be used to use another datatype or parser for JSON integers (e.g. :class:`float`). .. versionchanged:: 3.19.0 The integer to string conversion length limitation introduced in Python 3.11 has been backported. An attempt to parse an integer with more than 4300 digits will result in an exception unless a suitable alternative parser is specified (e.g. :class:`decimal.Decimal`) If *use_decimal* is true (default: ``False``) then *parse_float* is set to :class:`decimal.Decimal`. This is a convenience for parity with the :func:`dump` parameter. If *iterable_as_array* is true (default: ``False``), any object not in the above table that implements ``__iter__()`` will be encoded as a JSON array. .. versionchanged:: 3.8.0 *iterable_as_array* is new in 3.8.0. To use a custom :class:`JSONDecoder` subclass, specify it with the ``cls`` kwarg. Additional keyword arguments will be passed to the constructor of the class. You probably shouldn't do this. .. note:: Subclassing is not recommended. You should use *object_hook* or *object_pairs_hook*. This is faster and more portable than subclassing. *allow_nan*, if True (default false), will allow the parser to accept the non-standard floats ``NaN``, ``Infinity``, and ``-Infinity``. .. versionchanged:: 3.19.0 This argument was added to make it possible to use the legacy behavior now that the parser is more strict about compliance to the standard. *parse_constant*, if specified, will be called with one of the following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. It is not recommended to use this feature, as it is rare to parse non-compliant JSON containing these values. Encoders and decoders --------------------- .. class:: JSONDecoder(encoding='utf-8', object_hook=None, parse_float=None, \ parse_int=None, parse_constant=None, \ object_pairs_hook=None, strict=True, allow_nan=False) Simple JSON decoder. Performs the following translations in decoding by default: .. _json-to-py-table: +---------------+-----------+-----------+ | JSON | Python 2 | Python 3 | +===============+===========+===========+ | object | dict | dict | +---------------+-----------+-----------+ | array | list | list | +---------------+-----------+-----------+ | string | unicode | str | +---------------+-----------+-----------+ | number (int) | int, long | int | +---------------+-----------+-----------+ | number (real) | float | float | +---------------+-----------+-----------+ | true | True | True | +---------------+-----------+-----------+ | false | False | False | +---------------+-----------+-----------+ | null | None | None | +---------------+-----------+-----------+ When *allow_nan* is True, it also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their corresponding ``float`` values, which is outside the JSON spec. *encoding* determines the encoding used to interpret any :class:`str` objects decoded by this instance (``'utf-8'`` by default). It has no effect when decoding :class:`unicode` objects. Note that currently only encodings that are a superset of ASCII work, strings of other encodings should be passed in as :class:`unicode`. *object_hook* is an optional function that will be called with the result of every JSON object decoded and its return value will be used in place of the given :class:`dict`. This can be used to provide custom deserializations (e.g. to support JSON-RPC class hinting). *object_pairs_hook* is an optional function that will be called with the result of any object literal decode with an ordered list of pairs. The return value of *object_pairs_hook* will be used instead of the :class:`dict`. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, :class:`collections.OrderedDict` will remember the order of insertion). If *object_hook* is also defined, the *object_pairs_hook* takes priority. *parse_float*, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to ``float(num_str)``. This can be used to use another datatype or parser for JSON floats (e.g. :class:`decimal.Decimal`). *parse_int*, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to ``int(num_str)``. This can be used to use another datatype or parser for JSON integers (e.g. :class:`float`). .. versionchanged:: 3.19.0 The integer to string conversion length limitation introduced in Python 3.11 has been backported. An attempt to parse an integer with more than 4300 digits will result in an exception unless a suitable alternative parser is specified (e.g. :class:`decimal.Decimal`) *parse_constant*, if specified, will be called with one of the following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. It is not recommended to use this feature, as it is rare to parse non-compliant JSON containing these values. *strict* controls the parser's behavior when it encounters an invalid control character in a string. The default setting of ``True`` means that unescaped control characters are parse errors, if ``False`` then control characters will be allowed in strings. *allow_nan* when True (not the default), the decoder will allow ``NaN``, ``Infinity``, and ``-Infinity`` as their corresponding floats. .. versionchanged:: 3.19.0 This argument was added to make it behave closer to the spec by default. The previous behavior can be restored by setting this to False. .. method:: decode(s) Return the Python representation of the JSON document *s*. See :func:`loads` for details. It is preferable to use that rather than this class. .. method:: raw_decode(s[, idx=0]) Decode a JSON document from *s* (a :class:`str` or :class:`unicode` beginning with a JSON document) starting from the index *idx* and return a 2-tuple of the Python representation and the index in *s* where the document ended. This can be used to decode a JSON document from a string that may have extraneous data at the end, or to decode a string that has a series of JSON objects. :exc:`JSONDecodeError` will be raised if the given JSON document is not valid. .. class:: JSONEncoder(skipkeys=False, ensure_ascii=True, \ check_circular=True, allow_nan=False, sort_keys=False, \ indent=None, separators=None, encoding='utf-8', \ default=None, use_decimal=True, \ namedtuple_as_object=True, tuple_as_array=True, \ bigint_as_string=False, item_sort_key=None, \ for_json=True, ignore_nan=False, \ int_as_string_bitcount=None, iterable_as_array=False) Extensible JSON encoder for Python data structures. Supports the following objects and types by default: .. _py-to-json-table: +-------------------+---------------+ | Python | JSON | +===================+===============+ | dict, namedtuple | object | +-------------------+---------------+ | list, tuple | array | +-------------------+---------------+ | str, unicode | string | +-------------------+---------------+ | int, long, float | number | +-------------------+---------------+ | True | true | +-------------------+---------------+ | False | false | +-------------------+---------------+ | None | null | +-------------------+---------------+ .. note:: The JSON format only permits strings to be used as object keys, thus any Python dicts to be encoded should only have string keys. For backwards compatibility, several other types are automatically coerced to strings: int, long, float, Decimal, bool, and None. It is error-prone to rely on this behavior, so avoid it when possible. Dictionaries with other types used as keys should be pre-processed or wrapped in another type with an appropriate `for_json` method to transform the keys during encoding. When *allow_nan* is True, it also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their corresponding ``float`` values, which is outside the JSON spec. To extend this to recognize other objects, subclass and implement a :meth:`default` method with another method that returns a serializable object for ``o`` if possible, otherwise it should call the superclass implementation (to raise :exc:`TypeError`). .. note:: Subclassing is not recommended. You should use the *default* or *for_json* kwarg. This is faster and more portable than subclassing. If *skipkeys* is false (the default), then it is a :exc:`TypeError` to attempt encoding of keys that are not str, int, long, float, Decimal, bool, or None. If *skipkeys* is true, such items are simply skipped. If *ensure_ascii* is true (the default), the output is guaranteed to be :class:`str` objects with all incoming unicode characters escaped. If *ensure_ascii* is false, the output will be a unicode object. If *check_circular* is true (the default), then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an :exc:`OverflowError`). Otherwise, no such check takes place. If *allow_nan* is true (not the default), then ``NaN``, ``Infinity``, and ``-Infinity`` will be encoded as such. This behavior is not JSON specification compliant. Otherwise, it will be a :exc:`ValueError` to encode such floats. See also *ignore_nan* for ECMA-262 compliant behavior. .. versionchanged:: 3.19.0 This default is now False to make it behave closer to the spec. The previous behavior can be restored by setting this to False. If *sort_keys* is true (not the default), then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis. If *item_sort_key* is a callable (not the default), then the output of dictionaries will be sorted with it. The callable will be used like this: ``sorted(dct.items(), key=item_sort_key)``. This option takes precedence over *sort_keys*. If *indent* is a string, then JSON array elements and object members will be pretty-printed with a newline followed by that string repeated for each level of nesting. ``None`` (the default) selects the most compact representation without any newlines. For backwards compatibility with versions of simplejson earlier than 2.1.0, an integer is also accepted and is converted to a string with that many spaces. If specified, *separators* should be an ``(item_separator, key_separator)`` tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and ``(',', ': ')`` otherwise. To get the most compact JSON representation, you should specify ``(',', ':')`` to eliminate whitespace. If specified, *default* should be a function that gets called for objects that can't otherwise be serialized. It should return a JSON encodable version of the object or raise a :exc:`TypeError`. If *encoding* is not ``None``, then all input :class:`bytes` objects in Python 3 and 8-bit strings in Python 2 will be transformed into unicode using that encoding prior to JSON-encoding. The default is ``'utf-8'``. If *encoding* is ``None``, then all :class:`bytes` objects will be passed to the :meth:`default` method in Python 3 .. versionchanged:: 3.15.0 ``encoding=None`` disables serializing :class:`bytes` by default in Python 3. If *namedtuple_as_object* is true (default: ``True``), objects with ``_asdict()`` methods will be encoded as JSON objects. If *tuple_as_array* is true (default: ``True``), :class:`tuple` (and subclasses) will be encoded as JSON arrays. If *iterable_as_array* is true (default: ``False``), any object not in the above table that implements ``__iter__()`` will be encoded as a JSON array. .. versionchanged:: 3.8.0 *iterable_as_array* is new in 3.8.0. If *bigint_as_string* is true (default: ``False``), :class:`int`` ``2**53`` and higher or lower than ``-2**53`` will be encoded as strings. This is to avoid the rounding that happens in Javascript otherwise. Note that this option loses type information, so use with extreme caution. If *for_json* is true (default: ``False``), objects with a ``for_json()`` method will use the return value of that method for encoding as JSON instead of the object. If *ignore_nan* is true (default: ``False``), then out of range :class:`float` values (``nan``, ``inf``, ``-inf``) will be serialized as ``null`` in compliance with the ECMA-262 specification. If true, this will override *allow_nan*. .. method:: default(o) Implement this method in a subclass such that it returns a serializable object for *o*, or calls the base implementation (to raise a :exc:`TypeError`). For example, to support arbitrary iterators, you could implement default like this:: def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) return JSONEncoder.default(self, o) .. note:: Subclassing is not recommended. You should implement this as a function and pass it to the *default* kwarg of :func:`dumps`. This is faster and more portable than subclassing. The semantics are the same, but without the self argument or the call to the super implementation. .. method:: encode(o) Return a JSON string representation of a Python data structure, *o*. For example:: >>> import simplejson as json >>> json.JSONEncoder().encode({"foo": ["bar", "baz"]}) '{"foo": ["bar", "baz"]}' .. method:: iterencode(o) Encode the given object, *o*, and yield each string representation as available. For example:: for chunk in JSONEncoder().iterencode(bigobject): mysocket.write(chunk) Note that :meth:`encode` has much better performance than :meth:`iterencode`. .. class:: JSONEncoderForHTML(skipkeys=False, ensure_ascii=True, \ check_circular=True, allow_nan=False, \ sort_keys=False, indent=None, separators=None, \ encoding='utf-8', \ default=None, use_decimal=True, \ namedtuple_as_object=True, \ tuple_as_array=True, \ bigint_as_string=False, item_sort_key=None, \ for_json=True, ignore_nan=False, \ int_as_string_bitcount=None) Subclass of :class:`JSONEncoder` that escapes &, <, and > for embedding in HTML. It also escapes the characters U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR), irrespective of the *ensure_ascii* setting, as these characters are not valid in JavaScript strings (see http://timelessrepo.com/json-isnt-a-javascript-subset). Exceptions ---------- .. exception:: JSONDecodeError(msg, doc, pos, end=None) Subclass of :exc:`ValueError` with the following additional attributes: .. attribute:: msg The unformatted error message .. attribute:: doc The JSON document being parsed .. attribute:: pos The start index of doc where parsing failed .. attribute:: end The end index of doc where parsing failed (may be ``None``) .. attribute:: lineno The line corresponding to pos .. attribute:: colno The column corresponding to pos .. attribute:: endlineno The line corresponding to end (may be ``None``) .. attribute:: endcolno The column corresponding to end (may be ``None``) Standard Compliance and Interoperability ---------------------------------------- The JSON format is specified by :rfc:`7159` and by `ECMA-404 `_. This section details this module's level of compliance with the RFC. For simplicity, :class:`JSONEncoder` and :class:`JSONDecoder` subclasses, and parameters other than those explicitly mentioned, are not considered. This module does not comply with the RFC in a strict fashion, implementing some extensions that are valid JavaScript but not valid JSON. In particular: - Infinite and NaN number values are accepted and output; - Repeated names within an object are accepted, and only the value of the last name-value pair is used. Since the RFC permits RFC-compliant parsers to accept input texts that are not RFC-compliant, this module's deserializer is technically RFC-compliant under default settings. Character Encodings ^^^^^^^^^^^^^^^^^^^ The RFC recommends that JSON be represented using either UTF-8, UTF-16, or UTF-32, with UTF-8 being the recommended default for maximum interoperability. As permitted, though not required, by the RFC, this module's serializer sets *ensure_ascii=True* by default, thus escaping the output so that the resulting strings only contain ASCII characters. Other than the *ensure_ascii* parameter, this module is defined strictly in terms of conversion between Python objects and :class:`Unicode strings `, and thus does not otherwise directly address the issue of character encodings. The RFC prohibits adding a byte order mark (BOM) to the start of a JSON text, and this module's serializer does not add a BOM to its output. The RFC permits, but does not require, JSON deserializers to ignore an initial BOM in their input. This module's deserializer will ignore an initial BOM, if present. The RFC does not explicitly forbid JSON strings which contain byte sequences that don't correspond to valid Unicode characters (e.g. unpaired UTF-16 surrogates), but it does note that they may cause interoperability problems. By default, this module accepts and outputs (when present in the original :class:`str`) codepoints for such sequences. Infinite and NaN Number Values ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The RFC does not permit the representation of infinite or NaN number values. Despite that, by default, this module accepts and outputs ``Infinity``, ``-Infinity``, and ``NaN`` as if they were valid JSON number literal values if the allow_nan flag is enabled:: >>> # Neither of these calls raises an exception, but the results are not valid JSON >>> json.dumps(float('-inf'), allow_nan=True) '-Infinity' >>> json.dumps(float('nan'), allow_nan=True) 'NaN' >>> # Same when deserializing >>> json.loads('-Infinity', allow_nan=True) -inf >>> json.loads('NaN', allow_nan=True) nan >>> # ignore_nan uses the ECMA-262 behavior to serialize these as null >>> json.dumps(float('-inf'), ignore_nan=True) 'null' >>> json.dumps(float('nan'), ignore_nan=True) 'null' In the serializer, the *allow_nan* parameter can be used to alter this behavior. In the deserializer, the *allow_nan* and *parse_constant* parameters can be used to alter this behavior. Repeated Names Within an Object ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The RFC specifies that the names within a JSON object should be unique, but does not mandate how repeated names in JSON objects should be handled. By default, this module does not raise an exception; instead, it ignores all but the last name-value pair for a given name:: >>> weird_json = '{"x": 1, "x": 2, "x": 3}' >>> json.loads(weird_json) == {'x': 3} True The *object_pairs_hook* parameter can be used to alter this behavior. Top-level Non-Object, Non-Array Values ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The old version of JSON specified by the obsolete :rfc:`4627` required that the top-level value of a JSON text must be either a JSON object or array (Python :class:`dict` or :class:`list`), and could not be a JSON null, boolean, number, or string value. :rfc:`7159` removed that restriction, and this module does not and has never implemented that restriction in either its serializer or its deserializer. Regardless, for maximum interoperability, you may wish to voluntarily adhere to the restriction yourself. Implementation Limitations ^^^^^^^^^^^^^^^^^^^^^^^^^^ Some JSON deserializer implementations may set limits on: * the size of accepted JSON texts * the maximum level of nesting of JSON objects and arrays * the range and precision of JSON numbers * the content and maximum length of JSON strings This module does not impose any such limits beyond those of the relevant Python datatypes themselves or the Python interpreter itself. When serializing to JSON, beware any such limitations in applications that may consume your JSON. In particular, it is common for JSON numbers to be deserialized into IEEE 754 double precision numbers and thus subject to that representation's range and precision limitations. This is especially relevant when serializing Python :class:`int` values of extremely large magnitude, or when serializing instances of "exotic" numerical types such as :class:`decimal.Decimal`. .. highlight:: bash .. _json-commandline: Command Line Interface ---------------------- The :mod:`simplejson.tool` module provides a simple command line interface to validate and pretty-print JSON. If the optional :option:`infile` and :option:`outfile` arguments are not specified, :attr:`sys.stdin` and :attr:`sys.stdout` will be used respectively:: $ echo '{"json": "obj"}' | python -m simplejson.tool { "json": "obj" } $ echo '{1.2:3.4}' | python -m simplejson.tool Expecting property name enclosed in double quotes: line 1 column 2 (char 1) Command line options ^^^^^^^^^^^^^^^^^^^^ .. cmdoption:: infile The JSON file to be validated or pretty-printed:: $ python -m simplejson.tool mp_films.json [ { "title": "And Now for Something Completely Different", "year": 1971 }, { "title": "Monty Python and the Holy Grail", "year": 1975 } ] If *infile* is not specified, read from :attr:`sys.stdin`. .. cmdoption:: outfile Write the output of the *infile* to the given *outfile*. Otherwise, write it to :attr:`sys.stdout`. .. rubric:: Footnotes .. [#rfc-errata] As noted in `the errata for RFC 7159 `_, JSON permits literal U+2028 (LINE SEPARATOR) and U+2029 (PARAGRAPH SEPARATOR) characters in strings, whereas JavaScript (as of ECMAScript Edition 5.1) does not. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/pyproject.toml0000644000175100017510000000013615172736046016275 0ustar00runnerrunner[build-system] requires = ["setuptools>=42", "wheel"] build-backend = "setuptools.build_meta" ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1777056810.5946422 simplejson-4.1.1/scripts/0000755000175100017510000000000015172736053015046 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/scripts/make_docs.py0000755000175100017510000000062615172736046017356 0ustar00runnerrunner#!/usr/bin/env python import os import subprocess SPHINX_BUILD = 'sphinx-build' DOCTREES_DIR = 'build/doctrees' HTML_DIR = 'docs' for dirname in DOCTREES_DIR, HTML_DIR: if not os.path.exists(dirname): os.makedirs(dirname) open(os.path.join(HTML_DIR, '.nojekyll'), 'w').close() res = subprocess.call([ SPHINX_BUILD, '-d', DOCTREES_DIR, '-b', 'html', '.', 'docs', ]) raise SystemExit(res) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1777056810.6010413 simplejson-4.1.1/setup.cfg0000644000175100017510000000004615172736053015200 0ustar00runnerrunner[egg_info] tag_build = tag_date = 0 ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/setup.py0000644000175100017510000001036515172736046015100 0ustar00runnerrunner#!/usr/bin/env python import os import sys try: from setuptools import setup, Extension, Command from setuptools.command.build_ext import build_ext except ImportError: from distutils.core import setup, Extension, Command from distutils.command.build_ext import build_ext from distutils.errors import CCompilerError, DistutilsExecError, \ DistutilsPlatformError IS_PYPY = hasattr(sys, 'pypy_translation_info') IS_GRAALPY = getattr(getattr(sys, "implementation", None), "name", None) == "graalpy" VERSION = '4.1.1' DESCRIPTION = "Simple, fast, extensible JSON encoder/decoder for Python" with open('README.rst', 'r') as f: LONG_DESCRIPTION = f.read() PYTHON_REQUIRES = '>=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.*' CLASSIFIERS = [ 'Development Status :: 5 - Production/Stable', 'Environment :: WebAssembly :: Emscripten', 'Intended Audience :: Developers', 'Programming Language :: Python', 'Programming Language :: Python :: 2', 'Programming Language :: Python :: 2.7', 'Programming Language :: Python :: 3', 'Programming Language :: Python :: 3.8', 'Programming Language :: Python :: 3.9', 'Programming Language :: Python :: 3.10', 'Programming Language :: Python :: 3.11', 'Programming Language :: Python :: 3.12', 'Programming Language :: Python :: 3.13', 'Programming Language :: Python :: 3.14', 'Programming Language :: Python :: Implementation :: CPython', 'Programming Language :: Python :: Implementation :: GraalPy', 'Programming Language :: Python :: Implementation :: PyPy', 'Topic :: Software Development :: Libraries :: Python Modules', ] ext_errors = (CCompilerError, DistutilsExecError, DistutilsPlatformError) class BuildFailed(Exception): pass class ve_build_ext(build_ext): # This class allows C extension building to fail. def run(self): try: build_ext.run(self) except DistutilsPlatformError: raise BuildFailed() def build_extension(self, ext): try: build_ext.build_extension(self, ext) except ext_errors: raise BuildFailed() class TestCommand(Command): user_options = [] def initialize_options(self): pass def finalize_options(self): pass def run(self): import sys import subprocess raise SystemExit( subprocess.call([sys.executable, # Turn on deprecation warnings '-Wd', 'simplejson/tests/__init__.py'])) def run_setup(with_binary): cmdclass = dict(test=TestCommand) if with_binary: kw = dict( ext_modules=[ Extension( "simplejson._speedups", sources=["simplejson/_speedups.c"], depends=["simplejson/_speedups_scan.h"], ), ], cmdclass=dict(cmdclass, build_ext=ve_build_ext), ) else: kw = dict(cmdclass=cmdclass) setup( name="simplejson", version=VERSION, description=DESCRIPTION, long_description=LONG_DESCRIPTION, classifiers=CLASSIFIERS, python_requires=PYTHON_REQUIRES, author="Bob Ippolito", author_email="bob@redivi.com", url="https://github.com/simplejson/simplejson", license="MIT OR AFL-2.1", packages=['simplejson', 'simplejson.tests'], platforms=['any'], **kw) DISABLE_SPEEDUPS = IS_PYPY or IS_GRAALPY or os.environ.get('DISABLE_SPEEDUPS') == '1' CIBUILDWHEEL = os.environ.get('CIBUILDWHEEL') == '1' REQUIRE_SPEEDUPS = CIBUILDWHEEL or os.environ.get('REQUIRE_SPEEDUPS') == '1' try: run_setup(not DISABLE_SPEEDUPS) except BuildFailed: if REQUIRE_SPEEDUPS: raise BUILD_EXT_WARNING = ("WARNING: The C extension could not be compiled, " "speedups are not enabled.") print('*' * 75) print(BUILD_EXT_WARNING) print("Failure information, if any, is above.") print("I'm retrying the build without the C extension now.") print('*' * 75) run_setup(False) print('*' * 75) print(BUILD_EXT_WARNING) print("Plain-Python installation succeeded.") print('*' * 75) ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1777056810.5961206 simplejson-4.1.1/simplejson/0000755000175100017510000000000015172736053015542 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/__init__.py0000644000175100017510000005637215172736046017672 0ustar00runnerrunnerr"""JSON (JavaScript Object Notation) is a subset of JavaScript syntax (ECMA-262 3rd edition) used as a lightweight data interchange format. :mod:`simplejson` exposes an API familiar to users of the standard library :mod:`marshal` and :mod:`pickle` modules. It is the externally maintained version of the :mod:`json` library contained in Python 2.6+, supporting Python 2.7 and Python 3.8+, and has significant performance advantages, even without using the optional C extension for speedups. Encoding basic Python object hierarchies:: >>> import simplejson as json >>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}]) '["foo", {"bar": ["baz", null, 1.0, 2]}]' >>> print(json.dumps("\"foo\bar")) "\"foo\bar" >>> print(json.dumps(u'\u1234')) "\u1234" >>> print(json.dumps('\\')) "\\" >>> print(json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)) {"a": 0, "b": 0, "c": 0} >>> from simplejson.compat import StringIO >>> io = StringIO() >>> json.dump(['streaming API'], io) >>> io.getvalue() '["streaming API"]' Compact encoding:: >>> import simplejson as json >>> obj = [1,2,3,{'4': 5, '6': 7}] >>> json.dumps(obj, separators=(',',':'), sort_keys=True) '[1,2,3,{"4":5,"6":7}]' Pretty printing:: >>> import simplejson as json >>> print(json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=' ')) { "4": 5, "6": 7 } Decoding JSON:: >>> import simplejson as json >>> obj = [u'foo', {u'bar': [u'baz', None, 1.0, 2]}] >>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]') == obj True >>> json.loads('"\\"foo\\bar"') == u'"foo\x08ar' True >>> from simplejson.compat import StringIO >>> io = StringIO('["streaming API"]') >>> json.load(io)[0] == 'streaming API' True Specializing JSON object decoding:: >>> import simplejson as json >>> def as_complex(dct): ... if '__complex__' in dct: ... return complex(dct['real'], dct['imag']) ... return dct ... >>> json.loads('{"__complex__": true, "real": 1, "imag": 2}', ... object_hook=as_complex) (1+2j) >>> from decimal import Decimal >>> json.loads('1.1', parse_float=Decimal) == Decimal('1.1') True Specializing JSON object encoding:: >>> import simplejson as json >>> def encode_complex(obj): ... if isinstance(obj, complex): ... return [obj.real, obj.imag] ... raise TypeError('Object of type %s is not JSON serializable' % ... obj.__class__.__name__) ... >>> json.dumps(2 + 1j, default=encode_complex) '[2.0, 1.0]' >>> json.JSONEncoder(default=encode_complex).encode(2 + 1j) '[2.0, 1.0]' >>> ''.join(json.JSONEncoder(default=encode_complex).iterencode(2 + 1j)) '[2.0, 1.0]' Using simplejson.tool from the shell to validate and pretty-print:: $ echo '{"json":"obj"}' | python -m simplejson.tool { "json": "obj" } $ echo '{ 1.2:3.4}' | python -m simplejson.tool Expecting property name: line 1 column 3 (char 2) Parsing multiple documents serialized as JSON lines (newline-delimited JSON):: >>> import simplejson as json >>> def loads_lines(docs): ... for doc in docs.splitlines(): ... yield json.loads(doc) ... >>> sum(doc["count"] for doc in loads_lines('{"count":1}\n{"count":2}\n{"count":3}\n')) 6 Serializing multiple objects to JSON lines (newline-delimited JSON):: >>> import simplejson as json >>> def dumps_lines(objs): ... for obj in objs: ... yield json.dumps(obj, separators=(',',':')) + '\n' ... >>> ''.join(dumps_lines([{'count': 1}, {'count': 2}, {'count': 3}])) '{"count":1}\n{"count":2}\n{"count":3}\n' """ from __future__ import absolute_import __version__ = '4.1.1' __all__ = [ 'dump', 'dumps', 'load', 'loads', 'JSONDecoder', 'JSONDecodeError', 'JSONEncoder', 'OrderedDict', 'simple_first', 'RawJSON' ] __author__ = 'Bob Ippolito ' from decimal import Decimal from .errors import JSONDecodeError from .raw_json import RawJSON from .decoder import JSONDecoder from .encoder import JSONEncoder, JSONEncoderForHTML def _import_OrderedDict(): import collections try: return collections.OrderedDict except AttributeError: from . import ordered_dict return ordered_dict.OrderedDict OrderedDict = _import_OrderedDict() def _import_c_make_encoder(): try: from ._speedups import make_encoder return make_encoder except ImportError: return None _default_encoder = JSONEncoder() def dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=False, cls=None, indent=None, separators=None, encoding='utf-8', default=None, use_decimal=True, namedtuple_as_object=True, tuple_as_array=True, bigint_as_string=False, sort_keys=False, item_sort_key=None, for_json=False, ignore_nan=False, int_as_string_bitcount=None, iterable_as_array=False, **kw): """Serialize ``obj`` as a JSON formatted stream to ``fp`` (a ``.write()``-supporting file-like object). If *skipkeys* is true then ``dict`` keys that are not basic types (``str``, ``int``, ``long``, ``float``, ``bool``, ``None``) will be skipped instead of raising a ``TypeError``. If *ensure_ascii* is false (default: ``True``), then the output may contain non-ASCII characters, so long as they do not need to be escaped by JSON. When it is true, all non-ASCII characters are escaped. If *allow_nan* is true (default: ``False``), then out of range ``float`` values (``nan``, ``inf``, ``-inf``) will be serialized to their JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``) instead of raising a ValueError. See *ignore_nan* for ECMA-262 compliant behavior. If *indent* is a string, then JSON array elements and object members will be pretty-printed with a newline followed by that string repeated for each level of nesting. ``None`` (the default) selects the most compact representation without any newlines. If specified, *separators* should be an ``(item_separator, key_separator)`` tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and ``(',', ': ')`` otherwise. To get the most compact JSON representation, you should specify ``(',', ':')`` to eliminate whitespace. *encoding* is the character encoding for str instances, default is UTF-8. *default(obj)* is a function that should return a serializable version of obj or raise ``TypeError``. The default simply raises ``TypeError``. If *use_decimal* is true (default: ``True``) then decimal.Decimal will be natively serialized to JSON with full precision. If *namedtuple_as_object* is true (default: ``True``), :class:`tuple` subclasses with ``_asdict()`` methods will be encoded as JSON objects. If *tuple_as_array* is true (default: ``True``), :class:`tuple` (and subclasses) will be encoded as JSON arrays. If *iterable_as_array* is true (default: ``False``), any object not in the above table that implements ``__iter__()`` will be encoded as a JSON array. If *bigint_as_string* is true (default: ``False``), ints 2**53 and higher or lower than -2**53 will be encoded as strings. This is to avoid the rounding that happens in Javascript otherwise. Note that this is still a lossy operation that will not round-trip correctly and should be used sparingly. If *int_as_string_bitcount* is a positive number (n), then int of size greater than or equal to 2**n or lower than or equal to -2**n will be encoded as strings. If specified, *item_sort_key* is a callable used to sort the items in each dictionary. This is useful if you want to sort items other than in alphabetical order by key. This option takes precedence over *sort_keys*. If *sort_keys* is true (default: ``False``), the output of dictionaries will be sorted by item. If *for_json* is true (default: ``False``), objects with a ``for_json()`` method will use the return value of that method for encoding as JSON instead of the object. If *ignore_nan* is true (default: ``False``), then out of range :class:`float` values (``nan``, ``inf``, ``-inf``) will be serialized as ``null`` in compliance with the ECMA-262 specification. If true, this will override *allow_nan*. To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the ``.default()`` method to serialize additional types), specify it with the ``cls`` kwarg. NOTE: You should use *default* or *for_json* instead of subclassing whenever possible. """ # cached encoder if (not skipkeys and ensure_ascii and check_circular and not allow_nan and cls is None and indent is None and separators is None and encoding == 'utf-8' and default is None and use_decimal and namedtuple_as_object and tuple_as_array and not iterable_as_array and not bigint_as_string and not sort_keys and not item_sort_key and not for_json and not ignore_nan and int_as_string_bitcount is None and not kw ): iterable = _default_encoder.iterencode(obj) else: if cls is None: cls = JSONEncoder iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii, check_circular=check_circular, allow_nan=allow_nan, indent=indent, separators=separators, encoding=encoding, default=default, use_decimal=use_decimal, namedtuple_as_object=namedtuple_as_object, tuple_as_array=tuple_as_array, iterable_as_array=iterable_as_array, bigint_as_string=bigint_as_string, sort_keys=sort_keys, item_sort_key=item_sort_key, for_json=for_json, ignore_nan=ignore_nan, int_as_string_bitcount=int_as_string_bitcount, **kw).iterencode(obj) # could accelerate with writelines in some versions of Python, at # a debuggability cost for chunk in iterable: fp.write(chunk) def dumps(obj, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=False, cls=None, indent=None, separators=None, encoding='utf-8', default=None, use_decimal=True, namedtuple_as_object=True, tuple_as_array=True, bigint_as_string=False, sort_keys=False, item_sort_key=None, for_json=False, ignore_nan=False, int_as_string_bitcount=None, iterable_as_array=False, **kw): """Serialize ``obj`` to a JSON formatted ``str``. If ``skipkeys`` is true then ``dict`` keys that are not basic types (``str``, ``int``, ``long``, ``float``, ``bool``, ``None``) will be skipped instead of raising a ``TypeError``. If *ensure_ascii* is false (default: ``True``), then the output may contain non-ASCII characters, so long as they do not need to be escaped by JSON. When it is true, all non-ASCII characters are escaped. If ``check_circular`` is false, then the circular reference check for container types will be skipped and a circular reference will result in an ``OverflowError`` (or worse). If *allow_nan* is true (default: ``False``), then out of range ``float`` values (``nan``, ``inf``, ``-inf``) will be serialized to their JavaScript equivalents (``NaN``, ``Infinity``, ``-Infinity``) instead of raising a ValueError. See *ignore_nan* for ECMA-262 compliant behavior. If ``indent`` is a string, then JSON array elements and object members will be pretty-printed with a newline followed by that string repeated for each level of nesting. ``None`` (the default) selects the most compact representation without any newlines. For backwards compatibility with versions of simplejson earlier than 2.1.0, an integer is also accepted and is converted to a string with that many spaces. If specified, ``separators`` should be an ``(item_separator, key_separator)`` tuple. The default is ``(', ', ': ')`` if *indent* is ``None`` and ``(',', ': ')`` otherwise. To get the most compact JSON representation, you should specify ``(',', ':')`` to eliminate whitespace. ``encoding`` is the character encoding for bytes instances, default is UTF-8. ``default(obj)`` is a function that should return a serializable version of obj or raise TypeError. The default simply raises TypeError. If *use_decimal* is true (default: ``True``) then decimal.Decimal will be natively serialized to JSON with full precision. If *namedtuple_as_object* is true (default: ``True``), :class:`tuple` subclasses with ``_asdict()`` methods will be encoded as JSON objects. If *tuple_as_array* is true (default: ``True``), :class:`tuple` (and subclasses) will be encoded as JSON arrays. If *iterable_as_array* is true (default: ``False``), any object not in the above table that implements ``__iter__()`` will be encoded as a JSON array. If *bigint_as_string* is true (not the default), ints 2**53 and higher or lower than -2**53 will be encoded as strings. This is to avoid the rounding that happens in Javascript otherwise. If *int_as_string_bitcount* is a positive number (n), then int of size greater than or equal to 2**n or lower than or equal to -2**n will be encoded as strings. If specified, *item_sort_key* is a callable used to sort the items in each dictionary. This is useful if you want to sort items other than in alphabetical order by key. This option takes precedence over *sort_keys*. If *sort_keys* is true (default: ``False``), the output of dictionaries will be sorted by item. If *for_json* is true (default: ``False``), objects with a ``for_json()`` method will use the return value of that method for encoding as JSON instead of the object. If *ignore_nan* is true (default: ``False``), then out of range :class:`float` values (``nan``, ``inf``, ``-inf``) will be serialized as ``null`` in compliance with the ECMA-262 specification. If true, this will override *allow_nan*. To use a custom ``JSONEncoder`` subclass (e.g. one that overrides the ``.default()`` method to serialize additional types), specify it with the ``cls`` kwarg. NOTE: You should use *default* instead of subclassing whenever possible. """ # cached encoder if (not skipkeys and ensure_ascii and check_circular and not allow_nan and cls is None and indent is None and separators is None and encoding == 'utf-8' and default is None and use_decimal and namedtuple_as_object and tuple_as_array and not iterable_as_array and not bigint_as_string and not sort_keys and not item_sort_key and not for_json and not ignore_nan and int_as_string_bitcount is None and not kw ): return _default_encoder.encode(obj) if cls is None: cls = JSONEncoder return cls( skipkeys=skipkeys, ensure_ascii=ensure_ascii, check_circular=check_circular, allow_nan=allow_nan, indent=indent, separators=separators, encoding=encoding, default=default, use_decimal=use_decimal, namedtuple_as_object=namedtuple_as_object, tuple_as_array=tuple_as_array, iterable_as_array=iterable_as_array, bigint_as_string=bigint_as_string, sort_keys=sort_keys, item_sort_key=item_sort_key, for_json=for_json, ignore_nan=ignore_nan, int_as_string_bitcount=int_as_string_bitcount, **kw).encode(obj) _default_decoder = JSONDecoder() def load(fp, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, use_decimal=False, allow_nan=False, array_hook=None, **kw): """Deserialize ``fp`` (a ``.read()``-supporting file-like object containing a JSON document as `str` or `bytes`) to a Python object. *encoding* determines the encoding used to interpret any `bytes` objects decoded by this instance (``'utf-8'`` by default). It has no effect when decoding `str` objects. *object_hook*, if specified, will be called with the result of every JSON object decoded and its return value will be used in place of the given :class:`dict`. This can be used to provide custom deserializations (e.g. to support JSON-RPC class hinting). *object_pairs_hook* is an optional function that will be called with the result of any object literal decode with an ordered list of pairs. The return value of *object_pairs_hook* will be used instead of the :class:`dict`. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, :func:`collections.OrderedDict` will remember the order of insertion). If *object_hook* is also defined, the *object_pairs_hook* takes priority. *parse_float*, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to ``float(num_str)``. This can be used to use another datatype or parser for JSON floats (e.g. :class:`decimal.Decimal`). *parse_int*, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to ``int(num_str)``. This can be used to use another datatype or parser for JSON integers (e.g. :class:`float`). *allow_nan*, if True (default false), will allow the parser to accept the non-standard floats ``NaN``, ``Infinity``, and ``-Infinity`` and enable the use of the deprecated *parse_constant*. If *use_decimal* is true (default: ``False``) then it implies parse_float=decimal.Decimal for parity with ``dump``. *parse_constant*, if specified, will be called with one of the following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. It is not recommended to use this feature, as it is rare to parse non-compliant JSON containing these values. To use a custom ``JSONDecoder`` subclass, specify it with the ``cls`` kwarg. NOTE: You should use *object_hook* or *object_pairs_hook* instead of subclassing whenever possible. """ return loads(fp.read(), encoding=encoding, cls=cls, object_hook=object_hook, parse_float=parse_float, parse_int=parse_int, parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, use_decimal=use_decimal, allow_nan=allow_nan, array_hook=array_hook, **kw) def loads(s, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, use_decimal=False, allow_nan=False, array_hook=None, **kw): """Deserialize ``s`` (a ``str`` or ``unicode`` instance containing a JSON document) to a Python object. *encoding* determines the encoding used to interpret any :class:`bytes` objects decoded by this instance (``'utf-8'`` by default). It has no effect when decoding :class:`unicode` objects. *object_hook*, if specified, will be called with the result of every JSON object decoded and its return value will be used in place of the given :class:`dict`. This can be used to provide custom deserializations (e.g. to support JSON-RPC class hinting). *object_pairs_hook* is an optional function that will be called with the result of any object literal decode with an ordered list of pairs. The return value of *object_pairs_hook* will be used instead of the :class:`dict`. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, :func:`collections.OrderedDict` will remember the order of insertion). If *object_hook* is also defined, the *object_pairs_hook* takes priority. *parse_float*, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to ``float(num_str)``. This can be used to use another datatype or parser for JSON floats (e.g. :class:`decimal.Decimal`). *parse_int*, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to ``int(num_str)``. This can be used to use another datatype or parser for JSON integers (e.g. :class:`float`). *allow_nan*, if True (default false), will allow the parser to accept the non-standard floats ``NaN``, ``Infinity``, and ``-Infinity`` and enable the use of the deprecated *parse_constant*. If *use_decimal* is true (default: ``False``) then it implies parse_float=decimal.Decimal for parity with ``dump``. *parse_constant*, if specified, will be called with one of the following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. It is not recommended to use this feature, as it is rare to parse non-compliant JSON containing these values. To use a custom ``JSONDecoder`` subclass, specify it with the ``cls`` kwarg. NOTE: You should use *object_hook* or *object_pairs_hook* instead of subclassing whenever possible. """ if (cls is None and encoding is None and object_hook is None and parse_int is None and parse_float is None and parse_constant is None and object_pairs_hook is None and array_hook is None and not use_decimal and not allow_nan and not kw): return _default_decoder.decode(s) if cls is None: cls = JSONDecoder if object_hook is not None: kw['object_hook'] = object_hook if object_pairs_hook is not None: kw['object_pairs_hook'] = object_pairs_hook if array_hook is not None: kw['array_hook'] = array_hook if parse_float is not None: kw['parse_float'] = parse_float if parse_int is not None: kw['parse_int'] = parse_int if parse_constant is not None: kw['parse_constant'] = parse_constant if use_decimal: if parse_float is not None: raise TypeError("use_decimal=True implies parse_float=Decimal") kw['parse_float'] = Decimal if allow_nan: kw['allow_nan'] = True return cls(encoding=encoding, **kw).decode(s) def _toggle_speedups(enabled): from . import decoder as dec from . import encoder as enc from . import scanner as scan c_make_encoder = _import_c_make_encoder() if enabled: dec.scanstring = dec.c_scanstring or dec.py_scanstring enc.c_make_encoder = c_make_encoder enc.encode_basestring_ascii = (enc.c_encode_basestring_ascii or enc.py_encode_basestring_ascii) enc.encode_basestring = (enc.c_encode_basestring or enc.py_encode_basestring) scan.make_scanner = scan.c_make_scanner or scan.py_make_scanner else: dec.scanstring = dec.py_scanstring enc.c_make_encoder = None enc.encode_basestring_ascii = enc.py_encode_basestring_ascii enc.encode_basestring = enc.py_encode_basestring scan.make_scanner = scan.py_make_scanner dec.make_scanner = scan.make_scanner global _default_decoder _default_decoder = JSONDecoder() global _default_encoder _default_encoder = JSONEncoder() def simple_first(kv): """Helper function to pass to item_sort_key to sort simple elements to the top, then container elements. """ return (isinstance(kv[1], (list, dict, tuple)), kv[0]) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/_speedups.c0000644000175100017510000041524515172736046017712 0ustar00runnerrunner/* -*- mode: C; c-file-style: "python"; c-basic-offset: 4 -*- */ #include "Python.h" #include "structmember.h" #include /* CHAR_BIT */ #if PY_MAJOR_VERSION >= 3 #define PyInt_FromSsize_t PyLong_FromSsize_t #define PyInt_AsSsize_t PyLong_AsSsize_t #define PyInt_Check(obj) 0 #define PyInt_CheckExact(obj) 0 #define JSON_UNICHR Py_UCS4 #define JSON_InternFromString PyUnicode_InternFromString #define PyString_GET_SIZE PyUnicode_GET_LENGTH #define JSON_StringCheck PyUnicode_Check #define PY2_UNUSED #if PY_VERSION_HEX >= 0x030C0000 /* PyUnicode_READY was deprecated in 3.10 and is a no-op since 3.12 * (PEP 623). Skip calling it on modern Python to avoid the deprecation * warning and the eventual removal. */ #undef PyUnicode_READY #define PyUnicode_READY(obj) 0 #endif #else /* PY_MAJOR_VERSION >= 3 */ #define PY2_UNUSED UNUSED #define JSON_StringCheck(obj) (PyString_Check(obj) || PyUnicode_Check(obj)) #define PyBytes_Check PyString_Check #define PyUnicode_READY(obj) 0 #define PyUnicode_KIND(obj) (sizeof(Py_UNICODE)) #define PyUnicode_DATA(obj) ((void *)(PyUnicode_AS_UNICODE(obj))) #define PyUnicode_READ(kind, data, index) ((JSON_UNICHR)((const Py_UNICODE *)(data))[(index)]) #define PyUnicode_GET_LENGTH PyUnicode_GET_SIZE #define JSON_UNICHR Py_UNICODE #define JSON_InternFromString PyString_InternFromString #endif /* PY_MAJOR_VERSION < 3 */ #if PY_VERSION_HEX < 0x03090000 #if !defined(PyObject_CallNoArgs) #define PyObject_CallNoArgs(callable) PyObject_CallFunctionObjArgs(callable, NULL) #endif #if !defined(PyObject_CallOneArg) #define PyObject_CallOneArg(callable, arg) PyObject_CallFunctionObjArgs(callable, arg, NULL) #endif #endif /* PY_VERSION_HEX < 0x03090000 */ #if PY_VERSION_HEX < 0x02070000 #if !defined(PyOS_string_to_double) #define PyOS_string_to_double json_PyOS_string_to_double static double json_PyOS_string_to_double(const char *s, char **endptr, PyObject *overflow_exception); static double json_PyOS_string_to_double(const char *s, char **endptr, PyObject *overflow_exception) { double x; assert(endptr == NULL); assert(overflow_exception == NULL); PyFPE_START_PROTECT("json_PyOS_string_to_double", return -1.0;) x = PyOS_ascii_atof(s); PyFPE_END_PROTECT(x) return x; } #endif #endif /* PY_VERSION_HEX < 0x02070000 */ #if PY_VERSION_HEX < 0x02060000 #if !defined(Py_TYPE) #define Py_TYPE(ob) (((PyObject*)(ob))->ob_type) #endif #if !defined(Py_SIZE) #define Py_SIZE(ob) (((PyVarObject*)(ob))->ob_size) #endif #if !defined(PyVarObject_HEAD_INIT) #define PyVarObject_HEAD_INIT(type, size) PyObject_HEAD_INIT(type) size, #endif #endif /* PY_VERSION_HEX < 0x02060000 */ #ifdef __GNUC__ #define UNUSED __attribute__((__unused__)) #else #define UNUSED #endif /* Py_T_OBJECT_EX is the stable public name added in Python 3.12 for the * member descriptor type that raises AttributeError when the underlying * slot is NULL. Pre-3.12 headers spell it T_OBJECT_EX (via the * internal-ish ), with identical semantics. Use the * stable name everywhere in this file and fall back to the legacy * spelling on older Pythons. The previous spelling was plain T_OBJECT, * which returned Py_None for NULL slots and is deprecated in 3.12+. */ #if !defined(Py_T_OBJECT_EX) # define Py_T_OBJECT_EX T_OBJECT_EX #endif /* Py_BEGIN_CRITICAL_SECTION was added in Python 3.13. On older versions, define as no-ops. */ #if PY_VERSION_HEX < 0x030d0000 #define Py_BEGIN_CRITICAL_SECTION(op) #define Py_END_CRITICAL_SECTION() #endif #define DEFAULT_ENCODING "utf-8" /* Unified module state. On Python 3.13+ this is stored per-module (PEP 489) so that each subinterpreter gets its own copy. On older Python versions a single static instance is shared by the whole process. Either way, code accesses it via get_speedups_state(module_ref) so that call sites look identical on all versions. */ typedef struct { PyObject *PyScannerType; PyObject *PyEncoderType; PyObject *JSON_Infinity; PyObject *JSON_NegInfinity; PyObject *JSON_NaN; PyObject *JSON_EmptyUnicode; #if PY_MAJOR_VERSION < 3 PyObject *JSON_EmptyStr; PyObject *JSON_EmptyStr_join; /* bound method: ''.join */ #endif PyObject *JSON_s_null; PyObject *JSON_s_true; PyObject *JSON_s_false; PyObject *JSON_open_dict; PyObject *JSON_close_dict; PyObject *JSON_empty_dict; PyObject *JSON_open_array; PyObject *JSON_close_array; PyObject *JSON_empty_array; PyObject *JSON_newline; /* "\n", prepended before each indent */ PyObject *JSON_sortargs; PyObject *JSON_itemgetter0; /* Interned attribute-name strings used in hot paths. Caching them * here lets the scanner/encoder use PyObject_GetAttr (which takes * a PyObject *) instead of PyObject_GetAttrString (which interns * the C string every call). */ PyObject *JSON_attr_for_json; /* "for_json" */ PyObject *JSON_attr_asdict; /* "_asdict" */ PyObject *JSON_attr_sort; /* "sort" */ PyObject *JSON_attr_encoded_json; /* "encoded_json" */ PyObject *JSON_attr_add_note; /* "add_note" (PEP 678, 3.11+) */ PyObject *RawJSONType; PyObject *JSONDecodeError; } _speedups_state; #if PY_VERSION_HEX >= 0x030D0000 /* Forward declaration - defined later with multi-phase init */ static struct PyModuleDef moduledef; #else /* Pre-3.13: a single static state instance serves the whole process, and a borrowed reference to the module object so that Scanner and Encoder instances can store module_ref uniformly. The module object is kept alive by sys.modules for the entire interpreter lifetime. PyScannerType and PyEncoderType are defined later in this file (the full PyTypeObject bodies); their addresses are cached in _speedups_static_state.{PyScannerType,PyEncoderType} by module_exec, so the PyScanner_Check / PyEncoder_Check macros don't need a forward declaration of those symbols. */ static _speedups_state _speedups_static_state; static PyObject *_speedups_module = NULL; /* borrowed */ #define PyScanner_Check(op) \ PyObject_TypeCheck(op, (PyTypeObject *)_speedups_static_state.PyScannerType) #define PyScanner_CheckExact(op) \ (Py_TYPE(op) == (PyTypeObject *)_speedups_static_state.PyScannerType) #define PyEncoder_Check(op) \ PyObject_TypeCheck(op, (PyTypeObject *)_speedups_static_state.PyEncoderType) #define PyEncoder_CheckExact(op) \ (Py_TYPE(op) == (PyTypeObject *)_speedups_static_state.PyEncoderType) #endif static inline _speedups_state * get_speedups_state(PyObject *module) { /* Every call site passes either the module object (from module-level * methods) or Scanner/Encoder->module_ref (set during instance * construction). Both must be non-NULL; catch any regression where an * uninitialized instance leaks into the hot path. */ assert(module != NULL); #if PY_VERSION_HEX >= 0x030D0000 { /* Wrapped in an inner block so `state` is declared at the top * of a scope, keeping the file C89-clean under * -Wdeclaration-after-statement. */ void *state = PyModule_GetState(module); assert(state != NULL); return (_speedups_state *)state; } #else (void)module; return &_speedups_static_state; #endif } #define JSON_ALLOW_NAN 1 #define JSON_IGNORE_NAN 2 #if PY_VERSION_HEX >= 0x030E0000 /* Python 3.14+: JSON_Accu is backed by a PyUnicodeWriter, building the * entire output in one contiguous buffer. The FinishAsList wrapper * returns a single-element list so the Python caller's ''.join(chunks) * is effectively a no-op. */ typedef struct { PyUnicodeWriter *writer; } JSON_Accu; #else typedef struct { PyObject *large_strings; /* A list of previously accumulated large strings */ PyObject *small_strings; /* Pending small strings */ } JSON_Accu; #endif static int JSON_Accu_Init(JSON_Accu *acc); static int JSON_Accu_Accumulate(_speedups_state *state, JSON_Accu *acc, PyObject *unicode); static PyObject * JSON_Accu_FinishAsList(_speedups_state *state, JSON_Accu *acc); static void JSON_Accu_Destroy(JSON_Accu *acc); #define ERR_EXPECTING_VALUE "Expecting value" #define ERR_ARRAY_DELIMITER "Expecting ',' delimiter or ']'" #define ERR_ARRAY_VALUE_FIRST "Expecting value or ']'" #define ERR_OBJECT_DELIMITER "Expecting ',' delimiter or '}'" #define ERR_OBJECT_PROPERTY "Expecting property name enclosed in double quotes" #define ERR_OBJECT_PROPERTY_FIRST "Expecting property name enclosed in double quotes or '}'" #define ERR_OBJECT_PROPERTY_DELIMITER "Expecting ':' delimiter" #define ERR_STRING_UNTERMINATED "Unterminated string starting at" #define ERR_STRING_CONTROL "Invalid control character %r at" #define ERR_STRING_ESC1 "Invalid \\X escape sequence %r" #define ERR_STRING_ESC4 "Invalid \\uXXXX escape sequence" #define ERR_TRAILING_COMMA_OBJECT "Illegal trailing comma before end of object" #define ERR_TRAILING_COMMA_ARRAY "Illegal trailing comma before end of array" typedef struct _PyScannerObject { PyObject_HEAD PyObject *module_ref; PyObject *encoding; PyObject *strict_bool; int strict; PyObject *object_hook; PyObject *pairs_hook; PyObject *array_hook; PyObject *parse_float; PyObject *parse_int; PyObject *parse_constant; PyObject *memo; } PyScannerObject; /* X-macro listing every PyObject* field in PyScannerObject that must * be visited by tp_traverse and released by tp_clear. Keep in sync with * the struct above; adding a new Py_Object* field here is sufficient to * make both scanner_traverse and scanner_clear handle it. Fields of * plain int/bool types (`strict`) don't participate in GC and are * intentionally omitted. */ #define JSON_SCANNER_OBJECT_FIELDS(X) \ X(module_ref) \ X(encoding) \ X(strict_bool) \ X(object_hook) \ X(pairs_hook) \ X(array_hook) \ X(parse_float) \ X(parse_int) \ X(parse_constant) \ X(memo) static PyMemberDef scanner_members[] = { {"encoding", Py_T_OBJECT_EX, offsetof(PyScannerObject, encoding), READONLY, "encoding"}, {"strict", Py_T_OBJECT_EX, offsetof(PyScannerObject, strict_bool), READONLY, "strict"}, {"object_hook", Py_T_OBJECT_EX, offsetof(PyScannerObject, object_hook), READONLY, "object_hook"}, {"object_pairs_hook", Py_T_OBJECT_EX, offsetof(PyScannerObject, pairs_hook), READONLY, "object_pairs_hook"}, {"array_hook", Py_T_OBJECT_EX, offsetof(PyScannerObject, array_hook), READONLY, "array_hook"}, {"parse_float", Py_T_OBJECT_EX, offsetof(PyScannerObject, parse_float), READONLY, "parse_float"}, {"parse_int", Py_T_OBJECT_EX, offsetof(PyScannerObject, parse_int), READONLY, "parse_int"}, {"parse_constant", Py_T_OBJECT_EX, offsetof(PyScannerObject, parse_constant), READONLY, "parse_constant"}, {NULL} }; typedef struct _PyEncoderObject { PyObject_HEAD PyObject *module_ref; PyObject *markers; PyObject *defaultfn; PyObject *encoder; PyObject *indent; PyObject *key_separator; PyObject *item_separator; PyObject *sort_keys; PyObject *key_memo; PyObject *encoding; PyObject *Decimal; PyObject *skipkeys_bool; int skipkeys; int fast_encode; /* 0, JSON_ALLOW_NAN, JSON_IGNORE_NAN */ int allow_or_ignore_nan; int use_decimal; int namedtuple_as_object; int tuple_as_array; int iterable_as_array; PyObject *max_long_size; PyObject *min_long_size; PyObject *item_sort_key; PyObject *item_sort_kw; int for_json; } PyEncoderObject; /* X-macro listing every PyObject* field in PyEncoderObject that must * be visited by tp_traverse and released by tp_clear. See the comment * on JSON_SCANNER_OBJECT_FIELDS above. Int flag fields (skipkeys, * fast_encode, for_json, etc.) are omitted because they don't * participate in GC. */ #define JSON_ENCODER_OBJECT_FIELDS(X) \ X(module_ref) \ X(markers) \ X(defaultfn) \ X(encoder) \ X(encoding) \ X(indent) \ X(key_separator) \ X(item_separator) \ X(key_memo) \ X(skipkeys_bool) \ X(sort_keys) \ X(item_sort_kw) \ X(item_sort_key) \ X(max_long_size) \ X(min_long_size) \ X(Decimal) static PyMemberDef encoder_members[] = { {"markers", Py_T_OBJECT_EX, offsetof(PyEncoderObject, markers), READONLY, "markers"}, {"default", Py_T_OBJECT_EX, offsetof(PyEncoderObject, defaultfn), READONLY, "default"}, {"encoder", Py_T_OBJECT_EX, offsetof(PyEncoderObject, encoder), READONLY, "encoder"}, {"encoding", Py_T_OBJECT_EX, offsetof(PyEncoderObject, encoding), READONLY, "encoding"}, {"indent", Py_T_OBJECT_EX, offsetof(PyEncoderObject, indent), READONLY, "indent"}, {"key_separator", Py_T_OBJECT_EX, offsetof(PyEncoderObject, key_separator), READONLY, "key_separator"}, {"item_separator", Py_T_OBJECT_EX, offsetof(PyEncoderObject, item_separator), READONLY, "item_separator"}, {"sort_keys", Py_T_OBJECT_EX, offsetof(PyEncoderObject, sort_keys), READONLY, "sort_keys"}, /* Python 2.5 does not support T_BOOl */ {"skipkeys", Py_T_OBJECT_EX, offsetof(PyEncoderObject, skipkeys_bool), READONLY, "skipkeys"}, {"key_memo", Py_T_OBJECT_EX, offsetof(PyEncoderObject, key_memo), READONLY, "key_memo"}, {"item_sort_key", Py_T_OBJECT_EX, offsetof(PyEncoderObject, item_sort_key), READONLY, "item_sort_key"}, {"max_long_size", Py_T_OBJECT_EX, offsetof(PyEncoderObject, max_long_size), READONLY, "max_long_size"}, {"min_long_size", Py_T_OBJECT_EX, offsetof(PyEncoderObject, min_long_size), READONLY, "min_long_size"}, {NULL} }; #if PY_VERSION_HEX < 0x030E0000 static PyObject * join_list_unicode(_speedups_state *state, PyObject *lst); #endif static PyObject * JSON_ParseEncoding(PyObject *encoding); static PyObject * maybe_quote_bigint(PyEncoderObject* s, PyObject *encoded, PyObject *obj); #if PY_VERSION_HEX < 0x030E0000 || PY_MAJOR_VERSION < 3 static Py_ssize_t ascii_char_size(JSON_UNICHR c); #endif static Py_ssize_t ascii_escape_char(JSON_UNICHR c, char *output, Py_ssize_t chars); static PyObject * ascii_escape_unicode(PyObject *pystr); static PyObject * ascii_escape_str(PyObject *pystr); static PyObject * py_encode_basestring_ascii(PyObject* self UNUSED, PyObject *pystr); static PyObject * py_encode_basestring(PyObject* self UNUSED, PyObject *pystr); #if PY_MAJOR_VERSION < 3 static PyObject * join_list_string(_speedups_state *state, PyObject *lst); static PyObject * scan_once_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr); static PyObject * scanstring_str(_speedups_state *state, PyObject *pystr, Py_ssize_t end, const char *encoding, int strict, Py_ssize_t *next_end_ptr); static PyObject * _parse_object_str(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr); #endif static PyObject * scanstring_unicode(_speedups_state *state, PyObject *pystr, Py_ssize_t end, int strict, Py_ssize_t *next_end_ptr); static PyObject * scan_once_unicode(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr); static PyObject * _build_rval_index_tuple(PyObject *rval, Py_ssize_t idx) { /* return (rval, idx) tuple, stealing reference to rval */ if (rval == NULL) { assert(PyErr_Occurred()); return NULL; } return Py_BuildValue("(Nn)", rval, idx); } static PyObject * scanner_new(PyTypeObject *type, PyObject *args, PyObject *kwds); static void scanner_dealloc(PyObject *self); static int scanner_clear(PyObject *self); static PyObject * encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds); static void encoder_dealloc(PyObject *self); static int encoder_clear(PyObject *self); static int is_raw_json(_speedups_state *state, PyObject *obj); static PyObject * encoder_stringify_key(PyEncoderObject *s, PyObject *key); static int encoder_listencode_list(PyEncoderObject *s, JSON_Accu *rval, PyObject *seq, Py_ssize_t indent_level); static int encoder_listencode_obj(PyEncoderObject *s, JSON_Accu *rval, PyObject *obj, Py_ssize_t indent_level); static int encoder_listencode_dict(PyEncoderObject *s, JSON_Accu *rval, PyObject *dct, Py_ssize_t indent_level); static PyObject * _encoded_const(_speedups_state *state, PyObject *obj); static void raise_errmsg(_speedups_state *state, const char *msg, PyObject *s, Py_ssize_t end); static PyObject * encoder_encode_string(PyEncoderObject *s, PyObject *obj); static int _call_json_method(PyObject *obj, PyObject *method_name, PyObject **result); static PyObject * encoder_long_to_str(PyObject *obj); static PyObject * encoder_encode_float(PyEncoderObject *s, PyObject *obj); static int encoder_accumulate_newline_indent(PyEncoderObject *s, _speedups_state *state, JSON_Accu *rval, Py_ssize_t indent_level); #if PY_VERSION_HEX >= 0x030B0000 static void encoder_annotate_exception(_speedups_state *state, const char *format, ...); #endif static int init_speedups_state(_speedups_state *state, PyObject *module); static PyObject * import_dependency(const char *module_name, const char *attr_name); #define S_CHAR(c) (c >= ' ' && c <= '~' && c != '\\' && c != '"') /* NON_ASCII_ESCAPE: character needs escaping in ensure_ascii=False mode. * Only control characters (0x00-0x1f), backslash and double-quote. */ #define NEEDS_ESCAPE(c) ((c) <= 0x1f || (c) == '\\' || (c) == '"') #define IS_WHITESPACE(c) (((c) == ' ') || ((c) == '\t') || ((c) == '\n') || ((c) == '\r')) #define MIN_EXPANSION 6 /* Cross-version helpers for dict ops used in the scanner/encoder hot * paths. On Python 3.13+ these forward to the new APIs that atomically * return strong references (avoiding borrowed-ref races under free * threading); on older Python versions they fall back to the legacy * borrowed-ref APIs with explicit Py_INCREF. */ static inline int json_PyDict_GetItemRef(PyObject *dict, PyObject *key, PyObject **result) { /* Atomically fetch a strong reference to dict[key]. Returns 1 if * found (with *result set to a new strong reference), 0 if not * found (with *result set to NULL), -1 on error. */ #if PY_VERSION_HEX >= 0x030D0000 return PyDict_GetItemRef(dict, key, result); #elif PY_VERSION_HEX >= 0x03040000 /* PyDict_GetItemWithError was added in Python 3.4. */ PyObject *obj = PyDict_GetItemWithError(dict, key); if (obj != NULL) { Py_INCREF(obj); *result = obj; return 1; } *result = NULL; return PyErr_Occurred() ? -1 : 0; #else /* Python 2.7: PyDict_GetItem returns NULL without setting an * exception on missing keys and suppresses errors during lookup. */ PyObject *obj = PyDict_GetItem(dict, key); /* borrowed, no error */ if (obj != NULL) { Py_INCREF(obj); *result = obj; return 1; } *result = NULL; return 0; #endif } static inline int json_memo_intern_key(PyObject *memo, PyObject **key_ptr) { /* Intern *key_ptr into memo with a single atomic lookup, replacing * *key_ptr with a strong reference to the canonical entry (the * existing one if already present, or *key_ptr itself if it was * freshly inserted). The original reference in *key_ptr is always * dropped on success. Returns 0 on success, -1 on error. */ PyObject *old = *key_ptr; #if PY_VERSION_HEX >= 0x030D0000 PyObject *canonical = NULL; if (PyDict_SetDefaultRef(memo, old, old, &canonical) < 0) return -1; Py_DECREF(old); *key_ptr = canonical; return 0; #elif PY_VERSION_HEX >= 0x03040000 /* PyDict_SetDefault was added in Python 3.4 and returns a borrowed * reference to the canonical entry. */ PyObject *canonical = PyDict_SetDefault(memo, old, old); if (canonical == NULL) return -1; Py_INCREF(canonical); Py_DECREF(old); *key_ptr = canonical; return 0; #else /* Python 2.7: no PyDict_SetDefault, use GetItem + SetItem. */ PyObject *canonical = PyDict_GetItem(memo, old); /* borrowed, no error */ if (canonical == NULL) { if (PyDict_SetItem(memo, old, old) < 0) return -1; canonical = old; } Py_INCREF(canonical); Py_DECREF(old); *key_ptr = canonical; return 0; #endif } /* Check if obj is a dict or frozendict (Python 3.15+). * PyAnyDict_Check is provided by CPython 3.15; on older versions * fall back to plain PyDict_Check. */ #if PY_VERSION_HEX >= 0x030F0000 #define JSON_AnyDict_Check(obj) PyAnyDict_Check(obj) #else #define JSON_AnyDict_Check(obj) PyDict_Check(obj) #endif static int is_raw_json(_speedups_state *state, PyObject *obj) { int r = PyObject_IsInstance(obj, state->RawJSONType); if (r < 0) return -1; return r; } #if PY_VERSION_HEX >= 0x030E0000 /* ---- PyUnicodeWriter-backed JSON_Accu (Python 3.14+) ---- */ static int JSON_Accu_Init(JSON_Accu *acc) { acc->writer = PyUnicodeWriter_Create(0); if (acc->writer == NULL) return -1; return 0; } static int JSON_Accu_Accumulate(_speedups_state *state, JSON_Accu *acc, PyObject *unicode) { (void)state; assert(PyUnicode_Check(unicode)); return PyUnicodeWriter_WriteStr(acc->writer, unicode); } static PyObject * JSON_Accu_FinishAsList(_speedups_state *state, JSON_Accu *acc) { PyObject *str; PyObject *list; (void)state; str = PyUnicodeWriter_Finish(acc->writer); acc->writer = NULL; /* Finish consumed the writer */ if (str == NULL) return NULL; list = PyList_New(1); if (list == NULL) { Py_DECREF(str); return NULL; } PyList_SET_ITEM(list, 0, str); return list; } static void JSON_Accu_Destroy(JSON_Accu *acc) { if (acc->writer != NULL) { PyUnicodeWriter_Discard(acc->writer); acc->writer = NULL; } } #else /* PY_VERSION_HEX < 0x030E0000 */ /* ---- List-backed JSON_Accu (Python < 3.14) ---- */ static int JSON_Accu_Init(JSON_Accu *acc) { /* Lazily allocated */ acc->large_strings = NULL; acc->small_strings = PyList_New(0); if (acc->small_strings == NULL) return -1; return 0; } static int flush_accumulator(_speedups_state *state, JSON_Accu *acc) { Py_ssize_t nsmall = PyList_GET_SIZE(acc->small_strings); if (nsmall) { int ret; PyObject *joined; if (acc->large_strings == NULL) { acc->large_strings = PyList_New(0); if (acc->large_strings == NULL) return -1; } #if PY_MAJOR_VERSION >= 3 joined = join_list_unicode(state, acc->small_strings); #else joined = join_list_string(state, acc->small_strings); #endif if (joined == NULL) return -1; if (PyList_SetSlice(acc->small_strings, 0, nsmall, NULL)) { Py_DECREF(joined); return -1; } ret = PyList_Append(acc->large_strings, joined); Py_DECREF(joined); return ret; } return 0; } static int JSON_Accu_Accumulate(_speedups_state *state, JSON_Accu *acc, PyObject *unicode) { Py_ssize_t nsmall; #if PY_MAJOR_VERSION >= 3 assert(PyUnicode_Check(unicode)); #else /* PY_MAJOR_VERSION >= 3 */ assert(PyString_Check(unicode) || PyUnicode_Check(unicode)); #endif /* PY_MAJOR_VERSION < 3 */ if (PyList_Append(acc->small_strings, unicode)) return -1; nsmall = PyList_GET_SIZE(acc->small_strings); /* Each item in a list of unicode objects has an overhead (in 64-bit * builds) of: * - 8 bytes for the list slot * - 56 bytes for the header of the unicode object * that is, 64 bytes. 100000 such objects waste more than 6MB * compared to a single concatenated string. */ if (nsmall < 100000) return 0; return flush_accumulator(state, acc); } static PyObject * JSON_Accu_FinishAsList(_speedups_state *state, JSON_Accu *acc) { int ret; PyObject *res; ret = flush_accumulator(state, acc); Py_CLEAR(acc->small_strings); if (ret) { Py_CLEAR(acc->large_strings); return NULL; } res = acc->large_strings; acc->large_strings = NULL; if (res == NULL) return PyList_New(0); return res; } static void JSON_Accu_Destroy(JSON_Accu *acc) { /* Safe to call unconditionally, including after JSON_Accu_FinishAsList * (which clears small_strings and transfers ownership of * large_strings to its return value). Py_CLEAR handles the NULL * case, so repeat calls are no-ops. */ Py_CLEAR(acc->small_strings); Py_CLEAR(acc->large_strings); } #endif /* PY_VERSION_HEX >= 0x030E0000 */ static int IS_DIGIT(JSON_UNICHR c) { return c >= '0' && c <= '9'; } static PyObject * maybe_quote_bigint(PyEncoderObject* s, PyObject *encoded, PyObject *obj) { int ge, le; PyObject *quoted; /* int_as_string_bitcount is not set: fast path, return as-is. */ if (s->max_long_size == Py_None || s->min_long_size == Py_None) return encoded; ge = PyObject_RichCompareBool(obj, s->max_long_size, Py_GE); if (ge < 0) { Py_DECREF(encoded); return NULL; } le = PyObject_RichCompareBool(obj, s->min_long_size, Py_LE); if (le < 0) { Py_DECREF(encoded); return NULL; } if (!(ge || le)) return encoded; #if PY_MAJOR_VERSION >= 3 quoted = PyUnicode_FromFormat("\"%U\"", encoded); #else quoted = PyString_FromFormat("\"%s\"", PyString_AsString(encoded)); #endif Py_DECREF(encoded); return quoted; } /* Stringify an int/long to its JSON decimal form. For int/long subclasses * we first normalize through PyLong_Type so custom __str__ / __repr__ * overrides don't inject garbage into the JSON output (see #118). */ static PyObject * encoder_long_to_str(PyObject *obj) { PyObject *encoded; PyObject *tmp; if (PyInt_CheckExact(obj) || PyLong_CheckExact(obj)) return PyObject_Str(obj); tmp = PyObject_CallOneArg((PyObject *)&PyLong_Type, obj); if (tmp == NULL) return NULL; encoded = PyObject_Str(tmp); Py_DECREF(tmp); return encoded; } static int _call_json_method(PyObject *obj, PyObject *method_name, PyObject **result) { int rval = 0; /* method_name is an interned PyObject string cached in module state * (state->JSON_attr_for_json or state->JSON_attr_asdict), so this * avoids the char-to-interned-unicode conversion on every call. */ PyObject *method = PyObject_GetAttr(obj, method_name); if (method == NULL) { if (PyErr_ExceptionMatches(PyExc_AttributeError)) { PyErr_Clear(); return 0; } /* Non-AttributeError from __getattr__ (e.g. MemoryError, * KeyboardInterrupt): propagate via NULL result so the caller * forwards the pending exception to encoder_steal_encode. */ *result = NULL; return 1; } if (PyCallable_Check(method)) { PyObject *tmp = PyObject_CallNoArgs(method); if (tmp == NULL && PyErr_ExceptionMatches(PyExc_TypeError)) { PyErr_Clear(); } else { /* This will set result to NULL if a TypeError occurred, * which must be checked by the caller */ *result = tmp; rval = 1; } } Py_DECREF(method); return rval; } static Py_ssize_t ascii_escape_char(JSON_UNICHR c, char *output, Py_ssize_t chars) { /* Escape unicode code point c to ASCII escape sequences in char *output. output must have at least 12 bytes unused to accommodate an escaped surrogate pair "\uXXXX\uXXXX" */ if (S_CHAR(c)) { output[chars++] = (char)c; } else { output[chars++] = '\\'; switch (c) { case '\\': output[chars++] = (char)c; break; case '"': output[chars++] = (char)c; break; case '\b': output[chars++] = 'b'; break; case '\f': output[chars++] = 'f'; break; case '\n': output[chars++] = 'n'; break; case '\r': output[chars++] = 'r'; break; case '\t': output[chars++] = 't'; break; default: #if PY_MAJOR_VERSION >= 3 || defined(Py_UNICODE_WIDE) if (c >= 0x10000) { /* UTF-16 surrogate pair */ JSON_UNICHR v = c - 0x10000; c = 0xd800 | ((v >> 10) & 0x3ff); output[chars++] = 'u'; output[chars++] = "0123456789abcdef"[(c >> 12) & 0xf]; output[chars++] = "0123456789abcdef"[(c >> 8) & 0xf]; output[chars++] = "0123456789abcdef"[(c >> 4) & 0xf]; output[chars++] = "0123456789abcdef"[(c ) & 0xf]; c = 0xdc00 | (v & 0x3ff); output[chars++] = '\\'; } #endif output[chars++] = 'u'; output[chars++] = "0123456789abcdef"[(c >> 12) & 0xf]; output[chars++] = "0123456789abcdef"[(c >> 8) & 0xf]; output[chars++] = "0123456789abcdef"[(c >> 4) & 0xf]; output[chars++] = "0123456789abcdef"[(c ) & 0xf]; } } return chars; } #if PY_VERSION_HEX < 0x030E0000 || PY_MAJOR_VERSION < 3 /* Only needed by the two-pass ascii_escape_unicode (pre-3.14) and * ascii_escape_str (Python 2). The PyUnicodeWriter path on 3.14+ * computes sizes implicitly, so this would be unused there. */ static Py_ssize_t ascii_char_size(JSON_UNICHR c) { if (S_CHAR(c)) { return 1; } else if (c == '\\' || c == '"' || c == '\b' || c == '\f' || c == '\n' || c == '\r' || c == '\t') { return 2; } #if PY_MAJOR_VERSION >= 3 || defined(Py_UNICODE_WIDE) else if (c >= 0x10000U) { return 2 * MIN_EXPANSION; } #endif else { return MIN_EXPANSION; } } #endif /* PY_VERSION_HEX < 0x030E0000 || PY_MAJOR_VERSION < 3 */ #if PY_VERSION_HEX >= 0x030E0000 static PyObject * ascii_escape_unicode(PyObject *pystr) { /* Single-pass implementation using PyUnicodeWriter (Python 3.14+). * Writes runs of safe characters via WriteSubstring and escape * sequences via WriteUTF8 (all escape output is pure ASCII). */ Py_ssize_t i; Py_ssize_t input_chars = PyUnicode_GET_LENGTH(pystr); int kind = PyUnicode_KIND(pystr); void *data = PyUnicode_DATA(pystr); Py_ssize_t run_start = 0; PyUnicodeWriter *writer = PyUnicodeWriter_Create(input_chars + 2); if (writer == NULL) return NULL; if (PyUnicodeWriter_WriteChar(writer, '"') < 0) goto bail; for (i = 0; i < input_chars; i++) { JSON_UNICHR c = PyUnicode_READ(kind, data, i); if (S_CHAR(c)) continue; /* Flush run of safe characters */ if (i > run_start) { if (PyUnicodeWriter_WriteSubstring(writer, pystr, run_start, i) < 0) goto bail; } /* Write escape sequence */ { char buf[12]; Py_ssize_t len = ascii_escape_char(c, buf, 0); if (PyUnicodeWriter_WriteUTF8(writer, buf, len) < 0) goto bail; } run_start = i + 1; } /* Flush remaining safe characters */ if (i > run_start) { if (PyUnicodeWriter_WriteSubstring(writer, pystr, run_start, i) < 0) goto bail; } if (PyUnicodeWriter_WriteChar(writer, '"') < 0) goto bail; return PyUnicodeWriter_Finish(writer); bail: PyUnicodeWriter_Discard(writer); return NULL; } #else /* PY_VERSION_HEX < 0x030E0000 */ static PyObject * ascii_escape_unicode(PyObject *pystr) { /* Two-pass implementation: calculate exact output size, then fill. */ Py_ssize_t i; Py_ssize_t input_chars = PyUnicode_GET_LENGTH(pystr); Py_ssize_t output_size = 2; Py_ssize_t chars; PY2_UNUSED int kind = PyUnicode_KIND(pystr); void *data = PyUnicode_DATA(pystr); PyObject *rval; char *output; output_size = 2; for (i = 0; i < input_chars; i++) { Py_ssize_t charsize = ascii_char_size(PyUnicode_READ(kind, data, i)); if (output_size > PY_SSIZE_T_MAX - charsize) { PyErr_SetString(PyExc_OverflowError, "string is too long to escape"); return NULL; } output_size += charsize; } #if PY_MAJOR_VERSION >= 3 rval = PyUnicode_New(output_size, 127); if (rval == NULL) { return NULL; } assert(PyUnicode_KIND(rval) == PyUnicode_1BYTE_KIND); output = (char *)PyUnicode_DATA(rval); #else rval = PyString_FromStringAndSize(NULL, output_size); if (rval == NULL) { return NULL; } output = PyString_AS_STRING(rval); #endif chars = 0; output[chars++] = '"'; for (i = 0; i < input_chars; i++) { chars = ascii_escape_char(PyUnicode_READ(kind, data, i), output, chars); } output[chars++] = '"'; assert(chars == output_size); return rval; } #endif /* PY_VERSION_HEX >= 0x030E0000 */ #if PY_MAJOR_VERSION >= 3 static PyObject * ascii_escape_str(PyObject *pystr) { PyObject *rval; PyObject *input = PyUnicode_DecodeUTF8(PyBytes_AS_STRING(pystr), PyBytes_GET_SIZE(pystr), NULL); if (input == NULL) return NULL; rval = ascii_escape_unicode(input); Py_DECREF(input); return rval; } #else /* PY_MAJOR_VERSION >= 3 */ static PyObject * ascii_escape_str(PyObject *pystr) { /* Take a PyString pystr and return a new ASCII-only escaped PyString */ Py_ssize_t i; Py_ssize_t input_chars; Py_ssize_t output_size; Py_ssize_t chars; PyObject *rval; char *output; char *input_str; input_chars = PyString_GET_SIZE(pystr); input_str = PyString_AS_STRING(pystr); output_size = 2; /* Fast path for a string that's already ASCII */ for (i = 0; i < input_chars; i++) { JSON_UNICHR c = (JSON_UNICHR)input_str[i]; if (c > 0x7f) { /* We hit a non-ASCII character, bail to unicode mode */ PyObject *uni; uni = PyUnicode_DecodeUTF8(input_str, input_chars, "strict"); if (uni == NULL) { return NULL; } rval = ascii_escape_unicode(uni); Py_DECREF(uni); return rval; } { Py_ssize_t charsize = ascii_char_size(c); if (output_size > PY_SSIZE_T_MAX - charsize) { PyErr_SetString(PyExc_OverflowError, "string is too long to escape"); return NULL; } output_size += charsize; } } rval = PyString_FromStringAndSize(NULL, output_size); if (rval == NULL) { return NULL; } chars = 0; output = PyString_AS_STRING(rval); output[chars++] = '"'; for (i = 0; i < input_chars; i++) { chars = ascii_escape_char((JSON_UNICHR)input_str[i], output, chars); } output[chars++] = '"'; assert(chars == output_size); return rval; } #endif /* PY_MAJOR_VERSION < 3 */ static PyObject * encoder_stringify_key(PyEncoderObject *s, PyObject *key) { _speedups_state *state = get_speedups_state(s->module_ref); if (PyUnicode_Check(key)) { Py_INCREF(key); return key; } #if PY_MAJOR_VERSION >= 3 else if (PyBytes_Check(key) && s->encoding != Py_None) { const char *encoding = PyUnicode_AsUTF8(s->encoding); if (encoding == NULL) return NULL; return PyUnicode_Decode( PyBytes_AS_STRING(key), PyBytes_GET_SIZE(key), encoding, NULL); } #else /* PY_MAJOR_VERSION >= 3 */ else if (PyString_Check(key)) { Py_INCREF(key); return key; } #endif /* PY_MAJOR_VERSION < 3 */ else if (PyFloat_Check(key)) { return encoder_encode_float(s, key); } else if (key == Py_True || key == Py_False || key == Py_None) { /* This must come before the PyInt_Check because True and False are also 1 and 0.*/ return _encoded_const(state, key); } else if (PyInt_Check(key) || PyLong_Check(key)) { return encoder_long_to_str(key); } else if (s->use_decimal && PyObject_TypeCheck(key, (PyTypeObject *)s->Decimal)) { return PyObject_Str(key); } if (s->skipkeys) { Py_INCREF(Py_None); return Py_None; } PyErr_Format(PyExc_TypeError, "keys must be str, int, float, bool or None, " "not %.100s", key->ob_type->tp_name); return NULL; } /* Call list.sort(**item_sort_kw) on `lst`. Returns 0 on success, * -1 on error. Factored out so the fast and slow paths of * encoder_dict_iteritems share one implementation. */ static int encoder_sort_items_inplace(PyEncoderObject *s, PyObject *lst) { _speedups_state *state = get_speedups_state(s->module_ref); PyObject *sortfun; PyObject *sortres; sortfun = PyObject_GetAttr(lst, state->JSON_attr_sort); if (sortfun == NULL) return -1; sortres = PyObject_Call(sortfun, state->JSON_sortargs, s->item_sort_kw); Py_DECREF(sortfun); if (sortres == NULL) return -1; Py_DECREF(sortres); return 0; } /* True iff `key` is a Python string type that can be used verbatim * as a JSON object key (PyUnicode on all versions, PyString also on * Python 2). */ static inline int is_json_string_key(PyObject *key) { #if PY_MAJOR_VERSION < 3 if (PyString_Check(key)) return 1; #endif return PyUnicode_Check(key); } static PyObject * encoder_dict_iteritems(PyEncoderObject *s, PyObject *dct) { PyObject *items; PyObject *iter = NULL; PyObject *lst = NULL; PyObject *item = NULL; PyObject *kstr = NULL; Py_ssize_t size; Py_ssize_t i; if (PyDict_CheckExact(dct)) items = PyDict_Items(dct); else items = PyMapping_Items(dct); if (items == NULL) return NULL; /* Unsorted path: return iter(items) directly. */ if (s->item_sort_kw == Py_None) { iter = PyObject_GetIter(items); Py_DECREF(items); return iter; } /* Sorted path. Fast sub-path: if every key is already a JSON- * compatible string, sort the items list in place and return * iter(items). No per-item tuple rebuild, no list alloc, no * stringify branch in the hot loop. This is the overwhelmingly * common case โ€” JSON object keys are typically strings. Scan the * list once to establish it; on any non-string key fall through * to the general path below. */ size = PyList_GET_SIZE(items); for (i = 0; i < size; i++) { PyObject *it = PyList_GET_ITEM(items, i); PyObject *key; if (!PyTuple_Check(it) || Py_SIZE(it) != 2) { PyErr_SetString(PyExc_ValueError, "items must return 2-tuples"); Py_DECREF(items); return NULL; } key = PyTuple_GET_ITEM(it, 0); if (!is_json_string_key(key)) break; } if (i == size) { if (encoder_sort_items_inplace(s, items) < 0) { Py_DECREF(items); return NULL; } iter = PyObject_GetIter(items); Py_DECREF(items); return iter; } /* Slow path: at least one key needs to be stringified before the * sort. Walk the items list from the first offending index `i`, * accumulating a new list with replacement tuples as needed. */ iter = PyObject_GetIter(items); Py_DECREF(items); if (iter == NULL) return NULL; lst = PyList_New(0); if (lst == NULL) goto bail; while ((item = PyIter_Next(iter))) { PyObject *key, *value; /* items comes from the original iter we built; the tuple shape * was already validated by the fast-path pre-scan above, but * a user-defined mapping could return non-tuples out of order * here if the 2-tuple check above happened to succeed on * other entries โ€” revalidate. */ if (!PyTuple_Check(item) || Py_SIZE(item) != 2) { PyErr_SetString(PyExc_ValueError, "items must return 2-tuples"); goto bail; } key = PyTuple_GET_ITEM(item, 0); if (!is_json_string_key(key)) { PyObject *tpl; kstr = encoder_stringify_key(s, key); if (kstr == NULL) goto bail; if (kstr == Py_None) { /* skipkeys */ Py_CLEAR(kstr); Py_CLEAR(item); continue; } value = PyTuple_GET_ITEM(item, 1); tpl = PyTuple_Pack(2, kstr, value); if (tpl == NULL) goto bail; Py_CLEAR(kstr); Py_DECREF(item); item = tpl; } if (PyList_Append(lst, item)) goto bail; Py_CLEAR(item); } Py_CLEAR(iter); if (PyErr_Occurred()) goto bail; if (encoder_sort_items_inplace(s, lst) < 0) goto bail; iter = PyObject_GetIter(lst); Py_CLEAR(lst); return iter; bail: Py_XDECREF(kstr); Py_XDECREF(item); Py_XDECREF(lst); Py_XDECREF(iter); return NULL; } /* Use JSONDecodeError exception to raise a nice looking ValueError subclass */ static void raise_errmsg(_speedups_state *state, const char *msg, PyObject *s, Py_ssize_t end) { PyObject *JSONDecodeError = state->JSONDecodeError; PyObject *exc = PyObject_CallFunction(JSONDecodeError, "(zOn)", msg, s, end); if (exc) { PyErr_SetObject(JSONDecodeError, exc); Py_DECREF(exc); } } #if PY_VERSION_HEX < 0x030E0000 static PyObject * join_list_unicode(_speedups_state *state, PyObject *lst) { /* return u''.join(lst) */ return PyUnicode_Join(state->JSON_EmptyUnicode, lst); } #endif #if PY_MAJOR_VERSION < 3 static PyObject * join_list_string(_speedups_state *state, PyObject *lst) { /* return ''.join(lst) */ return PyObject_CallOneArg(state->JSON_EmptyStr_join, lst); } #endif /* PY_MAJOR_VERSION < 3 */ #define APPEND_OLD_CHUNK \ if (chunk != NULL) { \ if (chunks == NULL) { \ chunks = PyList_New(0); \ if (chunks == NULL) { \ goto bail; \ } \ } \ if (PyList_Append(chunks, chunk)) { \ goto bail; \ } \ Py_CLEAR(chunk); \ } #if PY_MAJOR_VERSION < 3 static PyObject * scanstring_str(_speedups_state *state, PyObject *pystr, Py_ssize_t end, const char *encoding, int strict, Py_ssize_t *next_end_ptr) { /* Read the JSON string from PyString pystr. end is the index of the first character after the quote. encoding is the encoding of pystr (must be an ASCII superset) if strict is zero then literal control characters are allowed *next_end_ptr is a return-by-reference index of the character after the end quote Return value is a new PyString (if ASCII-only) or PyUnicode */ PyObject *rval; Py_ssize_t len = PyString_GET_SIZE(pystr); Py_ssize_t begin = end - 1; Py_ssize_t next = begin; int has_unicode = 0; char *buf = PyString_AS_STRING(pystr); PyObject *chunks = NULL; PyObject *chunk = NULL; PyObject *strchunk = NULL; if (len == end) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } else if (end < 0 || len < end) { /* Out-of-range end: match py_scanstring, which raises * JSONDecodeError("Unterminated string starting at") so that * user code using `except JSONDecodeError` catches the C path * the same way it catches the pure-Python path. */ raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } while (1) { /* Find the end of the string or the next escape */ Py_UNICODE c = 0; for (next = end; next < len; next++) { c = (unsigned char)buf[next]; if (c == '"' || c == '\\') { break; } else if (strict && c <= 0x1f) { raise_errmsg(state, ERR_STRING_CONTROL, pystr, next); goto bail; } else if (c > 0x7f) { has_unicode = 1; } } if (!(c == '"' || c == '\\')) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } /* Pick up this chunk if it's not zero length */ if (next != end) { APPEND_OLD_CHUNK strchunk = PyString_FromStringAndSize(&buf[end], next - end); if (strchunk == NULL) { goto bail; } if (has_unicode) { chunk = PyUnicode_FromEncodedObject(strchunk, encoding, NULL); Py_DECREF(strchunk); if (chunk == NULL) { goto bail; } } else { chunk = strchunk; } } next++; if (c == '"') { end = next; break; } if (next == len) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } c = buf[next]; if (c != 'u') { /* Non-unicode backslash escapes */ end = next + 1; switch (c) { case '"': break; case '\\': break; case '/': break; case 'b': c = '\b'; break; case 'f': c = '\f'; break; case 'n': c = '\n'; break; case 'r': c = '\r'; break; case 't': c = '\t'; break; default: c = 0; } if (c == 0) { raise_errmsg(state, ERR_STRING_ESC1, pystr, end - 2); goto bail; } } else { c = 0; next++; end = next + 4; if (end > len) { raise_errmsg(state, ERR_STRING_ESC4, pystr, next - 2); goto bail; } /* Decode 4 hex digits */ for (; next < end; next++) { JSON_UNICHR hex_digit = (JSON_UNICHR)buf[next]; c <<= 4; switch (hex_digit) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': c |= (hex_digit - '0'); break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': c |= (hex_digit - 'a' + 10); break; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': c |= (hex_digit - 'A' + 10); break; default: raise_errmsg(state, ERR_STRING_ESC4, pystr, end - 6); goto bail; } } #if defined(Py_UNICODE_WIDE) /* Surrogate pair */ if ((c & 0xfc00) == 0xd800) { if (end + 6 <= len && buf[next] == '\\' && buf[next+1] == 'u') { JSON_UNICHR c2 = 0; end += 6; /* Decode 4 hex digits */ for (next += 2; next < end; next++) { c2 <<= 4; JSON_UNICHR hex_digit = buf[next]; switch (hex_digit) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': c2 |= (hex_digit - '0'); break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': c2 |= (hex_digit - 'a' + 10); break; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': c2 |= (hex_digit - 'A' + 10); break; default: raise_errmsg(state, ERR_STRING_ESC4, pystr, end - 6); goto bail; } } if ((c2 & 0xfc00) != 0xdc00) { /* not a low surrogate, rewind */ end -= 6; next = end; } else { c = 0x10000 + (((c - 0xd800) << 10) | (c2 - 0xdc00)); } } } #endif /* Py_UNICODE_WIDE */ } if (c > 0x7f) { has_unicode = 1; } APPEND_OLD_CHUNK if (has_unicode) { chunk = PyUnicode_FromOrdinal(c); if (chunk == NULL) { goto bail; } } else { char c_char = Py_CHARMASK(c); chunk = PyString_FromStringAndSize(&c_char, 1); if (chunk == NULL) { goto bail; } } } if (chunks == NULL) { if (chunk != NULL) rval = chunk; else { rval = state->JSON_EmptyStr; Py_INCREF(rval); } } else { APPEND_OLD_CHUNK rval = join_list_string(state, chunks); if (rval == NULL) { goto bail; } Py_CLEAR(chunks); } *next_end_ptr = end; return rval; bail: *next_end_ptr = -1; Py_XDECREF(chunk); Py_XDECREF(chunks); return NULL; } #endif /* PY_MAJOR_VERSION < 3 */ #if PY_VERSION_HEX >= 0x030E0000 static PyObject * scanstring_unicode(_speedups_state *state, PyObject *pystr, Py_ssize_t end, int strict, Py_ssize_t *next_end_ptr) { /* Python 3.14+: use PyUnicodeWriter instead of a chunks list. * The writer is lazily created on the first escape sequence so that * the common no-escape path returns a cheap PyUnicode_Substring. */ PyObject *rval; Py_ssize_t begin = end - 1; Py_ssize_t next = begin; int kind = PyUnicode_KIND(pystr); Py_ssize_t len = PyUnicode_GET_LENGTH(pystr); void *buf = PyUnicode_DATA(pystr); PyUnicodeWriter *writer = NULL; Py_ssize_t literal_start; if (len == end) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } else if (end < 0 || len < end) { /* Out-of-range end: match py_scanstring, which raises * JSONDecodeError("Unterminated string starting at") so that * user code using `except JSONDecodeError` catches the C path * the same way it catches the pure-Python path. */ raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } literal_start = end; while (1) { /* Find the end of the string or the next escape */ JSON_UNICHR c = 0; for (next = end; next < len; next++) { c = PyUnicode_READ(kind, buf, next); if (c == '"' || c == '\\') { break; } else if (strict && c <= 0x1f) { raise_errmsg(state, ERR_STRING_CONTROL, pystr, next); goto bail; } } if (!(c == '"' || c == '\\')) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } next++; if (c == '"') { end = next; break; } /* Backslash escape โ€” ensure writer exists and flush the * literal span [literal_start, next-1). */ if (writer == NULL) { writer = PyUnicodeWriter_Create(len - begin); if (writer == NULL) goto bail; } if (next - 1 > literal_start) { if (PyUnicodeWriter_WriteSubstring(writer, pystr, literal_start, next - 1) < 0) goto bail; } if (next == len) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } c = PyUnicode_READ(kind, buf, next); if (c != 'u') { /* Non-unicode backslash escapes */ end = next + 1; switch (c) { case '"': break; case '\\': break; case '/': break; case 'b': c = '\b'; break; case 'f': c = '\f'; break; case 'n': c = '\n'; break; case 'r': c = '\r'; break; case 't': c = '\t'; break; default: c = 0; } if (c == 0) { raise_errmsg(state, ERR_STRING_ESC1, pystr, end - 2); goto bail; } } else { c = 0; next++; end = next + 4; if (end > len) { raise_errmsg(state, ERR_STRING_ESC4, pystr, next - 2); goto bail; } /* Decode 4 hex digits */ for (; next < end; next++) { JSON_UNICHR hex_digit = PyUnicode_READ(kind, buf, next); c <<= 4; switch (hex_digit) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': c |= (hex_digit - '0'); break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': c |= (hex_digit - 'a' + 10); break; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': c |= (hex_digit - 'A' + 10); break; default: raise_errmsg(state, ERR_STRING_ESC4, pystr, end - 6); goto bail; } } /* Surrogate pair */ if ((c & 0xfc00) == 0xd800) { JSON_UNICHR c2 = 0; if (end + 6 <= len && PyUnicode_READ(kind, buf, next) == '\\' && PyUnicode_READ(kind, buf, next + 1) == 'u') { end += 6; /* Decode 4 hex digits */ for (next += 2; next < end; next++) { JSON_UNICHR hex_digit = PyUnicode_READ(kind, buf, next); c2 <<= 4; switch (hex_digit) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': c2 |= (hex_digit - '0'); break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': c2 |= (hex_digit - 'a' + 10); break; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': c2 |= (hex_digit - 'A' + 10); break; default: raise_errmsg(state, ERR_STRING_ESC4, pystr, end - 6); goto bail; } } if ((c2 & 0xfc00) != 0xdc00) { /* not a low surrogate, rewind */ end -= 6; next = end; } else { c = 0x10000 + (((c - 0xd800) << 10) | (c2 - 0xdc00)); } } } } if (PyUnicodeWriter_WriteChar(writer, c) < 0) goto bail; literal_start = end; } /* Finalize */ if (writer == NULL) { /* No escape sequences: return a substring directly. */ if (end - 1 > literal_start) rval = PyUnicode_Substring(pystr, literal_start, end - 1); else { rval = state->JSON_EmptyUnicode; Py_INCREF(rval); } } else { /* Flush trailing literal span after the last escape. */ if (end - 1 > literal_start) { if (PyUnicodeWriter_WriteSubstring(writer, pystr, literal_start, end - 1) < 0) goto bail; } rval = PyUnicodeWriter_Finish(writer); writer = NULL; /* Finish consumed the writer */ if (rval == NULL) goto bail; } *next_end_ptr = end; return rval; bail: if (writer != NULL) PyUnicodeWriter_Discard(writer); *next_end_ptr = -1; return NULL; } #else /* PY_VERSION_HEX < 0x030E0000 */ static PyObject * scanstring_unicode(_speedups_state *state, PyObject *pystr, Py_ssize_t end, int strict, Py_ssize_t *next_end_ptr) { /* Read the JSON string from PyUnicode pystr. end is the index of the first character after the quote. if strict is zero then literal control characters are allowed *next_end_ptr is a return-by-reference index of the character after the end quote Return value is a new PyUnicode */ PyObject *rval; Py_ssize_t begin = end - 1; Py_ssize_t next = begin; PY2_UNUSED int kind = PyUnicode_KIND(pystr); Py_ssize_t len = PyUnicode_GET_LENGTH(pystr); void *buf = PyUnicode_DATA(pystr); PyObject *chunks = NULL; PyObject *chunk = NULL; if (len == end) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } else if (end < 0 || len < end) { /* Out-of-range end: match py_scanstring, which raises * JSONDecodeError("Unterminated string starting at") so that * user code using `except JSONDecodeError` catches the C path * the same way it catches the pure-Python path. */ raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } while (1) { /* Find the end of the string or the next escape */ JSON_UNICHR c = 0; for (next = end; next < len; next++) { c = PyUnicode_READ(kind, buf, next); if (c == '"' || c == '\\') { break; } else if (strict && c <= 0x1f) { raise_errmsg(state, ERR_STRING_CONTROL, pystr, next); goto bail; } } if (!(c == '"' || c == '\\')) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } /* Pick up this chunk if it's not zero length */ if (next != end) { APPEND_OLD_CHUNK #if PY_MAJOR_VERSION < 3 chunk = PyUnicode_FromUnicode(&((const Py_UNICODE *)buf)[end], next - end); #else chunk = PyUnicode_Substring(pystr, end, next); #endif if (chunk == NULL) { goto bail; } } next++; if (c == '"') { end = next; break; } if (next == len) { raise_errmsg(state, ERR_STRING_UNTERMINATED, pystr, begin); goto bail; } c = PyUnicode_READ(kind, buf, next); if (c != 'u') { /* Non-unicode backslash escapes */ end = next + 1; switch (c) { case '"': break; case '\\': break; case '/': break; case 'b': c = '\b'; break; case 'f': c = '\f'; break; case 'n': c = '\n'; break; case 'r': c = '\r'; break; case 't': c = '\t'; break; default: c = 0; } if (c == 0) { raise_errmsg(state, ERR_STRING_ESC1, pystr, end - 2); goto bail; } } else { c = 0; next++; end = next + 4; if (end > len) { raise_errmsg(state, ERR_STRING_ESC4, pystr, next - 2); goto bail; } /* Decode 4 hex digits */ for (; next < end; next++) { JSON_UNICHR hex_digit = PyUnicode_READ(kind, buf, next); c <<= 4; switch (hex_digit) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': c |= (hex_digit - '0'); break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': c |= (hex_digit - 'a' + 10); break; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': c |= (hex_digit - 'A' + 10); break; default: raise_errmsg(state, ERR_STRING_ESC4, pystr, end - 6); goto bail; } } #if PY_MAJOR_VERSION >= 3 || defined(Py_UNICODE_WIDE) /* Surrogate pair */ if ((c & 0xfc00) == 0xd800) { JSON_UNICHR c2 = 0; if (end + 6 <= len && PyUnicode_READ(kind, buf, next) == '\\' && PyUnicode_READ(kind, buf, next + 1) == 'u') { end += 6; /* Decode 4 hex digits */ for (next += 2; next < end; next++) { JSON_UNICHR hex_digit = PyUnicode_READ(kind, buf, next); c2 <<= 4; switch (hex_digit) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': c2 |= (hex_digit - '0'); break; case 'a': case 'b': case 'c': case 'd': case 'e': case 'f': c2 |= (hex_digit - 'a' + 10); break; case 'A': case 'B': case 'C': case 'D': case 'E': case 'F': c2 |= (hex_digit - 'A' + 10); break; default: raise_errmsg(state, ERR_STRING_ESC4, pystr, end - 6); goto bail; } } if ((c2 & 0xfc00) != 0xdc00) { /* not a low surrogate, rewind */ end -= 6; next = end; } else { c = 0x10000 + (((c - 0xd800) << 10) | (c2 - 0xdc00)); } } } #endif } APPEND_OLD_CHUNK chunk = PyUnicode_FromOrdinal(c); if (chunk == NULL) { goto bail; } } if (chunks == NULL) { if (chunk != NULL) rval = chunk; else { rval = state->JSON_EmptyUnicode; Py_INCREF(rval); } } else { APPEND_OLD_CHUNK rval = join_list_unicode(state, chunks); if (rval == NULL) { goto bail; } Py_CLEAR(chunks); } *next_end_ptr = end; return rval; bail: *next_end_ptr = -1; Py_XDECREF(chunk); Py_XDECREF(chunks); return NULL; } #endif /* PY_VERSION_HEX >= 0x030E0000 */ PyDoc_STRVAR(pydoc_scanstring, "scanstring(basestring, end, encoding, strict=True) -> (str, end)\n" "\n" "Scan the string s for a JSON string. End is the index of the\n" "character in s after the quote that started the JSON string.\n" "Unescapes all valid JSON string escape sequences and raises ValueError\n" "on attempt to decode an invalid string. If strict is False then literal\n" "control characters are allowed in the string.\n" "\n" "Returns a tuple of the decoded string and the index of the character in s\n" "after the end quote." ); static PyObject * py_scanstring(PyObject* self UNUSED, PyObject *args) { PyObject *pystr; PyObject *rval; Py_ssize_t end; Py_ssize_t next_end = -1; char *encoding = NULL; int strict = 1; if (!PyArg_ParseTuple(args, "On|zi:scanstring", &pystr, &end, &encoding, &strict)) { return NULL; } if (encoding == NULL) { encoding = DEFAULT_ENCODING; } if (PyUnicode_Check(pystr)) { if (PyUnicode_READY(pystr)) return NULL; rval = scanstring_unicode(get_speedups_state(self), pystr, end, strict, &next_end); } #if PY_MAJOR_VERSION < 3 /* Using a bytes input is unsupported for scanning in Python 3. It is coerced to str in the decoder before it gets here. */ else if (PyString_Check(pystr)) { rval = scanstring_str(get_speedups_state(self), pystr, end, encoding, strict, &next_end); } #endif else { PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name); return NULL; } return _build_rval_index_tuple(rval, next_end); } PyDoc_STRVAR(pydoc_encode_basestring_ascii, "encode_basestring_ascii(basestring) -> str\n" "\n" "Return an ASCII-only JSON representation of a Python string" ); PyDoc_STRVAR(pydoc_encode_basestring, "encode_basestring(basestring) -> str\n" "\n" "Return a JSON representation of a Python string" ); static PyObject * py_encode_basestring_ascii(PyObject* self UNUSED, PyObject *pystr) { /* Return an ASCII-only JSON representation of a Python string */ /* METH_O */ if (PyBytes_Check(pystr)) { return ascii_escape_str(pystr); } else if (PyUnicode_Check(pystr)) { if (PyUnicode_READY(pystr)) return NULL; return ascii_escape_unicode(pystr); } else { PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name); return NULL; } } /* encode_basestring: escape only control chars, backslash, and quote. * Non-ASCII characters pass through unchanged (ensure_ascii=False). */ #if PY_VERSION_HEX < 0x030E0000 || PY_MAJOR_VERSION < 3 /* Only needed by the two-pass escape_unicode_noascii (pre-3.14 and * Python 2). The PyUnicodeWriter path on 3.14+ writes escapes inline. */ static Py_ssize_t escape_char_noascii(JSON_UNICHR c, JSON_UNICHR *output, Py_ssize_t chars) { if (!NEEDS_ESCAPE(c)) { output[chars++] = c; } else { output[chars++] = '\\'; switch (c) { case '\\': output[chars++] = '\\'; break; case '"': output[chars++] = '"'; break; case '\b': output[chars++] = 'b'; break; case '\f': output[chars++] = 'f'; break; case '\n': output[chars++] = 'n'; break; case '\r': output[chars++] = 'r'; break; case '\t': output[chars++] = 't'; break; default: /* Control character: \u00XX */ output[chars++] = 'u'; output[chars++] = '0'; output[chars++] = '0'; output[chars++] = "0123456789abcdef"[(c >> 4) & 0xf]; output[chars++] = "0123456789abcdef"[(c ) & 0xf]; } } return chars; } static Py_ssize_t escape_char_noascii_size(JSON_UNICHR c) { if (!NEEDS_ESCAPE(c)) return 1; switch (c) { case '\\': case '"': case '\b': case '\f': case '\n': case '\r': case '\t': return 2; default: return 6; /* \u00XX */ } } #endif /* PY_VERSION_HEX < 0x030E0000 || PY_MAJOR_VERSION < 3 */ #if PY_VERSION_HEX >= 0x030E0000 static PyObject * escape_unicode_noascii(PyObject *pystr) { /* Single-pass using PyUnicodeWriter (Python 3.14+). */ Py_ssize_t i; Py_ssize_t input_chars = PyUnicode_GET_LENGTH(pystr); int kind = PyUnicode_KIND(pystr); void *data = PyUnicode_DATA(pystr); Py_ssize_t run_start = 0; PyUnicodeWriter *writer = PyUnicodeWriter_Create(input_chars + 2); if (writer == NULL) return NULL; if (PyUnicodeWriter_WriteChar(writer, '"') < 0) goto bail; for (i = 0; i < input_chars; i++) { JSON_UNICHR c = PyUnicode_READ(kind, data, i); if (!NEEDS_ESCAPE(c)) continue; /* Flush run of safe characters */ if (i > run_start) { if (PyUnicodeWriter_WriteSubstring(writer, pystr, run_start, i) < 0) goto bail; } { char buf[6]; /* longest escape: \u00XX */ Py_ssize_t len = 0; buf[len++] = '\\'; switch (c) { case '\\': buf[len++] = '\\'; break; case '"': buf[len++] = '"'; break; case '\b': buf[len++] = 'b'; break; case '\f': buf[len++] = 'f'; break; case '\n': buf[len++] = 'n'; break; case '\r': buf[len++] = 'r'; break; case '\t': buf[len++] = 't'; break; default: buf[len++] = 'u'; buf[len++] = '0'; buf[len++] = '0'; buf[len++] = "0123456789abcdef"[(c >> 4) & 0xf]; buf[len++] = "0123456789abcdef"[(c ) & 0xf]; } if (PyUnicodeWriter_WriteUTF8(writer, buf, len) < 0) goto bail; } run_start = i + 1; } if (i > run_start) { if (PyUnicodeWriter_WriteSubstring(writer, pystr, run_start, i) < 0) goto bail; } if (PyUnicodeWriter_WriteChar(writer, '"') < 0) goto bail; return PyUnicodeWriter_Finish(writer); bail: PyUnicodeWriter_Discard(writer); return NULL; } #else /* PY_VERSION_HEX < 0x030E0000 */ static PyObject * escape_unicode_noascii(PyObject *pystr) { /* Two-pass: compute size, then fill. */ Py_ssize_t i; Py_ssize_t input_chars = PyUnicode_GET_LENGTH(pystr); PY2_UNUSED int kind = PyUnicode_KIND(pystr); void *data = PyUnicode_DATA(pystr); Py_ssize_t output_size = 2; /* opening and closing quotes */ PyObject *rval; for (i = 0; i < input_chars; i++) { JSON_UNICHR c = PyUnicode_READ(kind, data, i); Py_ssize_t charsize = escape_char_noascii_size(c); if (output_size > PY_SSIZE_T_MAX - charsize) { PyErr_SetString(PyExc_OverflowError, "string is too long to escape"); return NULL; } output_size += charsize; } #if PY_MAJOR_VERSION >= 3 { /* Escapes are all ASCII, so maxchar doesn't increase */ Py_UCS4 maxchar = PyUnicode_MAX_CHAR_VALUE(pystr); if (maxchar < 127) maxchar = 127; rval = PyUnicode_New(output_size, maxchar); if (rval == NULL) return NULL; } { int out_kind = PyUnicode_KIND(rval); void *out_data = PyUnicode_DATA(rval); Py_ssize_t chars = 0; PyUnicode_WRITE(out_kind, out_data, chars++, '"'); for (i = 0; i < input_chars; i++) { JSON_UNICHR c = PyUnicode_READ(kind, data, i); if (!NEEDS_ESCAPE(c)) { PyUnicode_WRITE(out_kind, out_data, chars++, c); } else { JSON_UNICHR buf[6]; Py_ssize_t j, n = escape_char_noascii(c, buf, 0); for (j = 0; j < n; j++) PyUnicode_WRITE(out_kind, out_data, chars++, buf[j]); } } PyUnicode_WRITE(out_kind, out_data, chars++, '"'); assert(chars == output_size); } #else /* Python 2: return a unicode object */ rval = PyUnicode_FromUnicode(NULL, output_size); if (rval == NULL) return NULL; { Py_UNICODE *output = PyUnicode_AS_UNICODE(rval); Py_ssize_t chars = 0; output[chars++] = '"'; for (i = 0; i < input_chars; i++) { chars = escape_char_noascii( PyUnicode_READ(kind, data, i), output, chars); } output[chars++] = '"'; assert(chars == output_size); } #endif return rval; } #endif /* PY_VERSION_HEX >= 0x030E0000 */ static PyObject * py_encode_basestring(PyObject* self UNUSED, PyObject *pystr) { /* Return a JSON representation of a Python string (ensure_ascii=False) */ /* METH_O */ if (PyBytes_Check(pystr)) { PyObject *uni; PyObject *rval; uni = PyUnicode_DecodeUTF8( PyBytes_AS_STRING(pystr), PyBytes_GET_SIZE(pystr), NULL); if (uni == NULL) return NULL; rval = escape_unicode_noascii(uni); Py_DECREF(uni); return rval; } else if (PyUnicode_Check(pystr)) { if (PyUnicode_READY(pystr)) return NULL; return escape_unicode_noascii(pystr); } else { PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name); return NULL; } } static void scanner_dealloc(PyObject *self) { /* bpo-31095: UnTrack is needed before calling any callbacks */ #if PY_VERSION_HEX >= 0x030D0000 PyTypeObject *tp = Py_TYPE(self); #endif PyObject_GC_UnTrack(self); scanner_clear(self); Py_TYPE(self)->tp_free(self); #if PY_VERSION_HEX >= 0x030D0000 Py_DECREF(tp); #endif } static int scanner_traverse(PyObject *self, visitproc visit, void *arg) { PyScannerObject *s = (PyScannerObject *)self; #if PY_VERSION_HEX >= 0x030D0000 /* Heap types must visit their type for GC. */ Py_VISIT(Py_TYPE(self)); #else assert(PyScanner_Check(self)); #endif #define JSON_VISIT_FIELD(f) Py_VISIT(s->f); JSON_SCANNER_OBJECT_FIELDS(JSON_VISIT_FIELD) #undef JSON_VISIT_FIELD return 0; } static int scanner_clear(PyObject *self) { PyScannerObject *s = (PyScannerObject *)self; #if PY_VERSION_HEX < 0x030D0000 assert(PyScanner_Check(self)); #endif #define JSON_CLEAR_FIELD(f) Py_CLEAR(s->f); JSON_SCANNER_OBJECT_FIELDS(JSON_CLEAR_FIELD) #undef JSON_CLEAR_FIELD return 0; } static PyObject * _parse_constant(PyScannerObject *s, PyObject *pystr, PyObject *constant, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) { /* Read a JSON constant from pystr. `constant` is the Python string that was found ("NaN", "Infinity", "-Infinity"). Returns the result of s->parse_constant(constant). */ PyObject *rval; if (s->parse_constant == Py_None) { raise_errmsg(get_speedups_state(s->module_ref), ERR_EXPECTING_VALUE, pystr, idx); return NULL; } rval = PyObject_CallOneArg(s->parse_constant, constant); idx += PyString_GET_SIZE(constant); *next_idx_ptr = idx; return rval; } /* -- Helper functions for _match_number fast paths (used by the _speedups_scan.h template). Factored out so the template can stay agnostic about PyFloat / PyInt vs PyObject_CallOneArg details. */ static inline PyObject * _match_number_float_fast_unicode(PyObject *numstr) { #if PY_MAJOR_VERSION >= 3 return PyFloat_FromString(numstr); #else return PyFloat_FromString(numstr, NULL); #endif } static inline PyObject * _match_number_int_fast_unicode(PyScannerObject *s, PyObject *numstr) { /* No fast path for unicode -> int; always call parse_int. */ return PyObject_CallOneArg(s->parse_int, numstr); } #if PY_MAJOR_VERSION < 3 static inline PyObject * _match_number_float_fast_str(PyObject *numstr) { double d = PyOS_string_to_double(PyString_AS_STRING(numstr), NULL, NULL); if (d == -1.0 && PyErr_Occurred()) return NULL; return PyFloat_FromDouble(d); } static inline PyObject * _match_number_int_fast_str(PyScannerObject *s, PyObject *numstr) { if (s->parse_int != (PyObject *)&PyInt_Type) { return PyObject_CallOneArg(s->parse_int, numstr); } return PyInt_FromString(PyString_AS_STRING(numstr), NULL, 10); } #endif /* -- Generate scan_once_unicode, _parse_object_unicode, _parse_array_unicode, _match_number_unicode from the shared template. -- */ #define JSON_SCAN_SUFFIX _unicode #define JSON_SCAN_DATA_INIT(p) \ PY2_UNUSED int kind = PyUnicode_KIND(p); \ void *str = PyUnicode_DATA(p); \ Py_ssize_t end_idx = PyUnicode_GET_LENGTH(p) - 1 #define JSON_SCAN_READ(i) PyUnicode_READ(kind, str, (i)) #define JSON_SCAN_SCANSTRING_CALL(pos, nextp) \ scanstring_unicode(state, pystr, (pos), s->strict, (nextp)) #if PY_MAJOR_VERSION >= 3 #define JSON_SCAN_NUMSTR_CREATE(sidx, eidx) \ PyUnicode_Substring(pystr, (sidx), (eidx)) #else #define JSON_SCAN_NUMSTR_CREATE(sidx, eidx) \ PyUnicode_FromUnicode(&((Py_UNICODE *)str)[(sidx)], (eidx) - (sidx)) #endif #define JSON_SCAN_PARSE_FLOAT_FAST(ns) _match_number_float_fast_unicode(ns) #define JSON_SCAN_PARSE_INT_FAST(ns) _match_number_int_fast_unicode(s, ns) #define JSON_SPEEDUPS_SCAN_INCLUDING 1 #include "_speedups_scan.h" #undef JSON_SPEEDUPS_SCAN_INCLUDING /* -- Generate the corresponding _str variants on Python 2. -- */ #if PY_MAJOR_VERSION < 3 #define JSON_SCAN_SUFFIX _str #define JSON_SCAN_DATA_INIT(p) \ char *str = PyString_AS_STRING(p); \ Py_ssize_t end_idx = PyString_GET_SIZE(p) - 1 #define JSON_SCAN_READ(i) ((unsigned char)str[(i)]) #define JSON_SCAN_SCANSTRING_CALL(pos, nextp) \ scanstring_str(state, pystr, (pos), \ PyString_AS_STRING(s->encoding), s->strict, (nextp)) #define JSON_SCAN_NUMSTR_CREATE(sidx, eidx) \ PyString_FromStringAndSize(&str[(sidx)], (eidx) - (sidx)) #define JSON_SCAN_PARSE_FLOAT_FAST(ns) _match_number_float_fast_str(ns) #define JSON_SCAN_PARSE_INT_FAST(ns) _match_number_int_fast_str(s, ns) #define JSON_SPEEDUPS_SCAN_INCLUDING 1 #include "_speedups_scan.h" #undef JSON_SPEEDUPS_SCAN_INCLUDING #endif /* PY_MAJOR_VERSION < 3 */ static PyObject * scanner_call(PyObject *self, PyObject *args, PyObject *kwds) { /* Python callable interface to scan_once_{str,unicode} */ PyObject *pystr; PyObject *rval = NULL; Py_ssize_t idx; Py_ssize_t next_idx = -1; static char *kwlist[] = {"string", "idx", NULL}; PyScannerObject *s; #if PY_VERSION_HEX < 0x030D0000 assert(PyScanner_Check(self)); #endif s = (PyScannerObject *)self; if (!PyArg_ParseTupleAndKeywords(args, kwds, "On:scan_once", kwlist, &pystr, &idx)) return NULL; if (PyUnicode_Check(pystr)) { if (PyUnicode_READY(pystr)) return NULL; } #if PY_MAJOR_VERSION < 3 else if (!PyString_Check(pystr)) { #else else { #endif PyErr_Format(PyExc_TypeError, "first argument must be a string, not %.80s", Py_TYPE(pystr)->tp_name); return NULL; } Py_BEGIN_CRITICAL_SECTION(self); if (PyUnicode_Check(pystr)) { rval = scan_once_unicode(s, pystr, idx, &next_idx); } #if PY_MAJOR_VERSION < 3 else { rval = scan_once_str(s, pystr, idx, &next_idx); } #endif /* PY_MAJOR_VERSION < 3 */ PyDict_Clear(s->memo); Py_END_CRITICAL_SECTION(); return _build_rval_index_tuple(rval, next_idx); } static PyObject * JSON_ParseEncoding(PyObject *encoding) { if (encoding == Py_None) return JSON_InternFromString(DEFAULT_ENCODING); #if PY_MAJOR_VERSION >= 3 if (PyUnicode_Check(encoding)) { if (PyUnicode_AsUTF8(encoding) == NULL) { return NULL; } Py_INCREF(encoding); return encoding; } #else /* PY_MAJOR_VERSION >= 3 */ if (PyString_Check(encoding)) { Py_INCREF(encoding); return encoding; } if (PyUnicode_Check(encoding)) return PyUnicode_AsEncodedString(encoding, NULL, NULL); #endif /* PY_MAJOR_VERSION >= 3 */ PyErr_SetString(PyExc_TypeError, "encoding must be a string"); return NULL; } static PyObject * scanner_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { /* Initialize Scanner object */ PyObject *ctx; static char *kwlist[] = {"context", NULL}; PyScannerObject *s; PyObject *encoding; if (!PyArg_ParseTupleAndKeywords(args, kwds, "O:make_scanner", kwlist, &ctx)) return NULL; s = (PyScannerObject *)type->tp_alloc(type, 0); if (s == NULL) return NULL; #if PY_VERSION_HEX >= 0x030D0000 s->module_ref = PyType_GetModuleByDef(type, &moduledef); if (s->module_ref == NULL) goto bail; #else s->module_ref = _speedups_module; /* borrowed */ #endif Py_INCREF(s->module_ref); if (s->memo == NULL) { s->memo = PyDict_New(); if (s->memo == NULL) goto bail; } /* Load required attributes from the Python-side JSONDecoder context. * Each getattr failure is a hard error; goto bail lets scanner_dealloc * release whatever we managed to set on s. */ #define LOAD_ATTR(field, name) \ do { \ s->field = PyObject_GetAttrString(ctx, name); \ if (s->field == NULL) \ goto bail; \ } while (0) encoding = PyObject_GetAttrString(ctx, "encoding"); if (encoding == NULL) goto bail; s->encoding = JSON_ParseEncoding(encoding); Py_XDECREF(encoding); if (s->encoding == NULL) goto bail; LOAD_ATTR(strict_bool, "strict"); s->strict = PyObject_IsTrue(s->strict_bool); if (s->strict < 0) goto bail; LOAD_ATTR(object_hook, "object_hook"); LOAD_ATTR(pairs_hook, "object_pairs_hook"); LOAD_ATTR(array_hook, "array_hook"); LOAD_ATTR(parse_float, "parse_float"); LOAD_ATTR(parse_int, "parse_int"); LOAD_ATTR(parse_constant, "parse_constant"); #undef LOAD_ATTR return (PyObject *)s; bail: Py_DECREF(s); return NULL; } PyDoc_STRVAR(scanner_doc, "JSON scanner object"); #if PY_VERSION_HEX >= 0x030D0000 /* Heap type slots and spec for Python 3.13+ */ static PyType_Slot PyScannerType_slots[] = { {Py_tp_doc, (void *)scanner_doc}, {Py_tp_dealloc, scanner_dealloc}, {Py_tp_call, scanner_call}, {Py_tp_traverse, scanner_traverse}, {Py_tp_clear, scanner_clear}, {Py_tp_members, scanner_members}, {Py_tp_new, scanner_new}, {0, NULL} }; static PyType_Spec PyScannerType_spec = { .name = "simplejson._speedups.Scanner", .basicsize = sizeof(PyScannerObject), .flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC, .slots = PyScannerType_slots, }; #else static PyTypeObject PyScannerType = { PyVarObject_HEAD_INIT(NULL, 0) "simplejson._speedups.Scanner", /* tp_name */ sizeof(PyScannerObject), /* tp_basicsize */ 0, /* tp_itemsize */ scanner_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ scanner_call, /* tp_call */ 0, /* tp_str */ 0,/* PyObject_GenericGetAttr, */ /* tp_getattro */ 0,/* PyObject_GenericSetAttr, */ /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC, /* tp_flags */ scanner_doc, /* tp_doc */ scanner_traverse, /* tp_traverse */ scanner_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ scanner_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ 0, /* tp_init */ 0,/* PyType_GenericAlloc, */ /* tp_alloc */ scanner_new, /* tp_new */ 0,/* PyObject_GC_Del, */ /* tp_free */ }; #endif static PyObject * encoder_new(PyTypeObject *type, PyObject *args, PyObject *kwds) { static char *kwlist[] = { "markers", "default", "encoder", "indent", "key_separator", "item_separator", "sort_keys", "skipkeys", "allow_nan", "key_memo", "use_decimal", "namedtuple_as_object", "tuple_as_array", "int_as_string_bitcount", "item_sort_key", "encoding", "for_json", "ignore_nan", "Decimal", "iterable_as_array", NULL}; PyEncoderObject *s; PyObject *markers, *defaultfn, *encoder, *indent, *key_separator; PyObject *item_separator, *sort_keys, *skipkeys, *allow_nan, *key_memo; PyObject *use_decimal, *namedtuple_as_object, *tuple_as_array, *iterable_as_array; PyObject *int_as_string_bitcount, *item_sort_key, *encoding, *for_json; PyObject *ignore_nan, *Decimal; int is_true; /* Build the format string from per-argument pieces so that each "O" * has a comment and adding/removing an argument only touches one * line instead of counting letters in a 20-char string literal. The * order here must match kwlist[] above and the &var argument list * to PyArg_ParseTupleAndKeywords below. */ static const char *const fmt = "O" /* markers */ "O" /* default */ "O" /* encoder */ "O" /* indent */ "O" /* key_separator */ "O" /* item_separator */ "O" /* sort_keys */ "O" /* skipkeys */ "O" /* allow_nan */ "O" /* key_memo */ "O" /* use_decimal */ "O" /* namedtuple_as_object */ "O" /* tuple_as_array */ "O" /* int_as_string_bitcount */ "O" /* item_sort_key */ "O" /* encoding */ "O" /* for_json */ "O" /* ignore_nan */ "O" /* Decimal */ "O" /* iterable_as_array */ ":make_encoder"; if (!PyArg_ParseTupleAndKeywords(args, kwds, fmt, kwlist, &markers, &defaultfn, &encoder, &indent, &key_separator, &item_separator, &sort_keys, &skipkeys, &allow_nan, &key_memo, &use_decimal, &namedtuple_as_object, &tuple_as_array, &int_as_string_bitcount, &item_sort_key, &encoding, &for_json, &ignore_nan, &Decimal, &iterable_as_array)) return NULL; s = (PyEncoderObject *)type->tp_alloc(type, 0); if (s == NULL) return NULL; #if PY_VERSION_HEX >= 0x030D0000 s->module_ref = PyType_GetModuleByDef(type, &moduledef); if (s->module_ref == NULL) goto bail; #else s->module_ref = _speedups_module; /* borrowed */ #endif Py_INCREF(s->module_ref); Py_INCREF(markers); s->markers = markers; Py_INCREF(defaultfn); s->defaultfn = defaultfn; Py_INCREF(encoder); s->encoder = encoder; #if PY_MAJOR_VERSION >= 3 if (encoding == Py_None) { /* Py3: encoding=None means "don't decode bytes keys/values". * Store Py_None rather than NULL so Py_T_OBJECT_EX exposes the * attribute as None (not AttributeError) and so tp_traverse / * tp_clear handle the slot uniformly. The bytes-path sentinel * checks in encoder_stringify_key and encoder_listencode_obj * compare against Py_None. */ Py_INCREF(Py_None); s->encoding = Py_None; } else #endif /* PY_MAJOR_VERSION >= 3 */ { s->encoding = JSON_ParseEncoding(encoding); if (s->encoding == NULL) goto bail; } Py_INCREF(indent); s->indent = indent; Py_INCREF(key_separator); s->key_separator = key_separator; Py_INCREF(item_separator); s->item_separator = item_separator; Py_INCREF(skipkeys); s->skipkeys_bool = skipkeys; s->skipkeys = PyObject_IsTrue(skipkeys); if (s->skipkeys < 0) goto bail; Py_INCREF(key_memo); s->key_memo = key_memo; s->fast_encode = (PyCFunction_Check(s->encoder) && PyCFunction_GetFunction(s->encoder) == (PyCFunction)py_encode_basestring_ascii); is_true = PyObject_IsTrue(ignore_nan); if (is_true < 0) goto bail; s->allow_or_ignore_nan = is_true ? JSON_IGNORE_NAN : 0; is_true = PyObject_IsTrue(allow_nan); if (is_true < 0) goto bail; s->allow_or_ignore_nan |= is_true ? JSON_ALLOW_NAN : 0; s->use_decimal = PyObject_IsTrue(use_decimal); if (s->use_decimal < 0) goto bail; s->namedtuple_as_object = PyObject_IsTrue(namedtuple_as_object); if (s->namedtuple_as_object < 0) goto bail; s->tuple_as_array = PyObject_IsTrue(tuple_as_array); if (s->tuple_as_array < 0) goto bail; s->iterable_as_array = PyObject_IsTrue(iterable_as_array); if (s->iterable_as_array < 0) goto bail; if (PyInt_Check(int_as_string_bitcount) || PyLong_Check(int_as_string_bitcount)) { static const unsigned long long_long_bitsize = sizeof(long long) * CHAR_BIT; long int_as_string_bitcount_val = PyLong_AsLong(int_as_string_bitcount); if (int_as_string_bitcount_val == -1 && PyErr_Occurred()) goto bail; if (int_as_string_bitcount_val > 0 && int_as_string_bitcount_val < (long)long_long_bitsize) { int n = (int)int_as_string_bitcount_val; /* Compute 2^n as unsigned (well-defined for n < 64) and * -(2^n) as signed without UB. Naive "-1LL << n" is a * shift of a negative value, which is undefined, and * "-(1LL << n)" overflows when n == 63. The expression * below avoids both: (1ULL << n) - 1 is always >= 0, so * negating it and subtracting 1 stays in range and * produces LLONG_MIN at n == 63. */ s->max_long_size = PyLong_FromUnsignedLongLong(1ULL << n); s->min_long_size = PyLong_FromLongLong( -(long long)((1ULL << n) - 1ULL) - 1LL); if (s->min_long_size == NULL || s->max_long_size == NULL) { goto bail; } } else { PyErr_Format(PyExc_TypeError, "int_as_string_bitcount (%ld) must be greater than 0 and less than the number of bits of a `long long` type (%lu bits)", int_as_string_bitcount_val, long_long_bitsize); goto bail; } } else if (int_as_string_bitcount == Py_None) { Py_INCREF(Py_None); s->max_long_size = Py_None; Py_INCREF(Py_None); s->min_long_size = Py_None; } else { PyErr_SetString(PyExc_TypeError, "int_as_string_bitcount must be None or an integer"); goto bail; } if (item_sort_key != Py_None) { if (!PyCallable_Check(item_sort_key)) { PyErr_SetString(PyExc_TypeError, "item_sort_key must be None or callable"); goto bail; } } else { is_true = PyObject_IsTrue(sort_keys); if (is_true < 0) goto bail; if (is_true) { _speedups_state *state = get_speedups_state(s->module_ref); item_sort_key = state->JSON_itemgetter0; if (!item_sort_key) goto bail; } } if (item_sort_key == Py_None) { Py_INCREF(Py_None); s->item_sort_kw = Py_None; } else { s->item_sort_kw = PyDict_New(); if (s->item_sort_kw == NULL) goto bail; if (PyDict_SetItemString(s->item_sort_kw, "key", item_sort_key)) goto bail; } Py_INCREF(sort_keys); s->sort_keys = sort_keys; Py_INCREF(item_sort_key); s->item_sort_key = item_sort_key; Py_INCREF(Decimal); s->Decimal = Decimal; s->for_json = PyObject_IsTrue(for_json); if (s->for_json < 0) goto bail; return (PyObject *)s; bail: Py_DECREF(s); return NULL; } static PyObject * encoder_call(PyObject *self, PyObject *args, PyObject *kwds) { /* Python callable interface to encode_listencode_obj */ static char *kwlist[] = {"obj", "_current_indent_level", NULL}; PyObject *obj; Py_ssize_t indent_level; PyEncoderObject *s; JSON_Accu rval; _speedups_state *state; int encode_rv; #if PY_VERSION_HEX < 0x030D0000 assert(PyEncoder_Check(self)); #endif s = (PyEncoderObject *)self; state = get_speedups_state(s->module_ref); if (!PyArg_ParseTupleAndKeywords(args, kwds, "On:_iterencode", kwlist, &obj, &indent_level)) return NULL; if (JSON_Accu_Init(&rval)) return NULL; Py_BEGIN_CRITICAL_SECTION(self); encode_rv = encoder_listencode_obj(s, &rval, obj, indent_level); Py_END_CRITICAL_SECTION(); if (encode_rv) { JSON_Accu_Destroy(&rval); return NULL; } return JSON_Accu_FinishAsList(state, &rval); } static PyObject * _encoded_const(_speedups_state *state, PyObject *obj) { /* Return the JSON string representation of None, True, False */ if (obj == Py_None) { Py_INCREF(state->JSON_s_null); return state->JSON_s_null; } else if (obj == Py_True) { Py_INCREF(state->JSON_s_true); return state->JSON_s_true; } else if (obj == Py_False) { Py_INCREF(state->JSON_s_false); return state->JSON_s_false; } else { PyErr_SetString(PyExc_ValueError, "not a const"); return NULL; } } static PyObject * encoder_encode_float(PyEncoderObject *s, PyObject *obj) { /* Return the JSON representation of a PyFloat */ _speedups_state *state = get_speedups_state(s->module_ref); double i = PyFloat_AS_DOUBLE(obj); if (!Py_IS_FINITE(i)) { if (!s->allow_or_ignore_nan) { PyErr_SetString(PyExc_ValueError, "Out of range float values are not JSON compliant"); return NULL; } if (s->allow_or_ignore_nan & JSON_IGNORE_NAN) { return _encoded_const(state, Py_None); } /* JSON_ALLOW_NAN is set */ else if (i > 0) { Py_INCREF(state->JSON_Infinity); return state->JSON_Infinity; } else if (i < 0) { Py_INCREF(state->JSON_NegInfinity); return state->JSON_NegInfinity; } else { Py_INCREF(state->JSON_NaN); return state->JSON_NaN; } } /* Use a better float format here? */ if (PyFloat_CheckExact(obj)) { return PyObject_Repr(obj); } else { /* See #118, do not trust custom str/repr */ PyObject *res; PyObject *tmp = PyObject_CallOneArg((PyObject *)&PyFloat_Type, obj); if (tmp == NULL) { return NULL; } res = PyObject_Repr(tmp); Py_DECREF(tmp); return res; } } static PyObject * encoder_encode_string(PyEncoderObject *s, PyObject *obj) { /* Return the JSON representation of a string */ PyObject *encoded; if (s->fast_encode) { return py_encode_basestring_ascii(NULL, obj); } encoded = PyObject_CallOneArg(s->encoder, obj); if (encoded != NULL && #if PY_MAJOR_VERSION < 3 !PyString_Check(encoded) && #endif /* PY_MAJOR_VERSION < 3 */ !PyUnicode_Check(encoded)) { PyErr_Format(PyExc_TypeError, "encoder() must return a string, not %.80s", Py_TYPE(encoded)->tp_name); Py_DECREF(encoded); return NULL; } return encoded; } static int _steal_accumulate(_speedups_state *state, JSON_Accu *accu, PyObject *stolen) { /* Append stolen and then decrement its reference count */ int rval = JSON_Accu_Accumulate(state, accu, stolen); Py_DECREF(stolen); return rval; } /* Push a reference to `obj` into the encoder's circular-reference marker * dict, keyed by the object's address. Allocates a fresh PyLong ident and * stores it in *ident_ptr (caller owns the reference, and must pass it to * encoder_markers_pop to remove the entry). If markers tracking is off * (s->markers == Py_None), stores NULL in *ident_ptr and returns 0 with * no side effects โ€” that NULL is the sentinel markers_pop understands. * On circular reference sets the ValueError; on any other failure sets * the corresponding exception. Returns 0 on success, -1 on error. */ static int encoder_markers_push(PyEncoderObject *s, PyObject *obj, PyObject **ident_ptr) { PyObject *ident; int has_key; *ident_ptr = NULL; if (s->markers == Py_None) return 0; ident = PyLong_FromVoidPtr(obj); if (ident == NULL) return -1; has_key = PyDict_Contains(s->markers, ident); if (has_key) { if (has_key != -1) PyErr_SetString(PyExc_ValueError, "Circular reference detected"); Py_DECREF(ident); return -1; } if (PyDict_SetItem(s->markers, ident, obj) < 0) { Py_DECREF(ident); return -1; } *ident_ptr = ident; return 0; } /* Counterpart to encoder_markers_push. Removes the ident entry from * s->markers and drops the caller's reference. Passing NULL is a no-op * so callers can invoke it unconditionally on the happy path (when * markers tracking was off, push wrote NULL into their local). Returns * 0 on success, -1 if PyDict_DelItem failed (the reference is still * released in that case). */ static int encoder_markers_pop(PyEncoderObject *s, PyObject *ident) { int rv; if (ident == NULL) return 0; rv = PyDict_DelItem(s->markers, ident); Py_DECREF(ident); return rv; } /* Helper for the for_json / _asdict paths in encoder_listencode_obj. * Steals the reference to `newobj` (returned by _call_json_method), * handles recursion-depth tracking, and dispatches to the right * sub-encoder: * - if as_dict is 0, encodes newobj as a generic JSON value via * encoder_listencode_obj (for_json contract: return any JSON- * compatible value); * - if as_dict is 1, encodes newobj as a dict via * encoder_listencode_dict after a TypeError-on-mismatch check * (_asdict contract: must return a dict). * Cleans up on every exit path. */ static int encoder_steal_encode(PyEncoderObject *s, JSON_Accu *rval, PyObject *newobj, Py_ssize_t indent_level, int as_dict) { int rv; if (newobj == NULL) return -1; if (Py_EnterRecursiveCall(" while encoding a JSON object")) { Py_DECREF(newobj); return -1; } if (as_dict) { if (JSON_AnyDict_Check(newobj)) { rv = encoder_listencode_dict(s, rval, newobj, indent_level); } else { PyErr_Format(PyExc_TypeError, "_asdict() must return a dict, not %.80s", Py_TYPE(newobj)->tp_name); rv = -1; } } else { rv = encoder_listencode_obj(s, rval, newobj, indent_level); } Py_DECREF(newobj); Py_LeaveRecursiveCall(); return rv; } /* Fallback encoder path used when obj is not one of the directly- * supported JSON types (const, string, int, float, list, dict, * Decimal, etc.) and is not a _asdict / for_json candidate. Handles * three sub-cases in order: * 1. RawJSON โ€” emit the already-encoded string verbatim. * 2. iterable_as_array โ€” treat any iterable object as a JSON array. * 3. default(obj) โ€” call the user-supplied default hook and recurse * on its result, with circular-reference tracking via markers. * Returns 0 on success, -1 on error. */ static int encoder_listencode_default(PyEncoderObject *s, JSON_Accu *rval, PyObject *obj, Py_ssize_t indent_level) { _speedups_state *state = get_speedups_state(s->module_ref); PyObject *ident = NULL; PyObject *newobj; int raw; int rv; raw = is_raw_json(state, obj); if (raw < 0) return -1; if (raw) { PyObject *encoded = PyObject_GetAttr(obj, state->JSON_attr_encoded_json); if (encoded == NULL) return -1; return _steal_accumulate(state, rval, encoded); } if (s->iterable_as_array) { newobj = PyObject_GetIter(obj); if (newobj == NULL) { if (!PyErr_ExceptionMatches(PyExc_TypeError)) return -1; PyErr_Clear(); } else { rv = encoder_listencode_list(s, rval, newobj, indent_level); Py_DECREF(newobj); return rv; } } if (encoder_markers_push(s, obj, &ident)) return -1; if (Py_EnterRecursiveCall(" while encoding a JSON object")) { Py_XDECREF(ident); return -1; } newobj = PyObject_CallOneArg(s->defaultfn, obj); if (newobj == NULL) { #if PY_VERSION_HEX >= 0x030B0000 /* Annotate before unwinding; the "when serializing X object" * note uses the original obj's type name, matching the Python * encoder which binds type(o).__name__ before `o = default(o)` * runs. */ encoder_annotate_exception(state, "when serializing %s object", Py_TYPE(obj)->tp_name); #endif Py_LeaveRecursiveCall(); Py_XDECREF(ident); return -1; } rv = encoder_listencode_obj(s, rval, newobj, indent_level); Py_LeaveRecursiveCall(); if (rv) { #if PY_VERSION_HEX >= 0x030B0000 /* default() succeeded but encoding its return value failed; in * the Python encoder `o` has been rebound to newobj at this * point, so the note reflects that type. */ encoder_annotate_exception(state, "when serializing %s object", Py_TYPE(newobj)->tp_name); #endif } Py_DECREF(newobj); if (rv == 0) { if (encoder_markers_pop(s, ident) < 0) rv = -1; } else { Py_XDECREF(ident); } return rv; } static int encoder_listencode_obj(PyEncoderObject *s, JSON_Accu *rval, PyObject *obj, Py_ssize_t indent_level) { /* Encode Python object obj to a JSON term, rval is a PyList */ _speedups_state *state = get_speedups_state(s->module_ref); PyObject *newobj; int rv = -1; /* Check strings first โ€” they are the most common JSON value type. */ if ((PyBytes_Check(obj) && s->encoding != Py_None) || PyUnicode_Check(obj)) { PyObject *encoded = encoder_encode_string(s, obj); if (encoded != NULL) rv = _steal_accumulate(state, rval, encoded); } else if (obj == Py_None || obj == Py_True || obj == Py_False) { PyObject *cstr = _encoded_const(state, obj); if (cstr != NULL) rv = _steal_accumulate(state, rval, cstr); } else if (PyInt_Check(obj) || PyLong_Check(obj)) { PyObject *encoded = encoder_long_to_str(obj); if (encoded != NULL) { encoded = maybe_quote_bigint(s, encoded, obj); if (encoded != NULL) rv = _steal_accumulate(state, rval, encoded); } } else if (PyFloat_Check(obj)) { PyObject *encoded = encoder_encode_float(s, obj); if (encoded != NULL) rv = _steal_accumulate(state, rval, encoded); } else if (s->for_json && _call_json_method(obj, state->JSON_attr_for_json, &newobj)) { rv = encoder_steal_encode(s, rval, newobj, indent_level, /*as_dict=*/0); } else if (s->namedtuple_as_object && _call_json_method(obj, state->JSON_attr_asdict, &newobj)) { rv = encoder_steal_encode(s, rval, newobj, indent_level, /*as_dict=*/1); } else if (PyList_Check(obj) || (s->tuple_as_array && PyTuple_Check(obj))) { if (Py_EnterRecursiveCall(" while encoding a JSON object")) return rv; rv = encoder_listencode_list(s, rval, obj, indent_level); Py_LeaveRecursiveCall(); } else if (JSON_AnyDict_Check(obj)) { if (Py_EnterRecursiveCall(" while encoding a JSON object")) return rv; rv = encoder_listencode_dict(s, rval, obj, indent_level); Py_LeaveRecursiveCall(); } else if (s->use_decimal && PyObject_TypeCheck(obj, (PyTypeObject *)s->Decimal)) { PyObject *encoded = PyObject_Str(obj); if (encoded != NULL) rv = _steal_accumulate(state, rval, encoded); } else { rv = encoder_listencode_default(s, rval, obj, indent_level); } return rv; } /* Stringify and encode a dict key to its JSON representation, using the * key_memo cache for string keys. Returns a new reference to the encoded * key string on success, Py_None (borrowed, no new reference) for * skipkeys, or NULL on error. */ static PyObject * encoder_encode_dict_key(PyEncoderObject *s, PyObject *key) { PyObject *kstr; PyObject *encoded; kstr = encoder_stringify_key(s, key); if (kstr == NULL) return NULL; if (kstr == Py_None) { Py_DECREF(kstr); return Py_None; /* skipkeys */ } /* For string keys (PyUnicode on Py3, PyString on Py2), * encoder_stringify_key returns Py_INCREF(key) โ€” i.e. kstr IS key. * For non-string keys it returns a freshly created string, so * kstr != key. Use this identity test to decide whether the * key_memo cache applies: caching under a non-string original key * would be write-only (the lookup uses kstr, not key). */ if (kstr == key) { int cached = json_PyDict_GetItemRef(s->key_memo, kstr, &encoded); if (cached < 0) { Py_DECREF(kstr); return NULL; } if (cached == 0) { encoded = encoder_encode_string(s, kstr); if (encoded == NULL) { Py_DECREF(kstr); return NULL; } if (PyDict_SetItem(s->key_memo, key, encoded)) { Py_DECREF(kstr); Py_DECREF(encoded); return NULL; } } Py_DECREF(kstr); } else { encoded = encoder_encode_string(s, kstr); Py_DECREF(kstr); if (encoded == NULL) return NULL; } return encoded; /* new reference */ } /* Write '\n' followed by indent_level copies of s->indent directly to * the accumulator, without materializing the combined string as an * intermediate PyObject. This is the newline-plus-indent prefix * emitted before each item of an indented array or object and before * the matching closing bracket. * * Writing pieces rather than building a concatenated string saves a * PyUnicode_FromStringAndSize + PySequence_Repeat + PyNumber_Add per * container (versus the previous encoder_build_indent_string helper), * which matters most for deeply nested structures and on 3.14+ where * JSON_Accu is a PyUnicodeWriter that appends in-place at near-zero * per-write cost. On older Python the cost is one PyList_Append per * piece; the extra list entries are cheap compared to the allocation * traffic and the PySequence_Repeat's proportional memcpy work. * * Returns 0 on success, -1 on allocation failure. Only called when * s->indent is not Py_None. */ static int encoder_accumulate_newline_indent(PyEncoderObject *s, _speedups_state *state, JSON_Accu *rval, Py_ssize_t indent_level) { Py_ssize_t i; if (JSON_Accu_Accumulate(state, rval, state->JSON_newline)) return -1; for (i = 0; i < indent_level; i++) { if (JSON_Accu_Accumulate(state, rval, s->indent)) return -1; } return 0; } #if PY_VERSION_HEX >= 0x030B0000 /* Attach a PEP 678 note formatted via PyUnicode_FromFormatV to the in- * flight exception. Mirrors the pure-Python encoder's * `except BaseException as exc: exc.add_note(...); raise` wrappers * around each recursive encode step. * * Crucially, PyErr_Fetch runs BEFORE the note is built: the `%R` * formatter walks into PyObject_Repr, which on debug builds * (cpython-3.14-debug and cpython-3.14t+debug) asserts that no * exception is currently set. Building the note while our caller's * exception is still live would core-dump those jobs, as the CI * failure on job 72447820967 / 72447820982 demonstrated. Fetch first, * format second, Restore last. * * Failures on the note path โ€” a repr() that raises, PyUnicode_FromFormatV * returning NULL, a BaseException subclass whose add_note override * raises โ€” are swallowed so the in-flight exception survives intact. * * Only compiled on Python 3.11+, where PEP 678 exists. */ static void encoder_annotate_exception(_speedups_state *state, const char *format, ...) { PyObject *exc_type; PyObject *exc_value; PyObject *exc_tb; PyObject *note; PyObject *result; va_list args; PyErr_Fetch(&exc_type, &exc_value, &exc_tb); if (exc_value == NULL) { /* No live exception to annotate โ€” callsites always have one, * so this is a defensive no-op. */ PyErr_Restore(exc_type, exc_value, exc_tb); return; } PyErr_NormalizeException(&exc_type, &exc_value, &exc_tb); va_start(args, format); note = PyUnicode_FromFormatV(format, args); va_end(args); if (note != NULL) { result = PyObject_CallMethodObjArgs(exc_value, state->JSON_attr_add_note, note, NULL); Py_DECREF(note); if (result == NULL) PyErr_Clear(); else Py_DECREF(result); } else { PyErr_Clear(); } PyErr_Restore(exc_type, exc_value, exc_tb); } #endif static int encoder_listencode_dict(PyEncoderObject *s, JSON_Accu *rval, PyObject *dct, Py_ssize_t indent_level) { /* Encode Python dict dct a JSON term */ _speedups_state *state = get_speedups_state(s->module_ref); PyObject *ident = NULL; PyObject *encoded = NULL; PyObject *iter = NULL; PyObject *item = NULL; /* When s->indent is not Py_None, each inter-item boundary is emitted * as s->item_separator + '\n' + (s->indent * inner_indent_level) via * multiple JSON_Accu_Accumulate calls; no combined PyObject is * materialized. */ int indented = (s->indent != Py_None); Py_ssize_t inner_indent_level = indented ? indent_level + 1 : indent_level; Py_ssize_t idx; { Py_ssize_t dct_size = PyDict_Check(dct) ? PyDict_Size(dct) : PyObject_Length(dct); if (dct_size == 0) return JSON_Accu_Accumulate(state, rval, state->JSON_empty_dict); if (dct_size < 0) return -1; } if (encoder_markers_push(s, dct, &ident)) goto bail; if (JSON_Accu_Accumulate(state, rval, state->JSON_open_dict)) goto bail; if (indented) { if (encoder_accumulate_newline_indent(s, state, rval, inner_indent_level)) goto bail; } /* Fast path: when sort_keys is off and dct is an exact dict, * iterate with PyDict_Next to avoid allocating an items list. * Py_BEGIN_CRITICAL_SECTION prevents concurrent dict mutation * on free-threaded builds; on default builds it is a no-op. */ if (s->item_sort_kw == Py_None && PyDict_CheckExact(dct)) { Py_ssize_t pos = 0; PyObject *key, *value; int err = 0; idx = 0; Py_BEGIN_CRITICAL_SECTION(dct); while (PyDict_Next(dct, &pos, &key, &value)) { Py_INCREF(key); Py_INCREF(value); encoded = encoder_encode_dict_key(s, key); Py_DECREF(key); if (encoded == NULL) { Py_DECREF(value); err = 1; break; } if (encoded == Py_None) { /* skipkeys */ encoded = NULL; Py_DECREF(value); continue; } if (idx) { if (JSON_Accu_Accumulate(state, rval, s->item_separator)) { Py_DECREF(value); err = 1; break; } if (indented && encoder_accumulate_newline_indent( s, state, rval, inner_indent_level)) { Py_DECREF(value); err = 1; break; } } if (JSON_Accu_Accumulate(state, rval, encoded)) { Py_DECREF(value); err = 1; break; } Py_CLEAR(encoded); if (JSON_Accu_Accumulate(state, rval, s->key_separator)) { Py_DECREF(value); err = 1; break; } if (encoder_listencode_obj(s, rval, value, inner_indent_level)) { #if PY_VERSION_HEX >= 0x030B0000 encoder_annotate_exception(state, "when serializing %s item %R", Py_TYPE(dct)->tp_name, key); #endif Py_DECREF(value); err = 1; break; } Py_DECREF(value); idx++; } Py_END_CRITICAL_SECTION(); if (err || PyErr_Occurred()) goto bail; } else { /* Slow path: sorted iteration, dict subclasses, or non-dict * mappings. Build an items list via encoder_dict_iteritems. */ iter = encoder_dict_iteritems(s, dct); if (iter == NULL) goto bail; idx = 0; while ((item = PyIter_Next(iter))) { PyObject *key, *value; if (!PyTuple_Check(item) || Py_SIZE(item) != 2) { PyErr_SetString(PyExc_ValueError, "items must return 2-tuples"); goto bail; } key = PyTuple_GET_ITEM(item, 0); value = PyTuple_GET_ITEM(item, 1); encoded = encoder_encode_dict_key(s, key); if (encoded == NULL) goto bail; if (encoded == Py_None) { /* skipkeys */ encoded = NULL; Py_CLEAR(item); continue; } if (idx) { if (JSON_Accu_Accumulate(state, rval, s->item_separator)) goto bail; if (indented && encoder_accumulate_newline_indent( s, state, rval, inner_indent_level)) goto bail; } if (JSON_Accu_Accumulate(state, rval, encoded)) goto bail; Py_CLEAR(encoded); if (JSON_Accu_Accumulate(state, rval, s->key_separator)) goto bail; if (encoder_listencode_obj(s, rval, value, inner_indent_level)) { #if PY_VERSION_HEX >= 0x030B0000 encoder_annotate_exception(state, "when serializing %s item %R", Py_TYPE(dct)->tp_name, key); #endif goto bail; } Py_CLEAR(item); idx++; } Py_CLEAR(iter); if (PyErr_Occurred()) goto bail; } if (encoder_markers_pop(s, ident)) goto bail; ident = NULL; if (indented) { if (encoder_accumulate_newline_indent(s, state, rval, indent_level)) goto bail; } if (JSON_Accu_Accumulate(state, rval, state->JSON_close_dict)) goto bail; return 0; bail: Py_XDECREF(encoded); Py_XDECREF(item); Py_XDECREF(iter); Py_XDECREF(ident); return -1; } static int encoder_listencode_list(PyEncoderObject *s, JSON_Accu *rval, PyObject *seq, Py_ssize_t indent_level) { /* Encode Python list seq to a JSON term */ _speedups_state *state = get_speedups_state(s->module_ref); PyObject *ident = NULL; PyObject *iter = NULL; PyObject *obj = NULL; /* See encoder_listencode_dict: inter-item indent prefixes are * written piece-by-piece via encoder_accumulate_newline_indent, * no intermediate PyObject is materialized. */ int indented = (s->indent != Py_None); Py_ssize_t inner_indent_level = indented ? indent_level + 1 : indent_level; Py_ssize_t i = 0; int is_exact_fast; /* Emptiness check: use direct size for exact types to skip the * __bool__/__len__ method dispatch of PyObject_IsTrue. */ if (PyList_CheckExact(seq)) { if (PyList_GET_SIZE(seq) == 0) return JSON_Accu_Accumulate(state, rval, state->JSON_empty_array); is_exact_fast = 1; } else if (PyTuple_CheckExact(seq)) { if (PyTuple_GET_SIZE(seq) == 0) return JSON_Accu_Accumulate(state, rval, state->JSON_empty_array); is_exact_fast = 1; } else { /* For non-exact types (list/tuple subclasses or other iterables) * we cannot short-circuit on emptiness without consuming the * iterator. iterable_as_array in particular reaches us with a * bare iterator whose PyObject_IsTrue always returns 1, so the * emptiness is only discovered on the first PyIter_Next call * inside the slow path below. */ is_exact_fast = 0; } if (encoder_markers_push(s, seq, &ident)) goto bail; if (is_exact_fast) { /* Fast path: exact list or exact tuple โ€” iterate by index to * avoid allocating an iterator object. Known non-empty from * the size check above, so the opening bracket and the first * newline_indent can be emitted unconditionally. * Py_BEGIN_CRITICAL_SECTION prevents concurrent list mutation * on free-threaded builds (tuples are immutable so the lock is * uncontested). * * Uses a local `item` variable (not the outer `obj`) so that * the bail handler's Py_XDECREF(obj) stays a no-op for this * path. */ PyObject *item; Py_ssize_t size; int is_list = PyList_CheckExact(seq); int err = 0; if (JSON_Accu_Accumulate(state, rval, state->JSON_open_array)) goto bail; if (indented) { if (encoder_accumulate_newline_indent(s, state, rval, inner_indent_level)) goto bail; } Py_BEGIN_CRITICAL_SECTION(seq); size = is_list ? PyList_GET_SIZE(seq) : PyTuple_GET_SIZE(seq); for (i = 0; i < size; i++) { item = is_list ? PyList_GET_ITEM(seq, i) : PyTuple_GET_ITEM(seq, i); Py_INCREF(item); if (i) { if (JSON_Accu_Accumulate(state, rval, s->item_separator)) { Py_DECREF(item); err = 1; break; } if (indented && encoder_accumulate_newline_indent( s, state, rval, inner_indent_level)) { Py_DECREF(item); err = 1; break; } } if (encoder_listencode_obj(s, rval, item, inner_indent_level)) { #if PY_VERSION_HEX >= 0x030B0000 encoder_annotate_exception(state, "when serializing %s item %zd", Py_TYPE(seq)->tp_name, i); #endif Py_DECREF(item); err = 1; break; } Py_DECREF(item); } Py_END_CRITICAL_SECTION(); if (err) goto bail; if (indented) { if (encoder_accumulate_newline_indent(s, state, rval, indent_level)) goto bail; } if (JSON_Accu_Accumulate(state, rval, state->JSON_close_array)) goto bail; } else { /* Slow path: list/tuple subclasses or arbitrary iterables. * The opening '[' and newline_indent are deferred until we * know there is at least one item, so an empty iterable still * emits a compact "[]" under an indent= setting. */ int emitted_open = 0; iter = PyObject_GetIter(seq); if (iter == NULL) goto bail; while ((obj = PyIter_Next(iter))) { if (!emitted_open) { if (JSON_Accu_Accumulate(state, rval, state->JSON_open_array)) goto bail; if (indented && encoder_accumulate_newline_indent( s, state, rval, inner_indent_level)) goto bail; emitted_open = 1; } else { if (JSON_Accu_Accumulate(state, rval, s->item_separator)) goto bail; if (indented && encoder_accumulate_newline_indent( s, state, rval, inner_indent_level)) goto bail; } if (encoder_listencode_obj(s, rval, obj, inner_indent_level)) { #if PY_VERSION_HEX >= 0x030B0000 encoder_annotate_exception(state, "when serializing %s item %zd", Py_TYPE(seq)->tp_name, i); #endif goto bail; } i++; Py_CLEAR(obj); } Py_CLEAR(iter); if (PyErr_Occurred()) goto bail; if (!emitted_open) { if (JSON_Accu_Accumulate(state, rval, state->JSON_empty_array)) goto bail; } else { if (indented) { if (encoder_accumulate_newline_indent(s, state, rval, indent_level)) goto bail; } if (JSON_Accu_Accumulate(state, rval, state->JSON_close_array)) goto bail; } } if (encoder_markers_pop(s, ident)) goto bail; ident = NULL; return 0; bail: Py_XDECREF(obj); Py_XDECREF(iter); Py_XDECREF(ident); return -1; } static void encoder_dealloc(PyObject *self) { /* bpo-31095: UnTrack is needed before calling any callbacks */ #if PY_VERSION_HEX >= 0x030D0000 PyTypeObject *tp = Py_TYPE(self); #endif PyObject_GC_UnTrack(self); encoder_clear(self); Py_TYPE(self)->tp_free(self); #if PY_VERSION_HEX >= 0x030D0000 Py_DECREF(tp); #endif } static int encoder_traverse(PyObject *self, visitproc visit, void *arg) { PyEncoderObject *s = (PyEncoderObject *)self; #if PY_VERSION_HEX >= 0x030D0000 /* Heap types must visit their type for GC. */ Py_VISIT(Py_TYPE(self)); #else assert(PyEncoder_Check(self)); #endif #define JSON_VISIT_FIELD(f) Py_VISIT(s->f); JSON_ENCODER_OBJECT_FIELDS(JSON_VISIT_FIELD) #undef JSON_VISIT_FIELD return 0; } static int encoder_clear(PyObject *self) { /* Deallocate Encoder */ PyEncoderObject *s = (PyEncoderObject *)self; #if PY_VERSION_HEX < 0x030D0000 assert(PyEncoder_Check(self)); #endif #define JSON_CLEAR_FIELD(f) Py_CLEAR(s->f); JSON_ENCODER_OBJECT_FIELDS(JSON_CLEAR_FIELD) #undef JSON_CLEAR_FIELD return 0; } PyDoc_STRVAR(encoder_doc, "_iterencode(obj, _current_indent_level) -> iterable"); #if PY_VERSION_HEX >= 0x030D0000 /* Heap type slots and spec for Python 3.13+ */ static PyType_Slot PyEncoderType_slots[] = { {Py_tp_doc, (void *)encoder_doc}, {Py_tp_dealloc, encoder_dealloc}, {Py_tp_call, encoder_call}, {Py_tp_traverse, encoder_traverse}, {Py_tp_clear, encoder_clear}, {Py_tp_members, encoder_members}, {Py_tp_new, encoder_new}, {0, NULL} }; static PyType_Spec PyEncoderType_spec = { .name = "simplejson._speedups.Encoder", .basicsize = sizeof(PyEncoderObject), .flags = Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC, .slots = PyEncoderType_slots, }; #else static PyTypeObject PyEncoderType = { PyVarObject_HEAD_INIT(NULL, 0) "simplejson._speedups.Encoder", /* tp_name */ sizeof(PyEncoderObject), /* tp_basicsize */ 0, /* tp_itemsize */ encoder_dealloc, /* tp_dealloc */ 0, /* tp_print */ 0, /* tp_getattr */ 0, /* tp_setattr */ 0, /* tp_compare */ 0, /* tp_repr */ 0, /* tp_as_number */ 0, /* tp_as_sequence */ 0, /* tp_as_mapping */ 0, /* tp_hash */ encoder_call, /* tp_call */ 0, /* tp_str */ 0, /* tp_getattro */ 0, /* tp_setattro */ 0, /* tp_as_buffer */ Py_TPFLAGS_DEFAULT | Py_TPFLAGS_HAVE_GC, /* tp_flags */ encoder_doc, /* tp_doc */ encoder_traverse, /* tp_traverse */ encoder_clear, /* tp_clear */ 0, /* tp_richcompare */ 0, /* tp_weaklistoffset */ 0, /* tp_iter */ 0, /* tp_iternext */ 0, /* tp_methods */ encoder_members, /* tp_members */ 0, /* tp_getset */ 0, /* tp_base */ 0, /* tp_dict */ 0, /* tp_descr_get */ 0, /* tp_descr_set */ 0, /* tp_dictoffset */ 0, /* tp_init */ 0, /* tp_alloc */ encoder_new, /* tp_new */ 0, /* tp_free */ }; #endif static PyMethodDef speedups_methods[] = { {"encode_basestring_ascii", (PyCFunction)py_encode_basestring_ascii, METH_O, pydoc_encode_basestring_ascii}, {"encode_basestring", (PyCFunction)py_encode_basestring, METH_O, pydoc_encode_basestring}, {"scanstring", (PyCFunction)py_scanstring, METH_VARARGS, pydoc_scanstring}, {NULL, NULL, 0, NULL} }; PyDoc_STRVAR(module_doc, "simplejson speedups\n"); /* Clear every state field that init_speedups_state may populate. * Called at the start of init_speedups_state so that re-initialization * (e.g. importlib.reload on pre-3.13 where the static state lives for * the lifetime of the interpreter) releases the previous references * instead of leaking them. Type fields are NOT touched here: on 3.13+ * module_exec creates the heap types before calling this function, and * on older versions the type fields hold borrowed pointers to static * PyTypeObjects that must not be cleared. */ static void reset_speedups_state_constants(_speedups_state *state) { Py_CLEAR(state->JSON_Infinity); Py_CLEAR(state->JSON_NegInfinity); Py_CLEAR(state->JSON_NaN); Py_CLEAR(state->JSON_EmptyUnicode); #if PY_MAJOR_VERSION < 3 Py_CLEAR(state->JSON_EmptyStr); Py_CLEAR(state->JSON_EmptyStr_join); #endif Py_CLEAR(state->JSON_s_null); Py_CLEAR(state->JSON_s_true); Py_CLEAR(state->JSON_s_false); Py_CLEAR(state->JSON_open_dict); Py_CLEAR(state->JSON_close_dict); Py_CLEAR(state->JSON_empty_dict); Py_CLEAR(state->JSON_open_array); Py_CLEAR(state->JSON_close_array); Py_CLEAR(state->JSON_empty_array); Py_CLEAR(state->JSON_newline); Py_CLEAR(state->JSON_sortargs); Py_CLEAR(state->JSON_itemgetter0); Py_CLEAR(state->JSON_attr_for_json); Py_CLEAR(state->JSON_attr_asdict); Py_CLEAR(state->JSON_attr_sort); Py_CLEAR(state->JSON_attr_encoded_json); Py_CLEAR(state->JSON_attr_add_note); Py_CLEAR(state->RawJSONType); Py_CLEAR(state->JSONDecodeError); } /* Shared initializer for per-module state. Called from module_exec on Python 3 and from init_speedups on Python 2. Assumes the type fields in state have already been populated. */ static int init_speedups_state(_speedups_state *state, PyObject *module) { /* Release any prior values. A no-op on the first call (fields are * already NULL from per-module zeroed storage on 3.13+ or from the * static BSS on older versions); on reload this releases the * previous references to avoid a refcount leak. */ reset_speedups_state_constants(state); state->JSON_NaN = JSON_InternFromString("NaN"); if (state->JSON_NaN == NULL) return -1; state->JSON_Infinity = JSON_InternFromString("Infinity"); if (state->JSON_Infinity == NULL) return -1; state->JSON_NegInfinity = JSON_InternFromString("-Infinity"); if (state->JSON_NegInfinity == NULL) return -1; #if PY_MAJOR_VERSION >= 3 state->JSON_EmptyUnicode = PyUnicode_New(0, 127); #else state->JSON_EmptyStr = PyString_FromString(""); if (state->JSON_EmptyStr == NULL) return -1; state->JSON_EmptyStr_join = PyObject_GetAttrString(state->JSON_EmptyStr, "join"); if (state->JSON_EmptyStr_join == NULL) return -1; state->JSON_EmptyUnicode = PyUnicode_FromUnicode(NULL, 0); #endif if (state->JSON_EmptyUnicode == NULL) return -1; state->JSON_s_null = JSON_InternFromString("null"); if (state->JSON_s_null == NULL) return -1; state->JSON_s_true = JSON_InternFromString("true"); if (state->JSON_s_true == NULL) return -1; state->JSON_s_false = JSON_InternFromString("false"); if (state->JSON_s_false == NULL) return -1; state->JSON_open_dict = JSON_InternFromString("{"); if (state->JSON_open_dict == NULL) return -1; state->JSON_close_dict = JSON_InternFromString("}"); if (state->JSON_close_dict == NULL) return -1; state->JSON_empty_dict = JSON_InternFromString("{}"); if (state->JSON_empty_dict == NULL) return -1; state->JSON_open_array = JSON_InternFromString("["); if (state->JSON_open_array == NULL) return -1; state->JSON_close_array = JSON_InternFromString("]"); if (state->JSON_close_array == NULL) return -1; state->JSON_empty_array = JSON_InternFromString("[]"); if (state->JSON_empty_array == NULL) return -1; state->JSON_newline = JSON_InternFromString("\n"); if (state->JSON_newline == NULL) return -1; state->JSON_sortargs = PyTuple_New(0); if (state->JSON_sortargs == NULL) return -1; state->RawJSONType = import_dependency("simplejson.raw_json", "RawJSON"); if (state->RawJSONType == NULL) return -1; state->JSONDecodeError = import_dependency("simplejson.errors", "JSONDecodeError"); if (state->JSONDecodeError == NULL) return -1; { PyObject *operator_mod = PyImport_ImportModule("operator"); if (!operator_mod) return -1; state->JSON_itemgetter0 = PyObject_CallMethod(operator_mod, "itemgetter", "i", 0); Py_DECREF(operator_mod); if (!state->JSON_itemgetter0) return -1; } /* Interned attribute names used in encoder hot paths. */ state->JSON_attr_for_json = JSON_InternFromString("for_json"); if (state->JSON_attr_for_json == NULL) return -1; state->JSON_attr_asdict = JSON_InternFromString("_asdict"); if (state->JSON_attr_asdict == NULL) return -1; state->JSON_attr_sort = JSON_InternFromString("sort"); if (state->JSON_attr_sort == NULL) return -1; state->JSON_attr_encoded_json = JSON_InternFromString("encoded_json"); if (state->JSON_attr_encoded_json == NULL) return -1; state->JSON_attr_add_note = JSON_InternFromString("add_note"); if (state->JSON_attr_add_note == NULL) return -1; (void)module; return 0; } /* Multi-phase initialization (PEP 489) for Python 3.5+. On 3.13+ this * path creates heap types and allocates per-module state so that each * interpreter gets its own copy; on 3.5-3.12 the type fields just point * at the statically-allocated PyTypeObjects and state lives in the * single _speedups_static_state instance. Either way, module_exec does * the work and get_speedups_state() gives uniform access. */ static int module_exec(PyObject *m) { _speedups_state *state = get_speedups_state(m); #if PY_VERSION_HEX >= 0x030D0000 /* Create heap types from specs, bound to this module */ state->PyScannerType = PyType_FromModuleAndSpec(m, &PyScannerType_spec, NULL); if (state->PyScannerType == NULL) return -1; state->PyEncoderType = PyType_FromModuleAndSpec(m, &PyEncoderType_spec, NULL); if (state->PyEncoderType == NULL) return -1; #else if (PyType_Ready(&PyScannerType) < 0) return -1; if (PyType_Ready(&PyEncoderType) < 0) return -1; /* Static types are eternal, so these are borrowed pointers kept * in the state struct for layout uniformity with the 3.13+ path. * There is nothing to refcount and no GC tracking here. */ state->PyScannerType = (PyObject *)&PyScannerType; state->PyEncoderType = (PyObject *)&PyEncoderType; /* Scanner/Encoder instance construction needs a borrowed reference * to the module to store in module_ref; capture it here, before * anything else that might trigger instance creation. */ _speedups_module = m; #endif #if PY_VERSION_HEX >= 0x030A0000 if (PyModule_AddObjectRef(m, "make_scanner", state->PyScannerType) < 0) return -1; if (PyModule_AddObjectRef(m, "make_encoder", state->PyEncoderType) < 0) return -1; #else Py_INCREF(state->PyScannerType); if (PyModule_AddObject(m, "make_scanner", state->PyScannerType) < 0) { Py_DECREF(state->PyScannerType); return -1; } Py_INCREF(state->PyEncoderType); if (PyModule_AddObject(m, "make_encoder", state->PyEncoderType) < 0) { Py_DECREF(state->PyEncoderType); return -1; } #endif return init_speedups_state(state, m); } #if PY_VERSION_HEX >= 0x030D0000 static int speedups_traverse(PyObject *m, visitproc visit, void *arg) { _speedups_state *state = get_speedups_state(m); Py_VISIT(state->PyScannerType); Py_VISIT(state->PyEncoderType); Py_VISIT(state->JSON_Infinity); Py_VISIT(state->JSON_NegInfinity); Py_VISIT(state->JSON_NaN); Py_VISIT(state->JSON_EmptyUnicode); Py_VISIT(state->JSON_s_null); Py_VISIT(state->JSON_s_true); Py_VISIT(state->JSON_s_false); Py_VISIT(state->JSON_open_dict); Py_VISIT(state->JSON_close_dict); Py_VISIT(state->JSON_empty_dict); Py_VISIT(state->JSON_open_array); Py_VISIT(state->JSON_close_array); Py_VISIT(state->JSON_empty_array); Py_VISIT(state->JSON_sortargs); Py_VISIT(state->JSON_itemgetter0); Py_VISIT(state->JSON_attr_for_json); Py_VISIT(state->JSON_attr_asdict); Py_VISIT(state->JSON_attr_sort); Py_VISIT(state->JSON_attr_encoded_json); Py_VISIT(state->RawJSONType); Py_VISIT(state->JSONDecodeError); return 0; } static int speedups_clear(PyObject *m) { _speedups_state *state = get_speedups_state(m); Py_CLEAR(state->PyScannerType); Py_CLEAR(state->PyEncoderType); reset_speedups_state_constants(state); return 0; } #endif /* PY_VERSION_HEX >= 0x030D0000 */ #if PY_MAJOR_VERSION >= 3 static PyModuleDef_Slot module_slots[] = { {Py_mod_exec, module_exec}, #if PY_VERSION_HEX >= 0x030D0000 {Py_mod_gil, Py_MOD_GIL_NOT_USED}, #endif {0, NULL} }; static struct PyModuleDef moduledef = { PyModuleDef_HEAD_INIT, "_speedups", /* m_name */ module_doc, /* m_doc */ #if PY_VERSION_HEX >= 0x030D0000 sizeof(_speedups_state), /* m_size */ #else 0, /* m_size: no per-module state on <3.13 */ #endif speedups_methods, /* m_methods */ module_slots, /* m_slots (multi-phase init, PEP 489) */ #if PY_VERSION_HEX >= 0x030D0000 speedups_traverse, /* m_traverse */ speedups_clear, /* m_clear */ #else NULL, /* m_traverse */ NULL, /* m_clear */ #endif NULL, /* m_free */ }; #endif static PyObject * import_dependency(const char *module_name, const char *attr_name) { PyObject *rval; PyObject *module = PyImport_ImportModule(module_name); if (module == NULL) return NULL; rval = PyObject_GetAttrString(module, attr_name); Py_DECREF(module); return rval; } #if PY_MAJOR_VERSION >= 3 PyMODINIT_FUNC PyInit__speedups(void) { /* Multi-phase init (PEP 489): Python runs module_exec via Py_mod_exec slot */ return PyModuleDef_Init(&moduledef); } #else /* Python 2.7: single-phase init via Py_InitModule3 */ void init_speedups(void) { _speedups_state *state = &_speedups_static_state; PyObject *m; if (PyType_Ready(&PyScannerType) < 0) return; if (PyType_Ready(&PyEncoderType) < 0) return; state->PyScannerType = (PyObject *)&PyScannerType; state->PyEncoderType = (PyObject *)&PyEncoderType; m = Py_InitModule3("_speedups", speedups_methods, module_doc); if (m == NULL) return; _speedups_module = m; /* borrowed; sys.modules keeps it alive */ Py_INCREF(state->PyScannerType); if (PyModule_AddObject(m, "make_scanner", state->PyScannerType) < 0) { Py_DECREF(state->PyScannerType); return; } Py_INCREF(state->PyEncoderType); if (PyModule_AddObject(m, "make_encoder", state->PyEncoderType) < 0) { Py_DECREF(state->PyEncoderType); return; } if (init_speedups_state(state, m) < 0) return; } #endif ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/_speedups_scan.h0000644000175100017510000004537315172736046020724 0ustar00runnerrunner/* * _speedups_scan.h -- templated JSON scanner function bodies. * * This file is NOT a traditional header and must not be used as one. * It contains function *definitions* and is #included multiple times * from _speedups.c with different macro settings to generate both * the Py2 bytes (_str) and the universal unicode (_unicode) variants * of scan_once, _parse_object, _parse_array, and _match_number * without code duplication. The caller must #define the following * macros before each #include, and must wrap the inclusion with * * #define JSON_SPEEDUPS_SCAN_INCLUDING 1 * #include "_speedups_scan.h" * #undef JSON_SPEEDUPS_SCAN_INCLUDING * * so that accidental inclusion from any other file (or any other * code path in _speedups.c) is caught at compile time. The * JSON_SPEEDUPS_SCAN_INCLUDING gate is the strong form of the * sanity check; missing JSON_SCAN_SUFFIX is a weaker symptom and is * also diagnosed below. * * Expected macros (must be defined before each #include): * * JSON_SCAN_SUFFIX - Either _str or _unicode * JSON_SCAN_DATA_INIT(pystr) - Statements to set up `str` (data * pointer) and `end_idx` locals * JSON_SCAN_READ(idx) - Read char at idx, returns JSON_UNICHR * JSON_SCAN_SCANSTRING_CALL(...) - scanstring_* call with the right args * JSON_SCAN_NUMSTR_CREATE(s, e) - Create a PyObject holding the numeric * substring from start..end * JSON_SCAN_PARSE_FLOAT_FAST(ns) - Fast-path float parse (or fallback) * JSON_SCAN_PARSE_INT_FAST(ns) - Fast-path int parse (or fallback) * * The macros are #undef'd at the bottom of the file so the caller can * redefine them for the next #include. * * Example: * * #define JSON_SCAN_SUFFIX _unicode * #define JSON_SCAN_DATA_INIT(p) \ * PY2_UNUSED int kind = PyUnicode_KIND(p); \ * void *str = PyUnicode_DATA(p); \ * Py_ssize_t end_idx = PyUnicode_GET_LENGTH(p) - 1 * #define JSON_SCAN_READ(i) PyUnicode_READ(kind, str, (i)) * ... * #define JSON_SPEEDUPS_SCAN_INCLUDING 1 * #include "_speedups_scan.h" * #undef JSON_SPEEDUPS_SCAN_INCLUDING */ #ifndef JSON_SPEEDUPS_SCAN_INCLUDING #error "_speedups_scan.h must only be included by _speedups.c. See the header comment." #endif #ifndef JSON_SCAN_SUFFIX #error "JSON_SCAN_SUFFIX must be defined before including _speedups_scan.h" #endif #define JSON_SCAN_CONCAT_(a, b) a##b #define JSON_SCAN_CONCAT(a, b) JSON_SCAN_CONCAT_(a, b) #define JSON_SCAN_FN(base) JSON_SCAN_CONCAT(base, JSON_SCAN_SUFFIX) /* Advance idx past any JSON whitespace (space, tab, CR, LF). The * idx <= end_idx guard matches the parser's convention that end_idx * is the index of the LAST valid byte (i.e. length - 1). */ #define SKIP_WHITESPACE() \ do { \ while (idx <= end_idx && IS_WHITESPACE(JSON_SCAN_READ(idx))) \ idx++; \ } while (0) static PyObject * JSON_SCAN_FN(_match_number)(PyScannerObject *s, PyObject *pystr, Py_ssize_t start, Py_ssize_t *next_idx_ptr) { /* Read a JSON number from pystr. *next_idx_ptr is a return-by-reference index to the first character after the number. Returns a new PyObject representation of that number: PyInt/PyLong or PyFloat, or whatever parse_int/parse_float return if those are set. */ _speedups_state *state = get_speedups_state(s->module_ref); JSON_SCAN_DATA_INIT(pystr); Py_ssize_t idx = start; int is_float = 0; JSON_UNICHR c; PyObject *rval; PyObject *numstr; /* read a sign if it's there, make sure it's not the end of the string */ if (JSON_SCAN_READ(idx) == '-') { if (idx >= end_idx) { raise_errmsg(state, ERR_EXPECTING_VALUE, pystr, start); return NULL; } idx++; } /* read as many integer digits as we find as long as it doesn't start with 0 */ c = JSON_SCAN_READ(idx); if (c == '0') { /* if it starts with 0 we only expect one integer digit */ idx++; } else if (IS_DIGIT(c)) { idx++; while (idx <= end_idx && IS_DIGIT(JSON_SCAN_READ(idx))) { idx++; } } else { /* no integer digits, error */ raise_errmsg(state, ERR_EXPECTING_VALUE, pystr, start); return NULL; } /* if the next char is '.' followed by a digit then read all float digits */ if (idx < end_idx && JSON_SCAN_READ(idx) == '.' && IS_DIGIT(JSON_SCAN_READ(idx + 1))) { is_float = 1; idx += 2; while (idx <= end_idx && IS_DIGIT(JSON_SCAN_READ(idx))) idx++; } /* if the next char is 'e' or 'E' then maybe read the exponent (or backtrack) */ if (idx < end_idx && (JSON_SCAN_READ(idx) == 'e' || JSON_SCAN_READ(idx) == 'E')) { Py_ssize_t e_start = idx; idx++; /* read an exponent sign if present */ if (idx < end_idx && (JSON_SCAN_READ(idx) == '-' || JSON_SCAN_READ(idx) == '+')) idx++; /* read all digits */ while (idx <= end_idx && IS_DIGIT(JSON_SCAN_READ(idx))) idx++; /* if we got a digit, then parse as float. if not, backtrack */ if (IS_DIGIT(JSON_SCAN_READ(idx - 1))) { is_float = 1; } else { idx = e_start; } } /* copy the section we determined to be a number */ numstr = JSON_SCAN_NUMSTR_CREATE(start, idx); if (numstr == NULL) return NULL; if (is_float) { /* parse as a float using a fast path if available, otherwise call user-defined method */ if (s->parse_float != (PyObject *)&PyFloat_Type) { rval = PyObject_CallOneArg(s->parse_float, numstr); } else { rval = JSON_SCAN_PARSE_FLOAT_FAST(numstr); } } else { /* parse as an int using a fast path if available, otherwise call user-defined method */ rval = JSON_SCAN_PARSE_INT_FAST(numstr); } Py_DECREF(numstr); *next_idx_ptr = idx; return rval; } static PyObject * JSON_SCAN_FN(_parse_object)(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) { /* Read a JSON object from pystr. idx is the index of the first character after the opening curly brace. *next_idx_ptr is a return-by-reference index to the first character after the closing curly brace. */ _speedups_state *state = get_speedups_state(s->module_ref); JSON_SCAN_DATA_INIT(pystr); PyObject *rval = NULL; PyObject *pairs = NULL; PyObject *item; PyObject *key = NULL; PyObject *val = NULL; int has_pairs_hook = (s->pairs_hook != Py_None); int did_parse = 0; Py_ssize_t next_idx; if (has_pairs_hook) { pairs = PyList_New(0); if (pairs == NULL) return NULL; } else { rval = PyDict_New(); if (rval == NULL) return NULL; } /* skip whitespace after { */ SKIP_WHITESPACE(); /* only loop if the object is non-empty */ if (idx <= end_idx && JSON_SCAN_READ(idx) != '}') { int trailing_delimiter = 0; while (idx <= end_idx) { trailing_delimiter = 0; /* read key */ if (JSON_SCAN_READ(idx) != '"') { raise_errmsg(state, ERR_OBJECT_PROPERTY, pystr, idx); goto bail; } key = JSON_SCAN_SCANSTRING_CALL(idx + 1, &next_idx); if (key == NULL) goto bail; /* Intern the key through s->memo so repeated key strings * share one PyObject across this decode. Using SetDefault * collapses what used to be separate Get/Set lookups into * a single atomic call. */ if (json_memo_intern_key(s->memo, &key) < 0) goto bail; idx = next_idx; /* skip whitespace between key and : delimiter, read :, skip whitespace */ SKIP_WHITESPACE(); if (idx > end_idx || JSON_SCAN_READ(idx) != ':') { raise_errmsg(state, ERR_OBJECT_PROPERTY_DELIMITER, pystr, idx); goto bail; } idx++; SKIP_WHITESPACE(); /* read any JSON term */ val = JSON_SCAN_FN(scan_once)(s, pystr, idx, &next_idx); if (val == NULL) goto bail; if (has_pairs_hook) { item = PyTuple_Pack(2, key, val); if (item == NULL) goto bail; Py_CLEAR(key); Py_CLEAR(val); if (PyList_Append(pairs, item) == -1) { Py_DECREF(item); goto bail; } Py_DECREF(item); } else { if (PyDict_SetItem(rval, key, val) < 0) goto bail; Py_CLEAR(key); Py_CLEAR(val); } idx = next_idx; /* skip whitespace before } or , */ SKIP_WHITESPACE(); /* bail if the object is closed or we didn't get the , delimiter */ did_parse = 1; if (idx > end_idx) break; if (JSON_SCAN_READ(idx) == '}') { break; } else if (JSON_SCAN_READ(idx) != ',') { raise_errmsg(state, ERR_OBJECT_DELIMITER, pystr, idx); goto bail; } idx++; /* skip whitespace after , delimiter */ SKIP_WHITESPACE(); trailing_delimiter = 1; /* check for trailing comma before } */ if (idx <= end_idx && JSON_SCAN_READ(idx) == '}') { raise_errmsg(state, ERR_TRAILING_COMMA_OBJECT, pystr, idx); goto bail; } } if (trailing_delimiter) { /* Truncated input after comma (e.g. '{"a":1,') */ raise_errmsg(state, ERR_OBJECT_PROPERTY, pystr, idx); goto bail; } } /* verify that idx < end_idx, str[idx] should be '}' */ if (idx > end_idx || JSON_SCAN_READ(idx) != '}') { if (did_parse) { raise_errmsg(state, ERR_OBJECT_DELIMITER, pystr, idx); } else { raise_errmsg(state, ERR_OBJECT_PROPERTY_FIRST, pystr, idx); } goto bail; } /* if pairs_hook is not None: rval = object_pairs_hook(pairs) */ if (s->pairs_hook != Py_None) { val = PyObject_CallOneArg(s->pairs_hook, pairs); if (val == NULL) goto bail; Py_DECREF(pairs); *next_idx_ptr = idx + 1; return val; } /* if object_hook is not None: rval = object_hook(rval) */ if (s->object_hook != Py_None) { val = PyObject_CallOneArg(s->object_hook, rval); if (val == NULL) goto bail; Py_DECREF(rval); rval = val; val = NULL; } *next_idx_ptr = idx + 1; return rval; bail: Py_XDECREF(rval); Py_XDECREF(key); Py_XDECREF(val); Py_XDECREF(pairs); return NULL; } static PyObject * JSON_SCAN_FN(_parse_array)(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) { /* Read a JSON array from pystr. idx is the index of the first character after the opening brace. *next_idx_ptr is a return-by-reference index to the first character after the closing brace. */ _speedups_state *state = get_speedups_state(s->module_ref); JSON_SCAN_DATA_INIT(pystr); PyObject *val = NULL; PyObject *rval = PyList_New(0); Py_ssize_t next_idx; if (rval == NULL) return NULL; /* skip whitespace after [ */ SKIP_WHITESPACE(); /* only loop if the array is non-empty */ if (idx <= end_idx && JSON_SCAN_READ(idx) != ']') { int trailing_delimiter = 0; while (idx <= end_idx) { trailing_delimiter = 0; /* read any JSON term and de-tuplefy the (rval, idx) */ val = JSON_SCAN_FN(scan_once)(s, pystr, idx, &next_idx); if (val == NULL) { goto bail; } if (PyList_Append(rval, val) == -1) goto bail; Py_CLEAR(val); idx = next_idx; /* skip whitespace between term and , */ SKIP_WHITESPACE(); /* bail if the array is closed or we didn't get the , delimiter */ if (idx > end_idx) break; if (JSON_SCAN_READ(idx) == ']') { break; } else if (JSON_SCAN_READ(idx) != ',') { raise_errmsg(state, ERR_ARRAY_DELIMITER, pystr, idx); goto bail; } idx++; /* skip whitespace after , */ SKIP_WHITESPACE(); trailing_delimiter = 1; /* check for trailing comma before ] */ if (idx <= end_idx && JSON_SCAN_READ(idx) == ']') { raise_errmsg(state, ERR_TRAILING_COMMA_ARRAY, pystr, idx); goto bail; } } if (trailing_delimiter) { /* Truncated input after comma (e.g. "[42,") */ raise_errmsg(state, ERR_EXPECTING_VALUE, pystr, idx); goto bail; } } /* verify that idx < end_idx, str[idx] should be ']' */ if (idx > end_idx || JSON_SCAN_READ(idx) != ']') { if (PyList_GET_SIZE(rval)) { raise_errmsg(state, ERR_ARRAY_DELIMITER, pystr, idx); } else { raise_errmsg(state, ERR_ARRAY_VALUE_FIRST, pystr, idx); } goto bail; } /* apply array_hook if set */ if (s->array_hook != Py_None) { val = PyObject_CallOneArg(s->array_hook, rval); if (val == NULL) goto bail; Py_DECREF(rval); rval = val; val = NULL; } *next_idx_ptr = idx + 1; return rval; bail: Py_XDECREF(val); Py_DECREF(rval); return NULL; } static PyObject * JSON_SCAN_FN(scan_once)(PyScannerObject *s, PyObject *pystr, Py_ssize_t idx, Py_ssize_t *next_idx_ptr) { /* Read one JSON term (of any kind) from pystr. idx is the index of the first character of the term. *next_idx_ptr is a return-by-reference index to the first character after the term. */ _speedups_state *state = get_speedups_state(s->module_ref); JSON_SCAN_DATA_INIT(pystr); Py_ssize_t length = end_idx + 1; PyObject *rval = NULL; int fallthrough = 0; if (idx < 0 || idx >= length) { raise_errmsg(state, ERR_EXPECTING_VALUE, pystr, idx); return NULL; } switch (JSON_SCAN_READ(idx)) { case '"': /* string */ rval = JSON_SCAN_SCANSTRING_CALL(idx + 1, next_idx_ptr); break; case '{': /* object */ if (Py_EnterRecursiveCall(" while decoding a JSON object " "from a string")) return NULL; rval = JSON_SCAN_FN(_parse_object)(s, pystr, idx + 1, next_idx_ptr); Py_LeaveRecursiveCall(); break; case '[': /* array */ if (Py_EnterRecursiveCall(" while decoding a JSON array " "from a string")) return NULL; rval = JSON_SCAN_FN(_parse_array)(s, pystr, idx + 1, next_idx_ptr); Py_LeaveRecursiveCall(); break; case 'n': /* null */ if ((idx + 3 < length) && JSON_SCAN_READ(idx + 1) == 'u' && JSON_SCAN_READ(idx + 2) == 'l' && JSON_SCAN_READ(idx + 3) == 'l') { Py_INCREF(Py_None); *next_idx_ptr = idx + 4; rval = Py_None; } else fallthrough = 1; break; case 't': /* true */ if ((idx + 3 < length) && JSON_SCAN_READ(idx + 1) == 'r' && JSON_SCAN_READ(idx + 2) == 'u' && JSON_SCAN_READ(idx + 3) == 'e') { Py_INCREF(Py_True); *next_idx_ptr = idx + 4; rval = Py_True; } else fallthrough = 1; break; case 'f': /* false */ if ((idx + 4 < length) && JSON_SCAN_READ(idx + 1) == 'a' && JSON_SCAN_READ(idx + 2) == 'l' && JSON_SCAN_READ(idx + 3) == 's' && JSON_SCAN_READ(idx + 4) == 'e') { Py_INCREF(Py_False); *next_idx_ptr = idx + 5; rval = Py_False; } else fallthrough = 1; break; case 'N': /* NaN */ if ((idx + 2 < length) && JSON_SCAN_READ(idx + 1) == 'a' && JSON_SCAN_READ(idx + 2) == 'N') { rval = _parse_constant(s, pystr, state->JSON_NaN, idx, next_idx_ptr); } else fallthrough = 1; break; case 'I': /* Infinity */ if ((idx + 7 < length) && JSON_SCAN_READ(idx + 1) == 'n' && JSON_SCAN_READ(idx + 2) == 'f' && JSON_SCAN_READ(idx + 3) == 'i' && JSON_SCAN_READ(idx + 4) == 'n' && JSON_SCAN_READ(idx + 5) == 'i' && JSON_SCAN_READ(idx + 6) == 't' && JSON_SCAN_READ(idx + 7) == 'y') { rval = _parse_constant(s, pystr, state->JSON_Infinity, idx, next_idx_ptr); } else fallthrough = 1; break; case '-': /* -Infinity */ if ((idx + 8 < length) && JSON_SCAN_READ(idx + 1) == 'I' && JSON_SCAN_READ(idx + 2) == 'n' && JSON_SCAN_READ(idx + 3) == 'f' && JSON_SCAN_READ(idx + 4) == 'i' && JSON_SCAN_READ(idx + 5) == 'n' && JSON_SCAN_READ(idx + 6) == 'i' && JSON_SCAN_READ(idx + 7) == 't' && JSON_SCAN_READ(idx + 8) == 'y') { rval = _parse_constant(s, pystr, state->JSON_NegInfinity, idx, next_idx_ptr); } else fallthrough = 1; break; default: fallthrough = 1; } /* Didn't find a string, object, array, or named constant. Look for a number. */ if (fallthrough) rval = JSON_SCAN_FN(_match_number)(s, pystr, idx, next_idx_ptr); return rval; } #undef JSON_SCAN_FN #undef JSON_SCAN_CONCAT #undef JSON_SCAN_CONCAT_ #undef JSON_SCAN_SUFFIX #undef JSON_SCAN_DATA_INIT #undef JSON_SCAN_READ #undef JSON_SCAN_SCANSTRING_CALL #undef JSON_SCAN_NUMSTR_CREATE #undef JSON_SCAN_PARSE_FLOAT_FAST #undef JSON_SCAN_PARSE_INT_FAST #undef SKIP_WHITESPACE ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/compat.py0000644000175100017510000000131315172736046017377 0ustar00runnerrunner"""Python 3 compatibility shims """ import sys if sys.version_info[0] < 3: PY3 = False def b(s): return s try: from cStringIO import StringIO except ImportError: from StringIO import StringIO BytesIO = StringIO text_type = unicode binary_type = str string_types = (basestring,) integer_types = (int, long) unichr = unichr reload_module = reload else: PY3 = True from importlib import reload as reload_module def b(s): return bytes(s, 'latin1') from io import StringIO, BytesIO text_type = str binary_type = bytes string_types = (str,) integer_types = (int,) unichr = chr long_type = integer_types[-1] ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/decoder.py0000644000175100017510000003622015172736046017526 0ustar00runnerrunner"""Implementation of JSONDecoder """ from __future__ import absolute_import import re import sys from .compat import PY3, unichr from .scanner import make_scanner, JSONDecodeError def _import_c_scanstring(): try: from ._speedups import scanstring return scanstring except ImportError: return None c_scanstring = _import_c_scanstring() # NOTE (3.1.0): JSONDecodeError may still be imported from this module for # compatibility, but it was never in the __all__ __all__ = ['JSONDecoder'] FLAGS = re.VERBOSE | re.MULTILINE | re.DOTALL def _floatconstants(): return float('nan'), float('inf'), float('-inf') NaN, PosInf, NegInf = _floatconstants() _CONSTANTS = { '-Infinity': NegInf, 'Infinity': PosInf, 'NaN': NaN, } STRINGCHUNK = re.compile(r'(.*?)(["\\\x00-\x1f])', FLAGS) BACKSLASH = { '"': u'"', '\\': u'\\', '/': u'/', 'b': u'\b', 'f': u'\f', 'n': u'\n', 'r': u'\r', 't': u'\t', } DEFAULT_ENCODING = "utf-8" if hasattr(sys, 'get_int_max_str_digits'): bounded_int = int else: def bounded_int(s, INT_MAX_STR_DIGITS=4300): """Backport of the integer string length conversion limitation https://docs.python.org/3/library/stdtypes.html#int-max-str-digits """ if len(s) > INT_MAX_STR_DIGITS: raise ValueError("Exceeds the limit (%s) for integer string conversion: value has %s digits" % (INT_MAX_STR_DIGITS, len(s))) return int(s) def scan_four_digit_hex(s, end, _m=re.compile(r'^[0-9a-fA-F]{4}$').match): """Scan a four digit hex number from s[end:end + 4] """ msg = "Invalid \\uXXXX escape sequence" esc = s[end:end + 4] if not _m(esc): raise JSONDecodeError(msg, s, end - 2) try: return int(esc, 16), end + 4 except ValueError: raise JSONDecodeError(msg, s, end - 2) def py_scanstring(s, end, encoding=None, strict=True, _b=BACKSLASH, _m=STRINGCHUNK.match, _join=u''.join, _PY3=PY3, _maxunicode=sys.maxunicode, _scan_four_digit_hex=scan_four_digit_hex): """Scan the string s for a JSON string. End is the index of the character in s after the quote that started the JSON string. Unescapes all valid JSON string escape sequences and raises ValueError on attempt to decode an invalid string. If strict is False then literal control characters are allowed in the string. Returns a tuple of the decoded string and the index of the character in s after the end quote.""" if encoding is None: encoding = DEFAULT_ENCODING chunks = [] _append = chunks.append begin = end - 1 while 1: chunk = _m(s, end) if chunk is None: raise JSONDecodeError( "Unterminated string starting at", s, begin) prev_end = end end = chunk.end() content, terminator = chunk.groups() # Content is contains zero or more unescaped string characters if content: if not _PY3 and not isinstance(content, unicode): content = unicode(content, encoding) _append(content) # Terminator is the end of string, a literal control character, # or a backslash denoting that an escape sequence follows if terminator == '"': break elif terminator != '\\': if strict: msg = "Invalid control character %r at" raise JSONDecodeError(msg, s, prev_end) else: _append(terminator) continue try: esc = s[end] except IndexError: raise JSONDecodeError( "Unterminated string starting at", s, begin) # If not a unicode escape sequence, must be in the lookup table if esc != 'u': try: char = _b[esc] except KeyError: msg = "Invalid \\X escape sequence %r" raise JSONDecodeError(msg, s, end) end += 1 else: # Unicode escape sequence uni, end = _scan_four_digit_hex(s, end + 1) # Check for surrogate pair on UCS-4 systems # Note that this will join high/low surrogate pairs # but will also pass unpaired surrogates through if (_maxunicode > 65535 and uni & 0xfc00 == 0xd800 and s[end:end + 2] == '\\u'): uni2, end2 = _scan_four_digit_hex(s, end + 2) if uni2 & 0xfc00 == 0xdc00: uni = 0x10000 + (((uni - 0xd800) << 10) | (uni2 - 0xdc00)) end = end2 char = unichr(uni) # Append the unescaped character _append(char) return _join(chunks), end # Use speedup if available scanstring = c_scanstring or py_scanstring WHITESPACE = re.compile(r'[ \t\n\r]*', FLAGS) WHITESPACE_STR = ' \t\n\r' def JSONObject(state, encoding, strict, scan_once, object_hook, object_pairs_hook, memo=None, _w=WHITESPACE.match, _ws=WHITESPACE_STR): (s, end) = state # Backwards compatibility if memo is None: memo = {} memo_get = memo.setdefault pairs = [] # Use a slice to prevent IndexError from being raised, the following # check will raise a more specific ValueError if the string is empty nextchar = s[end:end + 1] # Normally we expect nextchar == '"' if nextchar != '"': if nextchar in _ws: end = _w(s, end).end() nextchar = s[end:end + 1] # Trivial empty object if nextchar == '}': if object_pairs_hook is not None: result = object_pairs_hook(pairs) return result, end + 1 pairs = {} if object_hook is not None: pairs = object_hook(pairs) return pairs, end + 1 elif nextchar != '"': raise JSONDecodeError( "Expecting property name enclosed in double quotes or '}'", s, end) end += 1 while True: key, end = scanstring(s, end, encoding, strict) key = memo_get(key, key) # To skip some function call overhead we optimize the fast paths where # the JSON key separator is ": " or just ":". if s[end:end + 1] != ':': end = _w(s, end).end() if s[end:end + 1] != ':': raise JSONDecodeError("Expecting ':' delimiter", s, end) end += 1 try: if s[end] in _ws: end += 1 if s[end] in _ws: end = _w(s, end + 1).end() except IndexError: pass value, end = scan_once(s, end) pairs.append((key, value)) try: nextchar = s[end] if nextchar in _ws: end = _w(s, end + 1).end() nextchar = s[end] except IndexError: nextchar = '' end += 1 if nextchar == '}': break elif nextchar != ',': raise JSONDecodeError("Expecting ',' delimiter or '}'", s, end - 1) try: nextchar = s[end] if nextchar in _ws: end += 1 nextchar = s[end] if nextchar in _ws: end = _w(s, end + 1).end() nextchar = s[end] except IndexError: nextchar = '' end += 1 if nextchar != '"': if nextchar == '}': raise JSONDecodeError( "Illegal trailing comma before end of object", s, end - 1) raise JSONDecodeError( "Expecting property name enclosed in double quotes", s, end - 1) if object_pairs_hook is not None: result = object_pairs_hook(pairs) return result, end pairs = dict(pairs) if object_hook is not None: pairs = object_hook(pairs) return pairs, end def JSONArray(state, scan_once, array_hook=None, _w=WHITESPACE.match, _ws=WHITESPACE_STR): (s, end) = state values = [] nextchar = s[end:end + 1] if nextchar in _ws: end = _w(s, end + 1).end() nextchar = s[end:end + 1] # Look-ahead for trivial empty array if nextchar == ']': if array_hook is not None: values = array_hook(values) return values, end + 1 elif nextchar == '': raise JSONDecodeError("Expecting value or ']'", s, end) _append = values.append while True: value, end = scan_once(s, end) _append(value) nextchar = s[end:end + 1] if nextchar in _ws: end = _w(s, end + 1).end() nextchar = s[end:end + 1] end += 1 if nextchar == ']': break elif nextchar != ',': raise JSONDecodeError("Expecting ',' delimiter or ']'", s, end - 1) try: if s[end] in _ws: end += 1 if s[end] in _ws: end = _w(s, end + 1).end() except IndexError: pass if s[end:end + 1] == ']': raise JSONDecodeError( "Illegal trailing comma before end of array", s, end - 1) if array_hook is not None: values = array_hook(values) return values, end class JSONDecoder(object): """Simple JSON decoder Performs the following translations in decoding by default: +---------------+-------------------+ | JSON | Python | +===============+===================+ | object | dict | +---------------+-------------------+ | array | list | +---------------+-------------------+ | string | str, unicode | +---------------+-------------------+ | number (int) | int, long | +---------------+-------------------+ | number (real) | float | +---------------+-------------------+ | true | True | +---------------+-------------------+ | false | False | +---------------+-------------------+ | null | None | +---------------+-------------------+ When allow_nan=True, it also understands ``NaN``, ``Infinity``, and ``-Infinity`` as their corresponding ``float`` values, which is outside the JSON spec. """ def __init__(self, encoding=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None, allow_nan=False, array_hook=None): """ *encoding* determines the encoding used to interpret any :class:`str` objects decoded by this instance (``'utf-8'`` by default). It has no effect when decoding :class:`unicode` objects. Note that currently only encodings that are a superset of ASCII work, strings of other encodings should be passed in as :class:`unicode`. *object_hook*, if specified, will be called with the result of every JSON object decoded and its return value will be used in place of the given :class:`dict`. This can be used to provide custom deserializations (e.g. to support JSON-RPC class hinting). *object_pairs_hook* is an optional function that will be called with the result of any object literal decode with an ordered list of pairs. The return value of *object_pairs_hook* will be used instead of the :class:`dict`. This feature can be used to implement custom decoders that rely on the order that the key and value pairs are decoded (for example, :func:`collections.OrderedDict` will remember the order of insertion). If *object_hook* is also defined, the *object_pairs_hook* takes priority. *parse_float*, if specified, will be called with the string of every JSON float to be decoded. By default, this is equivalent to ``float(num_str)``. This can be used to use another datatype or parser for JSON floats (e.g. :class:`decimal.Decimal`). *parse_int*, if specified, will be called with the string of every JSON int to be decoded. By default, this is equivalent to ``int(num_str)``. This can be used to use another datatype or parser for JSON integers (e.g. :class:`float`). *allow_nan*, if True (default false), will allow the parser to accept the non-standard floats ``NaN``, ``Infinity``, and ``-Infinity``. *parse_constant*, if specified, will be called with one of the following strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``. It is not recommended to use this feature, as it is rare to parse non-compliant JSON containing these values. *strict* controls the parser's behavior when it encounters an invalid control character in a string. The default setting of ``True`` means that unescaped control characters are parse errors, if ``False`` then control characters will be allowed in strings. """ if encoding is None: encoding = DEFAULT_ENCODING self.encoding = encoding self.object_hook = object_hook self.object_pairs_hook = object_pairs_hook self.parse_float = parse_float or float self.parse_int = parse_int or bounded_int self.parse_constant = parse_constant or (allow_nan and _CONSTANTS.__getitem__ or None) self.strict = strict self.array_hook = array_hook self.parse_object = JSONObject self.parse_array = JSONArray self.parse_string = scanstring self.memo = {} self.scan_once = make_scanner(self) def decode(self, s, _w=WHITESPACE.match, _PY3=PY3): """Return the Python representation of ``s`` (a ``str`` or ``unicode`` instance containing a JSON document) """ if _PY3 and isinstance(s, bytes): s = str(s, self.encoding) obj, end = self.raw_decode(s) end = _w(s, end).end() if end != len(s): raise JSONDecodeError("Extra data", s, end, len(s)) return obj def raw_decode(self, s, idx=0, _w=WHITESPACE.match, _PY3=PY3): """Decode a JSON document from ``s`` (a ``str`` or ``unicode`` beginning with a JSON document) and return a 2-tuple of the Python representation and the index in ``s`` where the document ended. Optionally, ``idx`` can be used to specify an offset in ``s`` where the JSON document begins. This can be used to decode a JSON document from a string that may have extraneous data at the end. """ if idx < 0: # Ensure that raw_decode bails on negative indexes, the regex # would otherwise mask this behavior. #98 raise JSONDecodeError('Expecting value', s, idx) if _PY3 and not isinstance(s, str): raise TypeError("Input string must be text, not bytes") # strip UTF-8 bom if len(s) > idx: ord0 = ord(s[idx]) if ord0 == 0xfeff: idx += 1 elif ord0 == 0xef and s[idx:idx + 3] == '\xef\xbb\xbf': idx += 3 return self.scan_once(s, idx=_w(s, idx).end()) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/encoder.py0000644000175100017510000007365315172736046017553 0ustar00runnerrunner"""Implementation of JSONEncoder """ from __future__ import absolute_import import re from operator import itemgetter # Do not import Decimal directly to avoid reload issues import decimal import sys from .compat import binary_type, text_type, string_types, integer_types, PY3 # PEP 678 add_note() is available on Python 3.11+ _HAS_ADD_NOTE = sys.version_info >= (3, 11) def _import_speedups(): try: from . import _speedups return (_speedups.encode_basestring_ascii, _speedups.encode_basestring, _speedups.make_encoder) except ImportError: return None, None, None c_encode_basestring_ascii, c_encode_basestring, c_make_encoder = ( _import_speedups()) from .decoder import PosInf from .raw_json import RawJSON ESCAPE = re.compile(r'[\x00-\x1f\\"]') ESCAPE_ASCII = re.compile(r'([\\"]|[^\ -~])') HAS_UTF8 = re.compile(r'[\x80-\xff]') ESCAPE_DCT = { '\\': '\\\\', '"': '\\"', '\b': '\\b', '\f': '\\f', '\n': '\\n', '\r': '\\r', '\t': '\\t', } for i in range(0x20): ESCAPE_DCT.setdefault(chr(i), '\\u%04x' % (i,)) del i FLOAT_REPR = repr # dict-like types that should be encoded as JSON objects. # frozendict is a builtin added in CPython 3.15 (PEP 814). if sys.version_info >= (3, 15): _dict_types = (dict, frozendict) else: _dict_types = dict def py_encode_basestring(s, _PY3=PY3, _q=u'"'): """Return a JSON representation of a Python string """ if _PY3: if isinstance(s, bytes): s = str(s, 'utf-8') elif type(s) is not str: # convert an str subclass instance to exact str # raise a TypeError otherwise s = str.__str__(s) else: if isinstance(s, str) and HAS_UTF8.search(s) is not None: s = unicode(s, 'utf-8') elif type(s) not in (str, unicode): # convert an str subclass instance to exact str # convert a unicode subclass instance to exact unicode # raise a TypeError otherwise if isinstance(s, str): s = str.__str__(s) else: s = unicode.__getnewargs__(s)[0] def replace(match): return ESCAPE_DCT[match.group(0)] return _q + ESCAPE.sub(replace, s) + _q def py_encode_basestring_ascii(s, _PY3=PY3): """Return an ASCII-only JSON representation of a Python string """ if _PY3: if isinstance(s, bytes): s = str(s, 'utf-8') elif type(s) is not str: # convert an str subclass instance to exact str # raise a TypeError otherwise s = str.__str__(s) else: if isinstance(s, str) and HAS_UTF8.search(s) is not None: s = unicode(s, 'utf-8') elif type(s) not in (str, unicode): # convert an str subclass instance to exact str # convert a unicode subclass instance to exact unicode # raise a TypeError otherwise if isinstance(s, str): s = str.__str__(s) else: s = unicode.__getnewargs__(s)[0] def replace(match): s = match.group(0) try: return ESCAPE_DCT[s] except KeyError: n = ord(s) if n < 0x10000: return '\\u%04x' % (n,) else: # surrogate pair n -= 0x10000 s1 = 0xd800 | ((n >> 10) & 0x3ff) s2 = 0xdc00 | (n & 0x3ff) return '\\u%04x\\u%04x' % (s1, s2) return '"' + str(ESCAPE_ASCII.sub(replace, s)) + '"' encode_basestring_ascii = ( c_encode_basestring_ascii or py_encode_basestring_ascii) encode_basestring = ( c_encode_basestring or py_encode_basestring) class JSONEncoder(object): """Extensible JSON encoder for Python data structures. Supports the following objects and types by default: +-------------------+---------------+ | Python | JSON | +===================+===============+ | dict, namedtuple | object | +-------------------+---------------+ | list, tuple | array | +-------------------+---------------+ | str, unicode | string | +-------------------+---------------+ | int, long, float | number | +-------------------+---------------+ | True | true | +-------------------+---------------+ | False | false | +-------------------+---------------+ | None | null | +-------------------+---------------+ To extend this to recognize other objects, subclass and implement a ``.default()`` method with another method that returns a serializable object for ``o`` if possible, otherwise it should call the superclass implementation (to raise ``TypeError``). """ item_separator = ', ' key_separator = ': ' def __init__(self, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=False, sort_keys=False, indent=None, separators=None, encoding='utf-8', default=None, use_decimal=True, namedtuple_as_object=True, tuple_as_array=True, bigint_as_string=False, item_sort_key=None, for_json=False, ignore_nan=False, int_as_string_bitcount=None, iterable_as_array=False): """Constructor for JSONEncoder, with sensible defaults. If skipkeys is false, then it is a TypeError to attempt encoding of keys that are not str, int, long, float or None. If skipkeys is True, such items are simply skipped. If ensure_ascii is true, the output is guaranteed to be str objects with all incoming unicode characters escaped. If ensure_ascii is false, the output will be unicode object. If check_circular is true, then lists, dicts, and custom encoded objects will be checked for circular references during encoding to prevent an infinite recursion (which would cause an OverflowError). Otherwise, no such check takes place. If allow_nan is true (default: False), then out of range float values (nan, inf, -inf) will be serialized to their JavaScript equivalents (NaN, Infinity, -Infinity) instead of raising a ValueError. See ignore_nan for ECMA-262 compliant behavior. If sort_keys is true, then the output of dictionaries will be sorted by key; this is useful for regression tests to ensure that JSON serializations can be compared on a day-to-day basis. If indent is a string, then JSON array elements and object members will be pretty-printed with a newline followed by that string repeated for each level of nesting. ``None`` (the default) selects the most compact representation without any newlines. For backwards compatibility with versions of simplejson earlier than 2.1.0, an integer is also accepted and is converted to a string with that many spaces. If specified, separators should be an (item_separator, key_separator) tuple. The default is (', ', ': ') if *indent* is ``None`` and (',', ': ') otherwise. To get the most compact JSON representation, you should specify (',', ':') to eliminate whitespace. If specified, default is a function that gets called for objects that can't otherwise be serialized. It should return a JSON encodable version of the object or raise a ``TypeError``. If encoding is not None, then all input strings will be transformed into unicode using that encoding prior to JSON-encoding. The default is UTF-8. If use_decimal is true (default: ``True``), ``decimal.Decimal`` will be supported directly by the encoder. For the inverse, decode JSON with ``parse_float=decimal.Decimal``. If namedtuple_as_object is true (the default), objects with ``_asdict()`` methods will be encoded as JSON objects. If tuple_as_array is true (the default), tuple (and subclasses) will be encoded as JSON arrays. If *iterable_as_array* is true (default: ``False``), any object not in the above table that implements ``__iter__()`` will be encoded as a JSON array. If bigint_as_string is true (not the default), ints 2**53 and higher or lower than -2**53 will be encoded as strings. This is to avoid the rounding that happens in Javascript otherwise. If int_as_string_bitcount is a positive number (n), then int of size greater than or equal to 2**n or lower than or equal to -2**n will be encoded as strings. If specified, item_sort_key is a callable used to sort the items in each dictionary. This is useful if you want to sort items other than in alphabetical order by key. If for_json is true (not the default), objects with a ``for_json()`` method will use the return value of that method for encoding as JSON instead of the object. If *ignore_nan* is true (default: ``False``), then out of range :class:`float` values (``nan``, ``inf``, ``-inf``) will be serialized as ``null`` in compliance with the ECMA-262 specification. If true, this will override *allow_nan*. """ self.skipkeys = skipkeys self.ensure_ascii = ensure_ascii self.check_circular = check_circular self.allow_nan = allow_nan self.sort_keys = sort_keys self.use_decimal = use_decimal self.namedtuple_as_object = namedtuple_as_object self.tuple_as_array = tuple_as_array self.iterable_as_array = iterable_as_array self.bigint_as_string = bigint_as_string self.item_sort_key = item_sort_key self.for_json = for_json self.ignore_nan = ignore_nan self.int_as_string_bitcount = int_as_string_bitcount if indent is not None and not isinstance(indent, string_types): indent = indent * ' ' self.indent = indent if separators is not None: self.item_separator, self.key_separator = separators elif indent is not None: self.item_separator = ',' if default is not None: self.default = default self.encoding = encoding def default(self, o): """Implement this method in a subclass such that it returns a serializable object for ``o``, or calls the base implementation (to raise a ``TypeError``). For example, to support arbitrary iterators, you could implement default like this:: def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) return JSONEncoder.default(self, o) """ raise TypeError('Object of type %s is not JSON serializable' % o.__class__.__name__) def encode(self, o): """Return a JSON string representation of a Python data structure. >>> from simplejson import JSONEncoder >>> JSONEncoder().encode({"foo": ["bar", "baz"]}) '{"foo": ["bar", "baz"]}' """ # This is for extremely simple cases and benchmarks. if isinstance(o, binary_type): _encoding = self.encoding if (_encoding is not None and not (_encoding == 'utf-8')): o = text_type(o, _encoding) if isinstance(o, string_types): if self.ensure_ascii: return encode_basestring_ascii(o) else: return encode_basestring(o) # This doesn't pass the iterator directly to ''.join() because the # exceptions aren't as detailed. The list call should be roughly # equivalent to the PySequence_Fast that ''.join() would do. chunks = self.iterencode(o) if not isinstance(chunks, (list, tuple)): chunks = list(chunks) if self.ensure_ascii: return ''.join(chunks) else: return u''.join(chunks) def iterencode(self, o): """Encode the given object and yield each string representation as available. For example:: for chunk in JSONEncoder().iterencode(bigobject): mysocket.write(chunk) """ if self.check_circular: markers = {} else: markers = None if self.ensure_ascii: _encoder = encode_basestring_ascii else: _encoder = encode_basestring if self.encoding != 'utf-8' and self.encoding is not None: def _encoder(o, _orig_encoder=_encoder, _encoding=self.encoding): if isinstance(o, binary_type): o = text_type(o, _encoding) return _orig_encoder(o) def floatstr(o, allow_nan=self.allow_nan, ignore_nan=self.ignore_nan, _repr=FLOAT_REPR, _inf=PosInf, _neginf=-PosInf): # Check for specials. Note that this type of test is processor # and/or platform-specific, so do tests which don't depend on # the internals. if o != o: text = 'NaN' elif o == _inf: text = 'Infinity' elif o == _neginf: text = '-Infinity' else: if type(o) != float: # See #118, do not trust custom str/repr o = float(o) return _repr(o) if ignore_nan: text = 'null' elif not allow_nan: raise ValueError( "Out of range float values are not JSON compliant: " + repr(o)) return text key_memo = {} int_as_string_bitcount = ( 53 if self.bigint_as_string else self.int_as_string_bitcount) if c_make_encoder is not None: _iterencode = c_make_encoder( markers, self.default, _encoder, self.indent, self.key_separator, self.item_separator, self.sort_keys, self.skipkeys, self.allow_nan, key_memo, self.use_decimal, self.namedtuple_as_object, self.tuple_as_array, int_as_string_bitcount, self.item_sort_key, self.encoding, self.for_json, self.ignore_nan, decimal.Decimal, self.iterable_as_array) else: _iterencode = _make_iterencode( markers, self.default, _encoder, self.indent, floatstr, self.key_separator, self.item_separator, self.sort_keys, self.skipkeys, self.use_decimal, self.namedtuple_as_object, self.tuple_as_array, int_as_string_bitcount, self.item_sort_key, self.encoding, self.for_json, self.iterable_as_array, Decimal=decimal.Decimal) try: return _iterencode(o, 0) finally: key_memo.clear() class JSONEncoderForHTML(JSONEncoder): """An encoder that produces JSON safe to embed in HTML. To embed JSON content in, say, a script tag on a web page, the characters &, < and > should be escaped. They cannot be escaped with the usual entities (e.g. &) because they are not expanded within ' self.assertEqual( r'"\u003c/script\u003e\u003cscript\u003e' r'alert(\"gotcha\")\u003c/script\u003e"', self.encoder.encode(bad_string)) self.assertEqual( bad_string, self.decoder.decode( self.encoder.encode(bad_string))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_errors.py0000644000175100017510000000665715172736046021651 0ustar00runnerrunnerimport sys, pickle import unittest from unittest import TestCase import simplejson as json from simplejson.compat import text_type, b class TestErrors(TestCase): def test_string_keys_error(self): data = [{'a': 'A', 'b': (2, 4), 'c': 3.0, ('d',): 'D tuple'}] try: json.dumps(data) except TypeError: err = sys.exc_info()[1] else: self.fail('Expected TypeError') self.assertEqual(str(err), 'keys must be str, int, float, bool or None, not tuple') def test_not_serializable(self): try: json.dumps(json) except TypeError: err = sys.exc_info()[1] else: self.fail('Expected TypeError') self.assertEqual(str(err), 'Object of type module is not JSON serializable') def test_decode_error(self): err = None try: json.loads('{}\na\nb') except json.JSONDecodeError: err = sys.exc_info()[1] else: self.fail('Expected JSONDecodeError') self.assertEqual(err.lineno, 2) self.assertEqual(err.colno, 1) self.assertEqual(err.endlineno, 3) self.assertEqual(err.endcolno, 2) def test_scan_error(self): err = None for t in (text_type, b): try: json.loads(t('{"asdf": "')) except json.JSONDecodeError: err = sys.exc_info()[1] else: self.fail('Expected JSONDecodeError') self.assertEqual(err.lineno, 1) self.assertEqual(err.colno, 10) def test_error_is_pickable(self): err = None try: json.loads('{}\na\nb') except json.JSONDecodeError: err = sys.exc_info()[1] else: self.fail('Expected JSONDecodeError') s = pickle.dumps(err) e = pickle.loads(s) self.assertEqual(err.msg, e.msg) self.assertEqual(err.doc, e.doc) self.assertEqual(err.pos, e.pos) self.assertEqual(err.end, e.end) @unittest.skipIf(sys.version_info < (3, 11), 'add_note requires Python 3.11+') def test_add_note_list_recursion(self): x = [] x.append(x) try: json.dumps(x) except ValueError as exc: self.assertEqual( exc.__notes__, ['when serializing list item 0']) else: self.fail('Expected ValueError') @unittest.skipIf(sys.version_info < (3, 11), 'add_note requires Python 3.11+') def test_add_note_dict_recursion(self): x = {} x['test'] = x try: json.dumps(x) except ValueError as exc: self.assertEqual( exc.__notes__, ["when serializing dict item 'test'"]) else: self.fail('Expected ValueError') @unittest.skipIf(sys.version_info < (3, 11), 'add_note requires Python 3.11+') def test_add_note_nested_error(self): try: json.dumps({'a': [1, object(), 3]}) except TypeError as exc: self.assertEqual(len(exc.__notes__), 3) self.assertEqual(exc.__notes__[0], 'when serializing object object') self.assertEqual(exc.__notes__[1], 'when serializing list item 1') self.assertEqual(exc.__notes__[2], "when serializing dict item 'a'") else: self.fail('Expected TypeError') ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_fail.py0000644000175100017510000001446615172736046021245 0ustar00runnerrunnerimport sys from unittest import TestCase import simplejson as json # 2007-10-05 JSONDOCS = [ # http://json.org/JSON_checker/test/fail1.json '"A JSON payload should be an object or array, not a string."', # http://json.org/JSON_checker/test/fail2.json '["Unclosed array"', # http://json.org/JSON_checker/test/fail3.json '{unquoted_key: "keys must be quoted"}', # http://json.org/JSON_checker/test/fail4.json '["extra comma",]', # http://json.org/JSON_checker/test/fail5.json '["double extra comma",,]', # http://json.org/JSON_checker/test/fail6.json '[ , "<-- missing value"]', # http://json.org/JSON_checker/test/fail7.json '["Comma after the close"],', # http://json.org/JSON_checker/test/fail8.json '["Extra close"]]', # http://json.org/JSON_checker/test/fail9.json '{"Extra comma": true,}', # http://json.org/JSON_checker/test/fail10.json '{"Extra value after close": true} "misplaced quoted value"', # http://json.org/JSON_checker/test/fail11.json '{"Illegal expression": 1 + 2}', # http://json.org/JSON_checker/test/fail12.json '{"Illegal invocation": alert()}', # http://json.org/JSON_checker/test/fail13.json '{"Numbers cannot have leading zeroes": 013}', # http://json.org/JSON_checker/test/fail14.json '{"Numbers cannot be hex": 0x14}', # http://json.org/JSON_checker/test/fail15.json '["Illegal backslash escape: \\x15"]', # http://json.org/JSON_checker/test/fail16.json '[\\naked]', # http://json.org/JSON_checker/test/fail17.json '["Illegal backslash escape: \\017"]', # http://json.org/JSON_checker/test/fail18.json '[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', # http://json.org/JSON_checker/test/fail19.json '{"Missing colon" null}', # http://json.org/JSON_checker/test/fail20.json '{"Double colon":: null}', # http://json.org/JSON_checker/test/fail21.json '{"Comma instead of colon", null}', # http://json.org/JSON_checker/test/fail22.json '["Colon instead of comma": false]', # http://json.org/JSON_checker/test/fail23.json '["Bad value", truth]', # http://json.org/JSON_checker/test/fail24.json "['single quote']", # http://json.org/JSON_checker/test/fail25.json '["\ttab\tcharacter\tin\tstring\t"]', # http://json.org/JSON_checker/test/fail26.json '["tab\\ character\\ in\\ string\\ "]', # http://json.org/JSON_checker/test/fail27.json '["line\nbreak"]', # http://json.org/JSON_checker/test/fail28.json '["line\\\nbreak"]', # http://json.org/JSON_checker/test/fail29.json '[0e]', # http://json.org/JSON_checker/test/fail30.json '[0e+]', # http://json.org/JSON_checker/test/fail31.json '[0e+-1]', # http://json.org/JSON_checker/test/fail32.json '{"Comma instead if closing brace": true,', # http://json.org/JSON_checker/test/fail33.json '["mismatch"}', # http://code.google.com/p/simplejson/issues/detail?id=3 u'["A\u001FZ control characters in string"]', # misc based on coverage '{', '{]', '{"foo": "bar"]', '{"foo": "bar"', 'nul', 'nulx', '-', '-x', '-e', '-e0', '-Infinite', '-Inf', 'Infinit', 'Infinite', 'NaM', 'NuN', 'falsy', 'fal', 'trug', 'tru', '1e', '1ex', '1e-', '1e-x', ] SKIPS = { 1: "why not have a string payload?", 18: "spec doesn't specify any nesting limitations", } class TestFail(TestCase): def test_failures(self): for idx, doc in enumerate(JSONDOCS): idx = idx + 1 if idx in SKIPS: json.loads(doc) continue try: json.loads(doc) except json.JSONDecodeError: pass else: self.fail("Expected failure for fail%d.json: %r" % (idx, doc)) def test_array_decoder_issue46(self): # http://code.google.com/p/simplejson/issues/detail?id=46 for doc in [u'[,]', '[,]']: try: json.loads(doc) except json.JSONDecodeError: e = sys.exc_info()[1] self.assertEqual(e.pos, 1) self.assertEqual(e.lineno, 1) self.assertEqual(e.colno, 2) except Exception: e = sys.exc_info()[1] self.fail("Unexpected exception raised %r %s" % (e, e)) else: self.fail("Unexpected success parsing '[,]'") def test_truncated_input(self): test_cases = [ ('', 'Expecting value', 0), ('[', "Expecting value or ']'", 1), ('[42', "Expecting ',' delimiter", 3), ('[42,', 'Expecting value', 4), ('["', 'Unterminated string starting at', 1), ('["spam', 'Unterminated string starting at', 1), ('["spam"', "Expecting ',' delimiter", 7), ('["spam",', 'Expecting value', 8), ('{', "Expecting property name enclosed in double quotes or '}'", 1), ('{"', 'Unterminated string starting at', 1), ('{"spam', 'Unterminated string starting at', 1), ('{"spam"', "Expecting ':' delimiter", 7), ('{"spam":', 'Expecting value', 8), ('{"spam":42', "Expecting ',' delimiter", 10), ('{"spam":42,', 'Expecting property name enclosed in double quotes', 11), ('"', 'Unterminated string starting at', 0), ('"spam', 'Unterminated string starting at', 0), ('[,', "Expecting value", 1), ('--', 'Expecting value', 0), ('"\x18d', "Invalid control character %r", 1), ] for data, msg, idx in test_cases: try: json.loads(data) except json.JSONDecodeError: e = sys.exc_info()[1] self.assertEqual( e.msg[:len(msg)], msg, "%r doesn't start with %r for %r" % (e.msg, msg, data)) self.assertEqual( e.pos, idx, "pos %r != %r for %r" % (e.pos, idx, data)) except Exception: e = sys.exc_info()[1] self.fail("Unexpected exception raised %r %s" % (e, e)) else: self.fail("Unexpected success parsing '%r'" % (data,)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_float.py0000644000175100017510000000367715172736046021441 0ustar00runnerrunnerimport sys import math from unittest import TestCase from simplejson.compat import long_type, text_type import simplejson as json from simplejson.decoder import NaN, PosInf, NegInf class TestFloat(TestCase): def test_degenerates_allow(self): for inf in (PosInf, NegInf): self.assertEqual(json.loads(json.dumps(inf, allow_nan=True), allow_nan=True), inf) # Python 2.5 doesn't have math.isnan nan = json.loads(json.dumps(NaN, allow_nan=True), allow_nan=True) self.assertTrue((0 + nan) != nan) def test_degenerates_ignore(self): for f in (PosInf, NegInf, NaN): self.assertEqual(json.loads(json.dumps(f, ignore_nan=True)), None) def test_degenerates_deny(self): for f in (PosInf, NegInf, NaN): self.assertRaises(ValueError, json.dumps, f, allow_nan=False) for s in ('Infinity', '-Infinity', 'NaN'): self.assertRaises(ValueError, json.loads, s, allow_nan=False) self.assertRaises(ValueError, json.loads, s) def test_floats(self): for num in [1617161771.7650001, math.pi, math.pi**100, math.pi**-100, 3.1]: self.assertEqual(float(json.dumps(num)), num) self.assertEqual(json.loads(json.dumps(num)), num) self.assertEqual(json.loads(text_type(json.dumps(num))), num) def test_ints(self): for num in [1, long_type(1), 1<<32, 1<<64]: self.assertEqual(json.dumps(num), str(num)) self.assertEqual(int(json.dumps(num)), num) self.assertEqual(json.loads(json.dumps(num)), num) self.assertEqual(json.loads(text_type(json.dumps(num))), num) def test_float_range(self): try: float_range = [sys.float_info.min, sys.float_info.max] except AttributeError: float_range = [2.2250738585072014e-308, 1.7976931348623157e+308] self.assertEqual(json.loads(json.dumps(float_range)), float_range) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_for_json.py0000644000175100017510000001003615172736046022136 0ustar00runnerrunnerimport unittest import simplejson as json class ForJson(object): def for_json(self): return {'for_json': 1} class NestedForJson(object): def for_json(self): return {'nested': ForJson()} class ForJsonList(object): def for_json(self): return ['list'] class DictForJson(dict): def for_json(self): return {'alpha': 1} class ListForJson(list): def for_json(self): return ['list'] class TestForJson(unittest.TestCase): def assertRoundTrip(self, obj, other, for_json=True): if for_json is None: # None will use the default s = json.dumps(obj) else: s = json.dumps(obj, for_json=for_json) self.assertEqual( json.loads(s), other) def test_for_json_encodes_stand_alone_object(self): self.assertRoundTrip( ForJson(), ForJson().for_json()) def test_for_json_encodes_object_nested_in_dict(self): self.assertRoundTrip( {'hooray': ForJson()}, {'hooray': ForJson().for_json()}) def test_for_json_encodes_object_nested_in_list_within_dict(self): self.assertRoundTrip( {'list': [0, ForJson(), 2, 3]}, {'list': [0, ForJson().for_json(), 2, 3]}) def test_for_json_encodes_object_nested_within_object(self): self.assertRoundTrip( NestedForJson(), {'nested': {'for_json': 1}}) def test_for_json_encodes_list(self): self.assertRoundTrip( ForJsonList(), ForJsonList().for_json()) def test_for_json_encodes_list_within_object(self): self.assertRoundTrip( {'nested': ForJsonList()}, {'nested': ForJsonList().for_json()}) def test_for_json_encodes_dict_subclass(self): self.assertRoundTrip( DictForJson(a=1), DictForJson(a=1).for_json()) def test_for_json_encodes_list_subclass(self): self.assertRoundTrip( ListForJson(['l']), ListForJson(['l']).for_json()) def test_for_json_ignored_if_not_true_with_dict_subclass(self): for for_json in (None, False): self.assertRoundTrip( DictForJson(a=1), {'a': 1}, for_json=for_json) def test_for_json_ignored_if_not_true_with_list_subclass(self): for for_json in (None, False): self.assertRoundTrip( ListForJson(['l']), ['l'], for_json=for_json) def test_raises_typeerror_if_for_json_not_true_with_object(self): self.assertRaises(TypeError, json.dumps, ForJson()) self.assertRaises(TypeError, json.dumps, ForJson(), for_json=False) def test_getattr_exception_propagates(self): # Regression: _call_json_method used to PyErr_Clear() unconditionally # after PyObject_GetAttr() failed, swallowing MemoryError, # KeyboardInterrupt, and any other non-AttributeError raised by a # user __getattr__. Same bug class as commit d3ecab5 (iterable_as_array). class BadAttr(object): def __init__(self, attr, exc): self.attr = attr self.exc = exc def __getattr__(self, name): if name == self.attr: raise self.exc('from __getattr__: ' + name) raise AttributeError(name) for exc_type in (MemoryError, RuntimeError): self.assertRaises( exc_type, json.dumps, BadAttr('for_json', exc_type), for_json=True) self.assertRaises( exc_type, json.dumps, BadAttr('_asdict', exc_type), namedtuple_as_object=True) # AttributeError from __getattr__ must still be treated as # "method not present" and fall through silently. class NoJsonMethods(object): def __getattr__(self, name): raise AttributeError(name) self.assertRaises( TypeError, json.dumps, NoJsonMethods(), for_json=True) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_free_threading.py0000644000175100017510000000614115172736046023267 0ustar00runnerrunner"""Tests that exercise the C extension from multiple threads. These tests pass on any Python build but their real purpose is to catch data races on free-threaded builds (PEP 703) where the GIL is disabled. The test_free_threading CI job runs these with ``PYTHON_GIL=0`` on a free-threaded interpreter. """ import sys import threading from unittest import TestCase import simplejson from simplejson.tests._helpers import skip_if_speedups_missing class TestFreeThreading(TestCase): """Exercise the C extension from multiple threads simultaneously.""" N_THREADS = 8 N_ITER = 500 def _run_threads(self, worker): if sys.platform == 'emscripten': self.skipTest("threads not available on Emscripten") errors = [] def wrapped(): try: worker() except BaseException as e: errors.append(e) threads = [threading.Thread(target=wrapped) for _ in range(self.N_THREADS)] for t in threads: t.start() for t in threads: t.join() if errors: raise errors[0] @skip_if_speedups_missing def test_concurrent_encode(self): data = { "numbers": list(range(64)), "nested": {"key": "value", "list": [1, 2.5, None, True, False]}, "string": "hello \u00e9 world", } expected = simplejson.dumps(data, sort_keys=True) def worker(): for _ in range(self.N_ITER): self.assertEqual( simplejson.dumps(data, sort_keys=True), expected) self._run_threads(worker) @skip_if_speedups_missing def test_concurrent_decode(self): raw = ( '{"numbers": [1, 2, 3, 4, 5], ' '"nested": {"a": "b", "c": [true, false, null]}, ' '"string": "hello"}' ) expected = simplejson.loads(raw) def worker(): for _ in range(self.N_ITER): self.assertEqual(simplejson.loads(raw), expected) self._run_threads(worker) @skip_if_speedups_missing def test_concurrent_encode_decode(self): """Mix encode and decode on the same data across threads.""" data = {"items": list(range(32)), "flag": True, "name": "mix"} raw = simplejson.dumps(data, sort_keys=True) def worker(): for _ in range(self.N_ITER): s = simplejson.dumps(data, sort_keys=True) self.assertEqual(s, raw) self.assertEqual(simplejson.loads(s), data) self._run_threads(worker) @skip_if_speedups_missing def test_shared_encoder_instance(self): """A single encoder/decoder instance used by many threads.""" enc = simplejson.JSONEncoder(sort_keys=True) dec = simplejson.JSONDecoder() data = {"a": 1, "b": [1, 2, 3], "c": {"nested": True}} raw = enc.encode(data) def worker(): for _ in range(self.N_ITER): self.assertEqual(enc.encode(data), raw) self.assertEqual(dec.decode(raw), data) self._run_threads(worker) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_indent.py0000644000175100017510000000501015172736046021574 0ustar00runnerrunnerfrom unittest import TestCase import textwrap import simplejson as json from simplejson.compat import StringIO class TestIndent(TestCase): def test_indent(self): h = [['blorpie'], ['whoops'], [], 'd-shtaeou', 'd-nthiouh', 'i-vhbjkhnth', {'nifty': 87}, {'field': 'yes', 'morefield': False} ] expect = textwrap.dedent("""\ [ \t[ \t\t"blorpie" \t], \t[ \t\t"whoops" \t], \t[], \t"d-shtaeou", \t"d-nthiouh", \t"i-vhbjkhnth", \t{ \t\t"nifty": 87 \t}, \t{ \t\t"field": "yes", \t\t"morefield": false \t} ]""") d1 = json.dumps(h) d2 = json.dumps(h, indent='\t', sort_keys=True, separators=(',', ': ')) d3 = json.dumps(h, indent=' ', sort_keys=True, separators=(',', ': ')) d4 = json.dumps(h, indent=2, sort_keys=True, separators=(',', ': ')) h1 = json.loads(d1) h2 = json.loads(d2) h3 = json.loads(d3) h4 = json.loads(d4) self.assertEqual(h1, h) self.assertEqual(h2, h) self.assertEqual(h3, h) self.assertEqual(h4, h) self.assertEqual(d3, expect.replace('\t', ' ')) self.assertEqual(d4, expect.replace('\t', ' ')) # NOTE: Python 2.4 textwrap.dedent converts tabs to spaces, # so the following is expected to fail. Python 2.4 is not a # supported platform in simplejson 2.1.0+. self.assertEqual(d2, expect) def test_indent0(self): h = {3: 1} def check(indent, expected): d1 = json.dumps(h, indent=indent) self.assertEqual(d1, expected) sio = StringIO() json.dump(h, sio, indent=indent) self.assertEqual(sio.getvalue(), expected) # indent=0 should emit newlines check(0, '{\n"3": 1\n}') # indent=None is more compact check(None, '{"3": 1}') def test_separators(self): lst = [1,2,3,4] expect = '[\n1,\n2,\n3,\n4\n]' expect_spaces = '[\n1, \n2, \n3, \n4\n]' # Ensure that separators still works self.assertEqual( expect_spaces, json.dumps(lst, indent=0, separators=(', ', ': '))) # Force the new defaults self.assertEqual( expect, json.dumps(lst, indent=0, separators=(',', ': '))) # Added in 2.1.4 self.assertEqual( expect, json.dumps(lst, indent=0)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_item_sort_key.py0000644000175100017510000000254015172736046023175 0ustar00runnerrunnerfrom unittest import TestCase import simplejson as json from operator import itemgetter class TestItemSortKey(TestCase): def test_simple_first(self): a = {'a': 1, 'c': 5, 'jack': 'jill', 'pick': 'axe', 'array': [1, 5, 6, 9], 'tuple': (83, 12, 3), 'crate': 'dog', 'zeak': 'oh'} self.assertEqual( '{"a": 1, "c": 5, "crate": "dog", "jack": "jill", "pick": "axe", "zeak": "oh", "array": [1, 5, 6, 9], "tuple": [83, 12, 3]}', json.dumps(a, item_sort_key=json.simple_first)) def test_case(self): a = {'a': 1, 'c': 5, 'Jack': 'jill', 'pick': 'axe', 'Array': [1, 5, 6, 9], 'tuple': (83, 12, 3), 'crate': 'dog', 'zeak': 'oh'} self.assertEqual( '{"Array": [1, 5, 6, 9], "Jack": "jill", "a": 1, "c": 5, "crate": "dog", "pick": "axe", "tuple": [83, 12, 3], "zeak": "oh"}', json.dumps(a, item_sort_key=itemgetter(0))) self.assertEqual( '{"a": 1, "Array": [1, 5, 6, 9], "c": 5, "crate": "dog", "Jack": "jill", "pick": "axe", "tuple": [83, 12, 3], "zeak": "oh"}', json.dumps(a, item_sort_key=lambda kv: kv[0].lower())) def test_item_sort_key_value(self): # https://github.com/simplejson/simplejson/issues/173 a = {'a': 1, 'b': 0} self.assertEqual( '{"b": 0, "a": 1}', json.dumps(a, item_sort_key=lambda kv: kv[1])) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_iterable.py0000644000175100017510000000502415172736046022107 0ustar00runnerrunnerimport unittest from simplejson.compat import StringIO import simplejson as json def iter_dumps(obj, **kw): return ''.join(json.JSONEncoder(**kw).iterencode(obj)) def sio_dump(obj, **kw): sio = StringIO() json.dumps(obj, **kw) return sio.getvalue() class BadIter: """Object whose __iter__ raises a non-TypeError exception.""" def __init__(self, exc_type): self.exc_type = exc_type def __iter__(self): raise self.exc_type("from __iter__") class TestIterable(unittest.TestCase): def test_iterable(self): for l in ([], [1], [1, 2], [1, 2, 3]): for opts in [{}, {'indent': 2}]: for dumps in (json.dumps, iter_dumps, sio_dump): expect = dumps(l, **opts) default_expect = dumps(sum(l), **opts) # Default is False self.assertRaises(TypeError, dumps, iter(l), **opts) self.assertRaises(TypeError, dumps, iter(l), iterable_as_array=False, **opts) self.assertEqual(expect, dumps(iter(l), iterable_as_array=True, **opts)) # Ensure that the "default" gets called self.assertEqual(default_expect, dumps(iter(l), default=sum, **opts)) self.assertEqual(default_expect, dumps(iter(l), iterable_as_array=False, default=sum, **opts)) # Ensure that the "default" does not get called self.assertEqual( expect, dumps(iter(l), iterable_as_array=True, default=sum, **opts)) def test_iterable_as_array_propagates_non_typeerror(self): # Regression test: MemoryError from __iter__ must not be # swallowed by PyErr_Clear and replaced with TypeError. # Test against the C encoder directly since json.dumps may # use the Python fallback encoder. try: import simplejson._speedups as sp import decimal except ImportError: return # C extension not available def noop_default(obj): raise TypeError('not serializable') c_enc = sp.make_encoder( {}, noop_default, sp.encode_basestring_ascii, None, ', ', ': ', False, False, True, {}, False, False, False, None, None, 'utf-8', False, False, decimal.Decimal, True, # iterable_as_array=True ) for exc_type in (MemoryError, RuntimeError): with self.assertRaises(exc_type): c_enc(BadIter(exc_type), 0) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_namedtuple.py0000644000175100017510000001633715172736046022467 0ustar00runnerrunnerfrom __future__ import absolute_import import unittest import simplejson as json from simplejson.compat import StringIO try: from unittest import mock except ImportError: mock = None from collections import namedtuple Value = namedtuple('Value', ['value']) Point = namedtuple('Point', ['x', 'y']) class DuckValue(object): def __init__(self, *args): self.value = Value(*args) def _asdict(self): return self.value._asdict() class DuckPoint(object): def __init__(self, *args): self.point = Point(*args) def _asdict(self): return self.point._asdict() class DeadDuck(object): _asdict = None class DeadDict(dict): _asdict = None CONSTRUCTORS = [ lambda v: v, lambda v: [v], lambda v: [{'key': v}], ] class TestNamedTuple(unittest.TestCase): def test_namedtuple_dumps(self): for v in [Value(1), Point(1, 2), DuckValue(1), DuckPoint(1, 2)]: d = v._asdict() self.assertEqual(d, json.loads(json.dumps(v))) self.assertEqual( d, json.loads(json.dumps(v, namedtuple_as_object=True))) self.assertEqual(d, json.loads(json.dumps(v, tuple_as_array=False))) self.assertEqual( d, json.loads(json.dumps(v, namedtuple_as_object=True, tuple_as_array=False))) def test_namedtuple_dumps_false(self): for v in [Value(1), Point(1, 2)]: l = list(v) self.assertEqual( l, json.loads(json.dumps(v, namedtuple_as_object=False))) self.assertRaises(TypeError, json.dumps, v, tuple_as_array=False, namedtuple_as_object=False) def test_namedtuple_dump(self): for v in [Value(1), Point(1, 2), DuckValue(1), DuckPoint(1, 2)]: d = v._asdict() sio = StringIO() json.dump(v, sio) self.assertEqual(d, json.loads(sio.getvalue())) sio = StringIO() json.dump(v, sio, namedtuple_as_object=True) self.assertEqual( d, json.loads(sio.getvalue())) sio = StringIO() json.dump(v, sio, tuple_as_array=False) self.assertEqual(d, json.loads(sio.getvalue())) sio = StringIO() json.dump(v, sio, namedtuple_as_object=True, tuple_as_array=False) self.assertEqual( d, json.loads(sio.getvalue())) def test_namedtuple_dump_false(self): for v in [Value(1), Point(1, 2)]: l = list(v) sio = StringIO() json.dump(v, sio, namedtuple_as_object=False) self.assertEqual( l, json.loads(sio.getvalue())) self.assertRaises(TypeError, json.dump, v, StringIO(), tuple_as_array=False, namedtuple_as_object=False) def test_asdict_not_callable_dump(self): for f in CONSTRUCTORS: self.assertRaises( TypeError, json.dump, f(DeadDuck()), StringIO(), namedtuple_as_object=True ) sio = StringIO() json.dump(f(DeadDict()), sio, namedtuple_as_object=True) self.assertEqual( json.dumps(f({})), sio.getvalue()) self.assertRaises( TypeError, json.dump, f(Value), StringIO(), namedtuple_as_object=True ) def test_asdict_not_callable_dumps(self): for f in CONSTRUCTORS: self.assertRaises(TypeError, json.dumps, f(DeadDuck()), namedtuple_as_object=True) self.assertRaises( TypeError, json.dumps, f(Value), namedtuple_as_object=True ) self.assertEqual( json.dumps(f({})), json.dumps(f(DeadDict()), namedtuple_as_object=True)) def test_asdict_unbound_method_dumps(self): for f in CONSTRUCTORS: self.assertEqual( json.dumps(f(Value), default=lambda v: v.__name__), json.dumps(f(Value.__name__)) ) def test_asdict_does_not_return_dict(self): if not mock: raise unittest.SkipTest("unittest.mock required") fake = mock.Mock() self.assertTrue(hasattr(fake, '_asdict')) self.assertTrue(callable(fake._asdict)) self.assertFalse(isinstance(fake._asdict(), dict)) # https://github.com/simplejson/simplejson/pull/284 # when running under a debug build of CPython (COPTS=-UNDEBUG) # a C assertion could fire due to an unchecked error of an PyDict # API call on a non-dict internally in _speedups.c. Without a debug # build of CPython this test likely passes either way despite the # potential for internal data corruption. Getting it to crash in # a debug build is not always easy either as it requires an # assert(!PyErr_Occurred()) that could fire later on. with self.assertRaises(TypeError): json.dumps({23: fake}, namedtuple_as_object=True, for_json=False) def test_asdict_dispatch_order(self): # Regression: the Python encoder used to check isinstance(o, list) # BEFORE _asdict(), so a list subclass that defines _asdict() # encoded as a plain list under Py and as an object under C. # Consolidated on C's order: _asdict() takes precedence over # list/tuple/dict container checks. for_json still wins over # _asdict in both backends. # # The _cibw_runner harness runs the whole test suite once with # the C speedups active and once with them disabled, so a single # assertion here covers both backends. class ListAsdict(list): def _asdict(self): return {'from_asdict': list(self)} class TupleAsdict(tuple): def _asdict(self): return {'from_asdict': list(self)} class DictAsdict(dict): def _asdict(self): return {'from_asdict': dict(self)} class ListBoth(list): def _asdict(self): return {'from_asdict': True} def for_json(self): return {'from_for_json': True} self.assertEqual( {'from_asdict': [10, 20]}, json.loads(json.dumps(ListAsdict([10, 20])))) self.assertEqual( {'from_asdict': [1, 2]}, json.loads(json.dumps(TupleAsdict((1, 2))))) self.assertEqual( {'from_asdict': {'x': 1}}, json.loads(json.dumps(DictAsdict(x=1)))) # Nested inside a plain list. self.assertEqual( [{'from_asdict': [10]}], json.loads(json.dumps([ListAsdict([10])]))) # Nested inside a plain dict. self.assertEqual( {'k': {'from_asdict': [10]}}, json.loads(json.dumps({'k': ListAsdict([10])}))) # for_json must still outrank _asdict. self.assertEqual( {'from_for_json': True}, json.loads(json.dumps(ListBoth([1]), for_json=True))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_pass1.py0000644000175100017510000000332215172736046021346 0ustar00runnerrunnerfrom unittest import TestCase import simplejson as json # from http://json.org/JSON_checker/test/pass1.json JSON = r''' [ "JSON Test Pattern pass1", {"object with 1 member":["array with 1 element"]}, {}, [], -42, true, false, null, { "integer": 1234567890, "real": -9876.543210, "e": 0.123456789e-12, "E": 1.234567890E+34, "": 23456789012E66, "zero": 0, "one": 1, "space": " ", "quote": "\"", "backslash": "\\", "controls": "\b\f\n\r\t", "slash": "/ & \/", "alpha": "abcdefghijklmnopqrstuvwyz", "ALPHA": "ABCDEFGHIJKLMNOPQRSTUVWYZ", "digit": "0123456789", "special": "`1~!@#$%^&*()_+-={':[,]}|;.?", "hex": "\u0123\u4567\u89AB\uCDEF\uabcd\uef4A", "true": true, "false": false, "null": null, "array":[ ], "object":{ }, "address": "50 St. James Street", "url": "http://www.JSON.org/", "comment": "// /* */": " ", " s p a c e d " :[1,2 , 3 , 4 , 5 , 6 ,7 ],"compact": [1,2,3,4,5,6,7], "jsontext": "{\"object with 1 member\":[\"array with 1 element\"]}", "quotes": "" \u0022 %22 0x22 034 "", "\/\\\"\uCAFE\uBABE\uAB98\uFCDE\ubcda\uef4A\b\f\n\r\t`1~!@#$%^&*()_+-=[]{}|;:',./<>?" : "A key can be any string" }, 0.5 ,98.6 , 99.44 , 1066, 1e1, 0.1e1, 1e-1, 1e00,2e+00,2e-00 ,"rosebud"] ''' class TestPass1(TestCase): def test_parse(self): # test in/out equivalence and parsing res = json.loads(JSON) out = json.dumps(res) self.assertEqual(res, json.loads(out)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_pass2.py0000644000175100017510000000060215172736046021345 0ustar00runnerrunnerfrom unittest import TestCase import simplejson as json # from http://json.org/JSON_checker/test/pass2.json JSON = r''' [[[[[[[[[[[[[[[[[[["Not too deep"]]]]]]]]]]]]]]]]]]] ''' class TestPass2(TestCase): def test_parse(self): # test in/out equivalence and parsing res = json.loads(JSON) out = json.dumps(res) self.assertEqual(res, json.loads(out)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_pass3.py0000644000175100017510000000074215172736046021353 0ustar00runnerrunnerfrom unittest import TestCase import simplejson as json # from http://json.org/JSON_checker/test/pass3.json JSON = r''' { "JSON Test Pattern pass3": { "The outermost value": "must be an object or array.", "In this test": "It is an object." } } ''' class TestPass3(TestCase): def test_parse(self): # test in/out equivalence and parsing res = json.loads(JSON) out = json.dumps(res) self.assertEqual(res, json.loads(out)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_raw_json.py0000644000175100017510000000204615172736046022143 0ustar00runnerrunnerimport unittest import simplejson as json dct1 = { 'key1': 'value1' } dct2 = { 'key2': 'value2', 'd1': dct1 } dct3 = { 'key2': 'value2', 'd1': json.dumps(dct1) } dct4 = { 'key2': 'value2', 'd1': json.RawJSON(json.dumps(dct1)) } class TestRawJson(unittest.TestCase): def test_normal_str(self): self.assertNotEqual(json.dumps(dct2), json.dumps(dct3)) def test_raw_json_str(self): self.assertEqual(json.dumps(dct2), json.dumps(dct4)) self.assertEqual(dct2, json.loads(json.dumps(dct4))) def test_list(self): self.assertEqual( json.dumps([dct2]), json.dumps([json.RawJSON(json.dumps(dct2))])) self.assertEqual( [dct2], json.loads(json.dumps([json.RawJSON(json.dumps(dct2))]))) def test_direct(self): self.assertEqual( json.dumps(dct2), json.dumps(json.RawJSON(json.dumps(dct2)))) self.assertEqual( dct2, json.loads(json.dumps(json.RawJSON(json.dumps(dct2))))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_recursion.py0000644000175100017510000000321715172736046022333 0ustar00runnerrunnerfrom unittest import TestCase import simplejson as json class JSONTestObject: pass class RecursiveJSONEncoder(json.JSONEncoder): recurse = False def default(self, o): if o is JSONTestObject: if self.recurse: return [JSONTestObject] else: return 'JSONTestObject' return json.JSONEncoder.default(o) class TestRecursion(TestCase): def test_listrecursion(self): x = [] x.append(x) try: json.dumps(x) except ValueError: pass else: self.fail("didn't raise ValueError on list recursion") x = [] y = [x] x.append(y) try: json.dumps(x) except ValueError: pass else: self.fail("didn't raise ValueError on alternating list recursion") y = [] x = [y, y] # ensure that the marker is cleared json.dumps(x) def test_dictrecursion(self): x = {} x["test"] = x try: json.dumps(x) except ValueError: pass else: self.fail("didn't raise ValueError on dict recursion") x = {} y = {"a": x, "b": x} # ensure that the marker is cleared json.dumps(y) def test_defaultrecursion(self): enc = RecursiveJSONEncoder() self.assertEqual(enc.encode(JSONTestObject), '"JSONTestObject"') enc.recurse = True try: enc.encode(JSONTestObject) except ValueError: pass else: self.fail("didn't raise ValueError on default recursion") ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_scanstring.py0000644000175100017510000002750215172736046022500 0ustar00runnerrunnerimport sys from unittest import TestCase import simplejson as json import simplejson.decoder from simplejson.compat import b, PY3 class TestScanString(TestCase): # The bytes type is intentionally not used in most of these tests # under Python 3 because the decoder immediately coerces to str before # calling scanstring. In Python 2 we are testing the code paths # for both unicode and str. # # The reason this is done is because Python 3 would require # entirely different code paths for parsing bytes and str. # def test_py_scanstring(self): self._test_scanstring(simplejson.decoder.py_scanstring) def test_c_scanstring(self): if not simplejson.decoder.c_scanstring: return self._test_scanstring(simplejson.decoder.c_scanstring) self.assertTrue(isinstance(simplejson.decoder.c_scanstring('""', 0)[0], str)) def _test_scanstring(self, scanstring): if sys.maxunicode == 65535: self.assertEqual( scanstring(u'"z\U0001d120x"', 1, None, True), (u'z\U0001d120x', 6)) else: self.assertEqual( scanstring(u'"z\U0001d120x"', 1, None, True), (u'z\U0001d120x', 5)) self.assertEqual( scanstring('"\\u007b"', 1, None, True), (u'{', 8)) self.assertEqual( scanstring('"A JSON payload should be an object or array, not a string."', 1, None, True), (u'A JSON payload should be an object or array, not a string.', 60)) self.assertEqual( scanstring('["Unclosed array"', 2, None, True), (u'Unclosed array', 17)) self.assertEqual( scanstring('["extra comma",]', 2, None, True), (u'extra comma', 14)) self.assertEqual( scanstring('["double extra comma",,]', 2, None, True), (u'double extra comma', 21)) self.assertEqual( scanstring('["Comma after the close"],', 2, None, True), (u'Comma after the close', 24)) self.assertEqual( scanstring('["Extra close"]]', 2, None, True), (u'Extra close', 14)) self.assertEqual( scanstring('{"Extra comma": true,}', 2, None, True), (u'Extra comma', 14)) self.assertEqual( scanstring('{"Extra value after close": true} "misplaced quoted value"', 2, None, True), (u'Extra value after close', 26)) self.assertEqual( scanstring('{"Illegal expression": 1 + 2}', 2, None, True), (u'Illegal expression', 21)) self.assertEqual( scanstring('{"Illegal invocation": alert()}', 2, None, True), (u'Illegal invocation', 21)) self.assertEqual( scanstring('{"Numbers cannot have leading zeroes": 013}', 2, None, True), (u'Numbers cannot have leading zeroes', 37)) self.assertEqual( scanstring('{"Numbers cannot be hex": 0x14}', 2, None, True), (u'Numbers cannot be hex', 24)) self.assertEqual( scanstring('[[[[[[[[[[[[[[[[[[[["Too deep"]]]]]]]]]]]]]]]]]]]]', 21, None, True), (u'Too deep', 30)) self.assertEqual( scanstring('{"Missing colon" null}', 2, None, True), (u'Missing colon', 16)) self.assertEqual( scanstring('{"Double colon":: null}', 2, None, True), (u'Double colon', 15)) self.assertEqual( scanstring('{"Comma instead of colon", null}', 2, None, True), (u'Comma instead of colon', 25)) self.assertEqual( scanstring('["Colon instead of comma": false]', 2, None, True), (u'Colon instead of comma', 25)) self.assertEqual( scanstring('["Bad value", truth]', 2, None, True), (u'Bad value', 12)) for c in map(chr, range(0x00, 0x1f)): self.assertEqual( scanstring(c + '"', 0, None, False), (c, 2)) self.assertRaises( ValueError, scanstring, c + '"', 0, None, True) self.assertRaises(ValueError, scanstring, '', 0, None, True) self.assertRaises(ValueError, scanstring, 'a', 0, None, True) self.assertRaises(ValueError, scanstring, '\\', 0, None, True) self.assertRaises(ValueError, scanstring, '\\u', 0, None, True) self.assertRaises(ValueError, scanstring, '\\u0', 0, None, True) self.assertRaises(ValueError, scanstring, '\\u01', 0, None, True) self.assertRaises(ValueError, scanstring, '\\u012', 0, None, True) self.assertRaises(ValueError, scanstring, '\\u0123', 0, None, True) if sys.maxunicode > 65535: self.assertRaises(ValueError, scanstring, '\\ud834\\u"', 0, None, True) self.assertRaises(ValueError, scanstring, '\\ud834\\x0123"', 0, None, True) self.assertRaises(json.JSONDecodeError, scanstring, '\\u-123"', 0, None, True) # SJ-PT-23-01: Invalid Handling of Broken Unicode Escape Sequences self.assertRaises(json.JSONDecodeError, scanstring, '\\u EDD"', 0, None, True) def test_issue3623(self): self.assertRaises(ValueError, json.decoder.scanstring, "xxx", 1, "xxx") self.assertRaises(UnicodeDecodeError, json.encoder.encode_basestring_ascii, b("xx\xff")) def test_overflow(self): # Python 2.5 does not have maxsize, Python 3 does not have maxint maxsize = getattr(sys, 'maxsize', getattr(sys, 'maxint', None)) assert maxsize is not None self.assertRaises(OverflowError, json.decoder.scanstring, "xxx", maxsize + 1) def test_end_out_of_bounds_is_jsondecodeerror(self): # Regression: C scanstring used to raise a plain ValueError for # out-of-range end indices, while py_scanstring raises # JSONDecodeError. User code with `except JSONDecodeError:` missed # the C path. Both backends now raise JSONDecodeError with the # "Unterminated string starting at" message at pos = end - 1. for s, end in ( (u'"abc"', 100), (u'abc', 100), (u'', 100), (u'abc', -1), (u'', -1), ): with self.assertRaises(json.JSONDecodeError) as cm: json.decoder.scanstring(s, end, None, True) self.assertEqual(cm.exception.pos, end - 1, 'scanstring(%r, %r) pos=%r, expected %r' % (s, end, cm.exception.pos, end - 1)) self.assertIn('Unterminated string', str(cm.exception)) def test_surrogates(self): scanstring = json.decoder.scanstring def assertScan(given, expect, test_utf8=True): givens = [given] if not PY3 and test_utf8: givens.append(given.encode('utf8')) for given in givens: (res, count) = scanstring(given, 1, None, True) self.assertEqual(len(given), count) self.assertEqual(res, expect) assertScan( u'"z\\ud834\\u0079x"', u'z\ud834yx') assertScan( u'"z\\ud834\\udd20x"', u'z\U0001d120x') assertScan( u'"z\\ud834\\ud834\\udd20x"', u'z\ud834\U0001d120x') assertScan( u'"z\\ud834x"', u'z\ud834x') assertScan( u'"z\\udd20x"', u'z\udd20x') assertScan( u'"z\ud834x"', u'z\ud834x') # It may look strange to join strings together, but Python is drunk. # https://gist.github.com/etrepum/5538443 assertScan( u'"z\\ud834\udd20x12345"', u''.join([u'z\ud834', u'\udd20x12345'])) assertScan( u'"z\ud834\\udd20x"', u''.join([u'z\ud834', u'\udd20x'])) # these have different behavior given UTF8 input, because the surrogate # pair may be joined (in maxunicode > 65535 builds) assertScan( u''.join([u'"z\ud834', u'\udd20x"']), u''.join([u'z\ud834', u'\udd20x']), test_utf8=False) self.assertRaises(ValueError, scanstring, u'"z\\ud83x"', 1, None, True) self.assertRaises(ValueError, scanstring, u'"z\\ud834\\udd2x"', 1, None, True) def test_escape_error_parity(self): # Regression: the C scanstring bounds check was `end >= len` / the # surrogate-pair bounds check was `end + 6 < len`. Both were # off-by-one, causing C to raise "Invalid \\uXXXX escape sequence" # where pure-Python correctly raised "Unterminated string starting # at" when a \\uXXXX escape used the last bytes of the buffer. The # error-position offset also differed: C reported the position of # the 'u' while Python reported the position of the leading '\'. # This test asserts exact parity (exception class, position, and # message prefix) across a matrix of edge cases. if simplejson.decoder.c_scanstring is None: return def get_exc(scanstring, s): try: scanstring(s, 0, None, True) except json.JSONDecodeError as e: return (e.pos, str(e).split(':')[0]) return None # Each case: (input, expected_pos, expected_message_prefix) # expected_pos == -2 means (-1, 'Unterminated string starting at'); # otherwise the positional 'Invalid \\uXXXX escape sequence' error. UNTERMINATED = (-1, 'Unterminated string starting at') def INVALID(pos): return (pos, 'Invalid \\uXXXX escape sequence') cases = [ # Not enough room for 4 hex digits after \u. (u'\\u', INVALID(0)), (u'\\u0', INVALID(0)), (u'\\u01', INVALID(0)), (u'\\u012', INVALID(0)), # 4 non-hex chars after \u โ€” C used to raise at the 'u'. (u'\\uXXXX', INVALID(0)), # Exactly 4 hex digits at buffer end โ€” C used to mis-report # 'Invalid \\uXXXX escape' instead of 'Unterminated string'. (u'\\u0123', UNTERMINATED), # Lone high surrogate with no room for a second escape. (u'\\ud834', UNTERMINATED), # High surrogate followed by a truncated second escape. (u'\\ud834\\u', INVALID(6)), (u'\\ud834\\ux', INVALID(6)), (u'\\ud834\\udd2', INVALID(6)), (u'\\ud834\\udd2x', INVALID(6)), # High surrogate followed by a valid low surrogate that ends # exactly at the buffer edge โ€” must combine before the outer # loop reports an unterminated string. (u'\\ud834\\udd1e', UNTERMINATED), (u'prefix\\ud834\\udd1e', UNTERMINATED), ] for s, expected in cases: py = get_exc(simplejson.decoder.py_scanstring, s) c = get_exc(simplejson.decoder.c_scanstring, s) self.assertEqual(py, expected, 'py_scanstring(%r) expected %r, got %r' % (s, expected, py)) self.assertEqual(c, expected, 'c_scanstring(%r) expected %r, got %r' % (s, expected, c)) # Success paths: valid escape or surrogate pair ending at the # closing quote must still parse correctly. for scanstring in (simplejson.decoder.py_scanstring, simplejson.decoder.c_scanstring): self.assertEqual( scanstring(u'\\u0123"', 0, None, True), (u'\u0123', 7)) self.assertEqual( scanstring(u'\\ud834\\udd1e"', 0, None, True), (u'\U0001d11e', 13)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_separators.py0000644000175100017510000000165615172736046022512 0ustar00runnerrunnerimport textwrap from unittest import TestCase import simplejson as json class TestSeparators(TestCase): def test_separators(self): h = [['blorpie'], ['whoops'], [], 'd-shtaeou', 'd-nthiouh', 'i-vhbjkhnth', {'nifty': 87}, {'field': 'yes', 'morefield': False} ] expect = textwrap.dedent("""\ [ [ "blorpie" ] , [ "whoops" ] , [] , "d-shtaeou" , "d-nthiouh" , "i-vhbjkhnth" , { "nifty" : 87 } , { "field" : "yes" , "morefield" : false } ]""") d1 = json.dumps(h) d2 = json.dumps(h, indent=' ', sort_keys=True, separators=(' ,', ' : ')) h1 = json.loads(d1) h2 = json.loads(d2) self.assertEqual(h1, h) self.assertEqual(h2, h) self.assertEqual(d2, expect) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_speedups.py0000644000175100017510000003045615172736046022157 0ustar00runnerrunnerimport sys import unittest from unittest import TestCase import simplejson from simplejson import encoder, decoder, scanner from simplejson.compat import PY3, long_type, b from simplejson.tests._helpers import has_speedups, skip_if_speedups_missing class BadBool: def __bool__(self): 1/0 __nonzero__ = __bool__ class TestDecode(TestCase): @skip_if_speedups_missing def test_make_scanner(self): self.assertRaises(AttributeError, scanner.c_make_scanner, 1) @skip_if_speedups_missing def test_bad_bool_args(self): def test(value): decoder.JSONDecoder(strict=BadBool()).decode(value) self.assertRaises(ZeroDivisionError, test, '""') self.assertRaises(ZeroDivisionError, test, '{}') if not PY3: self.assertRaises(ZeroDivisionError, test, u'""') self.assertRaises(ZeroDivisionError, test, u'{}') class TestEncode(TestCase): @skip_if_speedups_missing def test_make_encoder(self): self.assertRaises( TypeError, encoder.c_make_encoder, None, ("\xCD\x7D\x3D\x4E\x12\x4C\xF9\x79\xD7" "\x52\xBA\x82\xF2\x27\x4A\x7D\xA0\xCA\x75"), None ) @skip_if_speedups_missing def test_bad_str_encoder(self): # Issue #31505: There shouldn't be an assertion failure in case # c_make_encoder() receives a bad encoder() argument. import decimal def bad_encoder1(*args): return None enc = encoder.c_make_encoder( None, lambda obj: str(obj), bad_encoder1, None, ': ', ', ', False, False, False, {}, False, False, False, None, None, 'utf-8', False, False, decimal.Decimal, False) self.assertRaises(TypeError, enc, 'spam', 4) self.assertRaises(TypeError, enc, {'spam': 42}, 4) def bad_encoder2(*args): 1/0 enc = encoder.c_make_encoder( None, lambda obj: str(obj), bad_encoder2, None, ': ', ', ', False, False, False, {}, False, False, False, None, None, 'utf-8', False, False, decimal.Decimal, False) self.assertRaises(ZeroDivisionError, enc, 'spam', 4) @skip_if_speedups_missing def test_bad_bool_args(self): def test(name): encoder.JSONEncoder(**{name: BadBool()}).encode({}) self.assertRaises(ZeroDivisionError, test, 'skipkeys') self.assertRaises(ZeroDivisionError, test, 'ensure_ascii') self.assertRaises(ZeroDivisionError, test, 'check_circular') self.assertRaises(ZeroDivisionError, test, 'allow_nan') self.assertRaises(ZeroDivisionError, test, 'sort_keys') self.assertRaises(ZeroDivisionError, test, 'use_decimal') self.assertRaises(ZeroDivisionError, test, 'namedtuple_as_object') self.assertRaises(ZeroDivisionError, test, 'tuple_as_array') self.assertRaises(ZeroDivisionError, test, 'bigint_as_string') self.assertRaises(ZeroDivisionError, test, 'for_json') self.assertRaises(ZeroDivisionError, test, 'ignore_nan') self.assertRaises(ZeroDivisionError, test, 'iterable_as_array') @skip_if_speedups_missing def test_int_as_string_bitcount_overflow(self): long_count = long_type(2)**32+31 def test(): encoder.JSONEncoder(int_as_string_bitcount=long_count).encode(0) self.assertRaises((TypeError, OverflowError), test) if PY3: @skip_if_speedups_missing def test_bad_encoding(self): with self.assertRaises(UnicodeEncodeError): encoder.JSONEncoder(encoding='\udcff').encode({b('key'): 123}) @unittest.skipIf(sys.version_info < (3, 13), "heap types require Python 3.13+") class TestHeapTypes(TestCase): """Verify that Scanner and Encoder are heap types on Python 3.13+.""" @skip_if_speedups_missing def test_scanner_is_heap_type(self): from simplejson._speedups import make_scanner # Py_TPFLAGS_HEAPTYPE = 1 << 9 self.assertTrue(make_scanner.__flags__ & (1 << 9), "Scanner should be a heap type on 3.13+") @skip_if_speedups_missing def test_encoder_is_heap_type(self): from simplejson._speedups import make_encoder self.assertTrue(make_encoder.__flags__ & (1 << 9), "Encoder should be a heap type on 3.13+") @skip_if_speedups_missing def test_scanner_type_is_gc_tracked(self): """Heap types must be GC-tracked so they can be collected.""" import gc from simplejson._speedups import make_scanner self.assertTrue(gc.is_tracked(make_scanner)) @skip_if_speedups_missing def test_encoder_type_is_gc_tracked(self): import gc from simplejson._speedups import make_encoder self.assertTrue(gc.is_tracked(make_encoder)) @skip_if_speedups_missing def test_scanner_instances_work(self): """Verify Scanner heap type instances decode correctly.""" result = simplejson.loads('{"a": 1}') self.assertEqual(result, {"a": 1}) @skip_if_speedups_missing def test_encoder_instances_work(self): """Verify Encoder heap type instances encode correctly.""" result = simplejson.dumps({"a": 1}, sort_keys=True) self.assertEqual(result, '{"a": 1}') @unittest.skipUnless(hasattr(sys, "gettotalrefcount"), "debug build required (sys.gettotalrefcount)") class TestRefcountLeaks(TestCase): """Catch refcount leaks in the C extension. These tests only run on debug builds of CPython, which expose sys.gettotalrefcount(). On release builds they skip silently. """ ITER = 2000 WARMUP = 200 def _assert_no_leak(self, func): """Run `func` in two measurement phases and verify the second phase's refcount delta stays near zero. A real per-call leak (1 ref per call) grows linearly with the iteration count, so both phase1 and phase2 would be ~ITER. But front-loaded noise -- specializer inline caches, dict resize, gc generation bumps, etc. -- shows up entirely in phase1 and leaves phase2 near zero. Asserting on phase2 only is thus both more sensitive (catches smaller linear leaks) and more robust (no false positives from CPython internals). """ import gc # Stabilize caches, specializer, intern pools, etc. for _ in range(self.WARMUP): func() gc.collect() # Collect every iteration so cyclic garbage doesn't accumulate # across GC generations and cause noisy refcount deltas. start = sys.gettotalrefcount() for _ in range(self.ITER): func() gc.collect() mid = sys.gettotalrefcount() for _ in range(self.ITER): func() gc.collect() end = sys.gettotalrefcount() phase1 = mid - start phase2 = end - mid msg = ("phase1=%d, phase2=%d, iterations=%d. A real per-call " "leak would make phase2 grow linearly with iterations." % (phase1, phase2, self.ITER)) # phase2 observed as 1-24 on CPython 3.14 debug when clean; # 100 is a generous ceiling that still catches any leak # producing more than ~0.05 refs/call. self.assertLess(abs(phase2), 100, msg) @skip_if_speedups_missing def test_dumps_no_leak(self): data = {"a": [1, 2, 3], "b": "hello", "c": None, "d": True} self._assert_no_leak(lambda: simplejson.dumps(data)) @skip_if_speedups_missing def test_loads_no_leak(self): raw = '{"a": [1, 2, 3], "b": "hello", "c": null, "d": true}' self._assert_no_leak(lambda: simplejson.loads(raw)) @skip_if_speedups_missing def test_scanner_construction_no_leak(self): self._assert_no_leak(lambda: simplejson.JSONDecoder()) @skip_if_speedups_missing def test_encoder_construction_no_leak(self): self._assert_no_leak(lambda: simplejson.JSONEncoder()) @skip_if_speedups_missing def test_failed_construction_no_leak(self): """Error path in scanner_new/encoder_new must release module_ref.""" class BadBool: def __bool__(self): raise ZeroDivisionError() __nonzero__ = __bool__ def try_bad_scanner(): try: decoder.JSONDecoder(strict=BadBool()).decode('{}') except ZeroDivisionError: pass def try_bad_encoder(): try: encoder.JSONEncoder(skipkeys=BadBool()).encode({}) except ZeroDivisionError: pass self._assert_no_leak(try_bad_scanner) self._assert_no_leak(try_bad_encoder) @skip_if_speedups_missing def test_circular_reference_no_leak(self): """ValueError mid-encode must not leak the partial accumulator, markers dict entry, or the ident PyLong.""" def circular(): d = {} d["self"] = d try: simplejson.dumps(d) except ValueError: pass self._assert_no_leak(circular) @skip_if_speedups_missing def test_asdict_returning_non_dict_no_leak(self): """encoder_steal_encode's TypeError path on _asdict() returning a non-dict must release the stolen newobj reference.""" class BadNT: def _asdict(self): return "not a dict" def bad_asdict(): try: simplejson.dumps(BadNT(), namedtuple_as_object=True) except TypeError: pass self._assert_no_leak(bad_asdict) @skip_if_speedups_missing def test_for_json_raising_no_leak(self): """for_json() raising inside its body must not leak the method binding or partial accumulator state.""" class Explodes: def for_json(self): raise RuntimeError("boom") def explode(): try: simplejson.dumps(Explodes(), for_json=True) except RuntimeError: pass self._assert_no_leak(explode) @skip_if_speedups_missing def test_non_string_dict_keys_no_leak(self): """Dict keys that aren't already strings go through encoder_stringify_key and the non-cached branch of the key_memo logic. Both paths must release the transient stringified key.""" data = {1: "a", 2: "b", 3: "c", True: "x", False: "y"} self._assert_no_leak(lambda: simplejson.dumps(data, sort_keys=True)) @skip_if_speedups_missing def test_bigint_as_string_no_leak(self): """maybe_quote_bigint's comparison path must release `encoded` on the RichCompareBool error branch and on the quoted-return path that replaces the unquoted string.""" big = 1 << 40 self._assert_no_leak( lambda: simplejson.dumps(big, int_as_string_bitcount=31)) @skip_if_speedups_missing def test_dict_fast_path_no_leak(self): """The PyDict_Next fast path (unsorted exact dict) must not leak references compared to the iterator slow path.""" data = {"a": 1, "b": "two", "c": [3], "d": None, "e": True} self._assert_no_leak(lambda: simplejson.dumps(data)) @skip_if_speedups_missing def test_dict_slow_path_no_leak(self): """The iterator slow path (sorted or dict subclass) must not leak.""" data = {"z": 1, "a": 2, "m": 3} self._assert_no_leak( lambda: simplejson.dumps(data, sort_keys=True)) @skip_if_speedups_missing def test_skipkeys_fast_path_no_leak(self): """skipkeys on the PyDict_Next fast path must not leak skipped keys.""" data = {"ok": 1, 42: 2, True: 3, None: 4} self._assert_no_leak( lambda: simplejson.dumps(data, skipkeys=True)) @skip_if_speedups_missing def test_list_fast_path_no_leak(self): """The indexed fast path for exact lists must not leak.""" data = [1, "two", 3.0, True, None, [4], {"k": "v"}] self._assert_no_leak(lambda: simplejson.dumps(data)) @skip_if_speedups_missing def test_tuple_fast_path_no_leak(self): """The indexed fast path for exact tuples must not leak.""" data = (1, "two", 3.0, True, None) self._assert_no_leak( lambda: simplejson.dumps(data, tuple_as_array=True)) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_str_subclass.py0000644000175100017510000000134415172736046023030 0ustar00runnerrunnerfrom unittest import TestCase import simplejson from simplejson.compat import text_type # Tests for issue demonstrated in https://github.com/simplejson/simplejson/issues/144 class WonkyTextSubclass(text_type): def __getslice__(self, start, end): return self.__class__('not what you wanted!') class TestStrSubclass(TestCase): def test_dump_load(self): for s in ['', '"hello"', 'text', u'\u005c']: self.assertEqual( s, simplejson.loads(simplejson.dumps(WonkyTextSubclass(s)))) self.assertEqual( s, simplejson.loads(simplejson.dumps(WonkyTextSubclass(s), ensure_ascii=False))) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_subclass.py0000644000175100017510000000214415172736046022137 0ustar00runnerrunnerfrom unittest import TestCase import simplejson as json from decimal import Decimal class AlternateInt(int): def __repr__(self): return 'invalid json' __str__ = __repr__ class AlternateFloat(float): def __repr__(self): return 'invalid json' __str__ = __repr__ # class AlternateDecimal(Decimal): # def __repr__(self): # return 'invalid json' class TestSubclass(TestCase): def test_int(self): self.assertEqual(json.dumps(AlternateInt(1)), '1') self.assertEqual(json.dumps(AlternateInt(-1)), '-1') self.assertEqual(json.loads(json.dumps({AlternateInt(1): 1})), {'1': 1}) def test_float(self): self.assertEqual(json.dumps(AlternateFloat(1.0)), '1.0') self.assertEqual(json.dumps(AlternateFloat(-1.0)), '-1.0') self.assertEqual(json.loads(json.dumps({AlternateFloat(1.0): 1})), {'1.0': 1}) # NOTE: Decimal subclasses are not supported as-is # def test_decimal(self): # self.assertEqual(json.dumps(AlternateDecimal('1.0')), '1.0') # self.assertEqual(json.dumps(AlternateDecimal('-1.0')), '-1.0') ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_subinterpreters.py0000644000175100017510000001103715172736046023561 0ustar00runnerrunner"""Tests that verify the C extension works in subinterpreters. Subinterpreters became usable for third-party C extensions in Python 3.12 (PEP 684). On 3.13+ the extension uses per-module state and heap types so that each interpreter gets its own copy. """ import sys import unittest from unittest import TestCase from simplejson.tests._helpers import skip_if_speedups_missing @unittest.skipIf(sys.version_info < (3, 12), "subinterpreters require Python 3.12+") class TestSubinterpreters(TestCase): """Test that the C extension can be loaded in subinterpreters.""" def _run_in_subinterp(self, code): """Helper to run code in a fresh subinterpreter.""" try: import _interpreters except ImportError: raise unittest.SkipTest("_interpreters not available") interp = _interpreters.create() try: _interpreters.run_string(interp, code) finally: _interpreters.destroy(interp) @skip_if_speedups_missing def test_import_in_subinterpreter(self): """Verify _speedups can be imported in a subinterpreter.""" self._run_in_subinterp( "import simplejson; simplejson.dumps({'a': 1})") @skip_if_speedups_missing def test_encode_in_subinterpreter(self): """Verify encoding works correctly in a subinterpreter.""" self._run_in_subinterp(""" import simplejson assert simplejson.dumps(None) == 'null' assert simplejson.dumps(True) == 'true' assert simplejson.dumps(False) == 'false' assert simplejson.dumps(42) == '42' assert simplejson.dumps(3.14) == '3.14' assert simplejson.dumps("hello") == '"hello"' assert simplejson.dumps([1, 2, 3]) == '[1, 2, 3]' assert simplejson.dumps({"a": 1}, sort_keys=True) == '{"a": 1}' """) @skip_if_speedups_missing def test_decode_in_subinterpreter(self): """Verify decoding works correctly in a subinterpreter.""" self._run_in_subinterp(""" import simplejson assert simplejson.loads('null') is None assert simplejson.loads('true') is True assert simplejson.loads('42') == 42 assert simplejson.loads('"hello"') == 'hello' assert simplejson.loads('[1, 2, 3]') == [1, 2, 3] assert simplejson.loads('{"a": 1}') == {"a": 1} """) @skip_if_speedups_missing def test_multiple_subinterpreters(self): """Verify multiple subinterpreters can use simplejson concurrently.""" try: import _interpreters except ImportError: raise unittest.SkipTest("_interpreters not available") interps = [_interpreters.create() for _ in range(3)] try: for i, interp in enumerate(interps): _interpreters.run_string(interp, """ import simplejson result = simplejson.dumps({"interp": %d}) assert '"interp": %d' in result """ % (i, i)) finally: for interp in interps: _interpreters.destroy(interp) @skip_if_speedups_missing def test_subinterpreter_state_independent(self): """Verify destroying one subinterpreter doesn't affect another.""" try: import _interpreters except ImportError: raise unittest.SkipTest("_interpreters not available") interp1 = _interpreters.create() interp2 = _interpreters.create() try: # Both interpreters load and use simplejson _interpreters.run_string(interp1, "import simplejson; simplejson.dumps([1])") _interpreters.run_string(interp2, "import simplejson; simplejson.dumps([2])") # Destroy the first interpreter _interpreters.destroy(interp1) interp1 = None # Second interpreter must still work correctly _interpreters.run_string(interp2, """ import simplejson assert simplejson.dumps({"still": "works"}) == '{"still": "works"}' assert simplejson.loads('{"still": "works"}') == {"still": "works"} """) finally: if interp1 is not None: _interpreters.destroy(interp1) _interpreters.destroy(interp2) @skip_if_speedups_missing @unittest.skipIf(sys.version_info < (3, 13), "heap types require Python 3.13+") def test_subinterpreter_heap_types(self): """Verify types are heap types inside subinterpreters.""" self._run_in_subinterp(""" from simplejson._speedups import make_scanner, make_encoder # Py_TPFLAGS_HEAPTYPE = 1 << 9 assert make_scanner.__flags__ & (1 << 9), "Scanner should be heap type" assert make_encoder.__flags__ & (1 << 9), "Encoder should be heap type" """) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_tool.py0000644000175100017510000000566415172736046021307 0ustar00runnerrunnerimport os import re import sys import textwrap import unittest import subprocess import tempfile def strip_python_stderr(stderr): """Strip debug-build refcount output from stderr.""" return re.sub(b"\\[\\d+ refs\\]\\r?\\n?$", b"", stderr).strip() def open_temp_file(): file = tempfile.NamedTemporaryFile(delete=False) filename = file.name return file, filename class TestTool(unittest.TestCase): data = """ [["blorpie"],[ "whoops" ] , [ ],\t"d-shtaeou",\r"d-nthiouh", "i-vhbjkhnth", {"nifty":87}, {"morefield" :\tfalse,"field" :"yes"} ] """ expect = textwrap.dedent("""\ [ [ "blorpie" ], [ "whoops" ], [], "d-shtaeou", "d-nthiouh", "i-vhbjkhnth", { "nifty": 87 }, { "field": "yes", "morefield": false } ] """) def runTool(self, args=None, data=None): if sys.platform == 'emscripten': self.skipTest("subprocess not available on Emscripten") argv = [sys.executable, '-m', 'simplejson.tool'] if args: argv.extend(args) proc = subprocess.Popen(argv, stdin=subprocess.PIPE, stderr=subprocess.PIPE, stdout=subprocess.PIPE) out, err = proc.communicate(data) self.assertEqual(strip_python_stderr(err), ''.encode()) self.assertEqual(proc.returncode, 0) return out.decode('utf8').splitlines() def test_stdin_stdout(self): self.assertEqual( self.runTool(data=self.data.encode()), self.expect.splitlines()) def test_infile_stdout(self): infile, infile_name = open_temp_file() try: infile.write(self.data.encode()) infile.close() self.assertEqual( self.runTool(args=[infile_name]), self.expect.splitlines()) finally: os.unlink(infile_name) def test_infile_outfile(self): infile, infile_name = open_temp_file() try: infile.write(self.data.encode()) infile.close() # outfile will get overwritten by tool, so the delete # may not work on some platforms. Do it manually. outfile, outfile_name = open_temp_file() try: outfile.close() self.assertEqual( self.runTool(args=[infile_name, outfile_name]), []) with open(outfile_name, 'rb') as f: self.assertEqual( f.read().decode('utf8').splitlines(), self.expect.splitlines() ) finally: os.unlink(outfile_name) finally: os.unlink(infile_name) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_tuple.py0000644000175100017510000000344715172736046021460 0ustar00runnerrunnerimport unittest from simplejson.compat import StringIO import simplejson as json class TestTuples(unittest.TestCase): def test_tuple_array_dumps(self): t = (1, 2, 3) expect = json.dumps(list(t)) # Default is True self.assertEqual(expect, json.dumps(t)) self.assertEqual(expect, json.dumps(t, tuple_as_array=True)) self.assertRaises(TypeError, json.dumps, t, tuple_as_array=False) # Ensure that the "default" does not get called self.assertEqual(expect, json.dumps(t, default=repr)) self.assertEqual(expect, json.dumps(t, tuple_as_array=True, default=repr)) # Ensure that the "default" gets called self.assertEqual( json.dumps(repr(t)), json.dumps(t, tuple_as_array=False, default=repr)) def test_tuple_array_dump(self): t = (1, 2, 3) expect = json.dumps(list(t)) # Default is True sio = StringIO() json.dump(t, sio) self.assertEqual(expect, sio.getvalue()) sio = StringIO() json.dump(t, sio, tuple_as_array=True) self.assertEqual(expect, sio.getvalue()) self.assertRaises(TypeError, json.dump, t, StringIO(), tuple_as_array=False) # Ensure that the "default" does not get called sio = StringIO() json.dump(t, sio, default=repr) self.assertEqual(expect, sio.getvalue()) sio = StringIO() json.dump(t, sio, tuple_as_array=True, default=repr) self.assertEqual(expect, sio.getvalue()) # Ensure that the "default" gets called sio = StringIO() json.dump(t, sio, tuple_as_array=False, default=repr) self.assertEqual( json.dumps(repr(t)), sio.getvalue()) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tests/test_unicode.py0000644000175100017510000001562015172736046021751 0ustar00runnerrunnerimport sys import codecs from unittest import TestCase import simplejson as json from simplejson.compat import unichr, text_type, b, BytesIO class TestUnicode(TestCase): def test_encoding1(self): encoder = json.JSONEncoder(encoding='utf-8') u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' s = u.encode('utf-8') ju = encoder.encode(u) js = encoder.encode(s) self.assertEqual(ju, js) def test_encoding2(self): u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' s = u.encode('utf-8') ju = json.dumps(u, encoding='utf-8') js = json.dumps(s, encoding='utf-8') self.assertEqual(ju, js) def test_encoding3(self): u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' j = json.dumps(u) self.assertEqual(j, '"\\u03b1\\u03a9"') def test_encoding4(self): u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' j = json.dumps([u]) self.assertEqual(j, '["\\u03b1\\u03a9"]') def test_encoding5(self): u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' j = json.dumps(u, ensure_ascii=False) self.assertEqual(j, u'"' + u + u'"') def test_encoding6(self): u = u'\N{GREEK SMALL LETTER ALPHA}\N{GREEK CAPITAL LETTER OMEGA}' j = json.dumps([u], ensure_ascii=False) self.assertEqual(j, u'["' + u + u'"]') def test_big_unicode_encode(self): u = u'\U0001d120' self.assertEqual(json.dumps(u), '"\\ud834\\udd20"') self.assertEqual(json.dumps(u, ensure_ascii=False), u'"\U0001d120"') def test_big_unicode_decode(self): u = u'z\U0001d120x' self.assertEqual(json.loads('"' + u + '"'), u) self.assertEqual(json.loads('"z\\ud834\\udd20x"'), u) def test_unicode_decode(self): for i in range(0, 0xd7ff): u = unichr(i) #s = '"\\u{0:04x}"'.format(i) s = '"\\u%04x"' % (i,) self.assertEqual(json.loads(s), u) def test_object_pairs_hook_with_unicode(self): s = u'{"xkd":1, "kcw":2, "art":3, "hxm":4, "qrt":5, "pad":6, "hoy":7}' p = [(u"xkd", 1), (u"kcw", 2), (u"art", 3), (u"hxm", 4), (u"qrt", 5), (u"pad", 6), (u"hoy", 7)] self.assertEqual(json.loads(s), eval(s)) self.assertEqual(json.loads(s, object_pairs_hook=lambda x: x), p) od = json.loads(s, object_pairs_hook=json.OrderedDict) self.assertEqual(od, json.OrderedDict(p)) self.assertEqual(type(od), json.OrderedDict) # the object_pairs_hook takes priority over the object_hook self.assertEqual(json.loads(s, object_pairs_hook=json.OrderedDict, object_hook=lambda x: None), json.OrderedDict(p)) def test_default_encoding(self): self.assertEqual(json.loads(u'{"a": "\xe9"}'.encode('utf-8')), {'a': u'\xe9'}) def test_unicode_preservation(self): self.assertEqual(type(json.loads(u'""')), text_type) self.assertEqual(type(json.loads(u'"a"')), text_type) self.assertEqual(type(json.loads(u'["a"]')[0]), text_type) def test_ensure_ascii_false_returns_unicode(self): # http://code.google.com/p/simplejson/issues/detail?id=48 self.assertEqual(type(json.dumps([], ensure_ascii=False)), text_type) self.assertEqual(type(json.dumps(0, ensure_ascii=False)), text_type) self.assertEqual(type(json.dumps({}, ensure_ascii=False)), text_type) self.assertEqual(type(json.dumps("", ensure_ascii=False)), text_type) def test_ensure_ascii_false_bytestring_encoding(self): # http://code.google.com/p/simplejson/issues/detail?id=48 doc1 = {u'quux': b('Arr\xc3\xaat sur images')} doc2 = {u'quux': u'Arr\xeat sur images'} doc_ascii = '{"quux": "Arr\\u00eat sur images"}' doc_unicode = u'{"quux": "Arr\xeat sur images"}' self.assertEqual(json.dumps(doc1), doc_ascii) self.assertEqual(json.dumps(doc2), doc_ascii) self.assertEqual(json.dumps(doc1, ensure_ascii=False), doc_unicode) self.assertEqual(json.dumps(doc2, ensure_ascii=False), doc_unicode) def test_ensure_ascii_linebreak_encoding(self): # http://timelessrepo.com/json-isnt-a-javascript-subset s1 = u'\u2029\u2028' s2 = s1.encode('utf8') expect = '"\\u2029\\u2028"' expect_non_ascii = u'"\u2029\u2028"' self.assertEqual(json.dumps(s1), expect) self.assertEqual(json.dumps(s2), expect) self.assertEqual(json.dumps(s1, ensure_ascii=False), expect_non_ascii) self.assertEqual(json.dumps(s2, ensure_ascii=False), expect_non_ascii) def test_invalid_escape_sequences(self): # incomplete escape sequence self.assertRaises(json.JSONDecodeError, json.loads, '"\\u') self.assertRaises(json.JSONDecodeError, json.loads, '"\\u1') self.assertRaises(json.JSONDecodeError, json.loads, '"\\u12') self.assertRaises(json.JSONDecodeError, json.loads, '"\\u123') self.assertRaises(json.JSONDecodeError, json.loads, '"\\u1234') # invalid escape sequence self.assertRaises(json.JSONDecodeError, json.loads, '"\\u123x"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\u12x4"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\u1x34"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ux234"') if sys.maxunicode > 65535: # invalid escape sequence for low surrogate self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u0"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u00"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u000"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u000x"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u00x0"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\u0x00"') self.assertRaises(json.JSONDecodeError, json.loads, '"\\ud800\\ux000"') def test_ensure_ascii_still_works(self): # in the ascii range, ensure that everything is the same for c in map(unichr, range(0, 127)): self.assertEqual( json.dumps(c, ensure_ascii=False), json.dumps(c)) snowman = u'\N{SNOWMAN}' self.assertEqual( json.dumps(c, ensure_ascii=False), '"' + c + '"') def test_strip_bom(self): content = u"\u3053\u3093\u306b\u3061\u308f" json_doc = codecs.BOM_UTF8 + b(json.dumps(content)) self.assertEqual(json.load(BytesIO(json_doc)), content) for doc in json_doc, json_doc.decode('utf8'): self.assertEqual(json.loads(doc), content) ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/simplejson/tool.py0000644000175100017510000000211215172736046017067 0ustar00runnerrunnerr"""Command-line tool to validate and pretty-print JSON Usage:: $ echo '{"json":"obj"}' | python -m simplejson.tool { "json": "obj" } $ echo '{ 1.2:3.4}' | python -m simplejson.tool Expecting property name: line 1 column 2 (char 2) """ import sys import simplejson as json def main(): if len(sys.argv) == 1: infile = sys.stdin outfile = sys.stdout elif len(sys.argv) == 2: infile = open(sys.argv[1], 'r') outfile = sys.stdout elif len(sys.argv) == 3: infile = open(sys.argv[1], 'r') outfile = open(sys.argv[2], 'w') else: raise SystemExit(sys.argv[0] + " [infile [outfile]]") with infile: try: obj = json.load(infile, object_pairs_hook=json.OrderedDict, use_decimal=True) except ValueError: raise SystemExit(sys.exc_info()[1]) with outfile: json.dump(obj, outfile, sort_keys=True, indent=' ', use_decimal=True) outfile.write('\n') if __name__ == '__main__': main() ././@PaxHeader0000000000000000000000000000003400000000000010212 xustar0028 mtime=1777056810.6005166 simplejson-4.1.1/simplejson.egg-info/0000755000175100017510000000000015172736053017234 5ustar00runnerrunner././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056810.0 simplejson-4.1.1/simplejson.egg-info/PKG-INFO0000644000175100017510000000730015172736052020330 0ustar00runnerrunnerMetadata-Version: 2.4 Name: simplejson Version: 4.1.1 Summary: Simple, fast, extensible JSON encoder/decoder for Python Home-page: https://github.com/simplejson/simplejson Author: Bob Ippolito Author-email: bob@redivi.com License: MIT OR AFL-2.1 Platform: any Classifier: Development Status :: 5 - Production/Stable Classifier: Environment :: WebAssembly :: Emscripten Classifier: Intended Audience :: Developers Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 2.7 Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: 3.11 Classifier: Programming Language :: Python :: 3.12 Classifier: Programming Language :: Python :: 3.13 Classifier: Programming Language :: Python :: 3.14 Classifier: Programming Language :: Python :: Implementation :: CPython Classifier: Programming Language :: Python :: Implementation :: GraalPy Classifier: Programming Language :: Python :: Implementation :: PyPy Classifier: Topic :: Software Development :: Libraries :: Python Modules Requires-Python: >=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*, !=3.5.*, !=3.6.*, !=3.7.* License-File: LICENSE.txt Dynamic: author Dynamic: author-email Dynamic: classifier Dynamic: description Dynamic: home-page Dynamic: license Dynamic: license-file Dynamic: platform Dynamic: requires-python Dynamic: summary simplejson ---------- simplejson is a simple, fast, complete, correct and extensible JSON encoder and decoder for Python 3.8+ with legacy support for Python 2.7. It is pure Python code with no dependencies, but includes an optional C extension for a serious speed boost. The latest documentation for simplejson can be read online here: https://simplejson.readthedocs.io/ simplejson is the externally maintained development version of the json library included with Python (since 2.6). This version is tested with Python 3.14 (including free-threaded builds) and maintains backwards compatibility with Python 3.8+. A legacy Python 2.7 wheel is also published. The encoder can be specialized to provide serialization in any kind of situation, without any special support by the objects to be serialized (somewhat like pickle). This is best done with the ``default`` kwarg to dumps. The decoder can handle incoming JSON strings of any specified encoding (UTF-8 by default). It can also be specialized to post-process JSON objects with the ``object_hook`` or ``object_pairs_hook`` kwargs. This is particularly useful for implementing protocols such as JSON-RPC that have a richer type system than JSON itself. For those of you that have legacy systems to maintain, there is a very old fork of simplejson in the `python2.2`_ branch that supports Python 2.2. This is based on a very old version of simplejson, is not maintained, and should only be used as a last resort. .. _python2.2: https://github.com/simplejson/simplejson/tree/python2.2 RawJSON ~~~~~~~ ``RawJSON`` allows embedding pre-encoded JSON strings into output without re-encoding them. This can be useful in advanced cases where JSON content is already serialized and re-encoding would be unnecessary. Example usage:: from simplejson import dumps, RawJSON payload = { "status": "ok", "data": RawJSON('{"a": 1, "b": 2}') } print(dumps(payload)) # Output: {"status": "ok", "data": {"a": 1, "b": 2}} **Caveat:** ``RawJSON`` should be used with care. It bypasses normal serialization and validation, and is not recommended for general use unless the embedded JSON content is fully trusted. ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056810.0 simplejson-4.1.1/simplejson.egg-info/SOURCES.txt0000644000175100017510000000331715172736052021123 0ustar00runnerrunnerCHANGES.txt LICENSE.txt MANIFEST.in README.rst conf.py index.rst pyproject.toml setup.py tsan_stress_simplejson.py scripts/make_docs.py simplejson/__init__.py simplejson/_speedups.c simplejson/_speedups_scan.h simplejson/compat.py simplejson/decoder.py simplejson/encoder.py simplejson/errors.py simplejson/ordered_dict.py simplejson/raw_json.py simplejson/scanner.py simplejson/tool.py simplejson.egg-info/PKG-INFO simplejson.egg-info/SOURCES.txt simplejson.egg-info/dependency_links.txt simplejson.egg-info/top_level.txt simplejson/tests/__init__.py simplejson/tests/_cibw_runner.py simplejson/tests/_helpers.py simplejson/tests/test_bigint_as_string.py simplejson/tests/test_bitsize_int_as_string.py simplejson/tests/test_check_circular.py simplejson/tests/test_decimal.py simplejson/tests/test_decode.py simplejson/tests/test_default.py simplejson/tests/test_dump.py simplejson/tests/test_encode_basestring_ascii.py simplejson/tests/test_encode_for_html.py simplejson/tests/test_errors.py simplejson/tests/test_fail.py simplejson/tests/test_float.py simplejson/tests/test_for_json.py simplejson/tests/test_free_threading.py simplejson/tests/test_indent.py simplejson/tests/test_item_sort_key.py simplejson/tests/test_iterable.py simplejson/tests/test_namedtuple.py simplejson/tests/test_pass1.py simplejson/tests/test_pass2.py simplejson/tests/test_pass3.py simplejson/tests/test_raw_json.py simplejson/tests/test_recursion.py simplejson/tests/test_scanstring.py simplejson/tests/test_separators.py simplejson/tests/test_speedups.py simplejson/tests/test_str_subclass.py simplejson/tests/test_subclass.py simplejson/tests/test_subinterpreters.py simplejson/tests/test_tool.py simplejson/tests/test_tuple.py simplejson/tests/test_unicode.py././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056810.0 simplejson-4.1.1/simplejson.egg-info/dependency_links.txt0000644000175100017510000000000115172736052023301 0ustar00runnerrunner ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056810.0 simplejson-4.1.1/simplejson.egg-info/top_level.txt0000644000175100017510000000001315172736052021757 0ustar00runnerrunnersimplejson ././@PaxHeader0000000000000000000000000000002600000000000010213 xustar0022 mtime=1777056806.0 simplejson-4.1.1/tsan_stress_simplejson.py0000644000175100017510000004006115172736046020547 0ustar00runnerrunner#!/usr/bin/env python3 """TSan stress test for simplejson._speedups. ================================================================================ Building a TSan + free-threaded CPython (one-time) ================================================================================ git clone https://github.com/python/cpython.git cpython-tsan cd cpython-tsan ./configure --disable-gil --with-thread-sanitizer \ --prefix=$HOME/py-tsan-ft make -j$(nproc) && make install $HOME/py-tsan-ft/bin/python3 -m pip install -e /home/bob/src/simplejson ================================================================================ Running this script under TSan ================================================================================ cd /home/bob/src/simplejson PYTHON_GIL=0 \ TSAN_OPTIONS='halt_on_error=0 second_deadlock_stack=1 history_size=7' \ $HOME/py-tsan-ft/bin/python3 tsan_stress_simplejson.py \ 2> tsan_report.txt # Triage: /ft-review-toolkit:explore . tsan tsan_report.txt ================================================================================ Configuration (environment variables) ================================================================================ TSAN_THREADS concurrent workers (default: cpu_count or 8) TSAN_ITERATIONS calls per worker per scenario (default 2000; auto-lowered on TSan builds) TSAN_DURATION approximate wall-clock seconds per scenario (default 2.5) TSAN_TIMEOUT per-scenario hard timeout in seconds (default 60) What this exercises ------------------- - Scenario 1: N threads share ONE make_scanner() instance; each parses different-but-overlapping JSON (dict/list/string cases). Targets scanner s->memo (PyDict_SetItem/Clear) under concurrent scan_once(). - Scenario 2: N threads share ONE make_encoder() instance; each encodes distinct nested dict/list objects. Targets self->markers and self->key_memo mutation on concurrent calls. - Scenario 3: N threads encode the SAME shared dict. Stresses the Py_BEGIN_CRITICAL_SECTION(dct) path in encoder_listencode_dict. - Scenario 4: N threads encode the SAME shared list. Stresses the Py_BEGIN_CRITICAL_SECTION(seq) path in encoder_listencode_list. - Scenario 5: Mutator thread adds/removes keys while N readers encode the same dict (read-write contention on the input container). - Scenario 6: Module-level hammer for scanstring + encode_basestring_ascii. """ import os import signal import sys import threading import time import warnings warnings.filterwarnings("ignore", ".*GIL.*") # --------------------------- configuration ---------------------------------- # def _env_int(name, default): try: v = os.environ.get(name) return int(v) if v else default except ValueError: return default def _env_float(name, default): try: v = os.environ.get(name) return float(v) if v else default except ValueError: return default def _is_tsan_build(): try: import sysconfig cflags = (sysconfig.get_config_var("CFLAGS") or "").lower() ldflags = (sysconfig.get_config_var("LDFLAGS") or "").lower() return "fsanitize=thread" in cflags or "fsanitize=thread" in ldflags except Exception: return False THREADS = _env_int("TSAN_THREADS", os.cpu_count() or 8) ITERATIONS = _env_int("TSAN_ITERATIONS", 2000) DURATION = _env_float("TSAN_DURATION", 2.5) SCENARIO_TIMEOUT = _env_int("TSAN_TIMEOUT", 60) if _is_tsan_build(): # TSan finds races on first occurrence; cap work to keep runtime sane. THREADS = min(THREADS, 6) ITERATIONS = min(ITERATIONS, 400) # --------------------------- import guard ----------------------------------- # try: from simplejson import _speedups except ImportError as exc: print(f"SKIP: cannot import simplejson._speedups: {exc}", file=sys.stderr) sys.exit(0) missing = [ n for n in ("make_scanner", "make_encoder", "encode_basestring_ascii", "scanstring") if not hasattr(_speedups, n) ] if missing: print(f"SKIP: simplejson._speedups missing symbols: {missing}", file=sys.stderr) sys.exit(0) # --------------------------- shared fixtures -------------------------------- # # A minimal context object that satisfies the fields read by make_scanner(). # Mirrors simplejson.scanner.JSONDecoder attributes accessed in Scanner init. import decimal class _ScanCtx: __slots__ = ( "encoding", "strict", "object_hook", "object_pairs_hook", "array_hook", "parse_float", "parse_int", "parse_constant", "memo", ) def __init__(self): self.encoding = "utf-8" self.strict = True self.object_hook = None self.object_pairs_hook = None self.array_hook = None self.parse_float = float self.parse_int = int self.parse_constant = float self.memo = {} def _make_shared_scanner(): return _speedups.make_scanner(_ScanCtx()) def _make_shared_encoder(): # Matches the argument order in simplejson/encoder.py c_make_encoder call. markers = {} key_memo = {} return _speedups.make_encoder( markers, # markers (cycle detection) lambda o: str(o), # default _speedups.encode_basestring_ascii, # _encoder (ascii) None, # indent ": ", ", ", # key_sep, item_sep False, # sort_keys False, # skipkeys True, # allow_nan key_memo, # key_memo False, # use_decimal False, # namedtuple_as_object True, # tuple_as_array None, # int_as_string_bitcount (None or positive int) None, # item_sort_key "utf-8", # encoding False, # for_json False, # ignore_nan decimal.Decimal, # Decimal False, # iterable_as_array ) # Varied JSON documents exercising dict / list / string / number / escapes. JSON_SAMPLES = [ b'{"a": 1, "b": [1,2,3], "c": "hello"}', b'[1, 2, 3, 4, "five", null, true, false]', b'{"nested": {"x": [1, {"y": [2, {"z": 3}]}]}}', b'"a plain \\"quoted\\" string with \\u00e9 escapes and \\n newlines"', b'{"k1":"v1","k2":"v2","k3":"v3","k4":"v4","k5":"v5"}', b'[{"id":1,"name":"alice"},{"id":2,"name":"bob"},{"id":3,"name":"carol"}]', b'{"a":1,"b":2,"c":3,"d":4,"e":5,"f":6,"g":7,"h":8}', b'{"list":[1.5, 2.5, 3.5, -0.25, 1e10, -2.3e-5]}', ] JSON_SAMPLES = [s.decode("utf-8") for s in JSON_SAMPLES] def _make_distinct_object(i): return { "id": i, "name": f"item-{i}", "tags": ["alpha", "beta", "gamma", f"n{i}"], "nested": {"x": i, "y": [i, i + 1, i + 2], "s": "x" * (i % 16)}, "vals": [1, 2, 3, 4, 5, i, i * 2, i * 3], } SHARED_DICT = { "a": 1, "b": 2, "c": [1, 2, 3, 4, 5], "d": {"dd": "deep", "ee": [10, 20, 30]}, "e": "some string with \"escapes\" and \n newlines", "f": True, "g": None, "h": 3.14159, } SHARED_LIST = [ 1, 2, 3, "four", 5.0, None, True, False, {"a": 1, "b": 2}, [10, 20, 30, 40], "a \"quoted\" thing", {"nested": {"deeper": [1, 2, 3]}}, ] # --------------------------- scenario plumbing ------------------------------ # def run_scenario(name, target_fns, thread_counts=None): """Run a scenario in a forked child so a SEGV can't kill the parent.""" print(f" Running: {name} ...", end=" ", flush=True) sys.stdout.flush() sys.stderr.flush() pid = os.fork() if pid == 0: try: _run_scenario_threads(target_fns, thread_counts) os._exit(0) except SystemExit as e: code = e.code if isinstance(e.code, int) else 1 os._exit(code) except BaseException: import traceback traceback.print_exc() os._exit(1) deadline = time.monotonic() + SCENARIO_TIMEOUT wait_status = None while time.monotonic() < deadline: r_pid, status = os.waitpid(pid, os.WNOHANG) if r_pid != 0: wait_status = status break time.sleep(0.1) if wait_status is None: try: os.kill(pid, signal.SIGKILL) except ProcessLookupError: pass os.waitpid(pid, 0) print(f"TIMEOUT ({SCENARIO_TIMEOUT}s)") elif os.WIFSIGNALED(wait_status): sig = os.WTERMSIG(wait_status) name_ = (signal.Signals(sig).name if sig in signal.Signals._value2member_map_ else str(sig)) print(f"CRASH ({name_})") elif os.WIFEXITED(wait_status) and os.WEXITSTATUS(wait_status) != 0: print(f"FAIL (exit {os.WEXITSTATUS(wait_status)})") else: print("OK") def _run_scenario_threads(target_fns, thread_counts=None): if thread_counts is None: thread_counts = [THREADS] * len(target_fns) total = sum(thread_counts) barrier = threading.Barrier(total) errors = [] errors_lock = threading.Lock() def wrapper(fn): def wrapped(): try: barrier.wait() fn() except Exception as exc: with errors_lock: errors.append(repr(exc)) return wrapped threads = [] for fn, count in zip(target_fns, thread_counts): for _ in range(count): threads.append(threading.Thread(target=wrapper(fn), daemon=True)) for t in threads: t.start() for t in threads: t.join(timeout=SCENARIO_TIMEOUT) if errors: # Print a few; data races remain the interesting output on stderr. for e in errors[:5]: print(f" worker error: {e}", file=sys.stderr) sys.exit(1) # --------------------------- scenarios -------------------------------------- # def scenario_shared_scanner(): """N threads share ONE scanner; race on s->memo (SetItem/Clear).""" scanner = _make_shared_scanner() samples = JSON_SAMPLES def worker(): deadline = time.monotonic() + DURATION i = 0 for _ in range(ITERATIONS): doc = samples[i % len(samples)] try: scanner(doc, 0) except Exception: pass i += 1 if time.monotonic() > deadline: break run_scenario("shared scanner (scanner s->memo races)", [worker]) def scenario_shared_encoder_distinct_inputs(): """N threads share ONE encoder; distinct objects. markers/key_memo races.""" encoder = _make_shared_encoder() def make_worker(tid): def worker(): deadline = time.monotonic() + DURATION for i in range(ITERATIONS): obj = _make_distinct_object(tid * 1_000_000 + i) try: encoder(obj, 0) except Exception: pass if time.monotonic() > deadline: break return worker run_scenario( "shared encoder, distinct inputs (markers / key_memo races)", [make_worker(i) for i in range(THREADS)], [1] * THREADS, ) def scenario_shared_encoder_shared_dict(): """All threads encode the SAME dict -> encoder_listencode_dict crit-sect.""" encoder = _make_shared_encoder() shared = SHARED_DICT def worker(): deadline = time.monotonic() + DURATION for _ in range(ITERATIONS): try: encoder(shared, 0) except Exception: pass if time.monotonic() > deadline: break run_scenario( "shared encoder + shared dict (encoder_listencode_dict CS)", [worker], ) def scenario_shared_encoder_shared_list(): """All threads encode the SAME list -> encoder_listencode_list crit-sect.""" encoder = _make_shared_encoder() shared = SHARED_LIST def worker(): deadline = time.monotonic() + DURATION for _ in range(ITERATIONS): try: encoder(shared, 0) except Exception: pass if time.monotonic() > deadline: break run_scenario( "shared encoder + shared list (encoder_listencode_list CS)", [worker], ) def scenario_mutator_vs_readers(): """Mutator mutates dict while readers encode it. Stresses input CS.""" encoder = _make_shared_encoder() target = {"base": 0, "a": 1, "b": 2, "c": 3, "d": [1, 2, 3]} def reader(): deadline = time.monotonic() + DURATION for _ in range(ITERATIONS): try: encoder(target, 0) except Exception: pass if time.monotonic() > deadline: break def mutator(): deadline = time.monotonic() + DURATION i = 0 while i < ITERATIONS * 4 and time.monotonic() < deadline: key = f"k{i % 64}" try: if i & 1: target[key] = [i, i + 1, {"x": i}] else: target.pop(key, None) except Exception: pass i += 1 # One mutator, rest readers. reader_count = max(THREADS - 1, 1) run_scenario( "mutator vs readers on shared dict", [reader, mutator], [reader_count, 1], ) def scenario_module_functions(): """Concurrent scanstring + encode_basestring_ascii (module-level).""" scanstring = _speedups.scanstring enc_ascii = _speedups.encode_basestring_ascii strings_to_encode = [ "hello world", "\"quoted\" with \\ backslash", "\u00e9\u00e8\u00ea \u4e2d\u6587 emoji-\U0001F600", "control\n\t\r chars", "x" * 256, ] # For scanstring: the opening quote has been consumed; pass end=1. scan_sources = [ r'"a simple string"', r'"with \"escapes\" and \n newlines"', r'"unicode \u00e9\u00e8\u4e2d\u6587 stuff"', r'"backslashes \\ and slashes \/"', ] def enc_worker(): deadline = time.monotonic() + DURATION for i in range(ITERATIONS): try: enc_ascii(strings_to_encode[i % len(strings_to_encode)]) except Exception: pass if time.monotonic() > deadline: break def scan_worker(): deadline = time.monotonic() + DURATION for i in range(ITERATIONS): src = scan_sources[i % len(scan_sources)] try: # scanstring(basestring, end, encoding, strict) scanstring(src, 1, "utf-8", 1) except Exception: pass if time.monotonic() > deadline: break half = max(THREADS // 2, 1) run_scenario( "module-level scanstring + encode_basestring_ascii", [enc_worker, scan_worker], [half, max(THREADS - half, 1)], ) # --------------------------- main ------------------------------------------- # SCENARIOS = [ scenario_shared_scanner, scenario_shared_encoder_distinct_inputs, scenario_shared_encoder_shared_dict, scenario_shared_encoder_shared_list, scenario_mutator_vs_readers, scenario_module_functions, ] def main(): print("TSan stress test for simplejson._speedups") print(f" Python: {sys.version.splitlines()[0]}") print(f" TSan build: {_is_tsan_build()}") print(f" PYTHON_GIL: {os.environ.get('PYTHON_GIL', '')}") print(f" Threads: {THREADS}") print(f" Iterations: {ITERATIONS}") print(f" Duration: {DURATION}s per scenario") print(f" Timeout: {SCENARIO_TIMEOUT}s per scenario") print() for sc in SCENARIOS: sc() print("\nDone. Inspect stderr (e.g. tsan_report.txt) for TSan warnings.") if __name__ == "__main__": main()