Skip to content

Python:json

JSON (JavaScript Object Notation), specified by RFC 7159 (which obsoletes RFC 4627) and by ECMA-404, is a lightweight data interchange format inspired by JavaScript object literal syntax (although it is not a strict subset of JavaScript).

json exposes an API familiar to users of the standard library marshal and pickle modules.

Encoding example

>>> import json
>>> json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])
'["foo", {"bar": ["baz", null, 1.0, 2]}]'
>>> print json.dumps("\"foo\bar")
"\"foo\bar"
>>> print json.dumps(u'\u1234')
"\u1234"
>>> print json.dumps('\\')
"\\"
>>> print json.dumps({"c": 0, "b": 0, "a": 0}, sort_keys=True)
{"a": 0, "b": 0, "c": 0}
>>> from StringIO import StringIO
>>> io = StringIO()
>>> json.dump(['streaming API'], io)
>>> io.getvalue()
'["streaming API"]'

Decoding example

>>> import json
>>> json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')
[u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
>>> json.loads('"\\"foo\\bar"')
u'"foo\x08ar'
>>> from StringIO import StringIO
>>> io = StringIO('["streaming API"]')
>>> json.load(io)
[u'streaming API']

JSONEncoder

JSON 인코딩.

커스텀 인코딩 콜백

datetime 이나 timedelta 를 직렬화하고 싶다면:

from datetime import datetime, timedelta
from json import JSONEncoder, dumps

def dumps_default(o: Any) -> Any:
    if isinstance(o, datetime):
        return o.isoformat()
    elif isinstance(o, timedelta):
        return o.total_seconds()
    try:
        return JSONEncoder().default(o)
    except TypeError:
        return str(o)

# 'o' is serialize object
json = dumps(o, indent=4, sort_keys=True, default=dumps_default)

JSONDecoder

JSON 디코딩.

from json import JSONDecoder, loads

# ...

Python JSON Libraries

Benchmarking Libraries

import time
import json
import orjson
import rapidjson
import hyperjson

m = {
    "timestamp": 1556283673.1523004,
    "task_uuid": "0ed1a1c3-050c-4fb9-9426-a7e72d0acfc7",
    "task_level": [1, 2, 1],
    "action_status": "started",
    "action_type": "main",
    "key": "value",
    "another_key": 123,
    "and_another": ["a", "b"],
}

def benchmark(name, dumps):
    start = time.time()
    for i in range(1000000):
        dumps(m)
    print(name, time.time() - start)

benchmark("Python", json.dumps)
# orjson only outputs bytes, but often we need unicode:
benchmark("orjson", lambda s: str(orjson.dumps(s), "utf-8"))
benchmark("rapidjson", rapidjson.dumps)
benchmark("hyperjson", hyperjson.dumps)

See also

Favorite site