Pydantic

官方手册
此文章未完成

pydantic 主要是一个解析库,而不是一个验证库。通过pydantic库,可以更为规范地定义和使用数据接口。

from pydantic import BaseModel

class User(BaseModel):
    id: int                # 指定类型
    name = 'Jane Doe'   # 通过默认值推断类型
    
# 对象的实例化就是解析和验证的过程,如果没有触发ValidationError,则模型是有效的。
user = User(id='123') 

# 输出一个字典,包含对象的所有成员
print(user.schema())

模型属性:

  • dict()

    返回一个字典,包含模型的所有字段和值。 cf. exporting models

  • json()

    dict()返回的值以json格式返回。 cf. exporting models

  • copy()

    返回一个新模型,由这个模型深度拷贝而来。 cf. exporting models

  • parse_obj()

    如果对象不是字典,则将任何对象加载到具有错误处理的模型中。 cf. helper functions

  • parse_raw()

    加载多种格式的字符串。 cf. helper functions

  • parse_file()

    parse_raw()类似,但是使用文件路径。 cf. helper function

  • from_orm()

    从任意类中加载数据 cf. ORM mode

  • schema()

    返回一个字典,以JSON的结构表示这个模型。 cf. schema

  • schema_json()

    schema()返回的结果以JSON字符串表示。 cf. schema

  • construct()

    是一个类方法,不进行验证直接创建一个模型。 cf. Creating models without validation

  • __fields_set__

    初始化模型实例时设置的字段名集合。

  • __fields__

    一个字典,包含模型的字段。

  • __config__

    模型的配置类。 cf. model config

字段类型

标准库类型

  • None, type(None) or Literal[None] (equivalent according to PEP 484)

    仅支持值:None

  • bool

    仅支持以下值:

    • A valid boolean (i.e. True or False),
    • The integers 0 or 1,
    • a str which when converted to lower case is one of '0', 'off', 'f', 'false', 'n', 'no', '1', 'on', 't', 'true', 'y', 'yes'
    • a bytes which is valid (per the previous rule) when decoded to str
  • int

    pydantic 使用 int(v) 将值强制转换成整形,这种方法可能会导致浮点数精度丢失。

  • float

    同样的使用float(v) 强制转换。

  • str

    字符类型原样接受,整形浮点数会使用str(v)强制转换,bytesbytearray会使用 v.decode()强制转换,枚举类型继承与str的会使用v.value进行转换,其他值会导致错误。

  • bytes

    bytes 类型会原样接受, bytearray使用bytes(v)进行强制转换, str 类型使用 v.encode()进行强制转换, int, float, 以及 Decimal 使用str(v).encode()进行强制转换。

  • list

    allows list, tuple, set, frozenset, deque, or generators and casts to a list; see typing.List below for sub-type constraints

  • tuple

    allows list, tuple, set, frozenset, deque, or generators and casts to a tuple; see typing.Tuple below for sub-type constraints

  • dict

    dict(v) is used to attempt to convert a dictionary; see typing.Dict below for sub-type constraints

  • set

    allows list, tuple, set, frozenset, deque, or generators and casts to a set; see typing.Set below for sub-type constraints

  • frozenset

    allows list, tuple, set, frozenset, deque, or generators and casts to a frozen set; see typing.FrozenSet below for sub-type constraints

  • deque

    allows list, tuple, set, frozenset, deque, or generators and casts to a deque; see typing.Deque below for sub-type constraints

  • datetime.date

    see Datetime Types below for more detail on parsing and validation

  • datetime.time

    see Datetime Types below for more detail on parsing and validation

  • datetime.datetime

    see Datetime Types below for more detail on parsing and validation

  • datetime.timedelta

    see Datetime Types below for more detail on parsing and validation

  • typing.Any

    allows any value including None, thus an Any field is optional

  • typing.Annotated

    allows wrapping another type with arbitrary metadata, as per PEP-593. The Annotated hint may contain a single call to the Field function, but otherwise the additional metadata is ignored and the root type is used.

  • typing.TypeVar

    constrains the values allowed based on constraints or bound, see TypeVar

  • typing.Union

    see Unions below for more detail on parsing and validation

  • typing.Optional

    Optional[x] is simply short hand for Union[x, None]; see Unions below for more detail on parsing and validation and Required Fields for details about required fields that can receive None as a value.

  • typing.List

    see Typing Iterables below for more detail on parsing and validation

  • typing.Tuple

    see Typing Iterables below for more detail on parsing and validation

  • subclass of typing.NamedTuple

    Same as tuple but instantiates with the given namedtuple and validates fields since they are annotated. See Annotated Types below for more detail on parsing and validation

  • subclass of collections.namedtuple

    Same as subclass of typing.NamedTuple but all fields will have type Any since they are not annotated

  • typing.Dict

    see Typing Iterables below for more detail on parsing and validation

  • subclass of typing.TypedDict

    Same as dict but pydantic will validate the dictionary since keys are annotated. See Annotated Types below for more detail on parsing and validation

  • typing.Set

    see Typing Iterables below for more detail on parsing and validation

  • typing.FrozenSet

    see Typing Iterables below for more detail on parsing and validation

  • typing.Deque

    see Typing Iterables below for more detail on parsing and validation

  • typing.Sequence

    see Typing Iterables below for more detail on parsing and validation

  • typing.Iterable

    this is reserved for iterables that shouldn't be consumed. See Infinite Generators below for more detail on parsing and validation

  • typing.Type

    see Type below for more detail on parsing and validation

  • typing.Callable

    see Callable below for more detail on parsing and validation

  • typing.Pattern

    will cause the input value to be passed to re.compile(v) to create a regex pattern

  • ipaddress.IPv4Address

    simply uses the type itself for validation by passing the value to IPv4Address(v); see Pydantic Types for other custom IP address types

  • ipaddress.IPv4Interface

    simply uses the type itself for validation by passing the value to IPv4Address(v); see Pydantic Types for other custom IP address types

  • ipaddress.IPv4Network

    simply uses the type itself for validation by passing the value to IPv4Network(v); see Pydantic Types for other custom IP address types

  • ipaddress.IPv6Address

    simply uses the type itself for validation by passing the value to IPv6Address(v); see Pydantic Types for other custom IP address types

  • ipaddress.IPv6Interface

    simply uses the type itself for validation by passing the value to IPv6Interface(v); see Pydantic Types for other custom IP address types

  • ipaddress.IPv6Network

    simply uses the type itself for validation by passing the value to IPv6Network(v); see Pydantic Types for other custom IP address types

  • enum.Enum

    checks that the value is a valid Enum instance

  • subclass of enum.Enum

    checks that the value is a valid member of the enum; see Enums and Choices for more details

  • enum.IntEnum

    checks that the value is a valid IntEnum instance

  • subclass of enum.IntEnum

    checks that the value is a valid member of the integer enum; see Enums and Choices for more details

  • decimal.Decimal

    pydantic attempts to convert the value to a string, then passes the string to Decimal(v)

  • pathlib.Path

    simply uses the type itself for validation by passing the value to Path(v); see Pydantic Types for other more strict path types

  • uuid.UUID

    strings and bytes (converted to strings) are passed to UUID(v), with a fallback to UUID(bytes=v) for bytes and bytearray; see Pydantic Types for other stricter UUID types

  • ByteSize

    converts a bytes string with units to bytes

文字类型(python > 3.8)

from typing import Literal

from pydantic import BaseModel, ValidationError

class Pie(BaseModel):
    flavor: Literal['apple', 'pumpkin']

Pie(flavor='apple')
Pie(flavor='pumpkin')
try:
    Pie(flavor='cherry')
except ValidationError as e:
    print(str(e)) # unexpected value; permitted: 'apple', 'pumpkin'

验证器

验证器是一种 class methods,所以第一个参数不是slef而是cls。第二个参数始终是要验证的值。

还可以添加以下参数(参数名必须相同):

  • values: 一个包含先前验证过的字段的字典。
  • config: 模型的配置。
  • field: 正在验证的字段,类型是pydantic.fields.ModelField
  • **kwargs: 一个字典,包含上出未明确列出的参数。

验证器应该返回解析的值或者引发 ValueErrorTypeErrorAssertionError异常,或者使用断言。

如果你使用了assert , 请注意运行的时候 -O optimization flag 选项会关闭 assert 语句, 并且会导致验证器停止工作

如果验证器依赖了其他的值,应该注意:定义的顺序决定了验证的顺序,并且所有有类型(无论是只有类型声明还是只有默认值)的字段将在没有类型的字段的前面。如果一个字段失败了,不会包含在values中,因此需要判断。

通过传递多个字段给验证器 @validator()可以将单个验证器用于多个字段。也可以使用参数*给每一个字段使用这个验证器。使用@validator(...,pre=True)可以在其他验证之前调用验证器。使用@validator(...,each_item=True)可以使验证器用于单个值(例如列表,字典等)而不是整个对象。但是如果引用的是父类的对象,则each_item=True的验证器不会运行,必须手动遍历每一个元素。

默认情况下,如果某个值未提供,则不调用对应的验证器,除非设置了@validator(... ,pre=True,,always=True)。这种方式最好和pre一起使用,否则可能会导致验证器错误。

使用allow_reuse=True可以在多个模型中公用一个验证器,代码如下:

from pydantic import BaseModel, validator

def normalize(name: str) -> str:
    return ' '.join((word.capitalize()) for word in name.split(' '))

class Producer(BaseModel):
    name: str

    _normalize_name = validator('name', allow_reuse=True)(normalize) # validators

class Consumer(BaseModel):
    name: str

    _normalize_name = validator('name', allow_reuse=True)(normalize) # validators

jane_doe = Producer(name='JaNe DOE')
john_doe = Consumer(name='joHN dOe')
assert jane_doe.name == 'Jane Doe'
assert john_doe.name == 'John Doe'

使用@root_validator可以设置Root Validators,支持在整个模型中检查数据。如果设置了pre=True,则在验证之前调用,否则在验证结束后调用。pre=True的验证器如果失败了,则不会继续验证接下来的字段。pre=False的根验证器就算先前的验证器失败了也会调用。可以通过skip_on_failure=True修改这个行为,具体见官方手册

在创建类时,会检查验证器以确认它们指定的字段确实存在于模型中。然而,有时这是做不到的:例如如果定义了一个验证器来验证继承模型上的字段。在这种情况下,应该在验证器上设置 check_fields=False