Skip to content

DataType

Data types supported by DataChain must be of type DataType. DataType includes most Python types supported in Pydantic fields, as well as any class that inherits from Pydantic BaseModel.

Pydantic models can be used to group and nest multiple fields together into a single type object. Any Pydantic model must be registered so that the chain knows the expected schema of the model. Alternatively, models may inherit from DataModel, which is a lightweight wrapper around Pydantic's BaseModel that automatically handles registering the model.

DataModel

Bases: BaseModel

Pydantic model wrapper that registers model with DataChain.

__pydantic_init_subclass__ classmethod

__pydantic_init_subclass__()

It automatically registers every declared DataModel child class.

Source code in datachain/lib/data_model.py
@classmethod
def __pydantic_init_subclass__(cls):
    """It automatically registers every declared DataModel child class."""
    ModelStore.register(cls)

register staticmethod

register(models: Union[DataType, Sequence[DataType]])

For registering classes manually. It accepts a single class or a sequence of classes.

Source code in datachain/lib/data_model.py
@staticmethod
def register(models: Union[DataType, Sequence[DataType]]):
    """For registering classes manually. It accepts a single class or a sequence of
    classes."""
    if not isinstance(models, Sequence):
        models = [models]
    for val in models:
        ModelStore.register(val)

DataType module-attribute

DataType = Union[type[BaseModel], StandardType]

is_chain_type

is_chain_type(t: type) -> bool

Return true if type is supported by DataChain.

Source code in datachain/lib/data_model.py
def is_chain_type(t: type) -> bool:
    """Return true if type is supported by `DataChain`."""
    if ModelStore.is_pydantic(t):
        return True
    if any(t is ft or t is get_args(ft)[0] for ft in get_args(StandardType)):
        return True

    orig = get_origin(t)
    args = get_args(t)
    if orig is list and len(args) == 1:
        return is_chain_type(get_args(t)[0])

    if orig is Union and len(args) == 2 and (type(None) in args):
        return is_chain_type(args[0])

    return False