Converts a given python_val to a Flyte Literal, assuming the given python_val matches the declared python_type.
Implementers should refrain from using type(python_val) instead rely on the passed in python_type. If these
do not match (or are not allowed) the Transformer implementer should raise an AssertionError, clearly stating
what was the mismatch
Parameter
Type
Description
ctx
flytekit.core.context_manager.FlyteContext
A FlyteContext, useful in accessing the filesystem and other attributes
python_val
pandas.DataFrame
The actual value to be transformed
python_type
typing.Type[pandas.DataFrame]
The assumed type of the value (this matches the declared type on the function)
This function primarily handles deserialization for untyped dicts, dataclasses, Pydantic BaseModels, and attribute access.`
For untyped dict, dataclass, and pydantic basemodel:
Life Cycle (Untyped Dict as example):
python val -> msgpack bytes -> binary literal scalar -> msgpack bytes -> python val
(to_literal) (from_binary_idl)
For attribute access:
Life Cycle:
python val -> msgpack bytes -> binary literal scalar -> resolved golang value -> binary literal scalar -> msgpack bytes -> python val
(to_literal) (propeller attribute access) (from_binary_idl)
Converts a given python_val to a Flyte Literal, assuming the given python_val matches the declared python_type.
Implementers should refrain from using type(python_val) instead rely on the passed in python_type. If these
do not match (or are not allowed) the Transformer implementer should raise an AssertionError, clearly stating
what was the mismatch
Parameter
Type
Description
ctx
FlyteContext
A FlyteContext, useful in accessing the filesystem and other attributes
python_val
typing.Any
The actual value to be transformed
python_type
Type[T]
The assumed type of the value (this matches the declared type on the function)
Writes data frame as a chunk to the local directory owned by the Schema object. Will later be uploaded to s3.
Parameter
Type
Description
df
pandas.DataFrame
data frame to write as parquet
to_file
os.PathLike
Sink file to write the dataframe to
coerce_timestamps
str
format to store timestamp in parquet. ‘us’, ‘ms’, ’s’ are allowed values. Note: if your timestamps will lose data due to the coercion, your write will fail! Nanoseconds are problematic in the Parquet format and will not work. See allow_truncated_timestamps.
allow_truncated_timestamps
bool
default False. Allow truncation when coercing timestamps to a coarser resolution.