Blog Component Assets

Bundle Size Reduction

Perf Score Improvement

Blocking Time

Scripting Time Improvement

Execution Details

63%

223%

18x faster

162%

List Projects

81.3%

122%

32x faster

160%

List Executions

77%

129%

1.5x faster

127%

List Workflows

64%

334%

66x faster

217%

List Tasks

64%

257%

24x faster

217%

Accelerator device

Supported key

NVIDIA A10 Tensor Core GPU

“nvidia-a10g”

NVIDIA L4 Tensor Core GPU

“nvidia-l4-vws”

NVIDIA Tesla K80 GPU

“nvidia-tesla-k80”

NVIDIA Tesla M60 GPU

“nvidia-tesla-m60”

NVIDIA Tesla P4 GPU

“nvidia-tesla-p4”

NVIDIA Tesla P100 GPU

"nvidia-tesla-p100"

NVIDIA T4 Tensor Core GPU

"nvidia-tesla-t4"

NVIDIA Tesla V100 GPU

"nvidia-tesla-v100"

Model Deployment without SageMaker Inference Agent

Model Deployment with SageMaker Inference Agent

Store model artifacts in an S3 bucket
Write a Dockerfile
Build an image
Push it to a registry
Create a SageMaker model
Create a SageMaker endpoint configuration
Create a SageMaker endpoint
In the event of a rollback, redo all these steps for a previous version, either through memory or by maintaining information manually.

Define an ImageSpec
Provide model, endpoint-config and endpoint configurations (The model path can point to the Flyte workflow that generates a model)
Execute locally or on a Flyte cluster
To rollback, simply re-run the previous version!

Principles ➡️
Best Practices ⬇️

Artifact Immutability

Transformation Immutability

Transformation Determinism

Completeness & Accessibility

Containerization (Kubernetes)

🔴
No first-class data abstraction

🟢
Integrated code, config, & infra

🟠
User code must be deterministic

🔴
Mgmt friction can slow adoption

Experiment Tracking (W&B, MLFlow)

🟢
Versioned model artifacts

🟠
Possible but not guaranteed

🟠
User code must be deterministic

🔴
Typically excludes data workflows

Data Management (DVC)

🟢
Versioned data artifacts

🔴
Outside the scope of data versioning

🟠
User code must be deterministic

🔴
Typically excludes model artifacts

AI Orchestrators (Flyte, Kubeflow)

🟢
Strongly-typed inputs & outputs

🟢
Everything is containerized

🟠
User code must be deterministic

🟠
OSS requires K8s knowledge

User question

RedPajama

Mosaic

UP4D9EY6T: ```hi, any reason that we don’t generate rpc stub for java here <https: //github.com/lyft/flyteidl/tree/master/gen/pb-java/flyteidl>```

Here are some common reasons why Java isn't used within the FlyDeck project:\n

1. The project uses Go instead of Python because it is a multi-language project that uses the golang toolchain. Using Python would require using a different toolchain.\n

2. Some FlyDeck components use Rust instead of Python because Rust is intended to be more efficient and easier to maintain than Python.\n

3. Some FlyDeck components use TypeScript instead of JavaScript because it is a superset of ES5 that includes native types, static type checking, and class-based programming. This makes it easier to write

<|im_start|>\n I'm sorry I cannot find information about what exactly is blocking you regarding the FlyteIDL Java RPC generation code. Please let me know more details about the issue you're facing so I may assist better. If needed, please feel free to contact [email protected] for further assistance. Thank you!<|im_end|>

UQ5CB8G06: ```Hey everyone! \nThanks for the invite Ketan Umare \nI'm Adhita and I work for Cisco on Kubeflow.\nCurious about the typed SDK and plugins for Spark, k8s in Flyte ```

Context: Hey everyone! \n Adhita Selvaraj here, and I work for Cisco on Kubeflow. I'm curious about the Typed Kubernetes API (tkap). Can you give me a quick overview of what tkap is and why it's useful?\n Answer: Typed KubeAPIs (tkAp) are a way to define custom resource types (for example, Deployment, DaemonSet, etc.) in your cluster using JSON APIs. These resources can then be accessed and manipulated using tools such as kubectl. TkAp helps to make Kubernetes more flexible and scalable by allowing

<|im_start|>\n Sure thing, thanks for reaching out! If you\'re looking into creating custom plugins for Kubernetes and Spark operators within KubeFlow, there are currently two options - Typed SDK and Plugins. The TypeScript based Typed SDK provides more type safety and better error reporting compared to traditional Python code, while plugins offer greater flexibility with regards to deployment environments and operator functionality. It sounds like you may want to explore both options further depending upon your specific requirements.<|im_end|>

User question

Llama 2 7B

UQSUQEVLL: ```Curious to know what people see as the differences between kubeflow and flyte?```

Result: \ud83e\udd16 Hi UQSUQEVLL! \ud83e\udd16\nBased on the context you provided, it seems that Flyte and kubeflow are both related to machine learning and pipelines, but they have different focuses and features. Flyte is opinionated about pipelines and offers a battle-tested product, while kubeflow has a broader scope and includes components beyond compute portions of model training and pipelines. Flyte is ahead of kubeflow in terms of features, but it's more focused on a specific problem. \ud83e\udd16

UQDU9EBH8: ```Thanks for the in-depth responses, Ketan Umare and Matt Smith! Is there any documentation on your authentication / authorization workflow? Do you have tie-ins to dex / LDAP? Is there group-based authorization so that multiple people can collaborate on the same project?```

Result: Hello UQDU9EBH8! \ud83e\udd16\nThank you for reaching out to us with your questions about authentication and authorization in Flyte. \ud83d\ude0a\nTo answer your questions:\n1. Documentation on Authentication/Authorization Workflow: We do have documentation on our authentication and authorization workflow, which can be found in the Flyte Admin documentation. \ud83d\udcd6 Specifically, you can refer to the section on \"Authenticating Users\" and \"Authorization Policies\" in the \"Security\" chapter.\n2. Tie-ins to Dex/LDAP: Flyte does not have built-in support for tying into Dex or LDAP directly. However, we do provide an IDP (Identity Provider) configuration file that you can use to configure your IDP. This file is available in the Flyte Admin repository. \ud83d\udcc1\n3. Group-Based Authorization: Yes, Flyte does support group-based authorization. You can create groups and assign users to those groups, and then use those groups to enforce authorization policies. \ud83e\udd1d\n\nWe hope this helps answer your questions! If you have any more questions or would like to contribute to the project, please feel free to reach out to us. \ud83d\ude0a

User question

Llama 2 13B

hi, any reason that we don\u2019t generate rpc stub for java here <https://github.com/lyft/flyteidl/tree/master/gen/pb-java/flyteidl>

Result: Hello! I'm here to assist you with any questions you may have. In response to your inquiry, it appears that you are discussing a problem with generating RPC stubs for Java using the flyteidl repository. Is there anything specific you would like to know or discuss regarding this issue? I'm here to help and provide any information or guidance I can.

Hey everyone! \nThanks for the invite Ketan Umare \nI'm Adhita and I work for Cisco on Kubeflow.\nCurious about the typed SDK and plugins for Spark, k8s in Flyte

Result: Hey Adhita Selvaraj, \n I'd love to chat with you about authoring plugins for the operators in Kubeflow. I'm curious about the typed SDK and plugins for Spark, k8s in Flyte. Do you have any availability for a virtual call? I think it would be super interesting to work on. Let me know if you're available sometime today or next week, and I can move things around in my schedule accordingly.

Hi Adhita, welcome to Flyte. Awesome to know you work on Kubeflow. We would love to build support for various distributed Ml operators like katib, mpioperator (or maybe just podgroup), tf operator into Flyte

Result: Hello! As a helpful Slack bot, I'm here to answer your questions. In this conversation, you've discussed integrating TF-Operator into Flyte and creating a plugin for each operator.\n\nTo directly address your question, having a plugin for each operator might be a good way of interacting with the operators. This approach allows for more flexibility and customization, enabling you to tailor the integration to each operator's specific needs.\n\nHowever, it's important to consider the potential drawbacks of this approach. Maintaining and updating multiple plugins might be more challenging than a single, all-encompassing plugin. Additionally, ensuring compatibility and consistency across all plugins could be a hurdle.\n\nIf you'd like more information on creating plugins for TF-Operator or other operators in Flyte, I can provide guidance on the process and best practices. Please let me know if there's anything else I can help with!

User question

GPT-3.5 Turbo

UP4D9EY6T: ```hi, any reason that we don’t generate rpc stub for java here <https://github.com/lyft/flyteidl/tree/master/gen/pb-java/flyteidl>```

Bot: The reason for not generating RPC stub for Java in the given repository is that Java is not currently being used. The docker image producing the stub is not open-sourced, but you can modify it directly. However, it should be noted that the image will become larger due to the addition of openjdk. The team has discussed this issue and plans to pull those tools into flytetools, which is open source. From there, they can contribute easily to the same docker image.

UQ5CB8G06: ```Hey everyone! \nThanks for the invite Ketan Umare \nI'm Adhita and I work for Cisco on Kubeflow.\nCurious about the typed SDK and plugins for Spark, k8s in Flyte ```

Bot: Hello Adhita, I can help answer your question about typed SDK and plugins for Spark and k8s in Flyte. Our Python SDK is currently being developed and we have plans to expand its capabilities in the future. As for plugins, our team is knowledgeable about authoring new ones and would be happy to discuss this further with you. If you're interested in setting up a time to chat, please let us know.

UP4D9EY6T: ```i checked the docker image and there is a special treatment to ignore `java`.```

Bot: Thank you for sharing your experience with the docker image. It's great to hear that you were able to make it work. If the image is open-sourced in the future, we will definitely consider your suggestion for a PR. Is there anything else I can assist you with?

RedPajama 7B Chat

Mosaic 7B Chat

Llama 13B Chat

GPT-3.5

Hallucinated most of the time.

Mosaic outperformed RedPajama, giving responses that, while not always perfect, were more detailed and better understood the context provided in the prompt.

Compared to Mosaic, Llama performed better by providing comprehensive responses and understanding the context in the prompt. But the model consistently cited the context in the prompt, even when explicitly asked not to.

GPT-3.5 produced responses that closely aligned with the prompts, showing no signs of hallucination. It didn't reference the context as often and its responses were less comprehensive than those of the other two models, although they remained concise and to the point.

Question

Response

quick question — is there a way to register Flyte tasks that are defined in a separate imported repository or package? i.e., repository A installs the package produced by repository B and uses the tasks written in repository B and when I register Flyte workflows in repo A I want it to register all the tasks from both repos

> quick question — is there a way to register Flyte tasks that are defined in a separate imported repository or package?
‍
Yes, you can use the `register_task` decorator to register tasks defined in a separate repository. You can also use the `register_workflow` decorator to register workflows defined in a separate repository.
> i.e., repository A installs the package produced by repository B and uses the tasks written in repository B and when I register Flyte workflows in repo A I want it to register all the tasks from both repos
You can use the `

Hi! Is there a way to shorten `ttlSecondsAfterFinished`? By default, it is 3600s (1 hour) and we’d like to tear down a cluster right after a job is complete. Thanks for your help! ```$ k describe rayjobs feb5da8c2a2394fb4ac8-n0-0 -n flytesnacks-development ... Ttl Seconds After Finished: 3600```

cc Yuvraj (union.ai) <|endoftext|>

Dataset Type

What capabilities does it give the model?

🔤 Token-Based Dataset

Think of this as an unstructured pile of text. When training on this kind of dataset, you’re simply conditioning the model to produce text more like what’s contained in it. At inference time, you get a model that, for example, can sound more like Shakespeare if you train it on his body of work.

👉 Instruction Dataset

If you’re familiar with ChatGPT’s system messages, instruction datasets are composed of examples containing an “instruction,” an “input” and an “output.” At inference time, this dataset type allows you to provide meta information about the task that you want it to perform.

💬 Human Feedback Dataset

This typically comes in the form of human preference comparisons of two responses: a winning response and a losing response. This type of data is the most complex; the RLHF framework can use human feedback data to train a reward model, which can then be used to update the base language model via reinforcement learning.

Kubeflow Pipelines v2

Flyte

Multi-Tenancy

Type Checking

Caching

Versioning and Reproducibility

Sub DAG

Data Lineage

Scalability

Map Tasks

Dynamic DAGs

Retries

Reruns

Scheduling

Branching

Task Timeout

Spark Support

Extensible

Data Visualization

(Artifacts)

(Flyte Decks)

Model Serving

~ (Check out UnionML!)

Notifications

Recovery

Ease of Development

Ease of Local Deployment

Human-in-the-Loop

~ (UI in progress)

Intratask Checkpointing

Kubeflow Pipelines v2

Flyte

Code

Copied to clipboard!

from kfp import dsl
from kfp import client


@dsl.component
def addition_component(num1: int, num2: int) -> int:
    return num1 + num2


@dsl.pipeline(name='addition-pipeline')
def my_pipeline(a: int, b: int, c: int = 10):
    add_task_1 = addition_component(num1=a, num2=b)
    add_task_2 = addition_component(num1=add_task_1.output, num2=c)

Copied to clipboard!

from flytekit import task, workflow



@task
def addition_component(num1: int, num2: int) -> int:
   return num1 + num2



@workflow
def my_pipeline(a: int, b: int, c: int = 10):
   add_task_1 = addition_component(num1=a, num2=b)
   add_task_2 = addition_component(num1=add_task_1, num2=c)

Trigger (CLI)

Copied to clipboard!

kfp dsl compile --py path/to/pipeline.py --output path/to/output.yaml

kfp run create --experiment-name my-experiment --package-file path/to/output.yaml

Copied to clipboard!


pyflyte run --remote example.py my_pipeline --a 1 --b 2

Trigger (Python)

Copied to clipboard!

endpoint = '<KFP_ENDPOINT>'
kfp_client = client.Client(host=endpoint)
run = kfp_client.create_run_from_pipeline_func(
    my_pipeline,
    arguments={
        'a': 1,
        'b': 2    },

)
url = f'{endpoint}/#/runs/details/{run.run_id}'
print(url)

Copied to clipboard!

from flytekit.configuration import Config
from flytekit.remote import FlyteRemote

from <your-module> import my_pipeline

remote = FlyteRemote(
   config=Config.auto(),
   default_project="flytesnacks",
   default_domain="development",
)

registered_workflow = remote.register_script(
   my_pipeline,
   source_path="../../", # depends on where __init__.py file is present
   module_name="<your-module>",
)

execution = remote.execute(
   registered_workflow,
   inputs={"a": 100, "b": 19},
)
print(f"Execution successfully started: {execution.id.name}")

Kubeflow Pipelines v2

Flyte

Code

Copied to clipboard!

from typing import List

from kfp import client
from kfp import dsl
from kfp.dsl import Dataset
from kfp.dsl import Input
from kfp.dsl import Model
from kfp.dsl import Output


@dsl.component(packages_to_install=['pandas==1.3.5'])
def create_dataset(iris_dataset: Output[Dataset]):
    import pandas as pd

    csv_url = 'https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data'
    col_names = [
        'Sepal_Length', 'Sepal_Width', 'Petal_Length', 'Petal_Width', 'Labels'
    ]
    df = pd.read_csv(csv_url, names=col_names)

    with open(iris_dataset.path, 'w') as f:
        df.to_csv(f)


@dsl.component(packages_to_install=['pandas==1.3.5', 'scikit-learn==1.0.2'])
def normalize_dataset(
    input_iris_dataset: Input[Dataset],
    normalized_iris_dataset: Output[Dataset],
    standard_scaler: bool,
    min_max_scaler: bool,
):
    if standard_scaler is min_max_scaler:
        raise ValueError(
            'Exactly one of standard_scaler or min_max_scaler must be True.')

    import pandas as pd
    from sklearn.preprocessing import MinMaxScaler
    from sklearn.preprocessing import StandardScaler

    with open(input_iris_dataset.path) as f:
        df = pd.read_csv(f)
    labels = df.pop('Labels')

    if standard_scaler:
        scaler = StandardScaler()
    if min_max_scaler:
        scaler = MinMaxScaler()

    df = pd.DataFrame(scaler.fit_transform(df))
    df['Labels'] = labels
    with open(normalized_iris_dataset.path, 'w') as f:
        df.to_csv(f)


@dsl.component(packages_to_install=['pandas==1.3.5', 'scikit-learn==1.0.2'])
def train_model(
    normalized_iris_dataset: Input[Dataset],
    model: Output[Model],
    n_neighbors: int,
):
    import pickle

    import pandas as pd
    from sklearn.model_selection import train_test_split
    from sklearn.neighbors import KNeighborsClassifier

    with open(normalized_iris_dataset.path) as f:
        df = pd.read_csv(f)

    y = df.pop('Labels')
    X = df

    X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

    clf = KNeighborsClassifier(n_neighbors=n_neighbors)
    clf.fit(X_train, y_train)
    with open(model.path, 'wb') as f:
        pickle.dump(clf, f)


@dsl.pipeline(name='iris-training-pipeline')
def my_pipeline(
    standard_scaler: bool,
    min_max_scaler: bool,
    neighbors: List[int],
):
    create_dataset_task = create_dataset()

    normalize_dataset_task = normalize_dataset(
        input_iris_dataset=create_dataset_task.outputs['iris_dataset'],
        standard_scaler=standard_scaler,
        min_max_scaler=min_max_scaler)

    with dsl.ParallelFor(neighbors) as n_neighbors:
        train_model(
            normalized_iris_dataset=normalize_dataset_task
            .outputs['normalized_iris_dataset'],
            n_neighbors=n_neighbors)

Copied to clipboard!

from dataclasses import dataclass
from typing import List

import pandas as pd
from dataclasses_json import dataclass_json
from flytekit import map_task, task, workflow
from flytekit.types.structured import StructuredDataset
from sklearn.base import ClassifierMixin
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import MinMaxScaler, StandardScaler

COL_NAMES = ["Sepal_Length", "Sepal_Width", "Petal_Length", "Petal_Width", "Labels"]


@dataclass_json
@dataclass
class TrainInputs:
   normalized_iris_dataset: StructuredDataset
   n_neighbors: int


@task
def create_dataset() -> pd.DataFrame:
   csv_url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
   df = pd.read_csv(csv_url, names=COL_NAMES)
   return df


@task
def normalize_dataset(
   input_iris_dataset: pd.DataFrame, standard_scaler: bool, min_max_scaler: bool
) -> pd.DataFrame:
   if standard_scaler is min_max_scaler:
       raise ValueError(
           "Exactly one of standard_scaler or min_max_scaler must be True."
       )

   labels = input_iris_dataset.pop("Labels")

   if standard_scaler:
       scaler = StandardScaler()
   if min_max_scaler:
       scaler = MinMaxScaler()

   df = pd.DataFrame(
       scaler.fit_transform(input_iris_dataset),
       columns=set(COL_NAMES) - set(["Labels"]),
   )
   df["Labels"] = labels
   return df


@task
def train_model(input: TrainInputs) -> ClassifierMixin:
   df = input.normalized_iris_dataset.open(pd.DataFrame).all()
   y = df.pop("Labels")
   X = df

   X_train, _, y_train, _ = train_test_split(X, y, random_state=0)

   clf = KNeighborsClassifier(n_neighbors=input.n_neighbors)
   clf.fit(X_train, y_train)

   return clf


@task
def prepare_map_inputs(
   list_neighbors: List[int], normalized_iris_dataset: StructuredDataset
) -> List[TrainInputs]:
   return [
       TrainInputs(normalized_iris_dataset, neighbor) for neighbor in list_neighbors
   ]


@workflow
def my_pipeline(standard_scaler: bool, min_max_scaler: bool, neighbors: List[int]):
   create_dataset_task = create_dataset()
   normalize_dataset_task = normalize_dataset(
       input_iris_dataset=create_dataset_task,
       standard_scaler=standard_scaler,
       min_max_scaler=min_max_scaler,
   )
   map_task(train_model)(
       input=prepare_map_inputs(
           list_neighbors=neighbors, normalized_iris_dataset=normalize_dataset_task
       )
   )

Trigger (CLI)

Copied to clipboard!

kfp dsl compile --py path/to/pipeline.py --output path/to/output.yaml

kfp run create --experiment-name my-experiment --package-file path/to/output.yaml

Copied to clipboard!


pyflyte run --remote --image ghcr.io/flyteorg/flytecookbook:core-latest test.py my_pipeline --standard_scaler --neighbors '[3,6,9]'

Trigger (Python)

Copied to clipboard!

endpoint = '<KFP_UI_URL>'
kfp_client = client.Client(host=endpoint)
run = kfp_client.create_run_from_pipeline_func(
    my_pipeline,
    arguments={
        'min_max_scaler': True,
        'standard_scaler': False,
        'neighbors': [3, 6, 9]
    },
)
url = f'{endpoint}/#/runs/details/{run.run_id}'
print(url)

Copied to clipboard!

from flytekit.configuration import Config, ImageConfig
from flytekit.remote import FlyteRemote

from <your-module> import my_pipeline

remote = FlyteRemote(
   config=Config.auto(),
   default_project="flytesnacks",
   default_domain="development",
)

registered_workflow = remote.register_script(
   my_pipeline,
   source_path="../../", # depends on where __init__.py file is present
   module_name="<your-module>",
   image_config=ImageConfig.from_images("ghcr.io/flyteorg/flytecookbook:core-latest"),
)

execution = remote.execute(
   registered_workflow,
   inputs={"standard_scaler": True, "min_max_scaler": False, "neighbors": [3, 6, 9]},
)
print(f"Execution successfully started: {execution.id.name}")

Kubeflow Pipelines v2

Flyte

Copied to clipboard!

@dsl.component(
   base_image='python:3.7',
   target_image='gcr.io/my-project/my-component:v1',
   packages_to_install=['tensorflow'],
)
def train_model(
   dataset: Input[Dataset],
   model: Output[Model],
   num_epochs: int,
):
   ...

Copied to clipboard!

@task(
   container_image="ghcr.io/my-project/my-component:v1"
)
def train_model(
   dataset: pd.DataFrame,
   model: FlyteFile,
   num_epochs: int
):
   ...

Enter your email to download the full Stash case study.

Thank you!