Integrations

Flyte is designed to be highly extensible and can be customized in multiple ways.

Want to contribute an integration example? Check out the contribution guide.

Connectors

Flyte supports the following connectors out-of-the-box. If you don’t see the connector you need below, have a look at Creating a new connector.

Agent	Description
SageMaker connector	Deploy models and create, as well as trigger inference endpoints on AWS SageMaker.
Airflow connector	Run Airflow jobs in your workflows with the Airflow connector.
BigQuery connector	Run BigQuery jobs in your workflows with the BigQuery connector.
ChatGPT connector	Run ChatGPT jobs in your workflows with the ChatGPT connector.
Databricks connector	Run Databricks jobs in your workflows with the Databricks connector.
Memory Machine Cloud connector	Execute tasks using the MemVerge Memory Machine Cloud connector.
OpenAI Batch connector	Submit requests for asynchronous batch processing on OpenAI.
Perian connector	Execute tasks on Perian Job Platform.
Sensor connector	Run sensor jobs in your workflows with the sensor connector.
Slurm connector	Run Slurm jobs in your workflows with the Slurm connector.
Snowflake connector	Run Snowflake jobs in your workflows with the Snowflake connector.

Flytekit plugins

Flytekit plugins can be implemented purely in Python, unit tested locally, and allow extending Flytekit functionality. For comparison, these plugins can be thought of like Airflow operators.

Plugin	Description
Comet	`comet-ml`: Comet’s machine learning platform.
DBT	Run and test your `dbt` pipelines in Flyte.
Dolt	Version your SQL database with `dolt`.
DuckDB	Run analytical queries using DuckDB.
Great Expectations	Validate data with `great_expectations`.
Memray	`memray`: Memory profiling with memray.
MLFlow	`mlflow`: the open standard for model tracking.
Modin	Scale pandas workflows with `modin`.
Neptune	`neptune`: Neptune is the MLOps stack component for experiment tracking.
NIM	Serve optimized model containers with NIM.
Ollama	Serve fine-tuned LLMs with Ollama in a Flyte workflow.
ONNX	Convert ML models to ONNX models seamlessly.
Pandera	Validate pandas dataframes with `pandera`.
Papermill	Execute Jupyter Notebooks with `papermill`.
SQL	Execute SQL queries as tasks.
Weights and Biases	`wandb`: Machine learning platform to build better models faster.
WhyLogs	`whylogs`: the open standard for data logging.

Using Flytekit plugins

Data is automatically marshalled and unmarshalled in and out of the plugin. Users should mostly implement the flytekit.core.base-task.PythonTask API defined in Flytekit.

Flytekit plugins are lazily loaded and can be released independently like libraries. The naming convention is flytekitplugins-*, where * indicates the package to be integrated into Flytekit. For example, flytekitplugins-papermill enables users to author Flytekit tasks using Papermill.

You can find the plugins maintained by the core Flyte team here.

Native backend plugins

Native backend plugins can be executed without any external service dependencies because the compute is orchestrated by Flyte itself, within its provisioned Kubernetes clusters.

Plugin	Description
Kubeflow PyTorch	Run distributed PyTorch training jobs using `Kubeflow`.
Kubeflow TensorFlow	Run distributed TensorFlow training jobs using `Kubeflow`.
Kubernetes cluster Dask jobs	Run Dask jobs on a Kubernetes Cluster.
Kubernetes cluster Spark jobs	Run Spark jobs on a Kubernetes Cluster.
MPI Operator	Run distributed deep learning training jobs using Horovod and MPI.
Ray	Run Ray jobs on a K8s Cluster.

External service backend plugins

As the term suggests, these plugins rely on external services to handle the workload defined in the Flyte task that uses the plugin.

Plugin	Description
AWS Athena	Execute queries using AWS Athena
AWS Batch	Running tasks and workflows on AWS batch service
Flyte Interactive	Execute tasks using Flyte Interactive to debug.
Hive	Run Hive jobs in your workflows.

Enabling backend plugins

To enable a backend plugin, you must add the ID of the plugin to the enabled plugins list. The enabled-plugins is available under the tasks > task-plugins section of FlytePropeller’s configuration. The plugin configuration structure is defined here. An example of the config follows:

        
    
tasks:
  task-plugins:
    enabled-plugins:
      - container
      - sidecar
      - k8s-array
    default-for-task-types:
      container: container
      sidecar: sidecar
      container_array: k8s-array

Finding the ID of the backend plugin

To find the ID of the backend plugin, look at the source code of the plugin. For examples, in the case of Spark, the value of ID is used here, defined as spark.

Flyte operators

Flyte can be integrated with other orchestrators to help you leverage Flyte’s constructs natively within other orchestration tools.

Operator	Description
Airflow	Trigger Flyte executions from Airflow.