Simplify the orchestration of data and ML workflows with Union’s well-designed architecture, extensible plugin system, and robust features that boost your team’s productivity and adapt to your changing needs and evolving workflows.
Many of these features can be realized in both the open source project, Flyte™, and the managed solution, Union. However, standing up and managing Flyte™ can be complex, and often requires dedicated infrastructure specialists to maintain the cluster and associated resources. Union is a managed solution that runs in your cloud environment and provides additional observability, built-in security, integrated monitoring and authorization
Resource Management & Security
This category includes features that facilitate effective resource management, organization, and collaboration, allowing for seamless teamwork and efficient use of available resources.
Union is designed with a focus on data ownership and control, as it is deployed into your AWS or GCP cloud account. Union uses an operator that manages your infrastructure in the data plane. While your code, data sets, and secrets reside here, Union does not have direct access to these. This architecture ensures that customers have full control over their data and resources, maintaining privacy, security, and compliance with organizational policies.
RBAC is a crucial feature for managing security and access within your organization. It allows you to assign permissions and control access to resources based on users' roles, ensuring that each team member has the appropriate level of access to perform their tasks effectively while maintaining data privacy and security.
Union’s multi-tenancy feature allows multiple users to share the same platform while maintaining their own unique data and configurations. This centralized infrastructure enables effective resource management and organization, while also facilitating seamless team collaboration within your organization.
Set a specified cadence for your workflows by scheduling them to run at regular intervals. This ensures that your workflows execute automatically at the desired frequency, minimizing manual intervention and optimizing efficiency.
Monitoring & Visualization
Features in this category enable comprehensive monitoring and visualization of your workflows, allowing you to stay informed about their state and performance.
Data lineage provides an essential means to trace the origin of errors within your workflows. By monitoring the data's journey and transformations across the entire lifecycle of your workflows, it becomes easier to identify the source of any issues. This efficient approach to debugging and troubleshooting saves time and resources, enabling rapid resolution of problems as they occur.
Enable comprehensive visualization throughout every step of your workflow to visualize your data, monitor your models, and view training history through plots.
Experience enhanced task-level monitoring with Union, empowering entire teams with valuable insights to optimize their workflows. By reducing task execution time and providing system feedback comparable to a local environment, Union enables a more efficient and streamlined experience. Benefit from detailed, individualized data rather than just aggregated information, allowing for more precise analysis and informed decision-making across your organization.
Visualization makes data easier to comprehend. Union provides first-class support for rendering data plots.
Performance & Accuracy
This set of features encompasses features that help improve the performance of your workflows, by leveraging GPU processing, enabling parallelism, and optimizing resource allocation.
Strongly typed interfaces
Ensure the integrity of your data throughout your workflow by establishing data guardrails. This will prevent any data errors from slipping through the cracks, while also allowing your workflow to remain informed of how the data evolves at each step.
Schedule your tasks to run on GPUs. Leverage the power of GPU processing, providing faster execution times and enhanced performance for ML and data-intensive workloads.
Tasks are inherently parallel to optimize resource consumption and improve performance, so you don't have to do anything special to enable parallelism.
Signaling allows manual actions to influence the course of a workflow. This allows a human to potentially intercept a workflow and either redirect or approve the tasks. This is helpful for labeling, supervised learning, and data curation.
These features focus on optimizing the efficiency of your workflows by providing tools to minimize resource wastage, reduce execution times, and streamline the debugging process.
Task boundaries provide natural checkpoints for your workflow, but in certain scenarios, such as training a model, they can be expensive. Training can be both time-consuming and resource-intensive, making it critical to ensure that progress is regularly saved. Intra-task checkpoints provide a solution by allowing you to checkpoint progress within a task execution, minimizing resource waste and optimizing performance.
Recover from failures
Debugging can be a costly and time-consuming process, especially when it involves rerunning previously successful tasks. Optimize your workflow's efficiency by utilizing the recoverability feature, which allows you to selectively rerun only the failed tasks and conserve both time and resources.
Rerun a single task
Efficiently debug issues within your workflow by rerunning it at the most granular level, without altering the previous state of any data or ML workflows. This allows you to quickly pinpoint and address issues.
Optimize your workflow's execution time by caching task outputs. When the task signature remains unchanged, the cache skips the need to rerun any long-running executions, preventing unnecessary resource wastage and significantly speeding up your workflow's execution.
Immutable executions are critical for ensuring reproducibility, as they prevent any changes to the state of execution. This provides the flexibility to completely restructure a data or ML workflow between versions without fear of any negative impact on production. With immutable executions, you can confidently iterate and experiment while maintaining the integrity of your data and workflow.
Spot or preemptible instances
Take advantage of spot instances with ease by scheduling your workflows to run on them and significantly reduce your costs. Effortlessly optimize your workflow's efficiency while minimizing expenses.
Ensure reliable task completion by setting timeouts. Timeout allows you to specify a maximum amount of time for a task to run, ensuring that the task always completes within a specified timeframe.
Dynamic resource allocation
Dynamic resource allocation is a key feature in optimizing workflow efficiency. With Union, resources required for a task can be adjusted on the fly based on user-provided inputs or real-time calculations. This adaptability ensures that your tasks have the necessary resources to run efficiently, enhancing overall performance and resource utilization.
Stay up-to-date on your workflow's state by configuring notifications through popular platforms such as Slack, PagerDuty, or email. Receive valuable real-time updates and alerts to quickly address any issues that may arise and maintain control over your workflow's performance
This category emphasizes flexibility, supporting various programming languages, allowing multiple users to work on the same platform, and isolating dependencies for seamless integration.
Compose your workflows in the language that best suits your team's expertise, with support for both SDK and raw containers. This flexibility allows you to write workflows in any programming language you prefer.
Data and ML practitioners require the ability to experiment and iterate without disrupting their workflow. With versioning, they can work in isolation and reproduce their results, as well as rollback to a previous version of their workflow at any time. This provides the necessary flexibility to experiment and iterate with confidence.
Dependency isolation via containers
Different tasks within a workflow may have varying resource requirements and library dependencies. Without careful management, this can cause conflicts that can negatively impact your workflow's performance. Union lets you maintain separate sets of dependencies for your tasks, ensuring that no conflicts arise and that your workflow runs smoothly and efficiently.
Union’s multi-cloud support enables users to seamlessly utilize multiple cloud providers (current support for AWS and GCP with more to come), including on-premises infrastructure if requested. This flexibility allows organizations to choose the best cloud solution that suits their specific requirements and preferences, ensuring a streamlined, adaptable, and efficient workflow experience regardless of the underlying infrastructure.