# Frontier AI

Tutorials for frontier-model pretraining, automated experimentation, and large-scale AI workloads.

### [Distributed LLM pretraining](https://www.union.ai/docs/v2/union/tutorials/frontier-ai/distributed-pretraining/page.md)

Pretrain large language models at scale with PyTorch Lightning, FSDP, and H200 GPUs, featuring streaming data and real-time metrics.

## Subpages

- [Distributed LLM pretraining](https://www.union.ai/docs/v2/union/tutorials/frontier-ai/distributed-pretraining/page.md)
  - Overview
  - Implementation
  - Setting up the environment
  - Declaring resource requirements
  - Model configurations
  - Building the GPT model
  - The Lightning training module
  - Checkpointing for fault tolerance
  - Real-time metrics with Flyte Reports
  - Streaming data at scale
  - Distributed training with FSDP
  - Tying it together
  - Running the pipeline
  - Going further

---
**Source**: https://github.com/unionai/unionai-docs/blob/main/content/tutorials/frontier-ai/_index.md
**HTML**: https://www.union.ai/docs/v2/union/tutorials/frontier-ai/
