AutoSec researcher agent

Code available here.

This tutorial demonstrates an autonomous security-research agent on Flyte. The pipeline fans out across bundled C source files (each with a planted memory-corruption bug), runs static analysis, uses a flyte.ai.agents.Agent to hypothesize vulnerabilities, builds proof-of-concept payloads, and validates exploits inside an on-device unionai-sandbox user-namespace session.

Flyte provides:

Parallel fan-out across every target file with asyncio.gather
Self-healing tasks — LLM timeouts, malformed JSON, and OOM during static analysis retry with bounded resources
Sandbox isolation — PoC compilation and execution never runs on the orchestration node
Live HTML reports with per-target detail tabs in the Flyte UI

This example analyzes deliberately vulnerable C code and runs generated exploit payloads in a sandbox. Use it only in controlled environments.

Define the task environment

The agent needs an Anthropic API key and a container image with gcc for sandbox compilation.

main.py
                
                    
                
            
                
            
main_img = flyte.Image.from_uv_script(__file__, name="autosec-research-agent", pre=True).with_apt_packages("gcc")

env = flyte.TaskEnvironment(
    name="autosec-research-agent",
    image=main_img,
    resources=flyte.Resources(cpu=1, memory="1Gi"),
    include=[str(TARGETS_DIR)],
    secrets=[
        flyte.Secret(key="internal-anthropic-api-key", as_env_var="ANTHROPIC_API_KEY"),
    ],
)

The Python packages are declared at the top of the file using the uv script style:

        
    
# /// script
# requires-python = ">=3.12"
# dependencies = [
#    "flyte>=2.4.0",
#    "unionai-sandbox",
#    "litellm",
# ]
# ///

Run the security pipeline

Each target flows through four stages: static scan, LLM hypothesis, PoC construction, and sandbox validation. The run_autosec_agent driver task analyzes all bundled targets in parallel and streams a findings report.

main.py
                
                    
                
            
                
            
@env.task(report=True)
async def run_autosec_agent() -> dict:
    targets = _load_targets()
    if not targets:
        raise FileNotFoundError(f"no targets found under {TARGETS_DIR}")

    findings = list(await asyncio.gather(*(analyze_target(name, src) for name, src in targets.items())))

    await flyte.report.replace.aio(_render_report_html(findings))
    flyte.report.get_tab("targets").replace(_render_targets_tab_html(findings, targets))
    await flyte.report.flush.aio()

    await random_error()

    return {
        "targets_analyzed": len(findings),
        "triggered": sum(1 for f in findings if f["verdict"].get("triggered")),
        "findings": findings,
    }

Run the agent

Create secrets

Get an Anthropic API key from the Anthropic console and register it as a Flyte secret:

flyte create secret internal-anthropic-api-key <YOUR_ANTHROPIC_API_KEY>

See Secrets for scoping and file-based secrets.

Run remotely

From the example directory:

        
cd v2/tutorials/autosec_research_agent
uv run --script main.py

Follow the printed run URL to watch each target progress through the pipeline and open the report panel for the findings table and per-target detail tabs.

Optional environment variables demonstrate self-healing behavior (AUTOSEC_FORCE_LLM_TIMEOUT, AUTOSEC_FORCE_BAD_TOOL_CALL, AUTOSEC_FORCE_OOM, or AUTOSEC_FORCE_ALL=1).