Jupyter Notebook Tasks
Once you have a Union account, install union:
pip install unionExport the following environment variable to build and push images to your own container registry:
# replace with your registry name
export IMAGE_SPEC_REGISTRY="<your-container-registry>"Then run the following commands to run the workflow:
$ git clone https://github.com/unionai/unionai-examples
$ cd unionai-examples
$ union run --remote <path/to/file.py> <workflow_name> <params>The source code for this example can be found here.
import math
import pathlib
from flytekit import kwtypes, task, workflow
from flytekitplugins.papermill import NotebookTaskHow to specify inputs and outputs
- After you are satisfied with the notebook, ensure that the first cell only has the input variables for the notebook. Now add the tag
parametersfor the first cell.
- Typically at the last cell of the notebook (which does not need to be the last cell), add a tag
outputsfor the intended cell.
- In a python file, create a new task at the
modulelevel. An example task is shown below:
nb = NotebookTask(
name="simple-nb",
notebook_path=str(pathlib.Path(__file__).parent.absolute() / "nb_simple.ipynb"),
render_deck=True,
enable_deck=True,
inputs=kwtypes(v=float),
outputs=kwtypes(square=float),
)- Note the notebook_path. This is the absolute path to the actual notebook.
- Note the inputs and outputs. The variable names match the variable names in the jupyter notebook.
- You can see the notebook on Flyte deck if
render_deckis set to true.
Other tasks
You can definitely declare other tasks and seamlessly work with notebook tasks. The example below shows how to declare a task that accepts the squared value from the notebook and provides a sqrt:
@task
def square_root_task(f: float) -> float:
return math.sqrt(f)Now treat the notebook task as a regular task:
@workflow
def nb_to_python_wf(f: float = 3.1415926535) -> float:
out = nb(v=f)
return square_root_task(f=out.square)And execute the task locally as well:
if __name__ == "__main__":
print(nb_to_python_wf(f=3.14))Why Are There 3 Outputs?
On executing, you should see 3 outputs instead of the expected one, because this task generates 2 implicit outputs.
One of them is the executed notebook (captured) and a rendered (HTML) of the executed notebook. In this case they are called
nb-simple-out.ipynb and nb-simple-out.html, respectively.