Owl Job Scheduler
Owl is a job scheduler that allows to submit user jobs to a Kubernetes cluster from anywhere.
Get startedOwl is a job scheduler that allows to submit user jobs to a Kubernetes cluster from anywhere.
Get startedAllow external users to run parameterized jobs in a Kubernetes cluster.
Submit parameterized pipelines and custom jobs to the cluster from anywhere.
Develop reusable pipelines that can be run at scale with custom parameters.
Authenticate to the Owl Server and submit jobs that run in a remote Kubernetes cluster.
# Authenticate
owl auth login
# Submit pipeline
owl job submit pipeline.yml
Execute pipelines at scale with user supplied parameters allocating custom server resources. Parellel and distributed computation is powered by Dask.
The basic idea is that a user can run pipelines (or data analysis recipes) with different data or different
parameters without the need of any code.
Out of the box pipelines include an
example
pipeline for demonstration purposes, a shell
pipeline that executes a
command
or a script and a papermill
pipeline that runs a parameterized Jupyter notebook.
More complex pipelines for image analysis and data processing will be available in the showcase section.
version: 1
# Name of the pipeline
name: example
# Pipeline arguments
datalen: 100
# Resources requested
resources:
threads: 10
workers: 2
memory: 10
Develop reusable pipelines that can be run by all users with custom parameters.
Owl pipelines are Python packages that can be installed using pip.
from dask import delayed
from owl_dev import pipeline
from owl_dev.logging import logger
@pipeline
def main(*, datalen: int, output: Path=None):
logger.info("Computing...")
output = []
for x in range(datalen):
a = delayed(inc)(x)
b = delayed(double)(x)
c = delayed(add)(a, b)
output.append(c)
total = delayed(sum)(output)
return total.compute()