Before reading this guide, we assume you either have a Kubernetes cluster, or a local Kubernetes dev environment, e.g Minikube.
It is also assumed that
kubectl is on your path and properly configured.
Follow this guide on how to setup a local Kubernetes cluster using docker-desktop.
The easiest way to get started is to use our Helm Charts to deploy YuniKorn on an existing Kubernetes cluster. Recommended to use Helm 3 or later versions.
it will firstly create a
configmap where stores YuniKorn configuration, and then deploy YuniKorn scheduler
and web UI containers in a pod as a
deployment in the
default namespace. If you want to deploy YuniKorn to another namespace, you can do following:
If you don't want to use helm charts, you can find our step-by-step tutorial here.
Run workloads with YuniKorn Scheduler
Unlike default Kubernetes scheduler, YuniKorn has
application notion in order to support batch workloads better.
There are a few ways to run batch workloads with YuniKorn scheduler
- Add labels
queuein pod's spec.
- Pods that have the same applicationId will be considered as tasks from 1 application.
Here is an example of the entry to add:
All examples provided in the next section have the labels already set. The
queue name must be a known queue name from the configuration.
Unknown queue names will cause the pod to be rejected by the YuniKorn scheduler.
Running simple sample applications
All sample deployments can be found under
The list of all examples is in the README.
Not all examples are given here. Further details can be found in that README.
A single pod based on a standard nginx image:
A simple sleep job example:
Running a spark application
Kubernetes support for Apache Spark is not part of all releases. You must have a current release of Apache Spark with Kubernetes support built in.
examples/spark directory contains pod template files for the Apache Spark driver and executor, they can be used if you want to run Spark on K8s using this scheduler.
- Get latest spark from github (only latest code supports to specify pod template), URL: https://github.com/apache/spark
- Build spark with Kubernetes support:
- Run spark submit
Spark uses its own version of the application ID tag called spark-app-id. This tags is required for the pods to be recognised as specific spark pods.
- examples/spark/executor.yaml When you run Spark on Kubernetes with pod templates, spark-app-id is considered the applicationId. A script to run the spark application and the yaml files are in the README spark section.
Running a simple Tensorflow job
There is an example for Tensorflow job. You must install tf-operator first. You can install tf-operator by applying all yaml from two website down below:
- CRD: https://github.com/kubeflow/manifests/tree/master/tf-training/tf-job-crds/base
- Deployment: https://github.com/kubeflow/manifests/tree/master/tf-training/tf-job-operator/base Also you can install kubeflow which can auto install tf-operator for you, URL: https://www.kubeflow.org/docs/started/getting-started/
A simple Tensorflow job example:
You need to build the image which used in example yaml.
The file for this example can be found in the README Tensorflow job section.
The scheduler supports affinity and ati affinity scheduling on kubernetes using predicates:
This deployment ensures 2 pods cannot be co-located together on same node. If this yaml is deployed on 1 node cluster, expect 1 pod to be started and the other pod should stay in a pending state. More examples on affinity and anti affinity scheduling in the predicates section of the README predicates
There are three examples with volumes available. The NFS example does not work on docker desktop and requires Minikube. The EBS volume requires a kubernetes cluster running on AWS (EKS). Further instructions for the volume examples in the section of the README Volumes.
CAUTION: All examples will generate an unending stream of data in a file called
dates.txt on the mounted volume. This could cause a disk to fill up and execution time should be limited.
- create the local volume and volume claim
- create the pod that uses the volume
- create the NFS server
- get the IP address for the NFS server and update the pod yaml by replacing the existing example IP with the one returned:
- create the pod that uses the volume
The Volume for the first two examples must be created before you can run the examples. The
VolumeId must be updated in the yaml files to get this to work.
To create a volume you can use the command line or web UI:
VolumeId is part of the returned information of the create command.
- create the pod that uses a direct volume reference:
- create the persistent volume (pv) and a pod that uses a persistent volume claim (pvc) to claim the existing volume:
- create a storage class to allow dynamic provisioning and create the pod that uses this mechanism:
Dynamic provisioning has a number of pre-requisites for it to work, see Dynamic Volume Provisioning in the kubernetes docs. The dynamically created volume will be automatically destroyed as soon as the pod is stopped.