Scheduler
Queues
Displays general information about the queues like name, status, capacities and properties. The queues' hierarchy is kept in the response json.
URL : /ws/v1/queues
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
For the default queue hierarchy (only root.default
leaf queue exists) a similar response to the following is sent back to the client:
Applications
Displays general information about the applications like used resources, queue name, submission time and allocations.
URL : /ws/v1/apps
Method : GET
Query Params :
- queue=<fully qualified queue name>
The fully qualified queue name used to filter the applications that run within the given queue. For example, "/ws/v1/apps?queue=root.default" returns the applications running in "root.default" queue.
Auth required : NO
Success response
Code : 200 OK
Content examples
In the example below there are three allocations belonging to two applications.
Nodes
Displays general information about the nodes managed by YuniKorn. Node details include host and rack name, capacity, resources and allocations.
URL : /ws/v1/nodes
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
Here you can see an example response from a 2-node cluster having 3 allocations.
Nodes utilization
Shows how nodes are distributed with regarding the utilization
URL : /ws/v1/nodes/utilization
Method : GET
Auth required : NO
Code : 200 OK
Content examples
Goroutines info
Dumps the stack traces of the currently running goroutines.
URL : /ws/v1/stack
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
Metrics
Endpoint to retrieve metrics from the Prometheus server. The metrics are dumped with help messages and type information.
URL : /ws/v1/metrics
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
Configuration validation
URL : /ws/v1/validate-conf
Method : POST
Auth required : NO
Success response
Regardless whether the configuration is allowed or not if the server was able to process the request, it will yield a 200 HTTP status code.
Code : 200 OK
Allowed configuration
Sending the following simple configuration yields an accept
Reponse
Disallowed configuration
The following configuration is not allowed due to the "wrong_text" field put into the yaml file.
Reponse
Configuration
Endpoint to retrieve the current scheduler configuration
URL : /ws/v1/config
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content example
Configuration update
Endpoint to override scheduler configuration.
URL : /ws/v1/config
Method : PUT
Auth required : NO
Success response
Code : 200 OK
Content example
Note: Updates must use a current running configuration as the base. The base configuration is the configuration version that was retrieved earlier via a GET request and updated by the user. The update request must contain the checksum of the base configuration. If the checksum provided in the update request differs from the currently running configuration checksum the update will be rejected.
Failure response
The configuration update can fail due to different reasons such as:
- invalid configuration,
- incorrect base checksum.
In each case the transaction will be rejected, and the proper error message will be returned as a response.
Code : 409 Conflict
Message example : root queue must not have resource limits set
Content example
Application history
Endpoint to retrieve historical data about the number of total applications by timestamp.
URL : /ws/v1/history/apps
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
Container history
Endpoint to retrieve historical data about the number of total containers by timestamp.
URL : /ws/v1/history/containers
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples