Scheduler
Queues
Displays general information about the queues like name, status, capacities and properties. The queues' hierarchy is kept in the response json.
URL : /ws/v1/queues
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
For the default queue hierarchy (only root.default
leaf queue exists) a similar response to the following is sent back to the client:
{
"partitionName": "[mycluster]default",
"capacity": {
"capacity": "map[ephemeral-storage:75850798569 hugepages-1Gi:0 hugepages-2Mi:0 memory:80000 pods:110 vcore:60000]",
"usedcapacity": "0"
},
"nodes": null,
"queues": {
"queuename": "root",
"status": "Active",
"capacities": {
"capacity": "[]",
"maxcapacity": "[ephemeral-storage:75850798569 hugepages-1Gi:0 hugepages-2Mi:0 memory:80000 pods:110 vcore:60000]",
"usedcapacity": "[memory:8000 vcore:8000]",
"absusedcapacity": "[memory:54 vcore:80]"
},
"queues": [
{
"queuename": "default",
"status": "Active",
"capacities": {
"capacity": "[]",
"maxcapacity": "[]",
"usedcapacity": "[memory:8000 vcore:8000]",
"absusedcapacity": "[]"
},
"queues": null,
"properties": {}
}
],
"properties": {}
}
}
Applications
Displays general information about the applications like used resources, queue name, submission time and allocations.
URL : /ws/v1/apps
Method : GET
Query Params :
- queue=<fully qualified queue name>
The fully qualified queue name used to filter the applications that run within the given queue. For example, "/ws/v1/apps?queue=root.default" returns the applications running in "root.default" queue.
Auth required : NO
Success response
Code : 200 OK
Content examples
In the example below there are three allocations belonging to two applications.
[
{
"applicationID": "application-0002",
"usedResource": "[memory:4000 vcore:4000]",
"partition": "[mycluster]default",
"queueName": "root.default",
"submissionTime": 1595939756253216000,
"allocations": [
{
"allocationKey": "deb12221-6b56-4fe9-87db-ebfadce9aa20",
"allocationTags": null,
"uuid": "9af35d44-2d6f-40d1-b51d-758859e6b8a8",
"resource": "[memory:4000 vcore:4000]",
"priority": "<nil>",
"queueName": "root.default",
"nodeId": "node-0001",
"applicationId": "application-0002",
"partition": "default"
}
],
"applicationState": "Running"
},
{
"applicationID": "application-0001",
"usedResource": "[memory:4000 vcore:4000]",
"partition": "[mycluster]default",
"queueName": "root.default",
"submissionTime": 1595939756253460000,
"allocations": [
{
"allocationKey": "54e5d77b-f4c3-4607-8038-03c9499dd99d",
"allocationTags": null,
"uuid": "08033f9a-4699-403c-9204-6333856b41bd",
"resource": "[memory:2000 vcore:2000]",
"priority": "<nil>",
"queueName": "root.default",
"nodeId": "node-0001",
"applicationId": "application-0001",
"partition": "default"
},
{
"allocationKey": "af3bd2f3-31c5-42dd-8f3f-c2298ebdec81",
"allocationTags": null,
"uuid": "96beeb45-5ed2-4c19-9a83-2ac807637b3b",
"resource": "[memory:2000 vcore:2000]",
"priority": "<nil>",
"queueName": "root.default",
"nodeId": "node-0002",
"applicationId": "application-0001",
"partition": "default"
}
],
"applicationState": "Running"
}
]
Nodes
Displays general information about the nodes managed by YuniKorn. Node details include host and rack name, capacity, resources and allocations.
URL : /ws/v1/nodes
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
Here you can see an example response from a 2-node cluster having 3 allocations.
[
{
"partitionName": "[mycluster]default",
"nodesInfo": [
{
"nodeID": "node-0001",
"hostName": "",
"rackName": "",
"capacity": "[ephemeral-storage:75850798569 hugepages-1Gi:0 hugepages-2Mi:0 memory:14577 pods:110 vcore:10000]",
"allocated": "[memory:6000 vcore:6000]",
"occupied": "[memory:154 vcore:750]",
"available": "[ephemeral-storage:75850798569 hugepages-1Gi:0 hugepages-2Mi:0 memory:6423 pods:110 vcore:1250]",
"allocations": [
{
"allocationKey": "54e5d77b-f4c3-4607-8038-03c9499dd99d",
"allocationTags": null,
"uuid": "08033f9a-4699-403c-9204-6333856b41bd",
"resource": "[memory:2000 vcore:2000]",
"priority": "<nil>",
"queueName": "root.default",
"nodeId": "node-0001",
"applicationId": "application-0001",
"partition": "default"
},
{
"allocationKey": "deb12221-6b56-4fe9-87db-ebfadce9aa20",
"allocationTags": null,
"uuid": "9af35d44-2d6f-40d1-b51d-758859e6b8a8",
"resource": "[memory:4000 vcore:4000]",
"priority": "<nil>",
"queueName": "root.default",
"nodeId": "node-0001",
"applicationId": "application-0002",
"partition": "default"
}
],
"schedulable": true
},
{
"nodeID": "node-0002",
"hostName": "",
"rackName": "",
"capacity": "[ephemeral-storage:75850798569 hugepages-1Gi:0 hugepages-2Mi:0 memory:14577 pods:110 vcore:10000]",
"allocated": "[memory:2000 vcore:2000]",
"occupied": "[memory:154 vcore:750]",
"available": "[ephemeral-storage:75850798569 hugepages-1Gi:0 hugepages-2Mi:0 memory:6423 pods:110 vcore:1250]",
"allocations": [
{
"allocationKey": "af3bd2f3-31c5-42dd-8f3f-c2298ebdec81",
"allocationTags": null,
"uuid": "96beeb45-5ed2-4c19-9a83-2ac807637b3b",
"resource": "[memory:2000 vcore:2000]",
"priority": "<nil>",
"queueName": "root.default",
"nodeId": "node-0002",
"applicationId": "application-0001",
"partition": "default"
}
],
"schedulable": true
}
]
}
]
Goroutines info
Dumps the stack traces of the currently running goroutines.
URL : /ws/v1/stack
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
goroutine 356 [running
]:
github.com/apache/incubator-yunikorn-core/pkg/webservice.getStackInfo.func1(0x30a0060,
0xc003e900e0,
0x2)
/yunikorn/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200717041747-f3e1c760c714/pkg/webservice/handlers.go: 41 +0xab
github.com/apache/incubator-yunikorn-core/pkg/webservice.getStackInfo(0x30a0060,
0xc003e900e0,
0xc00029ba00)
/yunikorn/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200717041747-f3e1c760c714/pkg/webservice/handlers.go: 48 +0x71
net/http.HandlerFunc.ServeHTTP(0x2df0e10,
0x30a0060,
0xc003e900e0,
0xc00029ba00)
/usr/local/go/src/net/http/server.go: 1995 +0x52
github.com/apache/incubator-yunikorn-core/pkg/webservice.Logger.func1(0x30a0060,
0xc003e900e0,
0xc00029ba00)
/yunikorn/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200717041747-f3e1c760c714/pkg/webservice/webservice.go: 65 +0xd4
net/http.HandlerFunc.ServeHTTP(0xc00003a570,
0x30a0060,
0xc003e900e0,
0xc00029ba00)
/usr/local/go/src/net/http/server.go: 1995 +0x52
github.com/gorilla/mux.(*Router).ServeHTTP(0xc00029cb40,
0x30a0060,
0xc003e900e0,
0xc0063fee00)
/yunikorn/go/pkg/mod/github.com/gorilla/mux@v1.7.3/mux.go: 212 +0x140
net/http.serverHandler.ServeHTTP(0xc0000df520,
0x30a0060,
0xc003e900e0,
0xc0063fee00)
/usr/local/go/src/net/http/server.go: 2774 +0xcf
net/http.(*conn).serve(0xc0000eab40,
0x30a61a0,
0xc003b74000)
/usr/local/go/src/net/http/server.go: 1878 +0x812
created by net/http.(*Server).Serve
/usr/local/go/src/net/http/server.go: 2884 +0x4c5
goroutine 1 [chan receive,
26 minutes
]:
main.main()
/yunikorn/pkg/shim/main.go: 52 +0x67a
goroutine 19 [syscall,
26 minutes
]:
os/signal.signal_recv(0x1096f91)
/usr/local/go/src/runtime/sigqueue.go: 139 +0x9f
os/signal.loop()
/usr/local/go/src/os/signal/signal_unix.go: 23 +0x30
created by os/signal.init.0
/usr/local/go/src/os/signal/signal_unix.go: 29 +0x4f
...
Metrics
Endpoint to retrieve metrics from the Prometheus server. The metrics are dumped with help messages and type information.
URL : /ws/v1/metrics
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
# HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.567e-05
go_gc_duration_seconds{quantile="0.25"} 3.5727e-05
go_gc_duration_seconds{quantile="0.5"} 4.5144e-05
go_gc_duration_seconds{quantile="0.75"} 6.0024e-05
go_gc_duration_seconds{quantile="1"} 0.00022528
go_gc_duration_seconds_sum 0.021561648
go_gc_duration_seconds_count 436
# HELP go_goroutines Number of goroutines that currently exist.
# TYPE go_goroutines gauge
go_goroutines 82
# HELP go_info Information about the Go environment.
# TYPE go_info gauge
go_info{version="go1.12.17"} 1
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
go_memstats_alloc_bytes 9.6866248e+07
...
# HELP yunikorn_scheduler_vcore_nodes_usage Nodes resource usage, by resource name.
# TYPE yunikorn_scheduler_vcore_nodes_usage gauge
yunikorn_scheduler_vcore_nodes_usage{range="(10%, 20%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(20%,30%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(30%,40%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(40%,50%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(50%,60%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(60%,70%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(70%,80%]"} 1
yunikorn_scheduler_vcore_nodes_usage{range="(80%,90%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="(90%,100%]"} 0
yunikorn_scheduler_vcore_nodes_usage{range="[0,10%]"} 0
Configuration validation
URL : /ws/v1/validate-conf
Method : POST
Auth required : NO
Success response
Regardless whether the configuration is allowed or not if the server was able to process the request, it will yield a 200 HTTP status code.
Code : 200 OK
Allowed configuration
Sending the following simple configuration yields an accept
partitions:
- name: default
queues:
- name: root
queues:
- name: test
Reponse
{
"allowed": true,
"reason": ""
}
Disallowed configuration
The following configuration is not allowed due to the "wrong_text" field put into the yaml file.
partitions:
- name: default
queues:
- name: root
queues:
- name: test
- wrong_text
Reponse
{
"allowed": false,
"reason": "yaml: unmarshal errors:\n line 7: cannot unmarshal !!str `wrong_text` into configs.PartitionConfig"
}
Application history
Endpoint to retrieve historical data about the number of total applications by timestamp.
URL : /ws/v1/history/apps
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
[
{
"timestamp": 1595939966153460000,
"totalApplications": "1"
},
{
"timestamp": 1595940026152892000,
"totalApplications": "1"
},
{
"timestamp": 1595940086153799000,
"totalApplications": "2"
},
{
"timestamp": 1595940146154497000,
"totalApplications": "2"
},
{
"timestamp": 1595940206155187000,
"totalApplications": "2"
}
]
Container history
Endpoint to retrieve historical data about the number of total containers by timestamp.
URL : /ws/v1/history/containers
Method : GET
Auth required : NO
Success response
Code : 200 OK
Content examples
[
{
"timestamp": 1595939966153460000,
"totalContainers": "1"
},
{
"timestamp": 1595940026152892000,
"totalContainers": "1"
},
{
"timestamp": 1595940086153799000,
"totalContainers": "3"
},
{
"timestamp": 1595940146154497000,
"totalContainers": "3"
},
{
"timestamp": 1595940206155187000,
"totalContainers": "3"
}
]