Skip to content

Special Use

Our cluster combines various hardware resources from multiple universities and other organizations.

All Taints

Here’s the new taint system and their descriptions. To run on a node with a taint, you need to use the node toleration in your pod. Users may only tolerate values they are authorized for by cluster admins.

Taint KeyDescription
nautilus.io/reservationFor user-facing reservations. Users can only tolerate a value here if they are part of an approved group. If unsure, toleration is not allowed.
nautilus.io/hardwareFor special hardware nodes that users should not land on by default but are permitted to tolerate if needed.
nautilus.io/systemFor system services or any infrastructure nodes that users should never schedule onto. Toleration is not permitted.
nautilus.io/issueFor nodes with temporary issues that should not accept user workloads. This also consolidates the old gitlab-issue taints. Numeric values indicate GitLab issues, strings indicate other issues. Users cannot tolerate these.
nautilus.io/storageFor nodes dedicated to storage services like Ceph and LINSTOR. Users should not schedule general workloads here. Toleration is not permitted unless explicitly authorized for co-location.

Examples of taint name changes from old to new system
Old TaintNew Taint
nautilus.io/5g=truenautilus.io/system=5g:NoSchedule
nautilus.io/ceph=truenautilus.io/system=storage:NoSchedule
nautilus.io/ceph-external=truenautilus.io/system=ceph-external:NoSchedule
nautilus.io/jump=truenautilus.io/system=jump:NoSchedule
nautilus.io/nrp-llm=truenautilus.io/system=nrp-llm:NoSchedule
nautilus.io/perfsonar=truenautilus.io/system=perfsonar:NoSchedule
nautilus.io/sense=truenautilus.io/system=sense:NoSchedule
nautilus.io/linstor-server=truenautilus.io/system=storage:NoSchedule
nautilus.io/bluefield2=truenautilus.io/reservation=bluefield2:NoSchedule
nautilus.io/csusb=truenautilus.io/reservation=csusb:NoSchedule
nautilus.io/csu-tide=truenautilus.io/reservation=csu-tide:NoSchedule
nautilus.io/fpga-tutorial=truenautilus.io/reservation=fpga-tutorial:NoSchedule
nautilus.io/genai-lab=truenautilus.io/reservation=genai-lab:NoSchedule
nautilus.io/mizzou=truenautilus.io/reservation=mizzou:NoSchedule
msu-cache=truenautilus.io/reservation=msu-cache:NoSchedule
nautilus.io/prism-center=truenautilus.io/reservation=prism-center:NoSchedule
nautilus.io/qaic=undefinednautilus.io/reservation=qaic:NoSchedule
nautilus.io/reservation=cogrobnautilus.io/reservation=cogrob:NoSchedule
nautilus.io/reservation=wifirenautilus.io/reservation=wifire:NoSchedule
nautilus.io/sdsc-llm=truenautilus.io/reservation=sdsc-llm:NoSchedule
nautilus.io/stashcache=truenautilus.io/reservation=osdf:NoSchedule
nautilus.io/suncave=truenautilus.io/reservation=suncave:NoSchedule
nautilus.io/suncave-head=truenautilus.io/reservation=suncave-head:NoSchedule
um-cache=truenautilus.io/reservation=um-cache:NoSchedule
nautilus.io/arm64=truenautilus.io/hardware=arm64:NoSchedule
nautilus.io/large-gpu=truenautilus.io/hardware=large-gpu:NoSchedule
nautilus.io/disk-swap=truenautilus.io/issue=disk-swap:NoSchedule
nautilus.io/gitlab-issue=1234nautilus.io/issue=1234:NoSchedule
nautilus.io/slow-network=truenautilus.io/issue=slow-network:NoSchedule
nautilus.io/testing=truenautilus.io/issue=testing:NoSchedule
nautilus.io/upgrading=truenautilus.io/issue=upgrading:NoSchedule
node.kubernetes.io/unreachable=undefinednode.kubernetes.io/unreachable=undefined:NoSchedule

Observable notebook with taints summary

Reservations

Our cluster contains several sets of nodes dedicated to certain groups.

Users can target ONLY THE GROUP NODES by using nodeAffinity, for example:

spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: nautilus.io/reservation
operator: In
values:
- group1

For large jobs, this helps avoid consuming all shared cluster resources. Optionally, a higher priority can be used (contact the admins before using one).

In addition, groups may request exclusive access to entire nodes if their workloads justify it. Such nodes can be reserved by setting the following taint and corresponding toleration:

  • Taint on reserved nodes:
    nautilus.io/reservation=group:NoSchedule

Please fill out the node reservation form if your group has a use case that would benefit from whole-node reservations.

Other taints

Some nodes in the cluster don’t have access to public Internet, and can only access educational network. They still can pull images from Docker Hub using a proxy.

If your workload is not using the public Internet resources, you might tolerate the nautilus.io/science-dmz and get access to additional nodes.

NSF Logo
This work was supported in part by National Science Foundation (NSF) awards CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, OAC-2112167, CNS-2100237, CNS-2120019.