The PRP/NRP/GRP now represents a partnership among more than 50 institutions. Most host Flash I/O Network Appliances (FIONAs), which are rack-mounted PCs. FIONAs are advanced Science DMZ Data Transfer Nodes, optimized for 10-100Gbps data transfer and sharing. Many sport two to eight GPU add-in boards.
165 of these 10-100G connected FIONAs have been joined into a “hyper-converged cluster” called Nautilus, which uses Google’s open-source Kubernetes to orchestrate software containers across the distributed system. Kubernetes is a now widely adopted way to manage containerized software. In fact, more than two-thirds of the Fortune 500 companies have adopted it, and it is available within all the major commercial cloud providers. Nautilus currently has 550 GPUs, 6000 CPU cores, and more than 2PB of disk storage, all distributed among campus Science DMZs.
John Graham talks about how you can participate and explore Nautilus, add your own node to Nautilus to get the benefits of informal Potluck Supercomputing (PLSC), and how to build a Kubernetes cluster that will federate with Nautilus, making it a persistent experiment in community-based infrastructure. He will explain how RocketChat has helped create a community of users who need to share big data and compute on it.
Dima Mishin then discusses Nautilus’ use of Ansible to automate sysadmin tasks, Ceph storage pools, advanced measuring, and monitoring, as well as Nautilus’ coming federation with the new NSF SDSC Expanse supercomputer and its data-centric architecture. He finishes up with their new work with InMon and Reservoir Labs.