Container images on CVMFS
CVMFS is a distributed filesystem that allows you to mount software repositories and datasets on your device. NRP hosts an application repository /cvmfs/nrp-software.opensciencegrid.org
to give users access to a reproducible, ready-to-use environment to share and use data through CVMFS. In this tutorial, we’ll go through steps distributing and using container images to the CVMFS repo hosted by NRP.
Access to container images distributed to CVMFS is available globally. For example, the CVMFS data repository is also availabale on the OSG OSPool, not just limited to the NRP environment. And other repositories in CVMFS can also be accessed from NRP environment.
Learning Objectives
- How to create your own data images.
- How to distribute images into CVMFS.
- How to use the images distributed in CVMFS
Prerequisites
In order to complete this tutorial, you should have gone through the Quickstart, and finished these tutorials:
You will also need knowlege regarding Creating a pull request from a fork on GitHub.
Creating data container images
Here we focus on how to create a customized data container image. We suppose that you have setup your enviroment according to the Docker Images tutorial.
1. Put installation instructions in a Dockerfile
A Dockerfile
is a plain text file with keywords and commands that can be used to create a new container image. Here is an example to build an image based on python:3 with the jupyter
package, and a data file called data.txt
.
FROM python:3RUN pip install jupyterADD data.txt /data.txt
FROM
, indicates which container image we’re starting with. We use python:3 as the base image.RUN
, indicates installation commands we want to run while building the image. Here we usepip
to install thejupyter
package.ADD
, indicates the local or remote files or directories to be included in the image. Here we include a local filedata.txt
in the image.
2. Build the image
Run the following command to build the image in the same directory with the Dockerfile
and data.txt
files:
docker build . -t my-data-container:latest
3. Push the image to a container registry
In Step 2, we built a local image. If you have an account on Docker Hub, you can run the following commands to tag the image and push it there:
docker image tag my-data-container:latest DOCKER_USERNAME/my-data-container:latestdocker push DOCKER_USERNAME/my-data-container:latest
You may also use the GitLab container registry provided by NRP. For details, please refer to Building in GitLab
Distributing container images on CVMFS
Image distribution on CVMFS works with unpacked layers or image root file systems. We host a CVMFS repo to synchronize and distribute container images. Any image publically available in Docker and other registries can be included for automatic syncing into the CVMFS repository. The result is an unpacked image under /cvmfs/nrp-software.opensciencegrid.org
.
To get your images included, please either create a git pull request against images.txt
in the cvmfs-oci repository, or contact NRP admins and we can help you.
The images.txt
file is a list of container images, one image per line, in the format of [registry_host/][namespace/]repository_name[:tag]
. For examaple:
gitlab-registry.nrp-nautilus.io/nrp/scientific-images/python:latest
When this line is added to the image.txt and the pull request is merged, the image will be unpacked into CVMFS at /cvmfs/gitlab-registry.nrp-nautilus.io/nrp/scientific-images/python:latest
.
If new versions of images have been pushed to the registry, they will be detected and updated in CVMFS accordingly.
Accessing unpacked container images
Creating a PVC volume with the spec storageClassName: cvmfs
To access images distributed in CVMFS, you need to attach the CVMFS volume which can mount all repos. First, create the PVC (taken from https://github.com/cvmfs-contrib/cvmfs-csi/tree/master/example ):
apiVersion: v1kind: PersistentVolumeClaimmetadata: name: cvmfsspec: accessModes: - ReadOnlyMany resources: requests: # Volume size value has no effect and is ignored # by the driver, but must be non-zero. storage: 1 storageClassName: cvmfs
Using the cvmfs
PVC in pods
Option 1, mount the entire CVMFS
Create a pod with the following yaml file:
apiVersion: v1kind: Podmetadata: name: cvmfs-all-reposspec: containers: - name: idle image: busybox imagePullPolicy: IfNotPresent command: [ "/bin/sh", "-c", "trap : TERM INT; (while true; do sleep 1000; done) & wait" ] volumeMounts: - name: my-cvmfs mountPath: /my-cvmfs # CVMFS automount volumes must be mounted with HostToContainer mount propagation. mountPropagation: HostToContainer volumes: - name: my-cvmfs persistentVolumeClaim: claimName: cvmfs
In this example, the nrp-software.opensciencegrid.org
repo is accessible at /my-cvmfs/nrp-software.opensciencegrid.org
in the pod. Notice that repo is mounted unless the mount point is accessed.
Option 2, mount the nrp-software.opensciencegrid.org
repo specifically
If you need to mount the nrp-software.opensciencegrid.org
repo specifically, add the subPath
key to the pod’s volumeMounts
section:
volumeMounts: - name: my-cvmfs # It is possible to mount a single CVMFS repository by specifying subPath. subPath: nrp-software.opensciencegrid.org mountPath: /my-nrp-software-cvmfs mountPropagation: HostToContainer
In this example, the repo is accessible at /my-nrp-software-cvmfs
, but other repos in CVMFS are not available in the pod.
Mount the cvmfs
PVC in customized JupyterHub deployment
If you have a customized JupyterHub deployment, you can make CVMFS available in user spawned instances. Suppose a cvmfs
PVC has been created in the namespace where the Jupyterhub is depoloyed, and you want to mount the CVMFS repo nrp-software.openscience.org
at /nrp-software in every user’s pod, you can insert the following example into the JupyterHub’s value template (e.g. config.yaml):
singleuser: storage: extraVolumes: - name: nrp-software persistentVolumeClaim: claimName: cvmfs extraVolumeMounts: - name: nrp-software mountPath: /nrp-software subPath: nrp-software.opensciencegrid.org mountPropagation: HostToContainer
And then update the helm chart by command helm upgrade --cleanup-on-fail --install jhub jupyterhub/jupyterhub --namespace <namespace> --version=<version> --values config.yaml
. When the JupyterHub is deployed, the images in nrp-software.opensciencegrid.org
CVMFS repo is mounted as /nrp-software when users spawn new pods.
