If you're new to Mesos
See the getting started page for more information about downloading, building, and deploying Mesos.
If you'd like to get involved or you're looking for support
See our community page for more details.
Pre-provisioned CSI Volume Support in Mesos Containerizer
Mesos 1.11.0 adds pre-provisioned CSI volume support to the
MesosContainerizer (a.k.a., the universal
containerizer) by introducing the new volume/csi
isolator.
This document describes the motivation and the configuration steps for enabling
the volume/csi
isolator, and required framework changes.
Table of Contents
Motivation
Container Storage Interface (CSI) is a specification that defines a common set of APIs for all interactions between the storage vendors and the container orchestration platforms. Building CSI support allows Mesos to make use of the quickly-growing CSI ecosystem.
We already have a solution to support CSI introduced in the Mesos 1.5.0 release, but that solution has a limitation: it requires CSI plugins to implement the ListVolumes and GetCapacity APIs so that the external storage can be modeled as Mesos raw disk resources and then offered to frameworks. However there are a lot of 3rd party CSI plugins the do not implement those two APIs.
Mesos 1.11.0 provides a more generic way to support 3rd party CSI plugins so that Mesos can work with broader external storage ecosystem and we will benefit from continued development of the community CSI plugins.
How does it work?
The volume/csi
isolator interacts with CSI plugins via the plugin’s gRPC
endpoint.
When a new task with CSI volumes is launched, the volume/csi
isolator will
call the CSI plugin to publish the specified CSI volumes onto the agent host
and then mount them onto the task container. When the task terminates, the
volume/csi
isolator will call the CSI plugin to unpublish the specified CSI
volumes.
Currently the volume/csi
isolator will only call the CSI plugin’s node service but not controller service, that means:
We only support pre-provisioned CSI volume but not dynamic CSI volumes provisioning, so operators need to create the CSI volumes explicitly and provide the volume info (e.g. volume ID, context, etc.) to frameworks so that frameworks can use the volumes in their tasks.
We do not support the CSI volumes that require the controller service to publish to a node (ControllerPublishVolume) prior to the node service publishing on the node (NodePublishVolume).
Configuration
To use the volume/csi
isolator, there are certain actions required by
operators and framework developers. In this section we list the steps
required by the operator to configure the volume/csi
isolator and the steps
required by framework developers to specify CSI volumes in their tasks.
Pre-conditions
- Explicitly create the CSI volumes that are going to be accessed by Mesos tasks. For some CSI plugins (e.g. NFS), they do not implement the CreateVolume API, so operators do not need to create the volume explicitly in this case.
Configuring the CSI Volume Isolator
In order to configure the volume/csi
isolator, the operator needs to
configure the --isolation
and --csi_plugin_config_dir
flags at agent
startup as follows:
sudo mesos-agent \
--master=<master-IP:master-port> \
--work_dir=/var/lib/mesos \
--isolation=filesystem/linux,volume/csi \
--csi_plugin_config_dir=<directory that contains CSI plugin configuration files>
The volume/csi
isolator must be specified in the --isolation
flag at agent
startup; the volume/csi
isolator has a dependency on the filesystem/linux
isolator.
The operator needs to put the CSI plugin configuration files under the directory
specified via the agent flag --csi_plugin_config_dir
. Each file in this
directory should contain a JSON object representing a CSIPluginInfo
object
which can be either a managed CSI plugin (i.e. the plugin launched by Mesos as
a standalone container) or an unmanaged CSI plugin (i.e. the plugin launched
outside of Mesos).
message CSIPluginInfo {
required string type = 1;
optional string name = 2 [default = "default"];
// A list of container configurations to run managed CSI plugin.
repeated CSIPluginContainerInfo containers = 3;
// The service endpoints of the unmanaged CSI plugin.
repeated CSIPluginEndpoint endpoints = 4;
optional string target_path_root = 5;
optional bool target_path_exists = 6;
}
message CSIPluginContainerInfo {
enum Service {
UNKNOWN = 0;
CONTROLLER_SERVICE = 1;
NODE_SERVICE = 2;
}
repeated Service services = 1;
optional CommandInfo command = 2;
repeated Resource resources = 3;
optional ContainerInfo container = 4;
}
message CSIPluginEndpoint {
required CSIPluginContainerInfo.Service csi_service = 1;
required string endpoint = 2;
}
Example of managed CSI plugin:
{.json}
{
"type": "org.apache.mesos.csi.managed-plugin",
"containers": [
{
"services": [
"NODE_SERVICE"
],
"command": {
"value": "<path-to-managed-plugin> --endpoint=$CSI_ENDPOINT"
},
"resources": [
{"name": "cpus", "type": "SCALAR", "scalar": {"value": 0.1}},
{"name": "mem", "type": "SCALAR", "scalar": {"value": 1024}}
]
}
]
}
Example of unmanaged CSI plugin:
{.json}
{
"type": "org.apache.mesos.csi.unmanaged-plugin",
"endpoints": [
{
"csi_service": "NODE_SERVICE",
"endpoint": "/var/lib/unmanaged-plugin/csi.sock"
}
],
"target_path_root": "/mnt/unmanaged-plugin"
}
Enabling frameworks to use CSI volumes
Volume Protobuf
The Volume
protobuf message has been updated to support CSI volumes.
message Volume {
...
required Mode mode = 3;
required string container_path = 1;
message Source {
enum Type {
UNKNOWN = 0;
...
CSI_VOLUME = 5;
}
message CSIVolume {
required string plugin_name = 1;
message VolumeCapability {
message BlockVolume {
}
message MountVolume {
optional string fs_type = 1;
repeated string mount_flags = 2;
}
message AccessMode {
enum Mode {
UNKNOWN = 0;
SINGLE_NODE_WRITER = 1;
SINGLE_NODE_READER_ONLY = 2;
MULTI_NODE_READER_ONLY = 3;
MULTI_NODE_SINGLE_WRITER = 4;
MULTI_NODE_MULTI_WRITER = 5;
}
required Mode mode = 1;
}
oneof access_type {
BlockVolume block = 1;
MountVolume mount = 2;
}
required AccessMode access_mode = 3;
}
// Specifies the parameters used to stage/publish a pre-provisioned volume
// on an agent host.
message StaticProvisioning {
required string volume_id = 1;
required VolumeCapability volume_capability = 2;
optional bool readonly = 3;
map<string, Secret> node_stage_secrets = 4;
map<string, Secret> node_publish_secrets = 5;
map<string, string> volume_context = 6;
}
optional StaticProvisioning static_provisioning = 2;
}
optional Type type = 1;
...
optional CSIVolume csi_volume = 6;
}
optional Source source = 5;
}
When requesting a CSI volume for a container, the framework developer needs to
set Volume
for the container, which includes mode
, container_path
and
source
fields.
The source
field specifies where the volume comes from. Framework developers
need to set the type
field to CSI_VOLUME
and specify the csi_volume
field.
The csi_volume
field specifies the information of the CSI volume. Framework
developers need to set the plugin_name
field to the type
field of one of the
CSI plugin configuration files in the directory specified via the agent flag
--csi_plugin_config_dir
, and specify the static_provisioning
field according
to the information of the pre-provisioned volume. The fields in static_provisioning
map directly onto the fields in the CSI calls NodeStageVolume
and NodePublishVolume,
please find more detailed descriptions about those fields in the CSI spec.
How to specify container_path
:
If you are launching a task without a container image and
container_path
is an absolute path, you need to make sure the absolute path exists on your host root file system as the container shares the host root file system; otherwise, the task will fail.For other cases like launching a task without a container image and with a relative
container_path
, or launching a task with a container image and an absolute or relativecontainer_path
, thevolume/csi
isolator will help create thecontainer_path
as the mount point.
The following table summarizes the above rules for container_path
:
Task with rootfs | Task without rootfs | |
---|---|---|
Absolute container_path | No need to exist | Must exist |
Relative container_path | No need to exist | No need to exist |
Example
Launch a task with a CSI volume managed by NFS CSI plugin:
TaskInfo {
...
"command" : {
"value": "echo test > volume/file"
},
"container" : {
"type": "MESOS",
"volumes" : [
{
"container_path" : "volume",
"mode" : "RW",
"source": {
"type": "CSI_VOLUME",
"csi_volume": {
"plugin_name": "nfs.csi.k8s.io",
"static_provisioning": {
"volume_id": "foo",
"volume_capability": {
"mount": {},
"access_mode": {
"mode": "MULTI_NODE_MULTI_WRITER"
}
},
"volume_context": {
"server": "192.168.1.100",
"share": "/mnt/data"
}
}
}
}
}
]
}
}
NOTE: To make the above example work, an NFS server (192.168.1.100
) needs to
be setup to export the directory /mnt/data
.