If you're new to Mesos
See the getting started page for more information about downloading, building, and deploying Mesos.
If you'd like to get involved or you're looking for support
See our community page for more details.
Roles
Many modern host-level operating systems (e.g. Linux, BSDs, etc) support multiple users. Similarly, Mesos is a multi-user cluster management system, with the expectation of a single Mesos cluster managing an organization’s resources and servicing the organization’s users.
As such, Mesos has to address a number of requirements related to resource management:
- Fair sharing of the resources amongst users
- Providing resource guarantees to users (e.g. quota, priorities, isolation)
- Providing accurate resource accounting
- How many resources are allocated / utilized / etc?
- Per-user accounting
In Mesos, we refer to these “users” as roles. More precisely, a role within Mesos refers to a resource consumer within the cluster. This resource consumer could represent a user within an organization, but it could also represent a team, a group, a service, a framework, etc.
Schedulers subscribe to one or more roles in order to receive resources and schedule work on behalf of the resource consumer(s) they are servicing.
Some examples of resource allocation guarantees that Mesos provides:
- Guaranteeing that a role is allocated a specified amount of resources (via quota).
- Ensuring that some (or all) of the resources on a particular agent are allocated to a particular role (via reservations).
- Ensuring that resources are fairly shared between roles (via DRF).
- Expressing that some roles should receive a higher relative share of the cluster (via weights).
Roles and access control
There are two ways to control which roles a framework is allowed to subscribe to. First, ACLs can be used to specify which framework principals can subscribe to which roles. For more information, see the authorization documentation.
Second, a role whitelist can be configured by passing the --roles
flag to
the Mesos master at startup. This flag specifies a comma-separated list of role
names. If the whitelist is specified, only roles that appear in the whitelist
can be used. To change the whitelist, the Mesos master must be restarted. Note
that in a high-availability deployment of Mesos, you should take care to ensure
that all Mesos masters are configured with the same whitelist.
In Mesos 0.26 and earlier, you should typically configure both ACLs and the whitelist, because in these versions of Mesos, any role that does not appear in the whitelist cannot be used.
In Mesos 0.27, this behavior has changed: if --roles
is not specified, the
whitelist permits any role name to be used. Hence, in Mesos 0.27, the
recommended practice is to only use ACLs to define which roles can be used; the
--roles
command-line flag is deprecated.
Associating frameworks with roles
A framework specifies which roles it would like to subscribe to when it
subscribes with the master. This is done via the roles
field in
FrameworkInfo
. A framework can also change which roles it is
subscribed to by reregistering with an updated FrameworkInfo
.
As a user, you can typically specify which role(s) a framework will
subscribe to when you start the framework. How to do this depends on the
user interface of the framework you’re using. For example, a single user
scheduler might take a --mesos_role
command-line flag and a multi-user
scheduler might take a --mesos-roles
command-line flag or sync with
the organization’s LDAP system to automatically adjust which roles it is
subscribed to as the organization’s structure changes.
Subscribing to multiple roles
As noted above, a framework can subscribe to multiple roles
simultaneously. Frameworks that want to do this must opt-in to the
MULTI_ROLE
capability.
When a framework is offered resources, those resources are associated
with exactly one of the roles it has subscribed to; the framework can
determine which role an offer is for by consulting the
allocation_info.role
field in the Offer
or the
allocation_info.role
field in each offered Resource
(in the current
implementation, all the resources in a single Offer
will be allocated
to the same role).
Multiple frameworks in the same role
Multiple frameworks can be subscribed to the same role. This can be useful: for example, one framework can create a persistent volume and write data to it. Once the task that writes data to the persistent volume has finished, the volume will be offered to other frameworks subscribed to the same role; this might give a second (“consumer”) framework the opportunity to launch a task that reads the data produced by the first (“producer”) framework.
However, configuring multiple frameworks to use the same role should be done with caution, because all the frameworks will have access to any resources that have been reserved for that role. For example, if a framework stores sensitive information on a persistent volume, that volume might be offered to a different framework subscribed to the same role. Similarly, if one framework creates a persistent volume, another framework subscribed to the same role might “steal” the volume and use it to launch a task of its own. In general, multiple frameworks sharing the same role should be prepared to collaborate with one another to ensure that role-specific resources are used appropriately.
Associating resources with roles
A resource is assigned to a role using a reservation. Resources can either be reserved statically (when the agent that hosts the resource is started) or dynamically: frameworks and operators can specify that a certain resource should subsequently be reserved for use by a given role. For more information, see the reservation documentation.
Default role
The role named *
is special. Unreserved resources are currently represented
as having the special *
role (the idea being that *
matches any role). By
default, all the resources at an agent node are unreserved (this can be changed
via the --default_role
command-line flag when starting the agent).
In addition, when a framework registers without providing a
FrameworkInfo.role
, it is assigned to the *
role. In Mesos 1.3, frameworks
should use the FrameworkInfo.roles
field, which does not assign a default of
*
, but frameworks can still specify *
explicitly if desired. Frameworks
and operators cannot make reservations to the *
role.
Invalid role names
A role name must be a valid directory name, so it cannot:
- Be an empty string
- Be
.
or..
- Start with
-
- Contain any slash, backspace, or whitespace character
Roles and resource allocation
By default, the Mesos master uses weighted Dominant Resource Fairness (wDRF) to allocate resources. In particular, this implementation of wDRF first identifies which role is furthest below its fair share of the role’s dominant resource. Each of the frameworks subscribed to that role are then offered additional resources in turn.
The resource allocation process can be customized by assigning
weights to roles: a role with a weight of 2 will be allocated
twice the fair share of a role with a weight of 1. By default, every role has a
weight of 1. Weights can be configured using the
/weights operator endpoint, or else using the
deprecated --weights
command-line flag when starting the Mesos master.
Roles and quota
In order to guarantee that a role is allocated a specific amount of resources, quota can be specified via the /quota endpoint.
The resource allocator will first attempt to satisfy the quota requirements, before fairly sharing the remaining resources. For more information, see the quota documentation.
Role vs. Principal
A principal identifies an entity that interacts with Mesos; principals are similar to user names. For example, frameworks supply a principal when they register with the Mesos master, and operators provide a principal when using the operator HTTP endpoints. An entity may be required to authenticate with its principal in order to prove its identity, and the principal may be used to authorize actions performed by an entity, such as resource reservation and persistent volume creation/destruction.
Roles, on the other hand, are used exclusively for resource allocation, as covered above.