Cloud Guide (2.4)
This page describes a particular configuration of Nimbus that allows the cloud-client to operate out of the box. If you've never configured Nimbus before, you should be able to follow this page conceptually but it is not meant to be a replacement for the administrator guide which will still need to be consulted.
This page is for deployers of the cloud configuration to learn about it and configure the workspace service for it. This is not necessary for cloud users to read and understand. If you are a cloud user just looking to understand how to launch and manage VMs on an existing cloud, start at the clouds page.
Table of Contents
- User Experience
- Assumptions and Defaults
- Necessary Configurations
- Properties and Options
- Cloud admin program
- Configuration Appendix
The service must be set up in resource pool mode, controlling any number of VMM nodes. You may use the workspace pilot to integrate with a local resource scheduler. An image repository must be set up, this will host workspace image files for each client. When a client runs a workspace, the image to use is transferred from the repository to the VMM that will be running it.
For the sake of discussion we will assume that the workspace service and file repository setup are on different nodes. This does not necessarily need to be the case but it is the recommended configuration because of the heavy I/O traffic the repository can experience.
The server addresses must be directly reachable from the Internet or otherwise configured to deal with being NAT'd. The Globus container (where the workspace service runs) and GridFTP can both be setup for NAT or other port forwarding situations.
The diagram above depicts the basic setup.
- A special workspace client called the "cloud-client" invokes operations on the service and GridFTP server. A number of defaults are assumed which makes this work out of the box (these defaults will be discussed later).
- Files are transferred from the cloud-client to a client-specific directory on the repository node (manual or other types of GridFTP based transfers are also possible if the user is comfortable with using grid tools directly).
- The service invokes commands on the VMMs to trigger file transfers to/from the repository node, VM lifecycle events, and destruction/clean up.
- If the workspace state changes, the cloud-client will reflect this to the screen (and log files) and depending on the change might also take action in response.
Working backwards from the user's cloud-client experience is a good way to understand how the service needs to be setup.
Here is an abbreviated depiction of a simple user interaction with a cloud, to give you an idea if you've never used it. This does not depict an image transfer to the repository node but that is similarly brief.
A grid credential is needed, there is an embedded grid-proxy-init program if that is necessary.
You can list what's in your repository directory:$ ./bin/cloud-client.sh --list
[Image] 'base-cluster-01.gz' Read only Modified: Jul 06 @ 17:34 Size: 578818017 bytes (~552 MB) [Image] 'globus-002' Read only Modified: Jun 12 @ 18:55 Size: 3758097408 bytes (~3584 MB) [Image] 'hello-cloud' Read only Modified: May 30 @ 14:16 Size: 524288000 bytes (~500 MB) [Image] 'hello-cluster' Read only Modified: Jun 30 @ 20:18 Size: 524288000 bytes (~500 MB)
And pick one to run (ignore the 'cluster' images for now)$ ./bin/cloud-client.sh --run --name hello-cloud --hours 1
SSH public keyfile contained tilde: - '~/.ssh/id_rsa.pub' --> '/home/guest/.ssh/id_rsa.pub' Launching workspace. Using workspace factory endpoint: https://cloudurl.edu:8443/wsrf/services/WorkspaceFactoryService Creating workspace "vm-023"... done. IP address: 188.8.131.52 Hostname: ahostname.cloudurl.edu Start time: Fri Feb 29 09:36:39 CST 2008 Shutdown time: Fri Feb 29 10:36:39 CST 2008 Termination time: Fri Feb 29 10:46:39 CST 2008 Waiting for updates.
Some time elapses as the image file is copied to the VMM node. Then a running notification is printed:
State changed: Running Running: 'vm-023'
The client had picked up your default public SSH key and sent it to be installed on the fly into the VM's authorized_keys policy for the root account. So after launching you can use the printed hostname to log in as root:$ ssh email@example.com
You can see an example of a cluster cloud-client deployment on the one-click clusters page.
So how does this happen?
Assumptions and Defaults
A number of things go into making the cloud client work out of the box, but it is in large part accomplished by giving the user a downloadable package with a number of default configurations.
These defaults limit functionality options in some cases, but that is the idea: eliminate decisions that need to be made and set working defaults. There are avenues left open for experienced users to do more (for example, by overriding the defaults or even switching over to the regular workspace client).
In the previous section, the first thing that probably stands out is that there are no contact addresses being entered on the command line.
The service and repository URLs are derived from a properties file that is included in the toplevel "conf" directory of the cloud-client package. An example file is this cloud.properties file which is currently distributed for the Nimbus cloud.
Note: How properties files and commandline overrides work is covered in a later section in detail, it is all designed to be flexible under the covers. If you don't want to follow the conventions laid out in this current "assumptions" section, it will be important to understand the later section to know how to change things for a good client package or properties file(s) that your users can use. Continue reading this section first, though, to get the basic ideas.
There are three main groups of assumptions and defaults. The first is the contact and identity information of the workspace service and GridFTP server (see above for configuration sample where this are specified). The other two groups make up the rest of this "Assumptions" section:
Deriving per-user repository directories
For GridFTP based commands (like --list, --delete, and --transfer) the server to contact is based on the contact in the cloud properties file. The X509 identity to verify is in the cloud properties file. If that property was missing, identity checks would be based on hostname.
Remember that we are not going to discuss the various ways of getting options in this "Assumptions" section.
When you transfer a local file, the target of the transfer is the same filename in your personal repository directory. When you refer to the name of a workspace to run, this name must correspond to a filename in your personal repository directory.
We know where the repository comes from but how is that directory derived?
There are two other components to derive the directory used: the configured base directory property and the hash of the caller's X509 Distinguished Name.
- The configured base directory property. The default configuration for the base directory on the repository node is "/cloud".
- A hash of the caller's X509 Distinguished Name is used as the subdirectory of the base directory. The algorithm for this is based on MD5. It produces a string of eight characters, for example "31ceb17f". The credential being used for the call is inspected to get the user's DN.
The directories for each user are created by the administrator. Any (unlikely) hash collisions would be detected at this point. You can see the hash of any "Globus style" DN with the --hash-print option of the cloud client. For example:
DN: /DC=org/DC=agrid/OU=people/CN=John Q. Public HASH: a9bad55
So with a hypothetical repository hostname "repository.cloudurl.edu", "/cloud" base directory and DN hash of "a9bad55", the derived GridFTP URL of the user's "my-workspace" file will be gsiftp://repository.cloudurl.edu:2811//cloud/a9bad55/my-workspace
Note that there is a cloud-client option to input any name or local file path and see what the derived URL is. See the --extrahelp description of the --print-file-URL option.
As of TP2.2, you can auto-create the user directories using the cloud-admin program.
The second set of assumptions to cover is how a given image file is going to actually work. There are many options that you can specify in regular workspace requests. For example, the memory size, the number of network interfaces to construct, the pool name(s) to lease network addresses from, and the partition name the VM is expecting for the base partition.
Some fixed assumptions are made:
- There can be only one network interface
- The network interface is expecting its address via DHCP
- There can be only one partition file, for the root partition, configured with an ext2/ext3 filesystem. Other filesystems may not work correctly (this has to do with the cloud's default kernel as well as its ability to edit the image's files before boot).
The rest of the launch request is filled by default configurations, here they are:
- Request 3584 MB of memory
- Request networking address from a pool named public
- Mount the partition to sda1
The previous section summed up the defaults and main assumptions. Opting to follow these conventions in your cloud leads to these configuration conclusions:
Install the workspace service in resource pool mode.
Configure an network for addresses to lease from and call it "public".
Create a cloud.properties file for your cloud with the values in this example file changed to reflect the correct URLs and identities.
If you need to adjust the default memory request, add a line of text like so to the cloud.properties file you will distribute: vws.memory.request=2560
Create a /cloud directory on the repository node.
For each user, take the hash of their DN (using --hash-print) and create a directory for them under the /cloud base directory.
Properties and Options
This section goes into more detail about the property file and commandline configurations. This is especially important to understand if you want to diverge from the defaults above.
All commands go through cloud-client.sh which in turn invokes the actual cloud client program. The cloud client is written in Java and installed at lib/globus/lib/workspace_client.jar.
Before calling this program, the script sets up some things:
- ../conf/cloud.properties is set as the user properties file
- ../lib/globus becomes the new GLOBUS_LOCATION (overriding anything previously set)
- ../lib/certs is set as a directory to add to the trusted X509 certificate directories for identity validations (the client verifies it is talking to the right servers). Adding the CA cert(s) of the workspace service and GridFTP host certificates to this directory ensures that the user will not run into CA (trusted certificates) problems.
The cloud client program respects settings from three different places, listed here in the order of precedence:
Commandline arguments - If the client uses one of the optional flags listed in ./bin/cloud-client.sh --extrahelp, these values are used. Many things can be overriden this way, including the service contacts.
Note that you can include different properties files and have your users switch between clouds using ./bin/cloud-client.sh --conf ./conf/some-file.
If no --conf argument is supplied, the default file cloud.properties needs to exist. If you need to change this in your client distribution for cosmetic reasons, you can do so by editing the one relevant line at the top of ./bin/cloud-client.sh
Embedded properties file - A properties file lives inside the workspace client jar (which is installed into lib/globus/lib/workspace_client.jar). This controls all the remaining configurations.
There are (intentionally) no fallback settings for the properties found in that sample cloud.properties file:
- ssh.pubkey (Path to SSH public key to log in with)
- vws.factory (Host+port of Virtal Workspace Service)
- vws.factory.identity (Virtal Workspace Service X509 identity)
- vws.repository (Host+port of image repository)
- vws.repository.identity (Image repository X509 identity)
See the configuration appendix for other, more esoteric defaults that can be tampered with.
To enable one-click clusters, you need to enable the context broker (see this section admin guide).
The plugins page discusses the "groupauthz" plugin which provides for many generally useful policies to be enforced, but one in particular is necessary for the cloud configuration to operate properly. The identity-hash based image subdirectories option ensures that propagation source paths and unpropagation target paths are specific to the caller using the hashing algorithm discussed above.
The workspace-control user account is empowered to run all workspaces, so this authorization of specific requests is necessary before the "enactment" command is sent out to workspace-control, work done on behalf of the client but importantly not as the client.
For the repository node you currently need GridFTP to handle remote transfers. Each cloud user's DN must be in the GridFTP grid-mapfile (an access control list that also maps each DN to a specific unix account). In order to prevent users from maliciously overwriting each others files when talking to GridFTP directly, currently each cloud user must be mapped to a unique unix account which is part of a unique unix group on the repository node.
See this thread for notes about GridFTP permission schemes.
Say that the base directory on the repository node is "/cloud", you will need to create a directory for each DN based on the hash. It is recommended that you use the cloud-admin program for this (see next section).
Cloud admin program
As of TP2.2, there is a program installed here: $GLOBUS_LOCATION/share/nimbus-autoconfig/cloud-admin.sh
This program can add new users for you with one command, including creating the directories with the right hash names. During its first "add-dn" invocation, you can set up many default choices including what "sample images" get soft linked to the new directory, etc.
Here are the current options:
--add-dn "/CN=Some DN" Adds new DN (interactive) --del-dn "/CN=Some DN" Deletes a DN (interactive) --find-dn "/CN=Some DN" Checks for a DN --hash-dn "/CN=Some DN" Outputs cloud hash for a DN --find-hash 1234abcd Looks in policies for a DN with this hash --all-dns Prints all active DNs --enable-groupauthz Enables the groupauthz plugin --disable-groupauthz Disables the groupauthz plugin
These are the embedded properties that are shipped with the cloud client, they can also exist in the cloud properties files to override the defaults:
# Default ms between polls vws.poll.interval=2000 # Default client behavior is to poll, not use asynchronous notifications vws.usenotifications=false # Default memory request vws.memory.request=3584 # Image repository base directory vws.repository.basedir=/cloud/ # CA hash of target cloud vws.cahash=6045a439 # propagation setup for cloud vws.propagation.scheme=scp vws.propagation.keepport=false # GridFTP transfer timeout, 0 is infinite vws.gridftp.timeout=0 # Metadata defaults vws.metadata.association=public vws.metadata.mountAs=sda1 vws.metadata.nicName=eth0 vws.metadata.cpuType=x86 vws.metadata.vmmType=Xen vws.metadata.vmmVersion=3 # Filename defaults for history directory vws.metadata.fileName=metadata.xml vws.depreq.fileName=deprequest.xml