Cloud Client Appendix
Notes on image compression
Up until now you have probably been dealing with raw image files, but there is a way to have the service handle compression and decompression on the VMM nodes. On the repository, you store compressed files.
By storing your image files compressed, you can save on the time it takes for image files to be moved to the VMM nodes. This is especially useful when using cluster mode because many transfers of the same file are often taking place in that case. If you make eight transfers of the compressed version which is then decompressed in parallel on the VMM nodes, you are going to see a lot of overall speedup.
If you want to deploy pre-existing compressed images, you must refer to them exactly by their filenames. For example, you may have "hello-cloud.gz" in your "--list" output so you could run this right now:
Upon reaching the VMM nodes, this will be uncompressed and the enclosed "hello-cloud" file will be used as the VM image.
If you are starting with a non-".gz" file "myimage" in the repository, you launch as normal:
... then save it back to the repository with a new name. If your new name includes the ".gz" suffix, it will be compressed first and then saved back to the repository.
From then on you can run like so:
With uncompressed files, as you learned here, the "--save" command used without the "--newname" flag will overwrite the source image in your personal repository directory (because you are not giving it a "new name" but want it saved).
In a similar vein, if you run a compressed image (using for example "--name something.gz") and then run the "--save" command without the "--newname" flag, it will recompress it by default, overwriting the previous "something.gz" image in your personal repository directory.
Using Science Clouds with virtual private networks
Some clouds will only connect your VM to a virtual private network. This may be dictated by site security policies or simply because public IP addresses are not available for VMs hosted on a cloud. The workspace cloud client works with virtual private network implementations but in order to use it in this way you will need to first join a virtual private network of the site you are working with. To do so, follow the steps below:
Download and install the OpenVPN client. (We recommend version 2.1.) For MacOS users, if you don't have tun/tap device drivers, they can be obtained from here (and seem to work despite the dire warnings).
Check the configuration notes for the cloud you want to use: they will point you to the OpenVPN client configuration file to use. Download this file and save it as something like "my/config/location" file.
Modify the configuration file to point to the correct locations for (1) the CA certificate that signed your credential, (2) the user certificate you use to log in, and (3) the private key associated with this certificate (search for "ca", "cert", and "key" or "SSL/TLS parms" comment). Please note: we currently assume that you will use the same certificate for VPN access and cloud access -- this means that you only have to mail one DN to the cloud administrator (if for some reason this does not work for you mail the cloud administrator).
Go to the directory where OpenVPN is installed (or do the usual path setup) and run the OpenVPN client to connect to the private network on the cloud like so:$ ./openvpn --config /my/config/location
Test your VPN installation by pinging an address given to you by the administrator of your cloud.
Transferring your own images to your personal directory
The cloud client makes it easy to transfer images to your personal repository directory. But first, it needs to conform in the following manner:
The IP address is obtained via DHCP broadcast at boot.
The image is an ext2/ext3 root partition file and it mounts to /dev/sda1
NOTE:On the University of Chicago installation, "/dev/xvda1" is currently required in the /etc/fstab file instead of "/dev/sda1".
The image runs SSHd after boot so that you can login and adapt the image to your needs. Your SSH key will be written to /root/.ssh/authorized_keys on the VM before it is booted (see client configuration instructions for how to pick which key on your local machine gets installed).
To then transfer a local image use the --transfer option of the client. Images are named by their filenames. For example, to transfer an image on your own computer called "helloworld" located in the "/tmp" directory, run the following command:
If you want to transfer the image to the cloud and run using one command, you don't have to specify the --name argument explicitly when using --run. The image name will instead be deduced from the --sourcefile argument. Eg., the command below transfers and runs the helloworld image on your local computer:
(The order of commandline flags does not matter, the client will do the sensible thing.)
Using the Cloud Client with Amazon EC2
The Context Broker and Context Agent support contextualizing across Nimbus clouds as well as Amazon EC2. However the cloud-client itself does not entirely support EC2 yet. Support for this will be provided in a future release. In the mean time, we have provided a facility for initializing a context and allowing you to manually launch EC2 instances. We even generate a script of EC2 launch commands that can usually be run without modification.
To contextualize with EC2, you must use the cloud client #11 or later. It can be retrieved from our archive.
You must also have access to a broker service. You can run your own copy of the broker if you like, but at this time it is recommended to use the service already running on the University of Chicago Nimbus Cloud. From within the cloud client directory, use these commands to initialize and verify your access:
Writing a cluster definition for EC2 is largely the same as doing so for a Nimbus cloud. The only major difference is that you must specify an AMI name as the image of each workspace. An example cluster definition is available here which references a public AMI we maintain.
The cloud client will contact the broker and initialize a context for your cluster. Since it cannot directly launch your EC2 instances, the client will save information about your cluster into temporary files as well as produce a sample EC2 script for you to run. This script will be written to the location specified with the --ec2script option.
Note that EC2 has no concept of time-limited instances. These instances will run until you explicitly terminate them or they shut themselves down.
This command will block, waiting to hear status information from the context broker. In another terminal, you need to run the generated /tmp/launch-ec2-vms.sh script.
Wait for the "cloud-client.sh" command to finish in the first terminal. It finishes when all the nodes have reported to the broker that the context agent process is complete and all the services are ready to go.
Beyond the Cloud Client
The cloud client wraps the functionality of a more primitive client called simply "workspace" which can be found in the cloud client distribution itself: $CLOUD_CLIENT_DIRECTORY/lib/workspace.sh
The cloud client:
- Sets reasonable defaults for the workspace client, allowing users to not think about most things they don't need to. The cloud client reduces the "complexity surface" for anyone using a Nimbus cloud.
- Generates full configuration strings from shortcuts. For example, an image name provided to the cloud client is actually converted into a path that represents the real location of your image on the repository node. Another example is that hours requested are folded into the overall resource allocation request.
- Coordinates calls to the context broker for any contextualization needs for both single instance launches and virtual cluster launches. Coordinates the "user-data" that needs to be sent to the IaaS service (either Nimbus or EC2) in launch requests.
- Can transfer files to/from the repository node using its embedded GridFTP client.
- Packages almost all needed dependencies and other useful utility programs.
Because the cloud-client was authored as a "driver" of the lib/workspace.sh program instead of a full rewrite, as you "peel away" the cloud client layer in order to do something complicated, you might have to miss out on some of those things.
If you inspect the debug logs kept in every launch's history directory, you can find a set of files and arguments to use as an example of lib/workspace.sh
In the run-debug-log.txt file for a single-VM launch, find the text "Created workspace description"
A "metadata.xml" file was constructed on the fly, the lib/workspace.sh program is expecting this as an input. Another crucial input is the "deprequest.xml" file.
Just after that, find a list of arguments. Once you have the two XML files we discussed, you can send all of these arguments through lib/workspace.sh yourself. Note that the "--file" argument concerns a file that it will create after a successful launch.
Running "lib/workspace.sh -h" will give you access to an extensive help system. Note that each action command action (e.g. "--deploy") has its own "-h" output.
Let's do something as an example, let's create a customization task. The cloud client will by default add a customization task on the fly that causes a file you have (~/.ssh/id_rsa.pub) to end up as a file on the VM (/root/.ssh/authorized_keys)
Customization tasks are included as an "optional parameter" file, here is an example:
<OptionalParameters> <filewrite> <content>write me to the file</content> <pathOnVM>/tmp/customizedfile</pathOnVM> </filewrite> </OptionalParameters> </pre>
Run a lib/workspace.sh based deployment with an extra "-o optional.xml" and you will be launching with this new customization task. The example file will appear in the VM even though it did not exist in the repository version of the image. Make sure the directory of the target file exists on the VM or it will trigger an error at deployment time.
To Infinity and Beyond
See the raw interfaces guide to learn all about the WSRF interfaces.