Warning: This document describes an old release. Check here for the current version.

Nimbus 2.4 Admin Installation

The document assumes the services node you are working on is Linux or some UNIX variant (such as OSX perhaps). The VMM nodes are required to be Linux. Most testing is carried out on Linux currently.

There are two system accounts involved:

  • privileged account - Pick a privileged account to run the remote client facing service. A privileged account on the VMM nodes is necessary as well. This guide assumes the privileged account on the service node and VMM nodes have the same name but that is not strictly necessary.

    In this guide (especially in the command samples) we will refer to an account of this type named nimbus with terminal prompts like "nimbus $"

    Note: this is not a privileged account in the sense that root is a privileged account. This is a regular account that you will give special powers to during the setup process. It is an account that will be privileged because it will have control over the VMMs.

  • superuser account - The root account is necessary to install dependencies on the VMM nodes (Xen/KVM, ebtables, etc.) and also to install the Nimbus agent that lives on the VMM nodes (workspace control).

    In this guide (especially in the command samples) we will refer to an account of this type named root with terminal prompts like "root #"

The installation guide is broken up into these steps:


Part I: Install/verify dependencies

I.A. Service node (#) (#)

The Nimbus service node installation is now largely self-contained. Starting with version 2.4, Nimbus no longer needs to be deployed into an existing Globus container. The container is automatically embedded into the Nimbus installation.

The service node installer has the following dependencies:

  • Sun Java 1.5 or later. The java command must be on your path
  • Apache Ant
  • Python 2.4 or later
  • If you plan to use the Nimbus web application (disabled by default), you must also have these Python modules installed: pyOpenSSL and python-sqlite2.

I.B. Xen/KVM and libvirt (#)

The workspace service manages Xen or KVM VMs. It interfaces with both of these systems using a library called libvirt.

The workspace service itself does not need to run on a VMM node. You must pick VMM nodes to install Xen or KVM on, the workspace service will manage these. Initially, pick just one node to install and test with. When all goes well, add more into the pool of VMMs the workspace service can send work to.

You need to install Xen/KVM and get a basic VM running with bridged networking.

Nimbus has been most heavily tested with the Xen 3.1 series although it is known to be working with 3.2 (this may require the "tap:aio" configuration in workspace-control's "xen.conf" configuration file). There is no information yet about Nimbus on the new/upcoming Xen 3.3 series. Nimbus has also been tested with QEMU/KVM 0.11.0

The VMM nodes require Python 2.4 and libvirt 0.6.3. Python 2.3 should work but it untested and unsupported.

You can actually put off the whole VMM step until later if you just want to evaluate the Nimbus clients and services: right out of the box Nimbus makes "fake" calls to workspace control (which is the agent that actually manages things on the VMM nodes). This allows the service side to be tested independently of the VMM. You "join them up" by turning off the "fake" switch when you're ready to do end to end testing.

* Acquire and install VMM software: (#)

Your Linux distribution probably has good support for libvirt, Xen and KVM (Debian/Ubuntu, RedHat, Gentoo, SUSE, etc.) in which case it is best to defer to distribution specific notes about how to get Xen or KVM up and running (with both libvirt and bridged networking).

If your Linux distribution doesn't support these packages well, you can always try installing from source or binary, starting at libvirt.org (and either xen.org or linux-kvm.org).

Note: an important step is to make sure that your libvirt installation supports the libvirt "python bindings." This is how the Nimbus software actually interacts with libvirt. In some cases, the bindings (which come with a normal libvirt installation) are not installed. On some systems you need to trigger this by installing the "python-dev" package first. In order to check you have the right install, Nimbus includes a script in the workspace-control download.

root # ./sbin/test-dependencies.sh

If this script reports you have a problem, look into your distribution's libvirt support and don't hesitate to contact the workspace-user mailing list with problems.

* Test Xen: (#)

You can test Xen with any Xen VM image, mostly. In this guide we will use a small and simple VM image as an example. Look here for some ideas about where to find Xen VM images. You can also create them from scratch of course, if you have the time.

In particular, we will be using a ttylinux-based image. ttylinux is a very small, yet functional, Linux distribution requiring only around 4 MB. Visit the ttylinux home page for a list of some of its many nice features.

You can download a tarball containing the image here. This directory should contain the following files:

  • ttylinux-xen.img: The partition image
  • libvirt-ttylinux.xml: A sample libvirt configuration

Take into account that the provided ttylinux image is not the exact same image you can download from the ttylinux home page. It is preconfigured to obtain a network address through DHCP, and depends on a different root device than a regular ttylinux image (to make the image Xen-friendlier).

Test the image using the provided configuration file. First of all, make sure you replace the values of the kernel and disk parameters inside the file with appropriate values. In particular, you should point kernel to your Xen guest kernel and disk to the location of the ttylinux-xen file.

To test the image, run the following as root for now (see below):

root # virsh -c "xen:///" create libvirt-ttylinux.xml

Logging in via terminal using:

root # vncviewer localhost::5900

You should see ttylinux boot messages, followed by a login prompt. You can log in using user 'root' and password 'root'. You can use ifconfig to check that the networking interface (eth0) is correctly setup. If you do not have a DHCP server, you can also use ifconfig to configure eth0 with a static IP address. Once the network is correctly set up, you should be able to ping out to some other machine besides dom0 and from that other machine be able to ping this address. Note that you should do all this just for the purposes of verifying that the image works correctly.

Keep the image configured to obtain a network address automatically via DHCP (even if your network doesn't have a DHCP server), as the Workspace Service will use DHCP to dynamically set up networking in it.

Note: most Xen documentation you will find online about testing and troubleshooting the install is valid in this situation because you are not doing anything specifically related to Nimbus here.

Note: make sure you read the section below about libvirt connection strings.

Note: the best instructions exist for libvirt permissions setup in the 2.5 instructions.

* Test KVM: (#)

To test KVM we do not have a ready made image for you yet. Use an image you have already gotten to run using "virsh -c qemu:///system"

Note: make sure you read the section below about libvirt connection strings.

* Libvirt connection: (#)

The workspace-control tool will be operating from a privileged UNIX account that is NOT root. The best way to allow this account to interface with libvirt is to use UNIX domain sockets and group permissions. See libvirtd.conf (usually located in the /etc directory) for details.

The connection strings for KVM and Xen will usually be similar to "qemu:///system" and "///var/run/xend/xend-socket" respectively for this method.

Ultimately do what is best for your deployment situation, the details can be found on the libvirt site. Whatever connection URI you choose to use, it must be configured in workspace-control's "libvirt.conf" file.

If you are using Xen, see the special note above.

I.C. Misc. libraries for the VMM nodes (#)

There are a few libraries needed on the VMM nodes before workspace control will work.

The ISC DHCP server (or DHCP server with compatible conf file) and ebtables are required to be running on each hypervisor node.

Any recent version of each package should be compatible, the scripts distributed with workspace control that automate the configurations were tested with ISC DHCP 3.0.3 and ebtables 2.0.6 userspace tools.

For more information on why this software is now necessary and how it will not interfere with a site's pre-existing DHCP server, see the network configuration details section in the reference guide.

Since these two pieces of software are relatively common, they may already be present on your hypervisor nodes via the package management system. Check your distribution tools for packages called dhcp (ISC DHCP server) and ebtables. You can also check for the existence of /sbin/ebtables or /usr/sbin/ebtables (for ebtables) and any of the following files for the DHCP server:

  • /etc/dhcp/dhcpd.conf

  • /etc/dhcp3/dhcpd.conf

  • /etc/init.d/dhcpd

  • /etc/init.d/dhcp3-server

If these software packages are not installed, all major Linux distributions include them and you should be able to easily install them with your package management system. For example, "rpm -ihv dhcp-*.rpm", "apt-get install dhcp", "emerge dhcp", etc. And similarly for ebtables.

ebtables requires kernel support in dom0, the default Xen kernel includes this support. If your dom0 kernel does not include these for some reason, the options to enable are under Networking :: Networking options :: Network packet filtering :: Bridge Netfilter Configuration :: Ethernet Bridge Tables

workspace-control also requires sudo and Python 2.3+ on all VMM nodes under the control of the Workspace Service.

I.D. SSH (#)

Finally, the privileged account on the service node needs to be able to SSH freely to the VMM nodes (without needing a password). And vice versa, the privileged account on the VMM nodes needs to be able to freely SSH back to the service nodes to deliver notifications.

You will be given the opportunity to get this right later during the service auto-configuration steps. The relevant security setups will be tested interactively, making sure you get this right.


Part II: Install/verify the Nimbus service package (#)

II.A. Download and install (#)

* Retrieve and unpack:

Grab the main Nimbus archive from the downloads page and unpack the archive:

nimbus $ wget http://www.nimbusproject.org/downloads/nimbus-2.4.tar.gz
nimbus $ tar xzf nimbus-2.4.tar.gz

* Build and install: (#)

Starting with Nimbus version 2.4, you do not need to run the Ant install scripts directly. You can run a single command to set up the Nimbus installation environment, build and deploy the software, and perform basic configuration. First change into the unpacked archive directory.

nimbus $ cd nimbus-2.4

Then run the install program, specifying a destination path where you would like Nimbus to be installed. If this path exists, it must be an empty directory into which you can read and write. If the path does not exist, you must have write access to the parent directory (so the installer can create a new directory).

nimbus $ ./bin/install /destination/path

This command will initialize the Nimbus home directory at the specified path. It will install a service container under the "services/" directory. It will then build and install Nimbus from source and deploy it to the container. Finally, it will run the "bin/nimbus-configure" program to help you set up an operational install. Follow the instructions provided by this program.

Hereafter in this guide, the Nimbus destination path you specified will be referred to as $NIMBUS_HOME.

Several Nimbus components are built and installed by default:

  • The Java based RM API and workspace service - VM/VMM manager
  • Default clients
  • WSRF frontend - a remote protocol implementation compatible with the default clients
  • EC2 SOAP frontend - a remote protocol implementation compatible with EC2 SOAP clients
  • EC2 Query frontend - a remote protocol implementation compatible with EC2 Query clients (such as boto or typica). This frontend is separate from the standard service container. It listens on its own port.
  • Nimbus Context Broker - a service which is used to assemble virtual clusters on-the-fly.

For more information about what these things are, see the FAQ.

You will not be able to run against the EC2 frontends until you have set up the cloud configuration because it relies on users having a personal repository directory (out of scope of this document).

After the install program finishes, you should have a nearly complete Nimbus installation in the $NIMBUS_HOME directory. Take a look, it contains several important files and directories.

  • bin/ - contains programs for managing and configuring Nimbus. "bin/nimbus-configure" was run at the end of the install and can be rerun to adjust the same configuration options. "bin/nimbusctl" is used start and stop Nimbus services.
  • services/ - GT container into which Nimbus has been deployed. Most of the configuration files live under here, in "etc/nimbus/".
  • var/ - has a simple Certificate Authority which is used by the Context Broker and which, in the default install, can also be used for host and user certificates. This directory also contains log files for the running service.
  • web/ - the Nimbus web application. By default this is not enabled.
  • nimbus-setup.conf - This file is written by the "bin/nimbus-configure" program. It contains configuration values. If you wish to change one of these values, you can usually edit this file and rerun "bin/nimbus-configure".

At this point in the installation process, Nimbus is installed and configured in "fake" mode, in which the services run and accept requests, but no actual VMM nodes are involved. The following network ports are used:

  • Service container - port 8443 - edit this in "$NIMBUS_HOME/sbin/run-services.sh"
  • EC2 Query frontend - port 8444 - edit this in "$NIMBUS_HOME/services/etc/nimbus/query/query.conf"

II.B. Necessary configurations (#)

There are a few configurations that cannot have any defaults.

We provide an auto-configuration program to gently take you through these configurations. During the process, several tests will be made to verify your set up.

This is installed by default, run the following command to get started:

nimbus $ $NIMBUS_HOME/services/share/nimbus-autoconfig/autoconfig.sh

Otherwise you can refer to this section of the reference guide to see the old instructions. Those instructions may shed some light on certain configurations as you move past the testing stage and want to know more about what is happening (but honestly, the best place to look for configuration insight is in the .conf file inline comments).

* Authorization: (#)

In the default install, Nimbus uses two files to manage authorization of users on the remote interfaces: users with X.509 credentials are checked against a grid-mapfile and users of the EC2 Query frontend are checked against the query users.txt list (and mapped to an X.509 Distinguished Name).

The grid-mapfile is located at "$NIMBUS_HOME/services/etc/nimbus/nimbus-grid-mapfile" by default, but you can have Nimbus use another file by editing the "gridmap" entry of "$NIMBUS_HOME/nimbus-setup.conf" and rerunning the nimbus-configure program.

A grid-mapfile is basically an access control list. It says which remote identities can access the container or a specific service. Note however that you need to ensure that any identity specified in this file is from a "trusted" Certificate Authority. CAs are trusted by placing their certificate in the trusted-certs directory (by default: "$NIMBUS_HOME/var/ca/trusted-certs/").

Users of the EC2 Query frontend do not provide X.509 credentials, but instead use symmetric authentication tokens. These tokens are matched against entries in the query users file. By default, this file is located at "$NIMBUS_HOME/services/etc/nimbus/query/users.txt".

In the cloud configuration you will see that there is a handy way to make the grid-mapfile (and the users.txt file) just a basic entry barrier. The real authorization decision can be made by more fine grained policies on a per user basis (for example, limiting certain users to certain amounts of total VM time, etc.).

II.C. Test call (#)

If you used the auto-configuration program, you are not ready for a live test yet. You will come back to this section after configuring workspace-control. Jump there now.

If you did not follow the auto-configuration program, currently Nimbus is set to "fake" mode. This allows you to get the service and VMM nodes working independently before you "join them up" for the live end to end test.

* Set up user certificate (#)

Note: There is a known bug for Python 2.4- users that is preventing them from executing this step correctly. If you have a Python version lower than 2.5, please substitute the $NIMBUS_HOME/sbin/nimbus-new-cert.py for the fixed script available here. We apologize about this issue. This fix is going to be present in the next release.

Before starting, you should create a X.509 user credential to access the workspace service if you haven't done that yet. We provide a tool to create the user certificate and key from the embedded Nimbus Certificate Authority. Run the following command to start the tool, and follow its instructions:

nimbus $ $NIMBUS_HOME/bin/nimbus-new-cert

If everything went well, the new certificate and key should be created in $HOME/.globus, and the DN of the certificate should be displayed in the program output:

Success! The DN of the new certificate is: "/O=Auto/OU=NimbusCA/CN=Bob"

In order to enable access to the workspace Service for some credential, its DN should be added to the nimbus-grid-mapfile, located at "$NIMBUS_HOME/services/etc/nimbus/nimbus-grid-mapfile". Add a line like this, using the DN from the generated certificate:

"/O=Auto/OU=NimbusCA/CN=Bob" test_account

* Start Nimbus services (#)

You can now start the Nimbus services using the '$NIMBUS_HOME/bin/nimbusctl" command:

nimbus $ nimbusctl start

It may take several seconds for everything to initialize. You should check the output in the logfile to ensure that the services started up okay:

nimbus $ tail -f $NIMBUS_HOME/var/services.log

* Client: (#)

Set up needed environment variables by sourcing the generated 'environment.sh' file, then run this program:

nimbus $ . $NIMBUS_HOME/sbin/environment.sh
nimbus $ cd $GLOBUS_LOCATION
nimbus $ ./bin/workspace
Problem: You must supply an action.
See help (-h).

OK, let's check out help, then.

nimbus $ ./bin/workspace -h

See sample output here.

A lot of options. This is the scriptable reference client. The cloud client is more user friendly. The cloud configuration which supports the cloud client is the recommended setup to provide users access to. This is very much because the cloud client offers a low entry barrier to people that just want to start getting work done.

For this guide we are going to run two of the actions listed in help, --deploy and --destroy. You can experiment with other ones, each action has its own help section.

(#)

* Deploy test: (#)

Grab this test script.

nimbus $ wget http://www.nimbusproject.org/docs/2.4/admin/test-create.sh

This script references a sample deployment file located at: "$NIMBUS_HOME/services/share/nimbus-clients/sample-workspace.xml". You may need to edit this file to specify an image that exists in your VMM's "/opt/nimbus/var/workspace-control/images/" directory.

Run it:

nimbus $ sh test-create.sh

Sample successful output is here

As you can see, an IP address was allocated to the VM, a schedule given, and then some state changes reported.

If you opened a third terminal to destroy, you could see the state change move after the destruction, too. Or you could type CTRL-C to exit this command and run destroy in the same terminal.

nimbus $ ./bin/workspace -e test.epr --destroy

... and we get something like:

Destroying workspace 2 @ "https://10.20.0.1:8443/wsrf/services/WorkspaceService"... destroyed.

If you look at this file, you will now see some usage recorded:

nimbus $ cat $NIMBUS_HOME/services/var/nimbus/accounting-events.txt

The "CREATED" line is a record of the deployment launch. A reservation for time is made.

The "REMOVED" line is a record of the destruction. A recording of the actual time used is made. These actual usage recordings stay long term in an internal accounting database and (along with any current reservations) can be used to make authorization decisions on a per-user basis.


Part III: Install/verify workspace-control (#)

III.A. Download and install (#)

Download the "Control Agents" tar file from the download page, and untar it. This archive contains both workspace-control and the workspace pilot. For this configuration we are using workspace-control, so copy it to the destination directory.

This guide assumes you are using "/opt/nimbus" as the target directory of the install. You need root privileges to complete the installation.

* Create privileged user: (#)

First, you need to choose (or create) a user that will be used to run the backend script (for this guide, let's assume that your user is called nimbus). Since we do not allow the backend script to be run as root, this user will rely on sudo to run Xen commands and other privileged commands.

Next, as root you should change permissions like so:

root # cd /opt/nimbus
root # chown -R root bin etc lib libexec src
root # chown -R nimbus var
root # find . -type d -exec chmod 775 {} \;
root # find . -type f -exec chmod 664 {} \;
root # find bin sbin libexec -iname "*sh" -exec chmod 755 {} \;

III.B. Necessary configurations (#)

* Configure sudo:

Using the visudo command, add the sudo policies printed out by the installer to the /etc/sudoers file. These policies should look something like this.

nimbus ALL=(root) NOPASSWD: /opt/nimbus/libexec/workspace-control/mount-alter.sh
nimbus ALL=(root) NOPASSWD: /opt/nimbus/libexec/workspace-control/dhcp-config.sh
nimbus ALL=(root) NOPASSWD: /opt/nimbus/libexec/workspace-control/xen-ebtables-config.sh

These policies reflect the user that will be running workspace control (nimbus) and the correct full paths to the libexec tools. See "/opt/nimbus/etc/workspace-control/sudo.conf" for more information

Also note that there is a specific ebtables script for xen and kvm.

The Xen ebtables script is configured by default. If you are using KVM, you must configure the "kvm-ebtables-config.sh" script in two places. First in the sudo rules so that it can be invoked (see workspace-control's "sudo.conf" file for details). Second, in workspace-control's "networks.conf" file.

Note: currently the KVM ebtables script can only support spoofing protection when there is one KVM virtual machine running at a time on each VMM node (this is the most common deployment configuration for sites supporting science). Nimbus' Xen support allows many guest VMs to be running while also ensuring there is no MAC and IP address spoofing.

You may need to comment out any "requiretty" setting in the sudoers policy:

#Defaults    requiretty

The commands run via sudo are not using a terminal and so if you have "requiretty" enabled, this can cause a failure.

* Configure DHCP: (#)

DHCP is used here as a delivery mechanism only, these DHCP servers do NOT pick the addresses to use on their own. Their policy files are dynamically altered by workspace-control as needed. Policy additions include the MAC addresses which is used to make sure the requester receives the intended DHCP lease.

Configuring the DHCP server consists of copying the example DHCP file "dhcp.conf.example" (included in the "share/workspace-control" directory) to "/etc/dhcp/dhcpd.conf" and editing it to include the proper subnet lines (see the contents of the example file). The subnet lines are necessary to get the DHCP server to listen on the node's network interface. So, make sure that you add a subnet line that matches the subnet of the node's network interface. No lease configurations, available ranges, etc. should be added: these are added dynamically to the file after the token at the bottom.

In most cases it is unecessary, but if you have a non-standard DHCP configuration you may need to look at the "dhcp-config.sh" script in the protected workspace bin directory and look at the "adjust as necessary" section. The assumptions made are as follows:

  • DHCP policy file to adjust: "/etc/dhcp/dhcpd.conf"
  • Stop DHCP server: "/etc/init.d/dhcpd stop"
  • Start DHCP server: "/etc/init.d/dhcpd start"
  • The standard unix utility "dirname" is assumed to be installed. This is used to find the workspace-control utilities "dhcp-conf-alter.py" and "ebtables-config.sh", we assume they are in the same directory as "dhcp-config.sh" itself. Paths to these can alternatively be hardcoded to fit your preferred configuration.

The "aux/foreign-subnet" script (in the workspace control source directory) may be needed for DHCP support. It allows VMMs to deliver IP information over DHCP to workspaces even if the VMM itself does not have a presence on the target IP's subnet. This is an advanced configuration, you should read through the script's leading comments and make sure to clear up any questions before using. It is particularly useful for hosting workspaces with public IPs where the VMMs themselves do not have public IPs. This is because it does not require a unique interface alias for each VMM (public IPs are often scarce resources).

* Configure kernel(s): (#)

Copy any kernels you wish to use to the /opt/nimbus/var/workspace-control/kernels directory, and list them in the authz_kernels option in the kernels.conf configuration file. By doing this, clients can choose from these kernels in the metadata, but they must already exist at the hypervisor node and must be in the guestkernels list.

* Configure network(s): (#)

In the workspace-control networks.conf configuration file, find notes about specifying the bridge name to use.

III.C. Testing (#)

For testing with a real VM (see testing section below) using that test-create.sh script, add your test VM from the Xen section.

The script is expecting a file named "ttylinux-xen" in the /opt/nimbus/var/workspace-control/images directory because of the file://ttylinux-xen line in the xml definition file the script points to ($NIMBUS_HOME/services/share/nimbus-clients/sample-workspace.xml).


Part IV: End to end test (#)

Now revisit the service Test call section.

If you did not use the auto-configuration program:

You are now ready to turn "fake" mode off and try a real VM launch. Back on the Nimbus services node:

nimbus $ nano -w $NIMBUS_HOME/services/etc/nimbus/workspace-service/other/common.conf

... and change the "fake.mode" setting to "false"

fake.mode=false

Now revisit the service Test call section.





Do not hesitate to contact the workspace-user mailing list with problems.

We plan to streamline some of the steps and also significantly add to the troubleshooting and reference sections.