Warning: This document describes an old release. Check here for the current version.

Nimbus Infrastructure Changelog

For cloud client changes, see here.




2.10 - Summary
  • qcow2 images are now supported on KVM hypervisors. This feature is disabled by default and must be enabled by administrators. See the documentation for more information.

  • Copy-on-write support based on the qcow2 format has been added to Nimbus and is available on KVM hypervisors. When used in coordination with the image cache, this feature allows to decrease the time needed to start virtual machines. This functionality is disabled by default and must be enabled by administrators. See the documentation for more information.

  • It is now possible to set the networks associated with an instance type in the same way CPU and memory are set.

  • It is now possible to select a kernel using the EC2 interface.

  • nimbus-admin --shutdown --all continues working even when it encounters Corrupted instances.

  • nimbus-admin --shutdown now has a --force flag to activate the hard shutdown if the soft shutdown fails.

2.10 - IaaS Services 2.10 - Cumulus 2.10 - Control Agents 2.10 - Additional Notes
  • List of all commits between Nimbus 2.9 final and Nimbus 2.10 RC2.




2.9 - Summary
  • A suite of tools has been developed to allow a Nimbus administrator to easily see the current state of all running VMs, kill VMs, and to free up potentially orphaned resources.

  • Availability zones support has been added. This feature extends the existing Nimbus VMM pools feature to allow users to explicitly select which VMM pool they would like their VM to be run on. This functionality is exposed to users as Availability Zones in the Amazon EC2 APIs.

  • Support was added for multiple CPUs in EC2 instance types.

  • Administrators can enable a feature that will allow users to discover on what physical machine their VM is running.

  • LANTorrent propagation has been made more robust. Complex bugs related to cancellation have been fixed.

  • A bug in the ec2 query token API that occurs when looking up the details of specific image IDs has been fixed.

2.9 - IaaS Services 2.9 - Additional Notes
  • List of all commits between Nimbus 2.8 final and Nimbus 2.9 final.




2.8 - Summary
  • This release contains many important bug fixes as well as some new features.

  • Propagation by means of a file system copy: this can greatly decrease the boot time of VMs on systems where a fast shared file system exists.

  • VM image caching: this will greatly increase the boot performance for clouds with a base image that is launched often (a common use case). This works with any propagation method.

  • libvirt template support added. A cloud administrator can now completely control the options sent to libvirt when starting a virtual machine by editing the template file

  • ImportKeyPair is implemented in the EC2 protocols (details below).

2.8 - IaaS Services 2.8 - LANTorrent 2.8 - Control Agents
  • libvirt template support added. A cloud administrator can now completely control the options sent to libvirt when starting a virtual machine by editing the template file /opt/nimbus/etc/workspace-control/libvirt_template.xml on the VMM nodes.

  • A cache of propagated images can now be kept on each VMM. Before an image is propagated the cache is checked for an image with a matching checksum. If found that image is used and no propagation is needed which can save a significant amount of time. This will work with any propagation mechanism.

  • The cp (copy) propagation driver has been introduced in this release. This prepares an image for use by a VMM by copying it directly out of the Cumulus data store and into a temporary location from which it will be booted. For users with shared fast file systems this can bring great performance benefits.

  • Bug Fixes

2.8 - Additional Notes
  • List of all commits between Nimbus 2.7 final and Nimbus 2.8 final.




2.7 - Summary
  • Support for backfill and spot VM instances was introduced. Backfill instances are configured by the administrator and automatically start on idle resources. When user requests are received, backfill instances are preempted and terminated.

    Spot instances are similar to backfill, but are initiated by the user. Users may "bid" on VM slots and compete for available resources. Like backfill, spot instances can be preempted and terminated at any time. Preemption occurs when either a real (non-spot) request or a spot request with a higher bid is received and no other resources are available.

  • The EC2 Query interface has substantially improved compatibility with EC2 clients. The generated XML is now largely identical.

  • Idempotent instance creation is now supported, via the EC2 interfaces.

  • There are also numerous bug fixes and minor enhancements.

2.7 - IaaS Services 2.7 - LANTorrent 2.7 - Control Agents 2.7 - Additional Notes
  • List of all commits between Nimbus 2.6 final and Nimbus 2.7 final.




2.6 - Summary
  • This is the first release of LANTorrent, a fast multicast file distribution protocol designed to saturate all the links in a switch. This works best for situations with a local area network, large files, and many cooperative peers that need the same file -- i.e., it is geared towards IaaS image propagation (but could work in other scenarios).

  • Dynamic VMM configuration management is now possible with the new nimbus-nodes program. This allows you to adjust the resource pool while the service is still running, adding and removing resources on the fly.

  • The context broker has a new client-side HTTP/REST interface in addition to WSRF. Users authenticate with the same tokens they use for Cumulus and the Elastic Query API. This opens up the context broker for several new client integrations including ones using alternate languages.

  • Cumulus now supports the S3 COPY operation.

  • A new upgrade tool is introduced, install-from. This assists with updating a previous Nimbus installations (2.5 and higher). It currently requires that the old Nimbus services are all stopped and no VMs are deployed.

  • nimbus-import-users is a new program that allows multiple cloud installations to coordinate user information with each other.

  • nimbus-public-image is a new program that allows administrators to register VM images in the local Cumulus repository that will be usable for all users of the system (but of course only stored once on disk).

  • As usual, some bug fixes and minor enhancements.

2.6 - Installer
  • install-from program

    A new upgrade tool is introduced, this assists with updating a previous Nimbus installations (2.5 and higher). It currently requires that the old Nimbus services are all stopped and no VMs are deployed.

    You use this program instead of the regular install instructions. Most of the installation process is exactly as normal, but install-from will base most of the initial configurations on your old installation.

    It is not yet entirely Magic™ so you will need to follow the instructions in the upgrade guide to make it happen.

2.6 - IaaS Services
  • nimbus-nodes program

    Dynamic VMM configuration management is now possible with the new nimbus-nodes program. This allows you to adjust the resource pool while the service is still running, adding and removing resources on the fly.

    For information and instructions, see this section of the administrator guide.

  • nimbus-import-users

    This is a new program that allows multiple cloud installations to coordinate user information with each other.

    You can "dump" user information into a text file and this allows you to import it elsewhere. This allows administrators (or larger groups) to coordinate users across clusters/installations.

    It is compatible with the nimbus-list-users tool, you can for example run things like ssh nimbus@othercloud nimbus-list-users % | nimbus-import-users

  • nimbus-public-image

    This is a new program that allows administrators to register VM images in the local Cumulus repository that will be usable for all users of the system.

    These are read-only images that all Nimbus users will see as an option in their "--list" output for running.

    To save these images as new, derivative templates, the user would need to run "--save --new-name" and create a new object.

  • Passthrough configuration

    When an image file URL scheme is not the normal "cumulus://", previously the service would always pass this along to the VMM to interpret on its own.

    Now you can explicitly set what propagation to use for "cumulus://" and what passthrough schemes are allowed (if any).

    Enhancement 7092 shows where to find the configuration.

  • Bug Fixes

2.6 - Cumulus
  • COPY

    The COPY operation allows you to duplicate an object stored in the system without having to do any data transfer to your client machine.

    It was tested with s3cmd and boto. Future cloud-client versions could use this more natively for image duplication/rename.

  • Redirection

    In order to make Cumulus a scalable service we added a feature which takes advantage of the temporary redirection in error in the Amazon S3 Protocol. A Cumulus administrator can create a text file full of cloned Cumulus server contact strings. A maximum number of allowed connected clients is associated with each replicated Cumulus server. If, after a client connects and authenticates, that number is exceeded, then a 301 Temporary Redirect error is return to the client instructing it to try a different Cumulus server.

  • Postgres DB for Authentication

    Minor changes were made to the service to allow an admin to configure Cumulus so that it will use postgres instead of the default SQLite DB.

  • Bug Fixes

2.6 - LANTorrent
  • Introducing LANTorrent!

    This is the first release of LANTorrent, a fast multicast file distribution protocol designed to saturate all the links in a switch.

    This works best for situations with a local area network, large files, and many cooperative peers that need the same file -- i.e., it is geared towards IaaS image propagation (but could work in other scenarios).

    It is disabled by default and takes extra steps to activate. See this section of the administrator's guide for instructions as well as detailed explanations of how it works.

2.6 - Context Broker
  • HTTP/REST support

    The context broker has a new client-side HTTP/REST interface in addition to WSRF. Users authenticate with the same tokens they use for Cumulus and the Elastic Query API. This opens up the context broker for several new client integrations including ones using alternate languages.

    For example there is a prototype client api being built that integrates with the REST broker.

2.6 - Control Agents 2.6 - Cloud Client
  • Use cloud-client 16 or higher

    As of this release, the latest cloud client is number 17. Cloud client 16 will also work, but 15 or before will not due to the introduction of Cumulus in Nimbus 2.5.

2.6 - Additional Notes
  • List of all commits between Nimbus 2.5 final and 2.6 final.

  • Problems addressed in RC2 vs. RC1 (All commits)

    • Issue with image md5sums not getting recorded in certain situations.

    • Ran into situations where two seconds was not enough time to get a database connection from a pool of connections. Decided to make this infinite (for correctness purposes, the timeouts are not handled correctly).

    • The "bad CPU architecture" remote error message was incorrectly stated.

    • The build and test suite runs on more platforms now by using the /tmp directory for special files.

    • Service support for the new admin client "nimbus-nodes" did not handle a missing configuration gracefully.

    • Cumulus includes a new scalability test.

    • The installer did not handle the lack of the "uuidgen" command well, using a library instead of relying on that commandline being there.

    • The service was giving a bad error message for image URL schemes (besides 'file' or 'cumulus') that were not explicitly authorized.




2.5 - Summary
  • We are happy to announce the first version of Cumulus!

    Cumulus is a storage cloud implementation compatible with the Amazon Web Services S3 REST API (with small exceptions), it works seamlessly with S3 clients such as s3cmd, jets3t, and boto. In addition, it offers extra functionality such as disk quota enforcement.

    Cumulus replaces the current GridFTP-based upload and download of VM images: it integrates tightly in a Nimbus installation as the VM image repository solution. And it can be also installed on its own to manage a storage cloud: a Nimbus IaaS installation needs Cumulus but not vice versa.

    Read more below in the Cumulus section of the changelog.

  • Zero To Cloud installation process

    Besides being the first Cumulus release, the other major event of 2.5 is the first release of the "Zero To Cloud" installation process.

    Many of the new programs and enhancements in this release work together with the goal of providing a seamless installation process. Read more in the changelog below about each new feature, highlights include:

    • A new user management system: users can be added and managed quickly. Their credentials are created on the fly (but there is still an alternative path for coexistence with your own credential system).

    • All of the new user management tools include a machine parsable mode that makes them easy to incorporate into your own scripts.

    • Tight user integration with Cumulus: user management includes user setup with the image repository as well as the IaaS services.

    • Tight integration between the user management tools and the web application. This allows administrators to add a user and instantly receive the secure URL that the new user will visit to pick up his credentials and cloud.properties file.

    • No unix account separation or root account needed for Cumulus -- installation is easier because the IaaS central services and Cumulus can live in the same, non-root unix account.

    • The IaaS services are able to consult information in a database about files the remote user wants to launch. When the "outside" namespace of the file is Cumulus based, there is a translation to an "inside" location and mechanism. This technique is now used to encapsulate propagation mechanisms, allowing for an easy way to introduce new and faster methods.

    All of this combined to allow us to, among other things, provide a new installation process for Nimbus. You can now tend to the central service node installation separately and very quickly. Further, you install in a "fake" mode that doesn't actually invoke hypervisor machines yet. Doing all of this allows you to know that the central services are configured and working (including security), leaving VMM related matters to be tackled separately.

  • Integration with a central, site DHCPd

    Instead of hosting a private DHPCd for each VMM, Nimbus now supports (by default) editing a lease file for a central DHCPd on the LAN. This makes it easier to install and integrate with existing infrastructure. The old method is supported as an advanced configuration. Details below in the IaaS Services and workspace-control sections.

  • New scheduling options

    When using the default scheduler, the VMM selection process is now more configurable:

    There is a round-robin configuration that looks for available nodes with the highest percentage of free RAM. And a greedy configuration that looks for available nodes with the lowest percentage of free RAM. Details below.

  • New options for pilot based scheduling

    The pilot module now sends more information to the local resource manager (such as Torque). This is for accounting and potentially authorization purposes, you can now more easily track remote users and more information about the launch via the resource manager itself.

  • New alternative propagation methods

    There are two new contributed propagation drivers for getting VM images to the hypervisor nodes: HTTP and HDFS. These are for use outside of the cloud client currently, and you need to configure Nimbus to allow for non-Cumulus-based image locations.

    These are for use on sites where the remote user knows more about image locations than the Cumulus name for it. The HTTP driver is also good for having a central image repository (you can specify trusted hosts to pull from).

  • Better performing blankspace: physical partition leases

    In order to support very large temporary space partitions for VMs, and fast access to them, Nimbus now allows you to configure a list of physical partitions to lease out to incoming Nimbus VMs on each node (formatted for each use).

  • Support for user-requested number of cores

    Nimbus now allows remote users to specify the number of cores the VM(s) should have. The authorization policies now support this as well, allowing administrators to specify the maximum allowed for each user/group.

  • New service node dependency requirements

    See the Cumulus section for specifics.

  • Bug fixes and smaller enhancements

    See each section below for specifics.

  • For developers

    Highlights for software developers include a new internal testing framework for the RM API, tight integration with IntelliJ IDEA if you use that IDE, a propagate-only mode (no VMM required), a way to quickly overwrite just jar files in an installation, and better tarball/release management.

2.5 - Installer 2.5 - IaaS Services
  • A new user management system

    The tight integration with Cumulus and a new authorization database have resulted in a way to create a cohesive "user system" for Nimbus clouds that allows the administrator to quickly add, delete, and edit users. In $NIMBUS_HOME/bin, you will find several new programs, including:

    • nimbus-new-user
    • nimbus-edit-user
    • nimbus-list-users
    • nimbus-remove-user

    Each of these commands work well from the terminal but also include options to make it easy to do a wide array of tasks from scripts.

    In the case of the EC2 Query protocol, the credentials are no longer checked against a text file full of keys: they now integrate with these tools, it is all database driven.

    In order to continue supporting X509 certificates, a map file is still populated with certificate names ("DN"s), but these new commandline utilities should be the thing editing those files in order to keep them in sync. You should use these tools for all user management now.

  • nimbus-new-user program

    The nimbus-new-user program is a particular highlight, it replaces the "cloud-admin.sh" program.

    Right out of the box you can use this to generate new users. It will produce a X509 certificate/key, query ID/key (for use with Cumulus and EC2 Query interfaces), and the cloud.properties file that the user should use. It allows you to specify the set of authorization/credits rules to apply. It also integrates with the Nimbus web application if you choose: it will produce a special single-use URL that you can email to new users to pick up the newly created credentials and the user-specific cloud.properties file.

    There is sample help output and sample usages output in the comments of Enhancement 7035 - new create-user process

  • nimbus-reset-state program

    The nimbus-reset-state program allows administrators to reset long term accounting data, running state data that tracks the cluster, and Cumulus users and files in bulk.

    It is a destructive program, please read the help output carefully. For safety, it includes an "are you sure?" prompt if you don't pass it a "force" argument.

    Addresses Bug 7021.

  • nimbus-version program

    The nimbus-version program allows you to examine/remember the details of your installation. A metadata file is inserted by the release builder which records things like the exact version, build date, and git commit identifier that the release was built from.

  • Web application enhancement

    The web application contains an update to allow it to integrate with the nimbus-new-user program as well as distribute the personalized cloud.properties file that nimbus-new-user can produce.

    Once you configure the web application (it is not enabled by default), nimbus-new-user can automatically set up a pickup URL for new users. This is a short-lived, obscure URL that you can share with new users to pickup their new credentials (this can be for either X509 or query tokens or both).

    Because the nimbus-new-user tool can be put into a machine parsable mode, the URL can be incorporated programmatically into a script you write to send a welcome email.

  • Support for user-requested number of cores.

    This can be limited by the administrator using the group authorization system.

    See Enhancement 6999 - Integrate multi-core support and add authorization handling for it

    And the cloud.properties file you distribute to cloud users can now contain the suggested number as of cloud client #16 (Enhancement 7062).

    Contributed by Patrick Armstrong, University of Victoria

  • A new propagation namespace system

    The service will treat image URIs as an "external" namespace that are authorized via the new authorization database. They are translated into an internal representation that allows propagation to actually occur (and means that propagation mechanisms can now be entirely pluggable).

    Some propagation drivers may bypass this namespace if you allow, such as the new HTTP and HDFS mechanisms (see the workspace-control section below).

  • New scheduling options

    When using the default scheduler, the VMM selection can now happen in one of two ways, driven by configuration:

    1. A round-robin configuration in resource-locator-ACTIVE.xml (this is the default mode). This looks for matching nodes (enough space to run, appropriate network support, etc.) with the highest percentage of free RAM. If there are many equally free nodes it will pick randomly from those. As should be clear, this favors entirely empty nodes first.

    2. A greedy configuration in resource-locator-ACTIVE.xml. This looks for matching nodes (enough space to run, appropriate network support, etc.) with the lowest percentage of free RAM. If there are many equally unfree nodes it will pick randomly from those.

    See: Enhancement 7012 - Better round-robin scheduling of multiple VMs per node

  • New options for pilot based scheduling

    The pilot module now sends more information to the local resource manager (such as Torque). This is for accounting and potentially authorization purposes, you can now more easily track remote users and more information about the launch via the resource manager itself.

    Extra launch information: the pilot module sends the memory request for the VMs, administrators can use this information for accounting purposes.

    Extra account information: the pilot module adds "-A /remote/user/dn" to the submission which allows administrators to use this information for accounting purposes. The "-A" flag is part of the PBS standard and it is up to the implementation/configuration to ignore the information or do something with it such as accounting.

    Contributed by Patrick Armstrong, University of Victoria

  • Integration with site DHCPd

    Instead of hosting a private DHPCd for each VMM, Nimbus now supports (by default) editing a lease file for a central DHCPd on the LAN. This makes it easier for many administrators to integrate with existing infrastructure (and in most cases speeds up the installation).

    You will configure the DHCP server to respond to specific MAC addresses with specific IP addresses. Each time you change your network pool in Nimbus, you must update the DHCPd at the same time with the new information.

    The old method of having a DHCPd run on each VMM is still supported as an advanced configuration.

    See: Enhancement 7066 - Allow VMs to use site DHCP server

  • The EC2 interfaces now return IP address as well as hostname. This was a feature added to the EC2 protocol API version 2009-07-15. This addresses Enhancement 7073.

  • For developers

    There is a new internal testing framework for the RM API and there is also now tight integration with IntelliJ IDEA if you have access to that.

    See: Enhancement 7045 - Make Nimbus services installation portable

    And: Enhancement 7046 - Add test suite infrastructure

    The testing framework also includes enhancements to use Spring @DirtiesContext contributed by Paulo Motta, Google Summer of Code.

  • Better event logging

    The service now logs more information to the logs and to the "accounting-events.txt" file.

    See: Enhancement 7043 - add more information to event log files (might break anything parsing them)

    Collaboration with Patrick Armstrong, University of Victoria

  • Bug Fixes

2.5 - Cumulus
  • We are happy to announce the first version of Cumulus!

    Cumulus is a storage cloud implementation compatible with the Amazon Web Services S3 REST API (with small exceptions), it works seamlessly with S3 clients such as s3cmd, jets3t, and boto. In addition, it offers extra functionality such as disk quota enforcement.

    Cumulus replaces the current GridFTP-based upload and download of VM images: it integrates tightly in a Nimbus installation as the VM image repository solution. And it can be also installed on its own to manage a storage cloud: a Nimbus IaaS installation needs Cumulus but not vice versa.

  • Cumulus requirements

    Cumulus requires Python 2.5+ (but not 3.x) with SQLite support [1] and gcc [2] on the central service node.

    [1] - SQLite should be included by default in any Python 2.5 installation but we've run into some distributions removing that.

    [2] - gcc is not strictly required if you install packages to your Python site-packages: pyOpenSSL and Twisted 8.2+. Otherwise the installer will do some Python->C bridge code compilation to get dependencies enabled. The service installer does not run as root (and nothing on the service node needs root anymore), so in the case where you use gcc like this, those dependencies will be installed inside a virtual Python environment just for Cumulus code.

  • Support for quotas

    Cumulus allows administrators to set disk space limits on a per user basis. By default users are created with unlimited space. See the quota documentation for more information.

  • GridFTP is not used with this release

    Cumulus is used instead. The impact this will have on current cloud users is discussed in the email introducing Cumulus.

  • Cumulus will fall under the same best-effort support policy as other Nimbus components. As always, using the mailing lists is advised since you will get an opportunity for help from the community (this does happen and it is great to see).

  • Learn more

    To learn about Cumulus in depth, see the Cumulus FAQ entries and reference documentation.

2.5 - Context Broker 2.5 - Control Agents
  • DHPCd changes

    By default, workspace-control is now configured to use an off-node DHCPd server. That server should be configured according to instructions using the Zero To Cloud Guide.

    Enhancement 7066 - Allow VMs to use site DHCP server

    In order to enable the old method of having a DHCPd run on each VMM, you need to carry out the advanced "localdhcp" configurations described in workspace-control's "etc/workspace-control/networks.conf" file.

  • New alternative propagation methods

    The contributed HTTP and HDFS propagation drivers currently integrate without the propagation namespace awareness: normally in the cloud configuration, remote users will specify a "cumulus://" URL which is translated to an internal propagation location and mechanism.

    They are for use on sites where the remote user knows more about image locations than the Cumulus name for it.

    These alternative propagation methods need to be manually enabled, consult workspace-control's "etc/workspace-control/propagation.conf" file.

  • HTTP propagation driver

    The HTTP driver is good for having a central image repository on the internet for image files, using this you will not need to get changes to images back to each cloud's repository.

    Enable via workspace-control's "etc/workspace-control/propagation.conf" file. And you can specify trusted hosts to pull from, see the central Nimbus node's "services/etc/nimbus/workspace-service/global-policies.conf" file

    Contributed by Patrick Armstrong, University of Victoria

  • HDFS propagation driver

    The HDFS propagation driver is good for sites experimenting with fast propagation techniques. It requires a local HDFS installation as well as an installation of the client tools on each VMM. See workspace-control's "etc/workspace-control/propagation.conf" file.

    Contributed by Matt Vliet, Google Summer of Code, University of Victoria

  • Better performing blankspace: physical partition leases

    In order to support very large temporary space partitions for VMs, and fast access to them, Nimbus now allows you to configure a list of physical partitions to lease out to incoming Nimbus VMs on each node.

    This is an non-default and preliminary feature.

    The partition will be presented to a configured "/dev" device inside every Nimbus VM that is started on the VMM node. If there is a conflict with a mountpoint that the user's request contains, the administrator has a choice of rejecting the request or not providing the blankspace.

    Enhancement 7065 - better performing blankspace: physical partition leases

  • For Developers

    • Propagate-only mode

      New propagation-only mode for workspace-control that helps with development and testing of fast/smart propagation techniques. Does not require libvirt or any sudo privileges to work with the central Nimbus service and fully exercise propagation.

      See the embedded notes in the new 'src/propagate-only.sh' script in workspace-control.

    • A workspace-control helper script bin/fakesudo makes it easier to work in situations where workspace-control is running entirely as root (not a normal situation).

  • Bug Fixes

2.5 - Cloud Client
  • Nimbus 2.5 requires the newest cloud client

    The current cloud client as of this release is cloud-client-016. The 2.5 service release will not work with previous cloud clients.

  • Backwards compatible

    Cloud client 16 introduces Cumulus support using the S3 library jets3t. In order to maintain backwards compatibility, it is still compatible with older clouds that use GridFTP.

    The new, default repository behavior is triggered when "vws.repository.type=cumulus" is present in the cloud.properties file. This is a value that any Nimbus 2.5+ cloud should distribute in their cloud.properties file (start with the Zero To Cloud Guide to learn how to give out the right cloud.properties file).

  • Added support for using unencrypted keys directly instead of needing to run proxy-init. See the README for details.

    The properties 'nimbus.cert' and 'nimbus.key' are consulted first, then normal a grid proxy search is made, then ~/.nimbus is consulted, then ~/.globus.

    This means one less user step in many of the common ways people use the cloud client. The new (optional) 'nimbus.cert' and 'nimbus.key' properties also makes toggling between clouds easier.

  • Other enhancements can be viewed in the cloud client changelog.

2.5 - Additional Notes


2.4 - Services 2.4 - Control Agents 2.4 - Monitoring
  • The Nimbus Monitoring & Discovery system has received a substantial overhaul since its first incarnation. The system still utilizes custom Nagios plugins to report worker node and head node resources. However, the Globus 4.x MDS utility is no longer used as the data registry. Instead, the system publishes XML to files to be served up by a web browser. A new utility, Cloud Aggregator has been developed separately to query these XML sources.

    See this extended summary as well as the full monitoring changelog.

2.4 - Cloud client
  • The current cloud client as of this release is cloud-client-014. This service release should also work with cloud clients 011 through 013.

    For cloud client changes, see here.




2.3 - Summary
  • Support for the EC2 Query API.

  • Introduction of administrative web portal interface. Supports securely distributing user credentials.

  • Refactored workspace-control and integrated with libvirt. Includes initial support for the KVM hypervisor.

  • Assorted bug fixes and minor enhancements.

2.3 - Services
  • Support for the EC2 Query API. Tested with the Python boto client but should work with others. The service does not run in the standard Globus container, it spawns a separate Jetty process. While installed by default, it requires configuration before it can be used.

  • EC2 SOAP API support has been upgraded to version 2009-08-15. This means ec2-api-tools clients must be upgraded to this version. Early work has been done to support multiple versions concurrently, but this functionality is not yet available.

  • The new Nimbus web portal is based on Django and is a standalone component with (in this version) no ties to the other Nimbus services. This component's current sole functionality to facilitate securely providing users with their X509 and query credentials. It will be expanded in future releases to include more functionality for both users and administrators.

  • The Context Broker has been refactored and merged into the main Nimbus source tree. It is installed by default but is not enabled because it needs configuration.

  • The Nimbus Derby configuration now supports network access, though it is disabled by default. At install, passwords are generated and stored in var/derby.properties. The Nimbus service still uses the embedded interface. See Bug 6516 for details.

  • Assorted bugfixes:

2.3 - Control Agents
  • The workspace-control component has been significantly refactored. It has been moved from backend/ to control/ in the source tree. Direct command-line invocation of Xen operations has been replaced with calls to the excellent libvirt library. This opens the door to easier integration with several other hypervisors, starting with KVM.

  • Initial KVM support is provided.

  • Assorted bugfixes:

2.3 - Cloud client
  • The current cloud client as of this release is cloud-client-014. This service release should also work with cloud clients 011 through 013.

    For cloud client changes, see here.




2.2 - Summary
  • Introduction of the metadata server which mimics the EC2 HTTP query based metadata server.

  • Introduction of a standalone context broker, see the downloads page. This runs by itself so that you can use just the context broker to contextualize virtual clusters on EC2. No Nimbus cluster is necessary.

  • Bug fixes, see below for specifics.

2.2 - Services
  • Added a metadata server which responds to VMs HTTP queries, using the same path names as the EC2 metadata server. The URL for this is obtained by looking at /var/nimbus-metadata-server-url on the VM, which is an optional VM customization that can be made. See "etc/nimbus/workspace-service/metadata.conf" for the details.

    It responds based on source IP address so there is an assumption that the immediately local network is non-spoofable.

    The metadata server is disabled by default.

  • Introduction of a standalone context broker, see the downloads page. This runs by itself so that you can use just the context broker to contextualize virtual clusters on EC2. No Nimbus cluster is necessary.

  • Added user-data support to EC2 remote interfaces.

  • Added user-data support to the WSRF operations, but namespaces did not change. This maintains client forward compatibility. If the user data element is missing, that is not an issue for the service.

  • Added getGlobalAll to the RM API, see enhancement request 6556

  • Added MetadataServer module and user-data to VM to the RM API.

  • Added user-data support to EC2 remote interfaces.

  • Fixed these EC2 interface bugs: wrong instance ID is returned and describe instances fails with parameter.

  • Fixed misc bugs 6546 and 6545 (pilot plugin initialization failure).

2.2 - Cloud client
  • Current cloud client as of this release is cloud-client-011. This supports contextualization using the new standalone context broker.

  • A lone invocation of "--status" (which prints all your currently running instances) will now print the associated cloud handle of each workspace.

  • Java 1.5 (Java 5) is now a requirement

  • The TP2.2 service side is backwards compatible with the "old style" contextualization but this cloud client only supports the new one. You can only use this against Nimbus TP2.1 installations if you are not using contextualization.

  • Support for contextualizing easily with EC2 resources. See the output of "--extrahelp" for the new "--ec2script" option. Sample EC2 cluster.xml file is @ "samples/ec2basecluster.xml"

    This will take care of the context broker interactions for you and give you a suggested set of EC2 commands to run (including files for metadata) for the virtual cluster to contextualize while running on EC2.

  • Fixed bug in the "lib/this-globus-environment.sh" script, the X509_CERT_DIR variable was being set incorrectly

2.2 - Context agent
  • A new version of the context agent is necessary to contextualize a virtual cluster with Nimbus TP2.2's metadata server and the new context broker.




2.1 - Summary
  • Introduction of an auto-configuration program which guides you through many of the initial configuration steps and run several validity tests.

  • Introduction of the Nimbus AutoContainer program which allows you to set up a Globus Java web services environment from scratch (including security) in less than a minute.

  • Introduction of the cloud-admin program which allows you to very easily manage new users in a cloud configuration.

  • No protocol changes to WSRF based messaging. Previous clients such as cloud-client-010 are compatible.

  • Protocol update to match the current Amazon EC2 deployment, see below for details.

  • New workspace-control configurations options to support more kinds of deployments, see below for details.

  • New service requirement: Java JDK5+ (aka Java 1.5+)

  • Updated documentation. Added an extensibility guide and upgrade guide.

  • Bug fixes, see below for specifics.

2.1 - Services
  • Introduction of an auto-configuration program which will guide you through many of the initial configuration steps and run several validity tests.

    See this section of the administrator quickstart for more information.

  • Introduction of the Nimbus AutoContainer program which will allow you to set up a Globus Java web services environment from scratch (including security) in less than a minute.

    It requires a separate download. See this section of the administrator quickstart for more information.

  • Introduction of the "cloud-admin" program which will allow you to very easily manage new users in a cloud configuration.

    It is installed at the same time as the auto-configuration program, installed as $GLOBUS_LOCATION/share/nimbus-autoconfig/cloud-admin.sh, see this section of the cloud guide for more information

  • Protocol update to match the current Amazon EC2 deployment:

    Nimbus TP2.1 supports the 2008-05-05 WSDL (used by this EC2 client) as opposed to Nimbus TP2.0 which supported the 2008-02-01 WSDL (used by this EC2 client).

  • New service requirement: Java JDK5+ (aka Java 1.5+)

  • Resolved bug 6390: "notifications script is not sh compliant"

    The notification scripts now directly use the intended "bash" shell.

  • Resolved bug 6474: "destruction callbacks were not registered"

    An internal problem was fixed which made the logs wrong as well as causing problems for the client at destroy time. In particular, a VM would be destroyed but the remote client would not hear the last notification of the event causing it to hang.

  • Resolved bug 6397: "reservation ID mapping verification wrong for single-VM reservations"

    The EC2 reservation emulation is now working correctly with single VMs.

  • Resolved bug 6475: "repository + scp propagation"

    The EC2 messaging system now works with setups that use SCP propagation, there is a new relevant configuration in the elastic.conf file.

  • Resolved miscellaneous/cosmetic bugs 6393, 6394, 6396, 6398, and 6416.

2.1 - Reference clients
  • Cloud and reference clients did not change. Current cloud client as of this release is cloud-client-010.

  • You will need to update any EC2 client you use with Nimbus:

    Nimbus TP2.1 supports the 2008-05-05 WSDL (used by this EC2 client) as opposed to Nimbus TP2.0 which supported the 2008-02-01 WSDL (used by this EC2 client).

2.1 - Control agents
  • Added a new option to create VMs with "tap:aio" instead of using the "file" method (these are Xen terms for methods of mounting the disks). The "tap:aio" method is often used in Xen 3.2 setups and is now possible to use via workspace-control. See the new worksp.conf.sample.

  • Resolved enhancement request 6326: "use matching initrd with kernel"

    This allows you to configure workspace-control to take the kernel filename it is launching a VM with and search for a matching initrd based on suffix rules you set up. This allows you to easily use many of the Xen guest kernels that are created with popular Linux distributions.




2.0 - Summary
  • Introduction of the FAQ which explains many things you may already know, but it also includes new descriptions of the component system now more clearly articulated in the Nimbus TP2.0 release.

  • Introduction of the Java RM API which is a bridge between protocols and resource management implementations. The resource managers can remain protocol/framework/security agnostic (they can be "pure Java") and various protocol implementations can be implemented independently (and even simultaneously). Runtime orchestration of implementation choices is directed by industry standard Spring dependency injection.

  • Introduction of an alternative remote protocol implementation based on Amazon EC2's WSDL interface description. It is only a partial implementation (see below). It can be used simultaneously alongside the WSRF based protocols.

  • More friendly configuration mechanism for administrators including area-specific ".conf" files instead of any XML and the addition of some helper scripts.

  • No protocol changes (only an additional remote protocol). Previous clients such as cloud-client-009 are compatible.

2.0 - Services
  • Introduction of the Java RM API which is a bridge between protocols and resource management work. The resource managers below can remain protocol/framework agnostic (they can be "pure Java") and various protocol implementations can be implemented independently. Runtime directions of choices is directed by Spring dependency injection.

  • Introduction of an alternative remote protocol implementation based on Amazon EC2's WSDL interface description (namespace http://ec2.amazonaws.com/doc/2008-02-01/)

    It can be used simultaneously alongside the previous remote interfaces. If the EC2 protocol layer does not recognize instance identifiers being reported by the underling resource manager (for example when gathering "describe-instances" results), it will create new, unique instance and reservation IDs on the fly for them.

    It is only a partial protocol implementation, the operations behind these EC2 commandline clients are currently provided:

    • ec2-describe-images - See what images in your personal cloud directory you can run.

    • ec2-run-instances - Run images that are in your personal cloud directory.

    • ec2-describe-instances - Report on currently running instances.

    • ec2-terminate-instances - Destroy currently running instances.

    • ec2-reboot-instances - Reboot currently running instances.

    • ec2-add-keypair [*] - Add personal SSH public key that can be installed for root SSH logins

    • ec2-delete-keypair - Delete keypair mapping.

    [*] - One of two add-keypair implementations can be chosen by the administrator.

    • One is the normal implementation where the server-side generates a private and public key (using jsch) and delivers the private key to you.

    • The other (configured by default) is a break from the regular semantics. It allows the keypair "name" you send in the request to be the name AND the public key value. This means there is never a private key server-side and also that you can keys you aready have on your system.

  • More friendly configuration mechanism for administrators including area-specific ".conf" files (instead of XML) and the addition of some helper scripts.

    If you are familiar with a previous Nimbus versions (VWS), these ".conf" files hold anything found in the old "jndi-config.xml" file which you don't need to look at anymore. The files hold name=value pairs with surrounding comments. They are organized by area: accounting.conf, global-policies.conf, logging.conf, pilot.conf, network.conf, ssh.conf, vmm.conf.

  • Service configurations are now in "etc/nimbus/workspace-service" and "etc/nimbus/elastic". Advanced configurations (which you should not need to alter normally are now in "etc/nimbus/workspace-service/other" and "etc/nimbus/elastic/other".

  • New persistence management wrapper scripts are in "share/nimbus" and the persistence directory has moved to "var/nimbus"

  • Support for site-to-site file management (staging) was removed.

  • Developers: Significant directory reworkings (and subsequent build file changes) to organize modules more coherently, allowing for easier module independence.

    Build system now clearly separates anything to do with the target deployment (only one target deployment at the moment, GT4.0.x).

  • New Java dependencies:

    • Spring - just the core dependency injection library. The RM API depends on Spring import statements but no other module has any direct coupling to it.
    • cglib - used "invisibly" alongside Spring to provide some limited code generation when convenient.
    • ehcache - used for in-memory object caching.
    • jug - used for UUID generation instead of needing an axis dependency.
    • jsch - used for SSH keypair generation if necessary (see [*] in the EC2 section).
2.0 - Reference clients
  • The clients have stayed the same (on purpose, to reduce too much changing) except for some library package name changes.

  • When using a cloud running the EC2 front end implementation, you can download this EC2 client from Amazon or try a number of different client that are out there.

2.0 - Control agents
  • Workspace-control has stayed the same (on purpose, to reduce too much changing).

2.0 - Workspace pilot system
  • No changes except that the server side configuration location has moved from the "jndi-config.xml" file to "pilot.conf"




1.3.3.1 - Summary
  • Introduction of support for contextualization with virtual clusters. See the clouds page and the new one-click clusters page to see the various new features in action.

  • New ensemble service report operation allows efficient queries about a large number of workspaces.

  • Support for storing images at the repository in gzip format and retrieving them from the repository in gzip format. This can save a lot of time in cluster situations.

  • Support for pegging the number of vcpus clients receive.

  • Various client enhancements including internal organization, cleaner output, and new commandline options. Embedded security tools (like grid-proxy-init) work more out of the box now.

  • No configuration migrations are necessary for moving to this version from TP1.3.2. Some configuration additions will be necessary if you'd like to take advantage of features.

  • There was a WSDL update: additions, changes and new namespaces. The base namespace for workspace schemas is now http://www.globus.org/2008/06/workspace/

  • Some bug fixes.

1.3.3.1 - Services
  • Integration with context broker.

  • New ensemble service report operation allows efficient queries about a large number of workspaces. Can retrieve status and error messages about entire ensemble at once.

  • Fixed scheduler backout to correctly handle situation where ensemble wasn't launched yet but ensemble-destroy was invoked.

  • Fixed bug where IP address updates were not passing through cache layer to DB correctly causing a possible inconsistency if container restarted in certain circumstances. NOTE: this bugfix was not present in TP1.3.3 but is present in TP1.3.3.1.

  • Various internal changes (see CVS log)

  • No configuration changes are necessary for moving to this version from TP1.3.2. But to enable the context broker, you need to configure paths to a credential for it in the jndi-config file and make sure the WSDD file lists the context broker as in the source file "deploy-server.wsdd" (which becomes server-config.wsdd)

1.3.3.1 - Reference clients
  • Added cloud-client cluster and contextualization support. Includes new "--cluster" flag (see cloud-client CHANGES.txt for full changes there).

    See the clouds page and the new clusters page.

  • The regular commandline client has new flags for ensemble and context broker support. See "-h" output.

1.3.3.1 - Control Agents
  • Support for gzip via filename-sense. See cloud notes on image compression/decompression. This can save a lot of time in cluster launch situations since the gzip/gunzip happens on the VMMs simultaneously, cutting transfer times (where there is contention) considerably.

  • Local-locked the control of dhcpd start and stop: now works for situations where multiple workspaces are deployed on a VMM simultaneously (such as one VM per core and launching as part of a cluster). The DHCP adjustment was being exercised simultaneously, revealing the race.

  • There is no need to change the workspace-control configuration file from a TP1.3.2 compatible one. There is a new configuration if you want to use it, though. The "[behavior] --> num_cpu_per_vm" configuration allows you to peg the number of vcpus that are assigned to every workspace.

    You can choose to not upgrade workspace-control at all if you don't want the features listed here.




1.3.2 - Summary
  • Introduction of the cloud configuration and cloud client for user friendly client access to the workspace service.

  • Introduction of the "groupauthz" authorization plugin for typical configurations including the cloud setup.

  • Clients may now send customization tasks with request, files on the image will be replaced with the content. The cloud client, for example, is set up by default to send a customization request that sets up the workspace's "/root/.ssh/authorized_keys" file.

  • Clients can request an alternate unpropagation target to save a template VM into a new personal copy. This new URL may be requested both at creation time and on the fly in a unpropagate request.

  • Centralization of MAC address allocations to the central workspace service. This allows all backend configurations files to be identical. Older/advanced configurations are still possible but not recommended unless necessary.

  • Hard disk images are now supported (client may bring a matching kernel along).

  • Various client enhancements including internal organization, cleaner output, and new commandline options.

  • A few bug fixes.

  • There was a WSDL update: additions, changes and new namespaces. The base namespace for workspace schemas is now http://www.globus.org/2008/03/workspace/

1.3.2 - Services
  • See the Cloud Guide for an overview of a new set of configurations/conventions that allow for clients to get up and running in minutes even from laptops on NATs. Currently this comes at the cost of obscuring some features like group deployments and multiple NICs.

  • Centralized MAC address allocations to the workspace service. This allows all backend configurations files to be identical. Older/advanced configurations are still possible but not recommended unless necessary.

    There is a new configuration in the jndi-config.xml file that allows the administrator to define the valid prefix for MAC address selection. See WorkspaceFactoryService -> NetworkAdapter -> macPrefix

    Once an IP is assigned a MAC address (during service initialization) it remains with that IP as long as it is configured as part of the network pools. This ensures that local network devices can cache MAC/IP bindings without needing to be manually cleared (no need for unsolicited ARP reply to guarantee connectivity).

  • Introduction of the "groupauthz" plugin. This comes directly with the workspace service (no separate plugin installation is necessary) but it is not enabled by default. This authorization plugin supports different policies for different group members which you organize by inserting identities into different group files.

    The plugin can enforce the following policies. The request data to check is determined on a per-request, per-client basis. The limits are defined on a per group basis (every caller identity must be a part of a group).

    • Maximum currently reserved minutes at one point in time. If the caller has two other workspaces with 10 hours scheduled for each, the value being checked against this policy would be 20 hours plus whatever time the current request is.
    • Maximum elapsed and currently reserved minutes at one point in time. If the caller has one other workspace with 10 hours scheduled and 80 hours of recorded past usage, the value being checked against this policy would be 90 hours plus whatever time the current request is. This is the all-time maximum usage cap.
    • Maximum number of running workspaces at one point in time.
    • Maximum number of workspaces per request (the largest group request possible).
    • The image node that must be specified.
    • The image node base directory that must be specified.
    • Support for identity-hash based image subdirectories (see the cloud setup documentation to understand this convention).

    Each policy can be set to disabled/infinite for specific groups if you desire.

  • Arbitrary file customization tasks may be sent with the workspace creation request. The image is mounted on the VMM and the contents of the task are placed into the specified file.

    This requires mount-alter.sh support on the backend which expects the mount -o loop construct to work without specific filesystem selection. i.e., this will not support workspaces with filesystems that the VMM kernels do not support.

    This requires three new jndi-config.xml configurations:

    • WorkspaceService -> home -> localTempDirectory
    • WorkspaceService -> home -> scpPath
    • WorkspaceService -> home -> backendTempDirectory
  • Inclusion of alternate unpropagation URL. This allows the client to specify the target URL for where the workspace is unpropagated. It can be specified as part of the creation request or overriden after deployment. If the default shutdown mechanism was to destroy the workspace, this can still be used (with shutdown-save) to cause unpropagation to the given URL.

  • Authorization enhancement to support late-specified alternate unpropagation URL. An operation to check the contents of a post-deployment alternate propagation URL request was added to the authorization callout interface.

    This can be used to filter out invalid requests. For example, the groupauthz plugin discussed above will use the same logic here for image repository policy checking that it does at create time. Previously, the authorization callout had only one operation which was called at creation time only.

  • Fault information can now be stored as part of the Corrupted state (for both RP queries and asynchronous state notifications). This will help the remote client debug issues that can arise after a successful factory creation, such as "the file you specified to propagate does not exist at the image repository" etc.

  • Various internal changes (see CVS log)

  • See the end of the administrator guide for notes on configuration migration to this version from older workspace releases.

1.3.2 - Reference clients
  • Introduction of cloud-client system. This consists of a wrapper program run from a specific directory setup that contains an embedded globus client installation among other things.

    For more information on the client and setting up a configuration to support it, see the Cloud Guide. To see some examples of end-user commands, see the clouds page.

  • The main client's help system was reorganized. For help on options that are specific to an action, use "--help --<name of action>". See the main "--help" output to get started.

  • The main client has a new "--exit-state" option that causes modes with subscriptions (in either poll or async mode) to wait for the specified state before exiting with success. If the workspace moves to a terminal state (Corrupted etc.) then this is considered an error. This is aimed at making scripts that wrap the client more effective.

  • The main client has a new "--save-target" option whose argument is an override to any previous unpropagation URL. You can use this before or after deployment has succeeded (although it could fail because of authorization issues). See the client's "-h --shutdown-save" output for more information.

  • Arbitrary customization tasks are possible by defining them in an optional parameters file. But the main client now also includes a shortcut for the very common task of inserting your SSH public key as the desired contents of the /root/.ssh/authorized_keys file on the VM. See the client's "-h --deploy" output for more information on this new "--sshfile" option.

  • Support for post-deployment error printing (faults can now be included as part of Corrupted notifications).

  • Status client allows for a bulk query ("in one remote operation, show me a short update of all workspaces I manage at this service").

  • Introduction of a base client API which abstracts operations out from the webservices implementation and provides common subscription tools, utility methods, etc. (the main workspace client was internally reorganized to use this API: if you are a client developer you could examine this code for a lot of concrete usage samples).

1.3.2 - Control Agents
  • (re-)inclusion of mount-alter for file customization tasks. Using this requires an additional sudo rule.

  • Fix for a bug where certain NIC bridging problems with a workspace that had more than one NIC would not trip a backout.

  • Fix for a bug where the lack of a gateway specification would cause a problem when inserting a workspace's DHCP policy. Lack of a default gateway is legal (and sometimes necessary).

  • When DHCP configuration file cannot be found, a more helpful error is printed.

  • Files on VMM were not being deleted in one unpropagate situation where they should have been.

  • The VM name prefix sent to the VMM has been shortened from "workspace" to "wrksp". String length limits for NIC names were being reached too early ("wrksp" should accommodate workspace IDs in the millions).

  • We are including a "foreign-subnet" script that allows VMMs to deliver IP information over DHCP to workspaces even if the VMM itself does not have a presence on the target IP's subnet. This is an advanced configuration, you should read through the script's leading comments and make sure to clear up any questions before using.

    This is particularly useful for hosting workspaces with public IPs where the VMMs themselves do not have public IPs. This is because it does not require a unique interface alias for each VMM (public IPs are often scarce resources).

  • Added support for booting hard disk images (pygrub). Resolves enhancement request #5423. Client must specify mountpoint like "hda" instead of "hda1" for this to trigger.

  • See the end of the administrator guide for notes on configuration migration to this version from older workspace releases.

1.3.2 - Workspace pilot program
  • In some situations the sleep() system call that the pilot makes during an unexpected backout situation was returning too early. This syscall been replaced by an alternate implementation that will not fail in those situations.




1.3.1 - Summary
  • Added support for workspace pilot resource management. The pilot is a program the service will submit to a local site resource manager in order to obtain time on the VMM nodes. When not allocated to the workspace service, these nodes will be used for jobs as normal (the jobs run in normal system accounts in Xen domain 0 with no guest VMs running). See below.

  • Added functionality to ensure multiple workspaces (including groups of workspaces) are co-scheduled. See below.

  • Various client enhancements including ensemble service support, cleaner output, and new commandline options.

  • Various bug fixes.

  • There was a WSDL update: additions, changes and new namespaces.

1.3.1 - Services
  • Added support for workspace pilot resource management. The pilot is a program the service will submit to a local site resource manager in order to obtain time on the VMM nodes. When not allocated to the workspace service, these nodes will be used for jobs as normal (the jobs run in normal system accounts in Xen domain 0 with no guest VMs running).

    Several extra safeguards have been added to make sure the node is returned from VM hosting mode at the proper time, including support for:

    • the workspace service being down or malfunctioning
    • LRM preemption (including deliberate LRM job cancellation)
    • node reboot/shutdown

    Also included is a one-command "kill 9" facility for administrators as a "worst case scenario" contingency.

    Using the pilot is optional. By default the service does not operate with it, the service instead directly manages the nodes it is configured to manage.

  • Added functionality to ensure multiple workspaces (including groups of workspaces) are co-scheduled. This includes the introduction of the Workspace Ensemble Service. This functionality allows complex virtual clusters to have all its component workspaces be scheduled to run at once if that is necessary. This works with both the default and pilot-based resource managers.

  • All remote interfaces (WSDLs/schemas) have been updated with at least new namespaces. You can examine them directly online at the WSDL and XSD files page (or read the descriptions on the Interfaces section). The main difference is an extension to the factory create/deploy operation and the addition of the ensemble service.

  • SSH based workspace-control invocations may now be configured with an alternate private key.

  • SSH based workspace-control invocations now use options to ensure easier identification of misconfigurations (no password entry hang is possible now).

  • If using the pilot mechanisms, a new configuration section in the service configuration file needs to be uncommented for pilot specific configurations (see the configuration comments there).

  • If using the pilot mechanisms, a client may now not submit a flag to the factory that requests the workspace be unpropagated after the running time has elapsed. Instead, unpropagation must be triggered manually by a client before this deadline is reached.

  • If using the pilot mechanisms, a shared secret must be configured in etc/workspace_service/pilot/users.properties for HTTP digest access authentication based notifications from the pilot. Use the included shared-secret-suggestion.py script. (alternatively SSH may be used for notifications but it is slower)

  • New dependencies (these are distributed with the service):

    • backport-util-concurrent
    • jetty - only necessary if using the pilot with the faster, default HTTP digest access authentication based notifications.

  • Some platforms+JVMs have buffer size issues which caused some workspace-control invocations to fail. This problem is addressed.

  • DHCP based network delivery to the VMs now requires unique hostnames for each allocatable address (even if they do not resolve to an IP). This addresses Bug #5738.

1.3.1 - Reference clients
  • A new client workspace-ensemble allows you to destroy all workspaces in a running ensemble as well as trigger the workspaces in the ensemble to be co-scheduled and (afterwards) allowed to launch. This trigger is also available in the last workspace deployment of the ensemble, if desirable (this will save a web services operation).

  • Enhancement Bug #5795 is addressed, this allows an early unpropagate request to be sent. The new workspace action is "--shutdown-save" and requires a single or group workspace EPR.

  • The workspace program includes a new flag "--trash-at-shutdown" which allows callers to include a request that the service simply discards the VM after use (instead of unpropagating it). This is typical behavior for virtual cluster compute nodes, for example. The functionality itself is not new in this release, just this flag. It allows you to include the flag when using commandline based resource requests as well as override a given resource request file with a trash-at-shutdown flag.

  • The workspace program has improved output, especially in the cases where you are launching groups and ensembles.

1.3.1 - Control Agents
  • Note: a previously used TP1.2.3 or TP1.3 configuration file for workspace-control will still work because of the nature of these changes. See this migration section of the administrator's guide for details.

  • A bug with failed propagations has been addressed: Bug #5681.

  • Will now support older ISC DHCP versions (v2 servers). See Bug #5470.

  • The defaults paths for ebtables and the dhcpd.conf file are now the more common occurrences:

    • /sbin/ebtables is now /usr/sbin/ebtables
    • /etc/dhcp/dhcpd.conf is now /etc/dhcpd.conf

1.3.1 - Workspace pilot program
  • This is a new tarball on the download page and is only necessary when using pilot based resource management.




1.3 - Summary
  • There was a WSDL update, changes and new namespaces.

  • Functionality to start multiple workspaces in one request was added, including introduction of the Workspace Group Service.

  • Optional accounting functionality was added, including introduction of the Workspace Status Service.

  • Configuration enhancements to make service administration easier.

  • Various client enhancements including group and status service support, reorganized help output, and new commandline options.

  • Various bug fixes.

1.3 - Services
  • All remote interfaces, WSDLs/schemas, have been updated and also have new namespaces. You can examine them directly online at the WSDL and XSD files page (or read the descriptions on the Interfaces section).

  • The Workspace Factory Service was extended to support starting a homogeneous group of workspaces in one deployment request. A global maximum group size can be specified natively (without needing to use an authorization callout).

  • The Workspace Group Service was added to manage groups after deployment. See the group overview on the main interfaces page.

  • Hooks for accounting modules were added. These plugins allow you to track clients' used or reserved running time. There are separate reader and writer interfaces for flexibility. A default database backed implementation is provided and enabled by default. By default this implementation includes a periodic write to log files on the system (one for current reservations, another for major events). See Bug 5443.

  • The Workspace Status Service was added, it allows a Grid client to consult the usage statistics that the service has tracked about it. See Bug 5444.

  • Some configurations have been added, changed name or changed location in the JNDI configuration file, see this migration section of the administrator's guide for details.

  • Resource selection now favors VMMs not in use. The previous selection process accepted the first VMM with enough memory which could result in a situation where e.g. two workspaces are running on one VMM but no workspaces are running on another.

  • Resource pool configurations can now be adjusted without resetting the database, see this migration section of the administrator's guide for details.

  • Networking address pool configurations can now be adjusted without resetting the database, see this migration section of the administrator's guide for details.

  • Resolved Bug 5441: Add functionality for late network binding to client and service.

  • Resolved Bug 5442: Move persistence information to its own subdirectory. All information is not stored under $GLOBUS_LOCATION/var/workspace_service/ instead of various subdirectories of $GLOBUS_LOCATION/var itself.

  • Host certificate transfer functionality was removed. The association configuration and WSDL has changed accordingly.

  • Resolved Bug 5415: WorkspacePersistenceDB not updated after workspace --shutdown

  • Resolved Bug 5345: resource not destroyed correctly when time expires and shutdown method is "trash"

  • Asynchronous notifications from workspace-control (propagation events) are handled more reliably.

  • The toplevel build file includes many new convenience targets, including more control over what is deployed/undeployed and more control over the different kinds of persistence information.

  • The build files now do not proceed if your JDK is an earlier version than 1.4.

1.3 - Reference clients
  • The help system was organized, run the client with "-h" to see the definitive list and explanation of features old and new.

  • The client can subscribe and listen to many workspaces at a time after deploying a group. As this can be quite verbose for large groups, there are two new options to control subscription output verbosity. See the "-h" text.

  • There is a numnodes argument that will control how many workspaces will be requested during the create operation. If there is a NodeNumber element in a given deployment request file, this argument will override that. For more about group support, see the Interfaces section.

  • The client can now run management commands using both regular and group workspace EPRs (it looks at which it is dealing with).

  • Resolved Bug 5441: Add functionality for late network binding to client and service. In the default case where subscriptions are desired, the client will notice if networking is missing and requery for it when the workspace(s) move to the Running state.

  • Resolved Bug 5445: various reference client improvements.

  • There is a new workspace-status client for querying accounting information. See Bug 5444.

  • The sample XML (metadata, resource request, etc) files included with the client have been updated and more samples have been added.

  • The client build now checks that the sample XML (metadata, resource request, etc) files validate against their respective schemas. If your ant installation does not include the xmlvalidate task, these checks are skipped.

1.3 - Control Agents
  • Note: a previously used TP1.2.3 configuration file for workspace-control will still work because of the nature of these changes. See this migration section of the administrator's guide for details.

  • Resolved Bug 5360: destroy log shows dhcp/ebtables backout problem

  • install.py handles user groups better and has an improved --onlyverify mode

  • Removed unnecessary configurations from sample worksp.conf file.

  • ebtables-config.sh rule backout handles an additional corner case

1.3 - Internal (developers only)
  • JNDI class discovery is done differently, this may affect you if you have alternate implementations of any module or plugin interface. A new workspace Initializable interface can be used. See the org.globus.workspace.Locator class.

  • Message intake and initial validation support is now implemented as a plugin, see the org.globus.workspace.service.binding.BindingAdapter interface.

  • The default scheduler's "node picking" support is now implemented as a plugin, see the org.globus.workspace.scheduler.defaults.SlotManagement interface.

  • AllocateAndConfigure (association) support is now implemented as a plugin, see the org.globus.workspace.network.AssociationAdapter interface.

  • New optional AccountingEventAdapter and AccountingReaderAdapter plugins, see the org.globus.workspace.accounting package.

  • The optional creation-time authorization callout interface was altered to include group requests as well as the caller's accrued used and reserved running minutes (if an accounting reader is running).




TP1.2.3

  • Significant documentation updates including the addition of a guided User Quickstart and the Workspace Marketplace.

  • Added the ability to specify multiple partitions for one VM. There is a restriction in this version that only one partition file may be used with the propagation mechanisms, the other partitions must be cached or on a shared filesystem. (Bug 5216)

  • Added the ability to create blank partitions on the fly if the client specifies to do so by sending a storage request (the MB of blank space needed) in the resource requirements.

    Currently this hardcodes the filesystem to create on the blank partition (the default is ext2), in the future this may be specifiable by the client. (Bug 5215)

  • Added an HTTP transfer adapter for pre- and post-deployment staging. Included is the ability to provide checksums that will be checked after the transfer as well as decompression functionality. For more details, see the Optional parameters documentation. (Bug 5219)

  • Added the ability to choose hypervisors in the resource pool based on what networking associations they support. For example, a request may arrive for a workspace to have NICs on two separate networks: the pool node selection algorithm will use the requirement to support both of these networks in its search. (Bug 5214)

  • The workspace types schema, workspace_types.xsd, has a new namespace: the "2006/08" part of it is now "2007/03".

  • Resolved Bug 5211: networking allocations were not backed out (returned to pool) under all error conditions during initial request processing.

  • Resolved Bug 5212: queries on the Workspace Factory resource properties gave incorrect association information after a container restart.

  • Resolved Bug 5213: the Advisory IP acquisition method was being incorrectly validated.

  • Resolved Bug 5217: the workspace-control program was not backing out DHCP policy additions under all error conditions.



TP1.2.2 (#)

  • Added support for DHCP delivery of networking information. See the administrator guide DHCP overview and configuration section which also includes a link to a design document.

  • Added unit tests under "workspace-service/service/java/test/".

  • Streamlined the logistics section of metadata, see the logistics section of the interfaces guide for more information.

  • Small bugfixes in StateTransition.

  • Internal refactoring to better accommodate unit tests.



TP1.2.1

  • Resolved Bug 4792 (propagation via globus-url-copy adds extra file URL scheme)

  • Resolved Bug 4793 (xenlocal arg parsing error)

  • Resolved Bug 4879 (issue with database jars that were already installed)

  • Resolved Bug 4880 (extra semicolons being sent in network information)

  • Fixed client build invocation (WS stubs weren't deployed by default)

  • Minor internal refactoring



TP1.2

  • Added support for a resource pool model that allows one grid service to manage a large group of VMMs, sending incoming workspace deployment requests to appropriate VMM nodes for instantiation.

  • To support the resource pool model, managed file propagation support was added to move files associated with workspaces to and from the resource pool nodes and storage nodes. The current choices are GridFTP and SCP.

  • An optional RFT staging plugin is available to allow a deployment request to include a stage in and/or stage out directive. This is to manage client file movement in the grid context as opposed to the managed, inter-site propagation functionality.

  • To support host based authorization (which include a reverse IP check in its algorithm), IP pool entries may now optionally include matching certificate and key pairs that are moved on to the VM when it is allocated a particular networking address.

  • New functionality is supported: create-paused, reboot. A choice of default shutdown method when the maximum running time has been reached: normal, trashed.

  • Logging choices for both the grid service and workspace-control program have been significantly enhanced.

  • The VMM workspace-control program has a new installer that will install the executable and create all of its necessary work directories and will review all directory and file permissions for safety (and correct problems if instructed to).

  • The VMM workspace-control program now employs a sudo callout to do its privileged work.

  • The VMM workspace-control program has been enhanced to isolate user files from each other and is set up with a safe environment for image altering. A new /opt/workspace hierarchy is the default installation option but it allows for flexible choices.

  • The grid service portion has been significantly improved internally for asynchronous event handling, scalability and the ability to replace more of its subsystems with alternate or improved implementations.



TP1.1.1

  • Fix for service loading order problem on some JVMs (caused a database not found error). Bug 4602

  • Some invocations to backend were missing sudo prefix used for Xen3 support.

  • Fixed support for Xen3 networking (Bug 3994).

  • Better error reporting for sudo misconfigurations (Bug 4601).

  • Fix for backend interface problem when the Allocate networking method was used for multiple NICs.

  • Xen3 is now the default sample configuration for service and workspace_control.



TP1.1

  • Support for a new, "Allocate" networking method that allows the workspace service administrator to specify pools of IP addresses (and DNS information) which are then assigned to virtual machines on deployment.

  • The resource properties have been extended to publish deployment information about a workspace, such as its IP address.

  • Workspace metadata validation has been extended to support requirement checking for specific architecture, hypervisor version, and CPU. The workspace factory advertises the supported qualities as a resource property; the requirement section of workspace metadata is checked against the supported set.

  • Authorization handling has been significantly extended. The workspace service can now accept and process VOMS credentials and SAML attributes (GridShib). Further, an authorization callout has been added to the service for fine grain policies. This callout can be configured to implementations of a simple attribute list lookup or a python script allowing for arbitrary authorization logic.

  • Support for Xen3 has been added.

  • The workspace client has been extended to accommodate new functionality. In addition the client interface has been extended to enable subscribing for notifications and specifying the resource allocation information at command-line.

  • Installation has been improved -- the client now requires only a minimal installation (as opposed to the full service installation).