Warning: This document describes an old release. Check here for the current version.

Nimbus 2.9 Troubleshooting

Any questions can be posted to the workspace-user mailing list and will likely be answered promptly by a member of the community. For instructions on how to subscribe and post messages to this list, see the contact page.

Sections:


Known bugs


Client related

  • Problem: Trying to use EC2 client but getting an error like: (#)

    SecurityException: [SEC]Operation name could not be determined

    Solution: Either the elastic service is not installed at the target URL or you are using the wrong client tools. Amazon EC2 upgrades happen without warning and so there is sometimes a sync error between their default tools and the tools needed to work with particular Nimbus elastic services.

    The current release is known to work with this specific version of the EC2 client tools (compatible with EC2 WSDL 2009-08-15).



  • Problem: Trying to use EC2 client but getting an error like: (#)

    CertificateParsingException: invalid DER-encoded certificate data

    Solution: These clients typically only handle un-annotated public certificate text blobs. Make a new file with just these lines from the bottom of the problem certificate pem file:

    -----BEGIN CERTIFICATE-----
    MIICKjCCAZOgAwIBA [...]
    -----END CERTIFICATE-----

    And make sure to point the tools at the new file:

    nimbus $ export EC2_CERT=/path/to/new/file/usercert-justcert.pem


  • Problem: Trying to use EC2 client but getting an error like: (#)

    IOException: DER length more than 4 bytes

    Solution: These clients typically only handle unencrypted private keys, you are perhaps using an encrypted private key. Make an unencrypted version like so:

    nimbus $ openssl rsa -in userkey.pem -out bare-userkey.pem
    nimbus $ chmod 400 bare-userkey.pem
    nimbus $ export EC2_PRIVATE_KEY=/path/to/new/file/bare-userkey.pem


  • Problem: Trying to use EC2 client but getting an error like: (#)

    "unable to find valid certification path to requested target"

    Solution: This means the EC2 client does not trust the https endpoint. This is likely because the endpoint is not using an SSL certificate signed by the standard Java JRE's trusted CAs.

    An option here is to use this program (source included) to add the certificate. It will contact the endpoint, retrieve the advertised certificate, and then give you an option to create a new "keystore" with the old trusted certificates and the new one you just added.

    To enable the newly created keystore, you need to copy it to the JRE's security directory, for example:

    root # mv /opt/sun-jdk-1.6.0.07/jre/lib/security/cacerts /opt/sun-jdk-1.6.0.07/jre/lib/security/cacerts.backup
    root # cp jssecacerts /opt/sun-jdk-1.6.0.07/jre/lib/security/cacerts

    As noted here, using http and not https to host the service is not a good idea.



  • Problem: Trying to use EC2 client for the first time as the site administrator testing the installation but getting an error like: (#)

    General: An error was discovered processing the <wsse:Security> header.
         (WSSecurityEngine: Invalid timestamp {0}); nested exception is:
            java.text.ParseException: Unparseable date: "2008-08-07T16:23:22.885Z"

    Solution: Make sure you have altered the container configuration correctly to handle EC2 clients, as explained here. Container restart required after the configuration change.

    There is a sample container server-config.wsdd configuration to compare against here.



  • Problem: Trying to use EC2 client for the first time as the site administrator testing the installation but getting an error like: (#)

    MustUnderstand: Header
                {Security}http://docs.oasis-open.org/wss/2004/01/oasis-200401-wss-wssecurity-secext-1.0.xsd
                was not undertsood by the service.

    Solution: Make sure you have altered the container configuration correctly to handle EC2 clients, as explained here. Container restart required after the configuration change.

    In particular, this may be that you have "X509EncryptHandler" disabled instead of "X509SignHandler".

    There is a sample container server-config.wsdd configuration to compare against here.


Service related

  • Problem: The VMs do not obtain addresses via DHCP. (#)

    Solution: Make sure dom0's interface name(s) configuration is valid, the "dhcpvif" part of the association configuration in the worksp.conf file. See the backend networking configuration section for more details and the right setting to use.



  • Problem: Sometimes, from the start of the workspace's deployment, one of the VM's NICs is unreachable. (specifically, the ARP protocol does not resolve the IP address to a MAC address) (#)

    Solution: Make sure the MAC address prefix is valid. See the backend networking configuration section for more details and the right setting to use.

    The service should be checking the validity of the MAC prefix now actually, you should never have this problem...



  • Problem: The container doesn't start anymore and you are getting a long JNDI related exception. You see "InvocationTargetException" and "NameAlreadyBoundException" and probably "Name home is already bound in this Context". > (#)

    Solution: This will happen if you make a backup of the "etc/nimbus" directory inside the etc directory. For example, you ran "cp -a nimbus nimbus_backups". The container thinks these are both directories for services and tries to consume both JNDI files. Hence, the configurations are consumed multiple times which is an error because only one of each can be "bound in this context" at a time.



  • Problem: The workspace-control "sbin/test-dependencies.sh" fails to work on Debian. (#)

    Debian has an old libvirt version so installing it is often a matter of running "configure --with-python; make; make install"

    But that still may make the "sbin/test-dependencies.sh" script fail because the "import libvirt" statement will not work at all.

    Solution: Add the location of libvirt.so.0 to the LD_LIBRARY_PATH environment variable.



  • Problem: The Cumulus build fails with an error about finding Python.h (#)

    You went the GCC route (see the dependencies page) and are seeing an error like:

    src/crypto/crypto.c:12:20: error: Python.h

    Solution: Install the Python development package (python-dev in Debian).



  • Problem: The Cumulus build fails with an error about compiling against crypto headers (#)

    You went the GCC route (see the dependencies page) and are seeing an error like:

    src/crypto/x509name.h:19: error: expected ')' before '*' token

    Solution: Install the OpenSSL development package (libssl-dev in Debian).



  • Problem: The service starts with an error about "no such parent file found Repo 1" (#)

    You installed Nimbus (or have been playing with the nimbus-reset-state program) and see the following error:

    Error creating bean with name 'nimbus-elastic.image.repository' defined 
    in file [/tmp/x/services/etc/nimbus/elastic/other/main.xml]: Invocation 
    of init method failed; nested exception is 
    org.nimbus.authz.AuthzDBException: no such parent file found Repo 1

    This typically means you have gotten into a situation where the installer did not run cumulus-create-repo-admin to configure the Cumulus/IaaS connection. Or you have reset Cumulus state (e.g. using nimbus-reset-state) recently.

    Solution: Run the following:

    $NIMBUS_HOME/ve/bin/cumulus-create-repo-admin nimbusadmin@localhost Repo

KVM VMM Setup

  • Problem: When testing the sample KVM image virsh create throws the error

    # virsh -c 'qemu:///system' create /tmp/z2c.xml
    error: Failed to create domain from /tmp/z2c.xml
    error: internal error Only 1 ide controller is supported
                
    (#)

    Solution: Make sure that you specified the mountpoint hda and not sda1 or hda1.