Setting Up an External OpenStack Testing System – Part 2

In this third article in the series, we discuss adding one or more Jenkins slave nodes to the external OpenStack testing platform that you (hopefully) set up in the second article in the series. The Jenkins slave nodes we create today will run Devstack and execute a set of Tempest integration tests against that Devstack environment.

Add a Credentials Record on the Jenkins Master

Before we can add a new slave node record on the Jenkins master, we need to create a set of credentials for the master to use when communicating with the slave nodes. Head over to the Jenkins web UI, which by default will be located at http://$MASTER_IP:8080/. Once there, follow these steps:

  1. Click the Credentials link on the left side panel
  2. Click the link for the Global domain:
    credentials-list
  3. Click the Add credentials link
  4. Select SSH username with private key from the dropdown labeled “Kind”
  5. Enter “jenkins” in the Username textbox
  6. Select the “From a file on Jenkins master” radio button and enter /var/lib/jenkins/.ssh/id_rsa in the File textbox:
    add-credentials
  7. Click the OK button

Construct a Jenkins Slave Node

We will now install Puppet and the software necessary for running Devstack and Jenkins slave agents on a node.

Slave Node Requirements

On the host or virtual machine that you have selected to use as your Jenkins slave node, you will need to ensure, like the Jenkins master node, that the node has the following:

  • These basic packages are installed:
    • wget
    • openssl
    • ssl-cert
    • ca-certificates
  • Have the SSH keys you use with GitHub in ~/.ssh/. It also helps to bring over your ~/.ssh/known_hosts and ~/.ssh/config files as well.
  • Have at least 40G of available disk space
  • .

IMPORTANT NOTE: If you were considering using LXC containers for your Jenkins slave nodes (as I originally struggled to use)…. Use a KVM or other non-shared-kernel virtual machine for the devstack-running Jenkins slaves. Bugs like the inability to run open-iscsi in an LXC container make it impossible to run devstack inside an LXC container.

Download Your Config Data Repository

In the second article in this series, we went over the need for a data repository and, if you followed along in that article, you created a Git repository and stored an SSH key pair in that repository for Jenkins to use. Let’s get that data repository onto the slave node:

git clone $YOUR_DATA_REPO data

Install the Jenkins software and pre-cache OpenStack/Devstack Git Repos

And now, we install Puppet and have Puppet set up the slave software:

wget https://raw.github.com/jaypipes/os-ext-testing/master/puppet/install_slave.sh
bash install_slave.sh

Puppet will run for some time, installing the Jenkins slave agent software and necessary dependencies for running Devstack. Then you will see output like this:

Running: ['git', 'clone', 'https://git.openstack.org/openstack-dev/cookiecutter', 'openstack-dev/cookiecutter']
Cloning into 'openstack-dev/cookiecutter'...
...

Which indicates Puppet done and a set of Nodepool scripts are running to cache upstream OpenStack Git repositories on the node and prepare Devstack. Part of the process of preparing Devstack involves downloading images that are used by Devstack for testing. Note that this step takes a long time! Go have a beer or other beverage and work on something else for a couple hours.

Adding a Slave Node on the Jenkins Master

In order to “register” our slave node with the Jenkins master, we need to create a new node record on the master. First, go to the Jenkins web UI, and then follow these steps:

  1. Click the Manage Jenkins link on the left
  2. Scroll down and click the Manage Nodes link
  3. Click the New Node link on the left:
    manage-nodes
  4. Enter “devstack_slave1” in the Node name textbox
  5. Select the Dumb Slave radio button:
    add-node
  6. Click the OK button
  7. Enter 2 in the Executors textbox
  8. Enter “/home/jenkins/workspaces” in the Remote FS root textbox
  9. Enter “devstack_slave” in the Labels textbox
  10. Enter the IP Address of your slave host or VM in the Host textbox
  11. Select jenkins from the Credentials dropdown:
    new-node-screen
  12. Click the Save button
  13. Click the Log link on the left. The log should show the master connecting to the slave, and at the end of the log should be: “Slave successfully connected and online”:
    slave-log

Test the dsvm-tempest-full Jenkins job

Now we are ready to have our Jenkins slave execute the long-running Jenkins job that uses Devstack to install an OpenStack environment on the Jenkins slave node, and run a set of Tempest tests against that environment. We want to test that the master can successfully run this long-running job before we set the job to be triggered by the upstream Gerrit event stream.

Go to the Jenkins web UI, click on the dsvm-tempest-full link in the jobs listing, and then click the Build Now link. You will notice an executor start up and a link to a newly-running job will appear in the Build History box on the left:

Build History panel in Jenkins

Build History panel in Jenkins

Click on the link to the new job, then click Console Output in the left panel. You should see the job executing, with Bash output showing up on the right:

Manually running the dsvm-tempest-full Jenkins job

Manually running the dsvm-tempest-full Jenkins job

Troubleshooting

If you see errors pop up, you will need to address those issues. In my testing, issues generally were around:

  • Firewall/networking issues: Make sure that the Jenkins master node can properly communicate over SSH port 22 to the slave nodes. If you are using virtual machines to run the master or slave nodes, make sure you don’t have any iptables rules that are preventing traffic from master to slave.
  • Missing files like “No file found: /opt/nodepool-scripts/…”: Make sure that the install_slave.sh Bash script completed successfully. This script takes a long time to execute, as it pulls down a bunch of images for Devstack caching.
  • LXC: See above about why you cannot currently use LXC containers for Jenkins slaves that run Devstack
  • Zuul processes borked: In order to have jobs triggered from upstream, both the zuul-server and zuul-merge processes need to be running, connecting to Gearman, and firing job events properly. First, make sure the right processes are running:
    # First, make sure there are **2** zuul-server processes and
    # **1** zuul-merger process when you run this:
    ps aux | grep zuul
    # If there aren't, do this:
    sudo rm -rf /var/run/zuul/*
    sudo service zuul start
    sudo service zuul-merger start
    

    Next, make sure that the Gearman service has registered queues for all the Jenkins jobs. You can do this using telnet (4730 is the default port for Gearman):

    ubuntu@master:~$ telnet 127.0.0.1 4730
    Trying 127.0.0.1...
    Connected to 127.0.0.1.
    Escape character is '^]'.
    status
    build:noop-check-communication:master	0	0	2
    build:dsvm-tempest-full	0	0	1
    build:dsvm-tempest-full:devstack_slave	0	0	1
    merger:merge	0	0	1
    zuul:enqueue	0	0	1
    merger:update	0	0	1
    zuul:promote	0	0	1
    set_description:master	0	0	1
    build:noop-check-communication	0	0	2
    stop:master	0	0	1
    .
    ^]
    
    telnet> quit 
    Connection closed.
    

Enabling the dsvm-tempest-full Job in the Zuul Pipelines

Once you’ve successfully run the dsvm-tempest-full job manually, you should now enable this job in the appropriate Zuul pipelines. To do so, on the Jenkins master node, you will want to edit the etc/zuul/layout.yaml file in your data repository (don’t forget to git commit your changes after you’ve made them and push the changes to the location of your data repository’s canonical location).

If you used the example layout.yaml from my data repository and you’ve been following along this tutorial series, the projects section of your file will look like this:

projects:
  - name: openstack-dev/sandbox
    check:
      # Remove this after successfully verifying communication with upstream
      # and seeing a posted successful review.
      - noop-check-communication
      # Uncomment this job when you have a jenkins slave running and want to
      # test a full Tempest run within devstack.
      #- dsvm-tempest-full
    gate:
      # Remove this after successfully verifying communication with upstream
      # and seeing a posted successful review.
      - noop-check-communication
      # Uncomment this job when you have a jenkins slave running and want to
      # test a full Tempest run within devstack.
      #- dsvm-tempest-full

To enable the dsvm-tempest-full Jenkins job to run in the check pipeline when a patch is received (or recheck comment added) to the openstack-dev/sandbox project, simply uncomment the line:

      #- dsvm-tempest-full

And then reload Zuul and Zuul-merger:

sudo service zuul reload
sudo service zuul-merger reload

From now on, new patches and recheck comments on the openstack-dev/sandbox project will fire the dsvm-tempest-full Jenkins job on your devstack slave node. :) If your test run was successful, you will see something like this in your Jenkins console for the job run:

\o/ Steve Holt!

\o/ Steve Holt!

And you will note that on the patch that triggered your Jenkins job will show a successful comment, and a +1 Verified vote:

A comment showing external job successful runs

A comment showing external job successful runs

What Next?

From here, the changes you make to your Jenkins Job configuration files are up to you. The first place to look for ideas is the devstack-vm-gate.sh script. Look near the bottom of that script for a number of environment variables that you can set in order to tinker with what the script will execute.

If you are a Cinder storage vendor looking to test your hardware and associated Cinder driver against OpenStack, you will want to either make changes to the example dsvm-tempest-full or create a copy of that example job definition and customize it to your needs. You will want to make sure that Cinder is configured to use your storage driver in the cinder.conf file. You may want to create some script that copies most of what the devstack-vm-gate.sh script does, and call the devstack ini_set function to configure your storage driver, and then run devstack and tempest.

Publishing Console and Devstack Logs

Finally, you will want to get the log files that are collected by both Jenkins and the devstack run published to some external site. Folks at Arista have used dropbox.com to do this. I’ll leave it up to an exercise for the reader to set this up. Hint: that you will want to set the PUBLISH_HOST variable in your data repository’s vars.sh to a host that you have SCP rights to, and uncomment the publishers section in the example dsvm-tempest-full job:

#    publishers:
#      - devstack-logs  # In macros.yaml from os-ext-testing
#      - console-log  # In macros.yaml from os-ext-testing

Final Thoughts

I hope this three-part article series has been helpful for you to understand the upstream OpenStack continuous integration platform, and instructional in helping you set up your own external testing platform using Jenkins, Zuul, and Jenkins Job Builder, and Devstack-Gate. Please do let me know if you run into issues. I will post some updates to the Troubleshooting section above when I hear from you and (hopefully help you resolve any problems).

  • Pingback: Setting Up an External OpenStack Testing System – Part 1 – join-fu!

  • Trinath Somanchi

    Hi-

    When I run the full tempest test on sandbox, as specified, I get the below error

    /opt/stack/new/devstack-gate/devstack-vm-gate.sh: line 206: [: : integer expression expected
    18:51:33 Running devstack
    18:51:33 stdbuf: failed to run command `./stack.sh': No such file or directory

    18:51:37 Build step ‘Execute shell’ marked build as failure
    18:51:37 Finished: FAILURE

    I found that there is no devstack installed in /opt/stack/new/devstack/*

    Do I need to manually install devstack there ??

    Did I miss some thing while configuration of the slave node ??

    • http://joinfu.com/ Jay Pipes

      Hi Trinath!

      Darragh O’Reilly caught this issue and pushed up a change that you should apply to your devstack JJB job:

      https://github.com/jaypipes/os-ext-testing-data/blob/master/etc/jenkins_jobs/config/examples.yaml#L25-L30

      Please let me know if that fixes things for you. Remember, once you’ve applied the above to your examples.yaml in your data repository, reload the Jenkins jobs, like so:

      sudo jenkins_jobs –flush-cache update /etc/jenkins_jobs/config

      Best,
      -jay

      • Trinath Somanchi

        Thanks for the update Jay!.

        Is this the fix for all types of openstack (nova|neutron|cinder|heat) testing or specific to Sandbox. ??

  • Trinath Somanchi

    Hi Jay-

    The devstack installation is not done good. For every run even from the starting of the slave node installation, I get ‘keystone and nova based errors” ..

    Can you guide me on how to troubleshoot the same.??

  • Trinath Somanchi

    step:

    git clone $YOUR_DATA_REPO data

    is not required since, install-slave.sh handles the same.

  • Pingback: OpenStack Community Weekly Newsletter (Feb 14 – 28) » The OpenStack Blog

  • Pattabi

    Hi Jay,
    Thanks to your detailed blog post. I was able to set up the Jenkins Master and Slave successfully.
    However, I do see the tempest test cases fail when i use the sample dsvm-tempest-full jenkins job. I believe this has nothing to with the slave node set up. Is my understanding right ?
    Thanks.
    Pattabi

  • Trinath Somanchi

    Hi Jay-

    In my current setup I’m using 2 physical machines [1] master node [2] slave node.

    I have few doubts here,

    [-1-] How I know whether a patch set is applied on the code (say cinder /nova/neutron) ? [-2-] Do I need to run devstack/clean.sh for each patchset testing so as to clean the complete devstack env ? (I think Yes)
    [-3-] The Env vars present in jenkins job configuration as not shown in $> echo $DEVSTACK_GATE_TEMPEST cmds in slave node. How can we check the env vars are correctly loaded in slave node from the jenkins job?
    [-4-] When I run tempest testing I get this error “No repository found in /opt/stack/new/tempest. Create one by running “testr init”.” Is there anything I’m missing while configuring the tempest using devstack… ??

  • Peter Wang

    Hi Jay
    I am exciting about this guide for 3-party CI System.

    I already set up the master and slave and successfully run the dsvm-tempest-full job in Jenkins locally

    i have following problem:
    1. the Job will start up devstack in slave but when JOB end, it will not unstack.sh the devstack, so after several run of Job, there are some many duplicate cinder-volume and other openstack process. could you please add this cleanup/unstack?

    2. it’s ok if the job did not unstack the devstack, after i mannual run unstack.sh under /opt/stack/new/devstack, these process did not end. do you know how to clean up the devstack?

    Thanks
    Peter

  • Peter Wang

    hi Jay, i met “Invalid OpenStack Identity credentials.”
    when job is running devstack-gate-vm.sh

    do you have any suggestion?

    Thanks

  • Isaac

    Hi Jay,

    Jenkins job fails with the following error:

    14:22:40 Started by user anonymous
    14:22:40 [EnvInject] – Loading node environment variables.
    14:22:40 Building remotely on cinder-slave (devstack_slave) in workspace /home/jenkins/workspaces/workspace/dsvm-tempest-full
    14:22:40 [dsvm-tempest-full] $ /bin/bash -xe /tmp/hudson5851200885453762148.sh
    14:22:40 + [[ ! -e devstack-gate ]]
    14:22:40 + git clone git://git.openstack.org/openstack-infra/devstack-gate
    14:23:45 Cloning into ‘devstack-gate’…
    14:23:48 [dsvm-tempest-full] $ /bin/bash -xe /tmp/hudson4920296495944828678.sh
    14:23:48 + ‘[' -z ']‘
    14:23:48 + export ZUUL_PROJECT=openstack-dev/sandbox
    14:23:48 + ZUUL_PROJECT=openstack-dev/sandbox
    14:23:48 + ‘[' -z ']‘
    14:23:48 + export ZUUL_BRANCH=master
    14:23:48 + ZUUL_BRANCH=master
    14:23:48 + export PYTHONUNBUFFERED=true
    14:23:48 + PYTHONUNBUFFERED=true
    14:23:48 + export DEVSTACK_GATE_TIMEOUT=180
    14:23:48 + DEVSTACK_GATE_TIMEOUT=180
    14:23:48 + export DEVSTACK_GATE_TEMPEST=1
    14:23:48 + DEVSTACK_GATE_TEMPEST=1
    14:23:48 + export RE_EXEC=true
    14:23:48 + RE_EXEC=true
    14:23:48 + cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh
    14:23:48 + ./safe-devstack-vm-gate-wrap.sh
    14:23:49 * Stopping NTP server ntpd
    14:23:49 …done.
    14:23:58 13 Aug 14:24:09 ntpdate[3148]: no server suitable for synchronization found
    14:23:58 * Starting NTP server ntpd
    14:23:58 …done.
    14:23:59 Triggered by: https://review.openstack.org/ patchset
    14:23:59 Pipeline:
    14:23:59 Available disk space on this host:
    14:23:59 Filesystem Size Used Avail Use% Mounted on
    14:23:59 /dev/sda1 19G 7.4G 11G 42% /
    14:23:59 udev 2.0G 4.0K 2.0G 1% /dev
    14:23:59 tmpfs 396M 732K 395M 1% /run
    14:23:59 none 5.0M 0 5.0M 0% /run/lock
    14:23:59 none 2.0G 152K 2.0G 1% /run/shm
    14:23:59 Setting up the host
    14:23:59 … this takes a few seconds (logs at logs/devstack-gate-setup-host.txt.gz)
    14:37:35 Setting up the workspace
    14:37:35 … this takes 3 – 5 minutes (logs at logs/devstack-gate-setup-workspace-new.txt.gz)
    16:27:21 Running gate_hook
    16:27:21 Job timeout set to: 52 minutes
    16:27:21 /opt/stack/new/devstack-gate/devstack-vm-gate.sh: line 287: cd: /opt/stack/new/devstack: No such file or directory
    16:27:21 ERROR: the main setup script run by this job failed – exit code: 1
    16:27:21 please look at the relevant log files to determine the root cause
    16:27:21 Cleaning up host
    16:27:21 … this takes 3 – 4 minutes (logs at logs/devstack-gate-cleanup-host.txt.gz)
    16:27:24 Build step ‘Execute shell’ marked build as failure
    16:27:25 Finished: FAILURE

    It seems that something is wrong with the configuration.
    I did found devstack directory in /opt/git…… but not in /opt/stack…..
    Can you please help?

    Isaac

  • Sunil

    The following error triggers the error toward the end:
    ….
    Error: /Stage[main]/Os_ext_testing::Base/File[/opt/nodepool-scripts]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/openstack_project/nodepool/scripts
    Error: /Stage[main]/Os_ext_testing::Base/File[/etc/apt/preferences.d/00-puppet.pref]: Could not evaluate: Could not retrieve information from environment production source(s) puppet:///modules/openstack_project/00-puppet.pref
    ….
    Info: Creating state file /var/lib/puppet/state/state.yaml
    Notice: Finished catalog run in 583.70 seconds
    python: can’t open file ‘/opt/nodepool-scripts/cache_git_repos.py': [Errno 2] No such file or directory

    The folder /opt/nodepool-scripts does not exist. I noticed the same error while running install_master.sh as well.

    Any ideas? What exactly happened to puppet:///modules/openstack_project/nodepool/scripts, why did it go missing?

    Thanks
    Sunil

  • Sunil

    Looks like I got this working by checking out a revision of the config repo before the split happened…:)

    My first attempt was to fake the original config tree by combining the current two trees into one. But that did not work, and hence I decided to go with a version which is one revision before the split happened.

    You may run into other issues but I am not sure if they were specific to my setup or not. There was a mysql error, issue with cache files being in the wrong user’s folder, issue with setting rlimits for number of open files. The test completed successfully after this, although it skipped some tests, including the neutron ones. Not sure what happened with neutron service. I will investigate that next.

    Jay: I think you might wanna put this info into your troubleshooting section. Thanks for this great series of blog posts. It helped me immensely!

    Here is the patch:

    — install_slave.sh.orig 2014-10-09 12:07:08.845785675 -0700
    +++ install_slave.sh 2014-10-09 12:08:43.192410199 -0700
    @@ -3,6 +3,7 @@
    # Sets up a slave Jenkins server intended to run devstack-based Jenkins jobs

    set -e
    +set -xv

    THIS_DIR=`pwd`

    @@ -18,6 +19,7 @@
    sudo bash -xe install_puppet.sh
    sudo git clone https://review.openstack.org/p/openstack-infra/config.git
    /root/config
    + sudo /bin/bash -c “cd /root/config; git checkout 30e638cc59c472419ecca4fc0509602eeca08924″
    sudo /bin/bash /root/config/install_modules.sh
    fi

    — /opt/nodepool-scripts/cache_git_repos.py.orig 2014-10-09 11:31:35.267106571 -0700
    +++ /opt/nodepool-scripts/cache_git_repos.py 2014-10-09 11:33:51.049090992 -0700
    @@ -24,7 +24,7 @@

    from common import run_local

    -URL = (‘https://git.openstack.org/cgit/openstack-infra/config/plain/’
    +URL = (‘/root/config/’
    ‘modules/openstack_project/files/review.projects.yaml’)
    PROJECT_RE = re.compile(‘^-?s+project:s+(.*)$’)

    @@ -68,7 +68,7 @@

    def main():
    # TODO(jeblair): use gerrit rest api when available
    – data = urllib2.urlopen(URL).read()
    + data = open(URL).read()
    for line in data.split(‘n’):
    # We’re regex-parsing YAML so that we don’t have to depend on the
    # YAML module which is not in the stdlib.

    • http://joinfu.com/ Jay Pipes

      Thanks, Sunil! I will certainly update this post in the coming days to address the issues with the new project-config repo.

      Best,
      -jay

  • Sunil

    Quick question: if I run the job manually twice in a row, the second run fails. I think the first run is leaving behind some cruft which interferes with the second run.

    Running unstack.sh in between does not help.

    Is anybody else seeing this?

    How do people (or openstack infra) deal with this on their CI installations? I am thinking of creating an encompassing job which runs on master and restores a clean snapshot of the slave before firing the actual CI job. But this is a sad workaround for something which should be handled at infra level.

    • http://joinfu.com/ Jay Pipes

      Sunil, I recommend using a pool of single-use slave nodes for running integration tests. Otherwise, like you noticed, the Tempest tests leave a bunch of cruft around that needs to be cleaned up (and you never quite know if you’ve cleaned it all up :)

      Alternately, you might want to run Akihiro Motoki’s cleaner scripts from this repo here:

      https://github.com/amotoki/devstack-tools/blob/master/devstack-cleaner.sh

      Best,
      -jay

  • Bharat Kumar Kobagana

    Hi, Can any one please help me how to configure Cinder use my storage back-end (cinder.conf).


    Thanks in advance….