Archive for category Python

Looking for a Few Good Engineers

Do you know Python? Do you get a thrill breaking other people’s code? Do you have experience with Chef, Puppet, Cobbler, Orchestra, or Jenkins? Have you ever deployed or worked on highly distributed systems? Do you understand virtualization technologies like KVM, Xen, ESX or Hyper-V?

If you answered “Yes!” to any of the questions above and are interested in working in a distributed, high-energy engineering team on solving complex problems with cloud infrastructure software, I want to hear from you. Experience with OpenStack is a huge plus.

I’m looking for QA software engineers, software deployment and/or automation engineers and software developers that can hit the ground running and make a big impact from Day One. Feel free to email me at REVERSE('moc.liamg@sepipyaj').

Diagnose and fix PEP8 issues during code review

I figured I’d write a quick post about how to deal with “pep8 issues” that come up during code reviews on OpenStack core projects. These issues come up often for new contributors, and it can be a source of frustration until the contributor understands how to diagnose and fix the issues that come up.

PEP8 is the Python PEP that deals with a recommended code style. All core (and periphery Python) OpenStack projects validate that new code pushed to the source tree is “pep8-compliant”. When a new patchset is pushed from code review to Jenkins for the set of automated pre-merge tests, the pep8 command-line tool is run against the new source tree to ensure it meets PEP8 code style standards.

If this PEP8 Jenkins job fails, the code submitter will see a notification that the job failed, and the contributor must fix up any pep8 issues and push those fixes up for review again. Typically, this notification looks something like this:

Change subject: Added Keypair extension (os-keypairs) client and tests LP#900139
......................................................................

Patch Set 2: I would prefer that you didn't submit this

Build Unstable

https://jenkins.openstack.org/job/gate-tempest-pep8/38/ : UNSTABLE
https://jenkins.openstack.org/job/gate-tempest-merge/78/ : SUCCESS

--
To view, visit https://review.openstack.org/3179
To unsubscribe, visit https://review.openstack.org/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I34c7e9aa6a1796b8d4c3ac9b3b69438796752866
Gerrit-PatchSet: 2
Gerrit-Project: openstack/tempest
Gerrit-Branch: master
Gerrit-Owner: kavan-patil
Gerrit-Reviewer: Brian Waldon 
Gerrit-Reviewer: Jay Pipes 
Gerrit-Reviewer: Jenkins
Gerrit-Reviewer: kavan-patil

There are a couple ways you can diagnose what style points your code violated. Probably the easiest and fastest is to just follow the link in the notification email to the Jenkins job. Clicking the link above, I get to the Jenkins job page, which looks like this:

Clicking on the graph, I get to a details screen showing the source files and lines of code that violated pep8 rules:

I can then go to line 86 of tempest/openstack.py and investigate the code style

Alternately, I could run the pep8 CLI tool on my local branch, which will tell me the pep8 violations and what to fix, as shown here:

jpipes@uberbox:~/repos/tempest$ pep8 --repeat tempest
tempest/openstack.py:86:73: W292 no newline at end of file

There we are… the tempest/openstack.py file doesn’t end with a newline. An easy fixup. :)

The Science (or Art?) of Commit Messages

There are some things in the world of development that you appreciate much more when you do a lot of code reviews. One of those things is commit messages.

At first glance, commit messages seem to be a small, relatively innocuous thing for a developer. When you commit code, you type in some description about your code changes and then, typically, push your code somewhere for review by someone.

Regardless of whether the code you pushed is going to an open source project, an internal proprietary code repository, or just some code exchanged between friends working on a joint project, that simple little commit message tells the person reading your code a whole lot about you. It speaks volumes about the way you feel about the code you submit and the quality of the review you expect for your code.

As an example, suppose I was working on some code that fixed a bug. I got my code ready for initial review and I did the following:

$> git commit -a -m "Fixes some stuff"

And then I push my commit somewhere using git push

Inevitably, what happens is that another developer will get some email or notification that I have pushed code up to some repository. It is likely that this notification will look something like this:

Change subject: Fixes some stuff
......................................................................

Fixes some stuff

Change-Id: I79bbac32b5c99742b5cb283c6e55e6204bf92adc
---
M path/to/some/changed/file
1 file changed, 1 insertion(+), 1 deletion(-)

And in the notification will be some link to a place to go do a code review.

Now, what do you think is the first thought that goes through the reviewer’s mind? My guess would be: Really? Fixes what stuff? By not including any context about what the patch is attempting to solve, you leave the reviewer with a bad taste in their mouth. And a bad taste in the reviewer’s mouth generally means one thing: a reluctance to review your patch.

OK, so what could we do to make the commit message better, to provide the reviewer with more initial context about your patch? Well, the first thing that comes to mind is to reference a specific bug that you are fixing with this patch.

Alright, so we amend our commit message to include a bug identifier:

$> git commit --amend -m "Fixes Bug 123456"

And subsequently push our amended commit message. The reviewer now gets a new notification that you’ve amended a previous patch. Now the notification includes the bug identifier. What do you think the next thought a typical reviewer might have? My guess is this: What, does this developer think that I’ve memorized all the bug IDs for all open bugs? How should I know what Bug 123456 is about? And here comes that bad taste in the mouth again.

OK, so this time, we will forgo the use of the time-saving -m switch to git commit and actually type a proper, multi-line commit message in our editor of choice that describes the bug that our patch fixes, including a brief description of how we fixed the bug:

git commit --amend  # This will open up your editor...

Now we’d enter a good commit message … something like this would work:

Fixes Bug 123456 - ImportError raised improperly in Storage Driver

Due to a circular dependency, an ImportError was improperly
being thrown when the storage driver was set to XYZ. Rearranged
code to remove circular dependency.

The commit message now will give the reviewer everything they need in the notification to understand what the patch is for and how you solved a bug, without needing to go to their bug tracker to figure out what the bug was about.

A detailed commit message shows you care about the time that reviewers spend on your patch and that you value the code you are submitting.

Presentation: OpenStack QA – Walkthrough of Processes, Tools and Code

Last night I gave a short webinar to some folks about the basics of contributing to the Tempest project, which is the OpenStack integration test suite. It was the first time I’d used Google Docs to create and give a presentation and I must say I was really impressed with the ease-of-use of Google Docs Presentation. Well done, Google.

Anyway, I’ve uploaded a PDF of the presentation to this website and provided a link to the Google Docs presentation along with a brief overview of the topics covered in the slides below. As always, I love to get feedback on slides. Feel free to leave a comment here, email me or find me on IRC. Enjoy!

Google Presentation (HTML)
PDF slides


Topics included in the slides:

  • OpenStack Contribution Process
  • Running Devstack Locally
  • Running Tempest against an Environment
  • Walkthrough the Tempest Source Code
  • Progressively improving a test case
  • Common Scenarios in Code Review and Submission

OpenStack Dev Tip — Easily Pull a Review Branch

Just a quick tip for developers working on OpenStack projects that work on multiple development machines or want to pull a colleague’s code from the Gerrit review system and test it locally.

If you have followed the instructions about setting up a development environment successfully, you will have installed the git-review tool that Jim Blair and Monty Taylor maintain. The git-review tool has a nice little feature that enables you to easily pull any branch that anyone has pushed up to code review:

$> git review -d $REVIEW_NUM

The $REVIEW_NUM variable should be replaced with the identifier of the review branch in Gerrit.

For example, I developed some code on my laptop that I now want to pull to my beefier work machine. The original branch is failing a few tests in Jenkins and I want to diagnose what’s going on. The review branch is here: https://review.openstack.org/#change,1656. The review number (ID) is 1656.

To grab that branch into my local environment and check it out, I do:

jpipes@uberbox:~/repos/glance$ git review -d 1656
Downloading refs/changes/56/1656/2 from gerrit into review/jay_pipes/bug/850377

Doing a git status, you’ll note that I am now in the local branch called review/jay_pipes/bug/850377:

jpipes@uberbox:~/repos/glance$ git status
# On branch review/jay_pipes/bug/850377
# Your branch and 'gerrit/master' have diverged,
# and have 1 and 2 different commit(s) each, respectively.
#
nothing to commit (working directory clean)

I can now run tests, diagnose the issue(s), fix code up and do a:

$> git commit -a --amend
$> git review

And my changes will be pushed up to the original review in Gerrit for others to look at.

Essex Design Summit — QA Sessions to Note

Essex Design Summit

There are quite a few folks interested in QA coming to the OpenStack
Essex Design Summit
next week. I wanted to give you all a heads-up on
the sessions that may be of interest to you.

Here they are:

Monday, Oct 3rd:

09:30-10:25 – Essex Release Cycle

Thierry Carrez, our illustrious release manager, will do a post-mortem
on the Diablo release cycle and discuss potential changes for the
Essex release cycle. I know almost all QAers have expressed desires to
have maintenance branches managed by the QA team and I’ve heard
suggestions about various QA-centric freeze points. Those interested
in advocating for these things should plan to attend this session.

14:00-14:45 – Stable Release Updates

Dave Walker from Canonical plans to outline some possibilities for how
to maintain and update stable releases of OpenStack projects.

15:00-15:45 – Separating API from Implementation of the API

Total self-promotion of a session I’ve proposed… I think anyone
interested in stabilizing the OpenStack APIs and having OpenStack APIs
become the open standards for the cloud computing industry should
attend.

16:30-17:15 – OpenStack Compute API 2.0

Glen Campbell will be leading a discussion about how to improve the
Compute (Nova) API for a 2.0 API series. I think it’s important that a
number of folks on the QA team attend this session and get an idea of
the things that we will be looking at in the future regarding the
Compute API. Personally, I’m definitely planning on attending this
one.

17:30 – 17:55 – NetStack Continuous Integration Planning

Personally, I will not be at this session as I have another session to
lead. However, I think it is important that a number of people from
the QA team attend this session, listen to the needs of the NetStack
contributors, voice our support for their projects, explain what the
goals of our team are, and enable some cross-team collaborative
efforts around CI and QA.

Tuesday, Oct 4th:

09:30 – 09:55 – Documentation Strategies for OpenStack

Anne Gentle will be leading a discussion about documentation of
OpenStack projects. One of the deliverables of the OpenStack QA team
is clearly to identify areas where specifications don’t match
behaviour, so I think it’s pretty critical that the Doc Team and the
QA team be on the same page when it comes to how to coordinate
communication of documentation discrepancies.

09:30 – 09:55 – VM Disk Management in Nova

At the same time as the documentation session, Paul Voccio is leading
a discussion about VM disk management in Nova. Those QAers focusing on
disk/volume management may want to attend this session to ensure the
QA team has a good grasp of changes coming in this arena.

10:00 – 10-25 – OpenStack Common

Brian Lamar will be leading a discussion on getting serious about the
potential of an openstack-common Python library of common code shared
amongst many OpenStack projects. Hey, it’s a heck of a lot easier to
QA code that’s in one location than the same code, written with slight
differences, spread across many projects… seems like a no-brainer
for the QA team to attend and support this idea. :)

11:00 – 11:25 – Monitoring in Swift

John Dickinson will be leading a session to discuss what things should
be monitored across a Swift cluster, and what tools are available for
monitoring. I think this discussion will be valuable for those of us
interested in long-running production integration tests where Swift is
one of the components of a full OpenStack test cluster.

12:00 – 12:25 – Integration Test Suites and Gating Trunk

A no brainer… in this session we will talk about the various
integration test suites for Nova/Glance/Keystone and discuss the
effort already underway to combine them. In addition, we will talk
about what policies to recommend for OpenStack projects regarding what
level of passing integration tests should hold up a gated trunk.

15:30 – 15:55 – Making VM State Handling More Robust

Phil Day is leading a discussion about ways in which the handling of
VM state transitions can be inconsistent and confusing. Since the QA
team is responsible for documenting just such inconsistencies and
building tests cases for such inconsistent behaviour, I think this
session would be good to hang around in and listen/take notes.

16:30 – 17:25 – OpenStack Faithful Implementation Test Suites (FITS)

Josh McKenty will be talking about certain proposals regarding a FITS
for OpenStack APIs. Should be an interesting session :)

Wednesday, Oct 5th:

09:30 – 10:25 – XenServer/KVM Feature Parity Plan

This session should be good for those QAers interested in identifying
areas where feature parity between hypervisors is lacking, and
discussing ways in which the QA team can document these disparities
and produce tests for identifying future disparity among hypervisors.

11:00 – 11:45 – Glance Throughput Improvements

This session is being led by Tim Reddins, from HP, who (along with his
team) have done some analysis on ways to improve Glance’s throughput.
QAers interested in stress, capacity, and parallelism testing should
definitely attend!

11:30 – 11:55 – Nova Upgrades

Ray Hookway will talk about ways that Nova’s update process can be
made more robust. I imagine that the talk’s recommendations will be
generally applicable to many OpenStack projects, not just Nova. I also
think that some members of the QA team should attend — we should be
able to create functional tests for upgrade processes for all
OpenStack projects…

14:30 – 14:55 – Git/Gerrit Best Practices

Monty Taylor is leading this session on Gerrit/Git best practices. I
recommend everyone go, if only to see the fireworks.

15:30 – 15:55 – Quality Assurance in OpenStack

Uhm, duh, you should all be at this one. :) We’ll discuss how to
divide the voluminous amount of work among our members, talk about
which projects (and components within certain projects) are
high-priority items, the ways we should communicate and track
progress, etc

17:00 – 17:25 – Internal Service Communication

Brian Waldon is leading a session on internal service communication
that should be quite interesting. The integration testing coverage of
major internal service components of Nova is currently light, and is
one of those areas I think should be carefully picked over by our QA
team.

OK, that’s the recommendations from me, but of course, feel free to
attend whatever sessions are of most interest to you. I’m very much
looking forward to meeting all of you (we’re up to 28 members as of
this writing).

Cheers, and see you tomorrow!
-jay

Developing Nova on Linux – Getting Started

In the past few weeks, I’ve gotten involved in the newly-debuted OpenStack project. Right now, my focus is on the Compute sub-project of the stack, called Nova. The initial pieces I am focusing on are the unit tests and end-to-end systems testing of the compute stack.

I struggled over the last couple days to solve a bug that turned out to be not a bug at all, but an issue with the Python development environment I use. I figured I’d write a blog article for those Python developers who are looking to contribute to the Nova project and may also be struggling to get up and going.

If you’re contributing to an open source project like Nova, you’ll want to be able to work on multiple branches of the source code at the same time — for instance, if you’re working on fixing a few bugs simultaneously.

There are quite a few dependencies for Nova, and, because of the way Python searches for packages, it’s imperative that you use a tool such as virtualenv to isolate your multiple branches into their own development environments. Otherwise, as I learned today, the location of your site-packages and what has previously been installed on your development machine can wreak havoc on you. :)

NOTE: For this article, I assume the reader is on Debian/Ubuntu Linux, since that is what I use as my development machine. If you’re on a different flavour of Linux, feel free to adapt the instructions here to suit your particular package manager.

Installing the Tools for Installing the Tools

Before we get into our virtual development environments, you’ll first want to ensure you’ve got a few packages installed, including bzr, libssl-dev, swig and virtualenv. The following should do the trick:

sudo apt-get install -y swig libssl-dev bzr python-virtualenv

A Setup for Source Control and Virtual Environments

In order to get properly setup to contribute to the Nova project, you’ll want to setup a local repository to keep branches of source code that you work on. Although bzr is not required as your revision control system, I use bzr myself and will use it in this article. Adapt as needed if you use git-bzr or similar.

I like to have the following directory structure for working on Python projects:

~/repos/$projectname/ <-- shared repository for branches of your project
~/repos/$projectname/trunk <-- local trunk branch
~/repos/$projectname/$branch <-- a branch you work in
~/virtenvs/$projectname/ <-- Development environments for your project
~/virtenvs/$projectname/$branch <-- development environment for a branch you work in

Assuming you want to contribute to the Nova project and you want to work on fixing a bug #XXXXX, then following would get you started:

bzr init-repo ~/repos/nova
cd ~/repos/nova
bzr branch lp:nova trunk
bzr branch trunk bugXXXXX
mkdir -p ~/virtenvs/nova

At this point, we'll go ahead and create a virtual development environment for bugXXXXX:

cd ~/virtenvs/nova
virtualenv --no-site-packages bugXXXXX
cd bugXXXXX
source bin/activate

At this point, you'll notice your prompt change, indicating that you are now in a virtual development environment. The --no-site-packages ensures that your locally-installed Python packages aren't included in your Python PATH when inside your virtual environment.

Next step is to install into this virtual development environment all the packages and dependencies we'll need. This should do the trick:

easy_install twisted tornado boto M2Crypto IPy carrot mox redis
easy_install http://python-gflags.googlecode.com/files/python_gflags-1.3-py2.5.egg

Alright, next we simply link to our bzr branch location from inside the virtual environment and run the Nova test suite:

ln -s ~/repos/nova/bugXXXXX bugXXXXX
cd bugXXXXX
python run_tests.py

If all went smoothly, you'll see all passing test cases, like below :)

Having issues getting up and running? Find us on Freenode IRC #openstack.

See ya,

Jay

Macro Support in new Drizzle Client Console?

Hi all!

I’ve been reading through the requested features for the new client on the wiki here:

I think all the stuff on that link is excellent so far. I’d also like to request a feature that I think will be a really cool timesaver for DBAs and developers using Drizzle.

Macro Support

Remember, “way back when” you used Microsoft Excel and were able to start recording your actions, then when you stopped recording, Excel would store a “macro” of your actions that you could subsequently replay?

I think this would be incredibly useful for folks who do repetitive work in the console.

Sure, I know, I know…the first reaction folks will say is “but HEY, you guys removed stored procedures!” Yeah, yeah… but the feature I’m proposing here is different from stored procedures in the following ways:

  1. It’s entirely client-side. There is no server-side storage/cache, processing, parsing, or anything.
  2. It’s not limited to a small subset of SQL that stored procedures (at least in MySQL) are currently limited to. Anything the new client can do would be able to go into a macro.
  3. Since the client is in Python, the macros are themselves re-writable in a scripting language. This gives the recorded macros incredible flexibility.
  4. No fussing with SQL stored procedure permissions at runtime (you know, the silly INVOKER/DEFINER crap)
  5. Ability to interact with result sets in the macro. Just try doing that easily in a SQL stored procedure. Using CURSORs is incredibly clunk and ugly. Applying a Python function or closure/lambda on each of a result set is elegant and easy.

Imagine the following rough example interface…

drizzle> RECORD MACRO "sales_report_with_email" (to_email);
macro recording started.

drizzle> mode python;
in python mode.

python> import datetime
python> today= datetime.datetime.now().isoformat()
python> filename= "%s-%s-%s" % ("sales", to_email, today)
python> Ctrl-D

drizzle> SELECT * FROM sales
         WHERE manager = @to_email; > csv(@filename);
drizzle> mode python;
In python mode.

python> report_txt= open(filename, "r+b").read()
python> import smtplib
python> mailserver = smtplib.SMTP('localhost')
python> mailserver.sendmail('theboss@company.com', to_email, report_txt)
python> mailserver.quit()
python> print "Mail sent to %s\n" % to_email
python> Ctrl-D

drizzle> STOP MACRO;
Macro "sales_report_with_email" saved.

drizzle> macro("sales_report_with_email", "myboss@company.com");
Mail sent to myboss@company.com

Pretty powerful, eh?

If you follow the flow above, you will notice the only real trick to solve is passing the macro’s arguments into the console’s variable array, and from the console’s variable array into the Python interpreter’s variable scope. But this is a fairly simple problem to solve…

Thoughts? Suggestions? If you’ve got comments, please feel free to share here, or on the Drizzle Discussion mailing list, or even update the wiki pages posted above. Thanks! :)