Answering the existential question in OpenStack

There have been a slew of threads on the OpenStack Developer mailing list in the last few months that touch on overall governance and community issues. Thierry Carrez’s “the future of the integrated release” thread sparked discussions around how we keep up with the ever-increasing number of projects in the OpenStack ecosystem while maintaining quality and stability. Discussions around where to place the Rally program, suggestions of creating a Neutron Incubator program, and the debate around graduating Zaqar from incubation all bring up fundamental questions about how the OpenStack community is going to cope with the incredible growth it is experiencing.

Underneath these discussions, however, is an even more pivotal question that remains to be answered. Chris Dent put it well in his recent response on the Zaqar thread:

Which leads inevitably to the existential questions of What is OpenStack in the business of? and How do we stay sane while being in that business?

I have a few thoughts on these questions. A definitive opinion, if you will. Please permit me to wax philosophical for a bit. To be fair, my thoughts on this topic have changed pretty dramatically over the last few years, and as my role in the OpenStack world has evolved from developer to operator/deployer to developer again, from being on the Project Policy Board (remember that?), to being on the Technical Committee, off the TC, and then on it again.

What is OpenStack in the business of?

I believe OpenStack is in the business of providing an extensible cloud platform built on open standards and collaboration.

In other words, we should be a Big Tent. Our policies and our governing structure should all work to support people that collaborate with each other to enhance the experience of users of this extensible cloud platform.

What would being a Big Tent mean for the OpenStack community?

To me, being a Big Tent means welcoming projects and communities that wish to:

“speak” the OpenStack APIs
extend OpenStack’s reach, breadth, and interplay with other cloud systems
enhance the OpenStack user experience

Note that I am deliberately using the generic term “user” in the last bullet point above. Users include application developers writing against one or more public OpenStack APIs, developers of the OpenStack projects, deployers of OpenStack, operators of OpenStack, and anyone else who interfaces with OpenStack in any way.

What would being under the OpenStack tent mean for a project?

What would “being under the tent” mean for a project and its community? I think there’s a pretty simple answer to this as well.

A project included in the OpenStack Big Tent would mean that:

The project is recognized as being a piece of code that enhances the experience of OpenStack users
The project’s source code lives in the openstack/ code namespace

Clearly, I’m talking here only about projects that wish to be included in the OpenStack tent here. I’m not suggesting that the OpenStack community go out and start annexing parts of other communities and bringing their source code into the OpenStack tent willy-nilly!

…and what would being under the tent not mean for a project?

Likewise, being under the OpenStack Big Tent should not mean any of the following:

That the project is the only way to provide a specific function to OpenStack users
That the project is the best way to provide a specific function to OpenStack users
That the project receives some allotted portion of shared technical or management resources

Note that the first two bullet points above go towards my opinion that the OpenStack community should embrace competition both within its ecosystem as well as embrace competition in the external community by identifying ways to increase interoperability. Specifically, I don’t have a problem with having projects under the OpenStack tent that share some common functionality or purpose.

What requirements must projects meet?

Finally, there is the question of what requirements a project that wishes to be under the OpenStack tent must meet? I think minimal is better here, as it lowers the barriers to entry for interested parties. I think the below items keep the bar high enough to ensure that applicants aren’t just trying to self-promote without contributing to the common good of OpenStack.

Source code is licensed under the Apache License or a compatible FLOSS license and code is submitted according to a Developer Certificate of Origin
There should be liaisons for release management and cross-project coordination
There should be documentation that clearly describes the function of the project, its intended audience, any public API it exposes, and how contribution to the project is handled

And that’s it. There would be no further requirements for a project to exist under the OpenStack tent.

Predicting a few questions

I’m going to go out on a limb and try to predict some questions that might be thrown my way about my answer to the existential question of OpenStack. Not only do I believe this is a good exercise for anyone who plans to defend their thoughts in the court of public opinion, but I think that having concrete answers to these questions will help some folks recognize how this change in overall OpenStack direction would affect specific decisions and policies that have come to light in recent months.

What about the gate? How will the gate possibly keep up with all these projects?

The gate is the test platform that guards the branches of the OpenStack projects that are in the Integrated OpenStack release against bugs that result from merging flawed code in a dependent project. This test platform consists of hundreds of virtual machines that run a set of integration tests against OpenStack environments created using the Devstack tool.

The gate is also composed of a group of talented humans who each have demonstrated heroic characteristics in the past four years during periods in which the gate or its underlying components get “wedged” and needs to be “unstuck.”

These talented engineers, known collectively as the OpenStack Infra team, are a resource that is shared among projects in the OpenStack community. Currently, in the OpenStack community means the set of code that lives in the openstack/ namespace AND the stackforge namespace. However, while the OpenStack Infra team is shared among all projects in the OpenStack community, the gate platform is NOT shared by all projects in the community. Instead, the gate platform (the integrated queue) is only available to the projects in the OpenStack community that are in the incubated and integrated project statuses.

Now, it is 100% true that given the current structure of our gate test platform, for each additional project that is included in the current OpenStack Integrated Release, there is an exponential impact on the number of test machines and the amount of test code that will be run in the gate platform. I will be the first to admit that.

However, I believe the gate test platform does not need to impose this limitation on our development and test processes. I believe the gate, in its current form, produces the kind of frustration that it does because of the following reasons:

We don’t use fail-fast policies in the gate
We don’t use hierarchical test construction properly in the gate
We assume bi-directional test dependence when all dependencies are uni-directional
We run tests against code that a patch does not touch
We only have one gate
Expectations of developers trying to merge code into a tree are improperly set

By rethinking what the gate test platform is doing, and especially rethinking the policies that the gate enforces, I think we can test more projects, more effectively. I’m not going to go into minute detail for each of the above items, as I will do that in a followup post. The point here is to step back a bit and see if the policies we’ve given ourselves might just be the cause of our woes, and not the solution we think they might be.

Would there continue to be an incubation process or status?

Short answer: No.

The status of incubation within OpenStack has become the end goal for many projects over the last 3 years. The reason? Like it or not, the reason is because being incubated means the project gets to live in the openstack/ code namespace. And being in the openstack/ code namespace brings an air of official-ness to a project. It’s not rational, and it’s not sensible, but the reality is that many companies will simply not dedicate resources (money and people, which cost money) to the development of a project that does not live in the openstack/ code namespace. If our goal, as a community, is to increase the adoption of projects that enhance the user experience of OpenStack, then the incubation status, and the barrier to inclusion that comes with it, is not helping that goal.

The original aims of the incubation process and status were to push projects that wished to be called “OpenStack” to adopt a consistent release cadence, development workflow and code review system. It was assumed that by aligning these things with other projects under the OpenStack tent, that these projects would become better integrated with each other, by nature of being under the same communal constraints and having to coordinate their development through a shared release manager. These are laudable goals, but in practice, incubation has become more of a status symbol and political campaign process than something that leads to increased integration with other OpenStack projects.

Do we still want to encourage projects that live under the OpenStack tent to work with each other where appropriate? Absolutely yes. However, I don’t think the existing incubation process, which features a graduation review (or three, or four) in front of the Technical Committee, is something that is useful to OpenStack users.

Application developers want to work with solid OpenStack APIs and easy-to-use command line and library interfaces. The incubation process doesn’t address these things specifically.

Deployers of an OpenStack cloud platform want to deploy various OpenStack projects using packages or source repositories (and sometimes both). The process of incubation doesn’t address the stability or cohesiveness of operating system packages for the OpenStack project. This is something that would be better handled with a working group comprised of folks from the OpenStack and operating system distributions, not by a graduation review before the Technical Committee.

Operators of OpenStack cloud platforms want the ability to easily operate, maintain, and upgrade the OpenStack projects that they choose to utilize. Much of the operator tooling and utilities live outside of the openstack/ code namespace, either in the stackforge/ code namespace or in random repositories on GitHub. By not having OpenStack-specific operator tooling in the openstack/ code namespace, we de-legitimize these tools and make the business decision to use them harder. The incubation process, along with the Technical Committee “come before the court” bottleneck, doesn’t enable these worthy tools and utilities to be used as effectively as they could be, which ultimately means less value for our operator users.

Won’t the quality of OpenStack suffer if we let too many projects in?

No, I don’t believe it will. Or at least, I don’t believe that the quality of OpenStack as a whole is a function of how many projects we let live in the openstack/ code namespace. The quality of OpenStack projects should be judged separately, not as a whole. It’s neither fair to older, mature projects to group them with newcomers, noris it reasonable to hold new projects to the standards of stability and quality that 5+ year old software has evolved to.

I think instead of having an OpenStack Integrated Release, we should instead have tags that describe the stability and quality of a particular OpenStack project, along with much more granular integration information.

In other words, as a deployer, I couldn’t care less whether Ceilometer is “in the integrated release.”

What I do care about, however, is whether the package of Neutron I’m installing emits notifications in a format that the package of Ceilometer I’m installing understands and tracks.

As a deployer, I don’t care at all that Swift is “in the integrated release.”

What I do care about is whether my installed Glance package is able to store images in and stream images from a Swift deployment I installed last May.

As a cloud user, I don’t care at all that Nova is “in the integrated release.”

What I do care about is whether the version of Nova installed at my cloud provider will work with the version of python-novaclient I happened to install on my laptop.

So, instead of going through the arduous and less-than-useful process of graduation and incubation, I would prefer we spend our limited resources working on very specific documentation that clearly tells OpenStack users about the tested integration points between our OpenStack client, server and library/dependency projects.

Would MarconiZaqar be in the integrated OpenStack release under this scheme?

No. I don’t believe we need an integrated release. Yes, Zaqar would be under the OpenStack tent, but no, there would be no integrated release.

Would Stacktach, Monasca and Ceilometer both live in the `openstack/` code namespace?

If that is something the Stacktack and Monasca communities would like, and once they meets the requirements for inclusion under the OpenStack tent, then yes. And I think that’s perfectly fine. Having Stacktach under the same tent does not mean Ceilometer isn’t a viable option for OpenStack users. Nor does it mean that Stacktach is the best solution in the telemetry space for OpenStack users.

Our decisions should be based on what is best for our OpenStack users. And I think giving our users choices in the telemetry space is what is best for our users. What works for one user may not be the best choice for another user. But, if we focus on the documentation of our projects in the OpenStack tent (see the requirements for projects to be under the tent, above), we can let these competing visions co-exist peacefully while having our user’s best interests in our hearts.

Would Stacktach and Ceilometer both be called “OpenStack Telemetry” then?

No. I don’t believe the concept of Programs is useful. I’d like to get rid of them altogether and replace the current programs taxonomy with a looser tag-based system. Monty Taylor, in this blog post, has some good ideas on that particular topic, including using a specific tag for “production readiness”.

Would the Technical Committee still decide which projects are in OpenStack?

No. I believe the Technical Committee should no longer be this weird court before which prospective projects must plead their case for inclusion into OpenStack. The requirements for a project to be included in the OpenStack tent should be objective and finite, and anyone on the new project-config-core team should be able to vet applicants based on the simple list of requirements.

How would this affect whether some commercial entity can call its product OpenStack?

This would not affect trademark, copyright, or branding policies in any way. The DefCore effort, of which I have pretty much steered as clear of as possible, is free to concoct whatever legal and foundation policies it sees fit with regards to what combinations of code and implementation can be called “powered by OpenStack”, “runs on OpenStack”, or whatever else business folks think gives them a competitive edge. I’ll follow up on the business/trademark side of OpenStack in a separate post, but I want to make it clear that my proposed broadening of the OpenStack tent has no relation to the business side of OpenStack. This is strictly a community approach.

What would happen to the `stackforge/` code namespace?

Nothing. Projects that wished to continue to be in the OpenStack ecosystem but do not meet the requirements for being under the OpenStack tent could still live in the stackforge/ code namespace, same as the they do today.

What benefits does this approach bring?

The benefits of this approach are as follows:

Clarity: due to the simple and objective requirements for inclusion into the OpenStack tent, there will be a clear way to make decisions on what should “be OpenStack”. No more ongoing questions about whether a project is “infrastructure-y enough” or aggravation about projects’ missions or functionality overlapping in certain areas.
Specificity: no more vague reference to an “integrated release” that frankly doesn’t convey the information that OpenStack users are really after. Use quality and integration tags to specify which projects integrate with which versions of other projects, and focus on quality, accurate documentation about integration points instead of the incubated and integrated project status symbols and graduation process.
Remove the issue of finite shared resources from the decision-making process: Being “under the OpenStack tent” does not mean that projects under the tent have any special access to or allocation of shared resources such as the OpenStack infrastructure team. Decisions about whether a project should be “in OpenStack” therefore need not be made based on resource constraints or conditions at a specific point in time. This means fewer decisions that look arbitrary to onlookers.
Convey an inclusive attitude: Clearer, simpler, more objective rules for inclusion into the OpenStack tent means that the OpenStack community will present an inclusive attitude. This is the opposite of our community’s reputation today, which is one where we are viewed as having meritocracy at a micro-level (within the contributor communities in a project), but a hegemony at a macro-level, with a Cabal of insiders judging whether programs meet subjective determinations of what is “cloudy enough” or “OpenStack enough”. The more inclusive we are as a community, the more we will attract the application developers to the OpenStack community, and its on the backs of application developers that OpenStack will live or die in the long run.

Thanks for reading. I look forward to your comments.