084. How Many On the Team, Exactly?
Hardcore Software by Steven Sinofsky (Audio Edition) - A podcast by Steven Sinofsky
 
   Categorie:
Much of what Hardcore Software has been about was what we were building (and why). This chapter is about how. Specifically, I wanted to delve into the management structure and what we worked through to restore efficacy and build a new kind of Windows team. Over the next few posts, we will journey through understanding of the cultural challenges the team faced, figuring out a plan to lay the foundation to address those, and then putting that plan into action. This first post gets to the core of understanding what precisely the team is building by figuring out how many people work on what projects. That should be simple, no?Back to 083. Living the Odd-Even Curse [Ch. XII]When you move into a new job there are a lot of things that need to come first, too many. You want to touch base with the most critical individuals, but don’t want to minimize the importance of those less so. You want to focus on the high priority areas of work, but it is times like this that the lower priority work hardly needs to be reminded of that fact. You’re dying to ask a lot of questions, but people are dying to tell you things. Then there’s the political reality that the many things pushed to you or that arrive in your inbox are often those least needing your attention, but most likely to notice a lack of attention.I had all this to think about while both being reminded of Windows Vista every day and needing to let the team in place finish the project without interference, inadvertent or otherwise.It was extremely weird to commit myself to learning about Windows and the Windows team. I had, after all, essentially grown up around it, just not in it. I knew the Windows product. I knew the Windows people. I just had no idea how the people made the product. I knew the organization at a super granular level from Windows 3.x and Windows 95 working on toolbars and app compat and the shell from C++ and Office. I knew things at a strategic and executive level.I had a high-altitude view of the organization, and I knew a lot of individuals, but between a few feet off the ground and 50,000 feet above the ground I had a lot to learn.Little was as it seemed, however, when it came to the details.There’s a well-known military principle on knowing the difference between lessons and lessons learned, between reading about something and having learned that same thing through the experience of changing how one operates. Any management book will tell you to know the budget and resources on a team and that’s a good lesson. In the Microsoft culture filled with cookie licking, shiny objects, and side projects my most important lesson learned is to actively track how many developers there are and what code are they writing. Every time I was uncertain of what a team was actually building or if a project was real, understanding the number of developers assigned to which code was the most valuable information to have and the most critical to keeping a project on track. I learned that with NetDocs, the Tablet PC, and so many 1990s internet projects long since passed. Everything other than actual working developers is just talk.Therefore, the first thing I chose to do was to get a handle on the composition of the team. With the help of Kristen Roby Dimlow (KristenD) from HR, one thing became clear: I was in a new world. KristenD was previously our finance partner in Office, coincidentally, and brought with her a refreshing analytical view of the structural challenges I (now we) faced. Kristen began immediately trying to collect the data on who was doing what.In Office, headcount, resource allocation, and org structure were readily visible and, for the most part, easy to figure out by looking at the company’s online system headtrax. Windows was a different apparatus. While there was a headcount number, what they were working on, for whom they were working, and even their actual physical location, were all less clear, or fuzzy.In keeping with Windows tradition of reorgs that “split the baby,” or product organizations that were structured such that accountability and ownership were muddled, the job I was given was not as much the “Windows job” as most would at first perceive. This wasn’t a surprise at all to me—I knew what I was getting into. This was the Windows Client team previously described. COSD remained separate as it was already.To KevinJo and SteveB, accountability was clear, even if the organization structure and people were not. This was typical Microsoft accountability in the 2000s. I was their Windows “guy” and decidedly on the hook to figure out what comes next. Kevin had a huge amount to figure out. He was clear just how much he was hoping I could wrap my brain around with respect to “what comes after Vista?”Along with the Windows client, there was Internet Explorer and the user facing side of Live services—the split of everything down the middle was alarming when it came to accountability, but just how alarming required more investigation.The Live services represented a lot of headcount but the revenue numbers did not seem so big to me at the time. By Microsoft Online Services standards there was significant revenue associated with Hotmail and Messenger. Hotmail sold display advertising, perhaps $300-400M worth. The ads were intrusive “right rail” ads that took up the right side of the screen. The running joke was that the most popular ad was for a toe fungus medicine. The team was working hard to try to sell Hotmail Extra Storage, upgrading to 10MB (later 50MB) for $19.95 per year. Google’s Gmail had no ads and essentially free unlimited storage. The unlimited storage was not a gimmick, Google invented a novel “infinite” and reliable storage mechanism enabling the capability. MSN Messenger was selling ads inside the client application (less than Hotmail), also intrusive, though the move to mobile phones and the growing competition from Skype were both problems. In other words, the few hundred million dollars in revenue was not remotely sustainable and at the same time the products were struggling.Almost an after-thought in all the discussions about me taking this job was the fact that I would also manage the new homegrown Search product, which was recently branded as Live Search. It would not be Bing for another two years. The team was growing rapidly (up to almost 100 engineers) but was still very new and clearly a very distant competitor to Google (with over 10,000 employees). The first beta test of Live Search started just weeks before I joined the team.Christopher Payne (ChrisPa) was chartered with leading the team. He was the vice president and team founder and had returned to Microsoft to lead many of the MSN Services efforts. In a moment of boldness for a team under a great deal of fiscal pressure, in 2003 he proposed to BillG a massive effort (expensive investment) to build out a search product that would compete with Google. This included maps, instant answers, books, and more. At the time, as crazy as it sounded, Search across all of MSN was a hodge-podge of business development deals and outsourcing—to compete with Google. In his prior time at Microsoft, Christopher (he preferred his full name in spite of the email name) was a product leader on the first versions of the Access database product in Office and some of the early MSN properties (he later went on to run eBay North America and is currently COO of DoorDash).Over two years or so, the Search team built an extremely credible effort, first releasing at the end of 2004. The team was the first fully organic one at Microsoft, tasked to build scaled cloud services, employ artificial intelligence and machine learning, and create the kind of tools Google had developed to automate and manage tens of thousands of servers in multiple data centers. Many of these pioneering efforts were critical to Microsoft’s cloud data centers over the next decade.While ChrisPa knew what he needed to do from a product and technology perspective, there were only two things holding the team back. The first was resources. The team needed to spend a lot of money on capital expenses for servers and data centers, as well as hiring more people. Second, the team needed to be given the time to build much more of the product and technology base before being pushed on revenue—they were far behind Google and the complexities of overlaying a new advertising business did not seem prudent at the time. Google was doing about six billion dollars of revenue directly on search that year, doubling year over year. It was already a juggernaut. In 2003 at the exec offsite, Payne said it would “take at least 18 months and $150 million dollars to even enter the race with Google, and that it was critical we own our own search infrastructure.” The first time Christopher and I met (as part of Search and Windows) he told me he needed an additional $1 billion just in capital equipment (data centers) next fiscal year and revenue was not yet a priority.As I would learn, there was only so much patience above me.Combined we called all of these “Windows and Windows Live” and my official title was Senior Vice President, Windows and Windows Live or WWL (I was already Senior Vice President of Office, another fact several people pointed to as evidence I was not up to the job.) COSD continued to report to Kevin, though figuring out how to manage and organize it was all part of our ongoing efforts. That meant the broad view was that there was WWL and COSD, just as before there was Windows Client and COSD. To some this was comforting. To others they were waiting for the other shoe to drop.To put some numbers on all of this: There were approximately 3,500 full-time R&D employees in over 30 cities around the world for WWL, with about 1,000 software design engineers. In Office, we worked using ratios that would translate to 1,000 software testers and 500 program managers, compared to WWL, which had 750 testers and 600 program managers. We had only a handful of managers to oversee multiple job functions (Office 2007 had about 10), but this organization had more than 40.COSD was a bit larger with about 4,700 people (in most every country Microsoft did business), but more than 1,500 were part of a major push to move all bug fixes and servicing of old releases of Windows to India. This was a radical out of sight, out of mind move designed for cost-effectiveness, and something we did not do in Office. COSD also had about 1,000 software engineers, but over 680 program managers (and not much user interface!) and about 1,000 software testers (about what one would expect). COSD had another 40 to 50 multidisciplinary managers.The number of vendors and contractors and open positions in WWL plus COSD product development approached 10,000 people. Yikes. The number of open positions was astonishing, thousands upon thousands. Not only could they never be filled, but the question was also how would they have helped to ship Vista? That couldn’t be more different than what we did in Office.Perhaps the most surprising data point was that almost one third of the team was managers and there were easily seven, and often up to nine, levels in the management hierarchy. Office was about 20 percent managers and rarely more than five levels of hierarchy domestically. Another measure of complexity in the system was the number of cost centers. In Microsoft lingo, a cost center was a locus of financial controls, budgets, and headcount monitoring. In practice, it was a numeric field in SAP. According to finance the mere existence of a cost center was about $100,000 a year in operational overhead. In actual practice, every cost center was a headache as it was another place someone could come up with unique budgets, costs, and headcount, and when everything was considered for one product release, cost centers became overhead and bureaucracy. Windows had around 300 cost centers. By comparison, Office had about 30, and most of those were needed because people were paid in local currency and a cost center could have only one currency.Mini-Microsoft was looking more and more accurate. I was beginning to understand why I thought Mini was so off base when I compared what they said about Windows to Office.I completed an inventory of the products and projects that were underway, and resources assigned. Doing so felt a bit like an excavation. There wasn’t a single place where the allocation was tracked. Finance knew how many dollars were budgeted by cost center which were created to essentially streamline accounting or sometimes to park open headcount. The projects underway were mostly tracked by multi-disciplinary managers (MDM, or PUM for product unit managers). The mapping of projects to products or a roadmap of product releases didn’t exist. Finance had one view of open headcount which had little correlation to the view HR had for recruiting. It was quite chaotic. When asked, managers had a solid idea of what they were doing but that certainty did not roll up in either a strategic or fiscal sense. Compounding this were what I came to call “headcount gymnastics”. In order for one group to rely on a contribution from another, groups engaged in headcount bartering where heads were offered or loaned from one group to another as a way of creating accountability or a reliable contract for work. Absent headcount gymnastics, partnerships or collaboration between teams would be subject to the whims of PUMs. I suppose.I knew about these gymnastics because more times than I could recall, Office was asked to support something new in Windows and as part of the ask they would provide headcount to get it done. It should be readily apparent as to why this is just not going to work, but when you think about it even for a bit you realize just how absurd such a system is. It basically says that headcount is the tool for changing the priorities of a group. If you don’t want to do something then you don’t want to, and the idea that if you had more headcount then that thing you were asked to do is the thing you’d choose to do next is absurd. That’s on the face of it. There’s the second order problem that headcount is not the same as a human being, a developer. It means the receiving group, the one that signed up to do something it didn’t want to do naturally, has now committed to do that very thing but has no person to get the work done. If one continues to play this out, then you ask all sorts of other questions about what schedule the work would get done on and what would happen if the work required changes to parts of a system that were not open to accepting changes (for a variety of reasons) at that time. I could keep going but it should be clear operationally why this is awful. Yet, this is how almost everything worked.Let me indulge with a brief view as to just how broken headcount was and how key this was to the whole mess I was now facing. There are some basics of all software projects, among them there is always more that the team wants to get done at any given time than is currently planned and that adding more people once a project starts not only fails to help get more done, but likely will result in less efficacy. There’s a simple corollary to these rules, which is that most every project will end up scaling back work as it progresses to finish on time. Said another way, projects don’t get more done by the end than they said they would get done at the start. These basics go back to the Mythical Man-Month, one of the books issued to every new Microsoft developer going back to the earliest days.Therefore, the basic way we worked in Office (also for as long as I could remember) was that projects were planned to use the number of people currently in place at the start of the project or milestone (sub-unit of a full release). If you don’t have a human who can start the work, then whatever work was under consideration doesn’t get put on a project plan. Groups that were growing had open heads but did not commit to work based on filling those heads.This makes it very easy to know what a project is actually going to accomplish because everything without a human assigned to it simply won’t get done. It will only get done the next time the team regroups, builds a new plan, and starts. In the case of Office this took place every milestone (projects had 2 or 3 milestones) and in the large every release.A big part of how we ran in Office then was to free everyone from ever thinking about headcount, ever. There was really no process to request headcount. We started a project with a known number of people. Every team could hire people to replace attrition. And then every new project cycle we assessed where we wanted to spend resources and increased, decreased, shut down, or created new efforts. Lather, rinse, repeat. We grew the Office organization from 350 to 2,500 over a decade using this deliberate approach, and never had thousands of open heads.Whenever we wanted to do something entirely new the first step was to create the team by reallocating from our existing teams, in a significant enough way we could execute the whole project just as described above. This is how we created OneNote, SharePoint, InfoPath, and even the original Office Product Unit. By starting new efforts this way, we benefitted from having experienced people volunteering for the new work who were committed to seeing it through and we never went through a period of one manager telling us they are still hiring people to do the work.Some reading this description would be critical and point out how this lacks agility. They might suggest that this does not allow for flexibility or entrepreneurial thinking. What if a really great idea comes up or a competitor does something requiring a response, people would ask? Easy, change the plan, allocate people to that new thing, and scale back or cancel something else. What if something is much more difficult than originally conceived and there’s no way to get it done without more people? Easy, the team really messed up and either we immediately reallocate from elsewhere on the team or we kill the feature.Why are all these so “easy” then? Because anything that relies on hiring, onboarding, training, and getting up to speed with people that don’t currently exist has zero chance of getting the work done in conjunction with the rest of the product plan. If the business wants credit for the feature then it is going to want it to finish with a release on some schedule, be incorporated into marketing, and launch. Otherwise, it probably won’t exist for customers anyway.Were there complaints or grumbling? Of course and primarily along the lines of “we could do so much more” which was hardly specific to any single team. There’s a certain psychology that takes hold while building out a product plan once execution starts. There are people who always think about “just this one more thing” or “if only we could also do this too.” They fall into the trap of believing that it is always one thing that makes all the difference. That one extra thing. But it is never like that. And on the outside chance it is, then it is far more likely that the whole of the plan was not that great in the first place. That one last thing is never considered in the context of the entire plan, rather it is just in that one moment. That’s the whole flaw with planning by headcount rather than holistic plans based on people that exist, ready to do work.Ultimately, the key for how we worked in Office was to remove headcount from ongoing discussions. There was never a headcount request or approval process. Everyone was expected, and did, simply work with what they had. The deal from management (at every level) was that we lived with the tradeoffs teams made along the way. Into that process of tradeoffs, we baked in a culture of commitment to partnerships across the organization so we avoided one group prioritizing locally at the expense of other groups depending on previously committed work.Windows (and Windows Live) had almost the mirror image of this approach. Nearly every team ran with open heads that sometimes approached half their existing team size. It was not just that the team was always hiring (we were always hiring in Office too) but the team was also in a constant state of having no idea what would get done and when. This lack of clarity extended to cross-team collaboration where headcount gymnastics were still not enough to make good on commitments.It was even a bit more insidious than I just described. As I began talking to teams about what they were planning on releasing, it was almost as though at every step I was running into a manager explaining that they had open heads. I would ask then if the feature was in the plan or not, and they would always say yes. I would follow up, asking when it would be done. The answer was that it depended on when they could hire someone. Yet if someone left the team (in general, Microsoft teams at this time were attriting at 6-8% per year, more so during Longhorn as per the articles in the last section) then the next hires were simply replacements for who had left.None of this reality slowed teams down from working in a constant state of signing up to do more, requesting and being granted more headcount, and furthering the gap between what was sitting on slide decks as the plan from a team and what code was going to be written and delivered (and when). Meetings with executives (aka me) were viewed through a lens of expansive slide decks and accompanying headcount requests.The culture of headcount, as I called it, led to a world where people were seemingly rewarded for thinking up big ideas and making the case for more headcount to implement those ideas. It seems entrepreneurial—making a case for an idea and getting resources to build it. Everyone can make a case for resources to get something done, but the question quickly becomes what will actually get done. The process of circling back to those original proposals and checking in didn’t really exist, other than meetings where projects went from expressing goals to expressing “non-goals” or what was no longer in scope. My inbox was filled with these decks offering to get me up to speed, or maybe to approve more headcount.The flipside of the culture of headcount is just how much bloat it causes. People do get hired on to these teams and the teams eventually grow though never as much as the open headcount (also more headcount keeps getting added as the team expands the charter to do more, at least on paper). The problem is that as soon as people show up they are invariably added to the efforts that have already fallen behind and not scaled back. This is a big part of how the original Outlook and NetDocs projects got to be so large, both of which reached a point where in order to ship headcount was frozen and plans shifted quite a bit. In Systems, this explained the growth of the Cairo project which was ultimately cancelled.The fiscal tracking systems in place only exacerbated the challenges this process created. The finance team trying to budget expenses gave up accounting by heads and simply tried to use actual dollars being spent on payroll and then literally guessed how many dollars might be spent the next year. In other words, rather than asking executives how many people were on the team, finance maintained a dollar-based Excel model of expenses that had little correlation to all those cost centers and headcount slots. When I would ask managers about their headcount, they would point me to finance who would then tell me a dollar figure for the team’s expenses.I did not intend to discuss headcount so deeply, but as I was listening to people tell me what was top of mind it literally drove me bonkers. All I wanted to do was make a list of what was planned and who was working on it, but all I could get back were big plans and open headcount. As it would turn out, this was one of the most visible signs that things could be improved and since I knew what to do it gave me a bit of hope when I needed it the most.This might seem like the talk of a headcount tracking maniac. I am not. In fact, I spent almost no time on this topic until I moved to Windows. As the next sections will describe, we had a massive amount of remedial work to do on headcount management. I won’t skip to the end, but a bit of foreshadowing is that we will ultimately get more done, ship on time, and with vastly more clarity by spending hundreds of millions of dollars less (in direct costs) and completely removing the whole concept of budgets and headcount gymnastics from the team. It was a huge headache and had we failed to deliver good products then the effort would have been used as a causal factor for failure, but it positioned us enormously well for the financial crisis that would seemingly appear out of nowhere halfway through our first product cycle as a team.The easy access to the headtrax system gave everyone a ready benchmark for how other groups were perceived as growing faster and bigger. In times of rapid growth, it was easy to find people who thought “Micro-dollars” were to be freely “invested.” Not in the back of my mind, but front and center, were my mentor Jeff Harbers’ words about spending and treating Microsoft money as the shareholders.That’s the rant as to why the key lesson learned for me was that if you want to know what an organization is doing, then just count where the developers are and what they are working on. It really is that simple. Every financial control follows from the number of people actually working on the team. People love to say that building comes from small teams and of course there is truth to that. Building at scale, however, requires sizeable teams. The way to make a big team seem small-ish is to keep the teams focused on building and making the tradeoffs inherent in building and not on budget and headcount gymnastics.In many ways, in a large company with many talented people and key product people in key roles, the unique and critical role of executive management is to decide and manage headcount so no one else ever even thinks about it and to drive the reallocations to get new things done, or adding headcount to be filled without the expectation or requirement to deliver in the current project. The only way to do that is by knowing what the headcount is actually building all the time.Returning to the inventory of projects in the WWL organization, I counted 74 projects, each with about 13 developers on average for a total of 947 developers. There were only about 780 testers which was far short of what Windows software generally required. Some of this shortcoming could be explained by Search, which was using more developer operations owing to the modern web architecture, but even Windows which I would argue needed more testers appeared short-staffed. There were 440 program managers which was shy of the 2 developers for 1 PM ratio I might have expected. There were, however, over 40 people managing the small teams of a dozen developers and most of those managers were serving as the lead program manager as well. I realize I am already falling into the trap of using Office as a baseline, but absent that there was no baseline, no plan or strategy, from which to work.The key lesson learned for me was that if you want to know what an organization is doing then just count where the developers are and what they are working on. The largest project teams, over 25 developers, in this whole organization were (in order): Search, Print server/drivers, audio/video platform, audio/video codecs, modern interface (pen, ink, etc.), and media rights management. While there is never a perfect correlation between number of engineers and strategy, it was abundantly clear either the resource allocation was off or at the very least was not aligned with strategy. Looking to Windows for some examples, there were only 13 developers on DirectX the core graphics engine for Vista and there were only 25 developers on the rendering engine for Internet Explorer and they were primarily fixing security and compatibility bugs until the recent emergency plan to produce IE 7.0 for Vista. That came about because there was whole new, non-HTML, browser as part of Avalon which is no longer in the Vista plan. Avalon, which would later be known as WPF for Windows Presentation Foundation was a cornerstone of the Longhorn plan had a total of 46 developers. While the specifics of what code was where might not have been totally clear (and certainly isn’t today as I write this) the team was staffed inconsistently with respect to what seemed important.Windows Live was organized as series of what seemed to be small projects relative to the overall scale. On the one hand, it might be easy to look at the allocation and think about each one as a cool startup inside a big company competing with a startup from Silicon Valley in a similar space. With that view, the teams were staffed well. Except Microsoft was not able to release things in a small way and grow them like a startup. Everything needed to work worldwide, include adequate accessibility, work across browsers (not just Firefox), and scale to all the users seeing the service on Microsoft.com one of the busiest sites on the internet. Microsoft’s online services were spread across 30 or so projects each with less than 10 developers, at least for the front ends (the backends were still in a separate organization).The difference in org structure and composition relative to Office had already begun to clarify some of the questions I was receiving.While those differences were stark to me, I quickly realized the obvious. Comparing what I was seeing in WWL to Office was not a merely non-starter, it was insulting my new team. No one in Windows wanted to hear anything relative to Office. Windows was not just different. It was vastly more complex as I was repeatedly told. It was also more difficult. For more than a decade I was used to being reminded directly (or more subtly) that Windows was technically deeper than Office, but now I was hearing that Windows was also a more complex management challenge. I wasn’t convinced but I was in active learning mode.I had no other baseline. I knew Office. I knew development tools. I worked across the company for BillG. I’d studied tons of other companies As much as I knew I was biased in my thinking of Office, I also knew…it was just software. It could not be that different, I thought. I did not really believe Windows was either more technically challenging or more difficult to manage, but I had to resist the temptation to debate those points. There were bigger issues. As much as I was focused on addressing Windows challenges, I realized the pain and anguish the Vista product cycle had brought to the broad employee base.To many product group employees, the stock price slump reflected the product execution, and Vista was taking the brunt of that blame. The challenges were much broader and deeper, however, and it would take time for employees and other executives to gain an appreciation for the difficulties the company was facing in products. There’s a tendency to view morale and employee issues (or broadly culture) as distinct from company execution and performance, but at least at this moment one thing was clear. The negative employee experiences were happening at the same time as customers were experiencing product issues and strategically the company seemed to be falling behind. It did not seem to me that one could fix the culture unless we built better products, executed more effectively, and transformed the business to be more competitive.A favorite internal conversation for me was on a Tren Griffin (TGriffin) email discussion group, called LITEBULB. TGriffin was a former technology investor, Seattle-area native, and early friends with the Gates family. He was one of the strongest strategic thinkers at Microsoft. Long a student (and author of books) of investing, Tren frequently posted news stories or questions about competitive markets, Microsoft’s approaches, or other industry dynamics, and generated a rich discussion among a core group of contributors and a larger set of observers. Often the best discussions about Mini’s posts or other press articles about Microsoft could be found on the LITEBULB distribution list, or his external blog 25iq. (Above is an example of a thread from LITEBULB.) After a couple of weeks of listening across these many forums, I started to gain a full picture of what was going on. It was deeply emotional for me—a mixture of opportunity, as I said in many team meetings, “to work on the other greatest business in the history of business” (a not-so-subtle reference to Office I could not resist), and deep angst, which I also shared in many meetings. “So much of what I’m hearing are things I’ve seen, heard, even experienced over the past 15 years but from afar . . . and now these are my problems, and by that, I mean our problems to solve together,” I would say.I had moments of sheer terror. For a while, I tended to avoid people outside of the Windows team, especially my dear friends in Office, because they all wanted to know, “Are things really that bad?” I simply could not afford to be candid. Even going to yoga class or out to dinner resulted in sightings I wanted to avoid. Seattle was a one-company town back then.Fifteen years earlier when Mike Maples (MikeMap) shared his description of two gardens, the Windows and Office gardens, I understood it intellectually from the experiences I had. Now I was experiencing the difference emotionally. Even to this day, I struggle to articulate just how different the cultures were, while both still achieved spectacular success. Somehow this came about all under one roof in a remarkable case of divergent evolution.While I definitely experienced lonely moments leading Office, I was never as lonely as I was in the first six months of working on Windows.I had to write to think. But I was not ready to write in public.On to 085. The Memo (Part 1) This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit hardcoresoftware.learningbyshipping.com
