Managing a world fleet of Home windows desktops, laptops, and servers for Google’s inner groups will be tough, with a relentless stream of recent instruments, excessive expectations, and stringent organizational wants for safe, code-based, scalable administration. Add in a globally distributed enterprise and prolonged work-from-home necessities, and you’ve got a recipe for potential hassle.
Right now we might prefer to stroll you thru a few of the instruments that the Windows Operations (WinOps) group makes use of at Google, and why we made (and open-sourced) them. Our group is continually working to enhance the method we use to handle our consumer fleet of laptops and desktops, and we have spent the previous a number of years constructing open supply, infrastructure-as-code instruments to do exactly that.
Now that we’re all working from dwelling, these decisions have enabled us to maintain working at scale remotely. Let’s dig into just a few frequent Home windows administrative challenges and the way our open instruments might help.
Challenges with scale
While you handle Home windows in a big, globally distributed enterprise atmosphere, issues of scalability are entrance and middle. Many common administrative instruments are GUI-based, which makes them simple to be taught however troublesome to scale and combine. An administrator is usually restricted to the performance constructed into the product by its vendor. Many instances, core administration suites lack qualities that we’d think about crucial in a dependable manufacturing atmosphere, together with the flexibility to:
- Peer evaluate edits and to roll adjustments back and forth on demand
- Implement platform testing, with help for automation pipelines
- Combine seamlessly with tooling that additionally manages our different main platforms
As a result of they depend on specific network-level entry, many of those merchandise additionally rely closely on a nicely outlined company community, with clear distinctions between inside and out of doors .
At Google, we have been rethinking the best way we handle Home windows to deal with these limitations. We now have constructed a number of instruments which have helped us scale our surroundings globally and enabled us to persistently help Google workers, even when main surprising occasions occur.
Open source products are more and more a key to our success. With the fitting information and funding, open supply instruments will be prolonged and tailor-made to our surroundings in methods different functions merely can’t. Our designs additionally focus closely on configuration as code, relatively than person interfaces: Code-based infrastructure offers optimum integration with different inner techniques, and allows us to handle our fleet in methods which are audited, peer reviewed, and totally examined. Lastly, the rules of the BeyondCorp mannequin dictate that our administration layer operates from anyplace on the earth, relatively than solely inside the corporate’s non-public community.
Let’s dig into a few of these instruments, organized by what they assist us get finished.
Prepping Home windows gadgets
Glazier, a software for imaging, marked our group’s first foray into open supply. This Python-based software is on the core of our Home windows gadget preparation course of. It focuses on text-based configuration, which we will handle utilizing a model management system. Very similar to code, we will use the versatile format to write down automated exams for our configuration information, and trivially roll our deployments again and ahead. File distribution is predicated round HTTPS, making it globally scalable and straightforward to proxy. Glazier helps modular actions (comparable to putting in host certificates or gathering set up metrics), making it easy to increase with new capabilities over time as our surroundings adjustments.
Safe, modular imaging with Glazier helps put together gadgets
Conventional imaging tends to rely closely on community belief and presence inside a safe perimeter. Methods like PXE, Lively Listing, Group Coverage, and System Heart Configuration Supervisor require you to both arrange a tool on a trusted community section or have delicate infrastructure uncovered to the open web. The Fresnel undertaking addressed these limitations by making it potential to ship boot media securely to our workers, anyplace on the earth. We then integrated it with Glazier, enabling our imaging course of to acquire crucial information required to bootstrap a picture from any community. The consequence was an imaging course of that could possibly be began and accomplished securely from anyplace, on any community, which aligns with our broader BeyondCorp safety mannequin.
Fresnel allows imaging from any community on the earth
The distant imaging and provisioning course of included a number of different community belief dependencies that we needed to resolve. Puppet offers the premise of our configuration administration stack, whereas software program supply now leverages GooGet, an open supply repository platform for Home windows. GooGet’s open package deal format lends itself nicely to automation, whereas its easy, APT-like distribution mechanism is ready to scale our package deal deployments globally. For each Puppet and GooGet the underlying use of HTTPS offers safety and accessibility from any community. We additionally make the most of OSQuery as a way of amassing distributed host state and stock.
GooGet helps us automate package deal distribution and deployment
Our infrastructure nonetheless has dependencies on traditional Lively Listing (AD), and the area be a part of course of was a very distinctive problem for hosts that don’t bootstrap from a trusted community. This led to the Splice undertaking, which makes use of the Home windows offline area be a part of API and Google Cloud providers to allow area becoming a member of from any community. Splice allows us to use versatile enterprise logic to the historically inflexible area be a part of course of. With the flexibility to implement customized authentication and authorization fashions, host stock checks, and naming guidelines not usually out there in AD environments, this undertaking has given us the flexibleness to increase our area nicely past the traditional community perimeter.
Splice helps us be a part of new gadgets onto our Lively Listing area from anyplace
Sustaining our fleet
Deployment is just the start of the gadget lifecycle; we additionally want to have the ability to handle our lively fleet and hold it safe.
The Home windows inner replace mechanism is usually enough to maintain the working system patched, however we additionally wished to have the ability to train some management over updates hitting our fleet. Particularly, we want the flexibility to quickly deploy a crucial replace, or to postpone putting in a problematic one. Enter Cabbie, a Home windows service that builds upon Home windows APIs to supply an extra administration layer for patching. Cabbie offers us centralized management over the replace agent on every machine in our fleet utilizing our present configuration administration stack.
Centralized patch management utilizing configuration administration
We even have Home windows servers to handle, and these hosts current distinctive challenges, distinct from these we face with our consumer fleet. One such problem is easy methods to schedule routine upkeep in a manner that’s simply configurable, automated, and will be built-in with our varied brokers like Cabbie. This led to Aukera, a easy but versatile service for outlining recurring upkeep home windows, establishing durations the place a tool can safely carry out a number of automated actions that may in any other case be disruptive.
Constructing for the longer term
Our group was lucky to have began many of those tasks nicely earlier than the Spring of 2020, when many people needed to abruptly depart our places of work behind. This was due, partially, to embracing the thought of constructing a Home windows fleet for the longer term: one the place each community is a part of our firm community. Whether or not our customers are working at a enterprise workplace, from dwelling, or on a digital machine in a Cloud information middle, our instruments have to be versatile, scalable, dependable, and manageable to fulfill their wants.
A lot of the challenges we’ve mentioned right here should not distinctive to Google. Corporations of all sizes and shapes can profit from rising safety, scalability, and adaptability of their networks. Our objective in opening up these tasks, and sharing the rules behind them, is to help our friends within the Home windows group to construct stronger options for their very own companies.
To be taught extra about our wider fleet administration technique and operations, learn our “Fleet Management at Scale” white paper.