Fannie Mae: Scaling the DevOps Enterprise

April 02, 2019 By Derek Weeks

5 minute read time

*Note: Join us live and online for the 2019 Nexus User Conference on June 12. Registration is free.

When you think about scaling DevOps into the enterprise, Fannie Mae is near the top of the size chart. They have over $100 billion in annual revenue and 7,200 employees. While they primarily have one DevOps model, they have 468 applications and 1,200 software assets.

Combine all of that with their unique role of being a government-sponsored, public entity, which enables them to get the benefits from both sides, but also the heavy weight governance and processes - and they have quite an interesting case study to learn from. 

For those that don’t know, Fannie Mae provides liquidity in the U.S. mortgage market by bundling mortgage loans into securities and selling them to investors around the world.  This means they also have the added burden of financial security regulations and practices.

The point being - scaling DevOps at Fannie Mae was daunting, but doable. Barry Snyder, their former DevOps Product Manager, talked about it at the 2018 Nexus User Conference to inspire others. He is a frequent speaker on DevOps and is always helpful.

The drive to DevOps at Fannie Mae was sparked with their obvious need to automate and provide transparency. Following the mortgage crisis a decade back in the U.S., which Fannie Mae was at the center of, the organization was shifting to a revenue model of fees over investments, was projecting a 30% increase in the housing market, a change in competition, and an accelerating housing finance ecosystem. Fannie Mae leadership instituted change initiatives across three broad areas:

  1. Shift in culture

  2. Enterprise simplification

  3. Change how we build software

To change how they build software, Fannie Mae decided to:

  1. Automate (DevOps)

  2. Improve the software supply chain

  3. Monitor and measure

This all started in 2012, when they had lots of manual governance, deployments were once every 18 months, and they were in full Waterfall. Barry mentioned his team had a person that just submitted, monitored, and got tickets to resolution. This stifled innovation.

What does the transition look like? They went from:

  • 9-18 months to release to production to projects releasing every month

  • 5-6 inspection points with 32 stakeholders with 100% inspection to 1 Working Group, 1 CAB/inspect, 5% inspections

  • 814 recorded libraries in production and 26% with critical vulnerabilities to 19,000+ recorded libraries in production with 5% critical vulnerabilities

Nexus played a role in improving their library management. As noted, after using Nexus, they were able to identify 19,000+ libraries, and it is growing. Now, only 5% have critical vulnerabilities, and most of those have waivers because their are either no other options, other critical functionality is tied to a critical function in another library, or they aren’t using the functionality that exposes the vulnerability.

barry1

Nexus Repository is key to manage the software supply chain, a key to the overall pipeline. Fannie Mae’s supply chain, like most, has commercial components from the outside, open source software (OSS) from the outside, and code they build. Barry notes that all the code they build internally are components - no different than any OSS component.

Barry reveals that after scanning, they discovered OSS components were a higher quality overall. So, they sought to clean up their component quality enterprise-wide. First, they modernized from a very manual effort to installing tools to automate it more. Now, they are breaking builds when necessary and shifting the availability of the scans to the left, so that the developers can access them early and often. This helps them identify, understand, and fix vulnerabilities before they get too far along.

Once in production, vulnerabilities can be found any time, so they automated notifications with Nexus so that as soon as a vulnerability is found, it comes back and notifies the product owner and tech owner. As one example, Barry mentioned they found a web app vulnerability, used Nexus to find all apps using the component in a matter of minutes/hours. This is a process that previously took days, weeks, and even months.

barry2
Looking back, Barry realizes there are things they could have done better. He laid out three things he would tell his younger self now:

  1. Approach DevOps as a product. If scaled at the enterprise, you need to approach it as a product. Treat developers as the customer. Find their pain points and address them.

  2. End-to-End Pipeline. Take time to think through your pipeline. Even now they are finding themselves having to restitch how they tied together all 31 applications.

  3. Beware of early lifts. They didn’t have Jenkins, but they had another tool, so they used it. They still have to use this tool for many applications because they tried to get an early lift from it, but it is much more time consuming and manual than Jenkins. But they also found that Jenkins jobs don’t scale. One of their deployments has 7,000 Jenkins jobs to manage. So, they now have new tooling which will cut these down to 10% of what they manage today.

Looking to scale DevOps? Or even thinking about DevOps for a small organization? There are many similarities and lessons learned from Barry and all that has transpired at Fannie Mae. Catch his whole presentation, for free, here.

If you are interested in learning more about all of the Nexus products, check out the platform here. And keep an eye out for more session recaps from the 2018 Nexus User Conference - we'll be sharing them every week leading up to this year's conference on June 12.

Tags: devsecops, Nexus User Conference, featured, Post developers/devops

Written by Derek Weeks

Derek serves as vice president and DevOps advocate at Sonatype and is the co-founder of All Day DevOps -- an online community of 65,000 IT professionals.