[00:01:28] Guy Podjarny: Hello, everyone. Thanks for tuning back in to The Secure Developer. Today, we’re going to explore the different levels of security or sort of perspectives of security in different companies with Sacha Faust who has a really interesting journey for security roles in different companies.
Sacha, thanks for coming onto the podcast.
[00:01:43] Sacha Faust: Yeah. Thanks for inviting me.
[00:01:45] Guy Podjarny: Sacha, before we dig in, tell us a little bit about what is it you do and how did you get into security. Just a little bit of the synopsis of the journey.
[00:01:53] Sacha Faust: I currently lead security intelligence in Amazon consumer payment. What that means is I don’t do fraud. I’m focusing more on generating a fast-learning flywheel in security by managing red team capabilities, threat hunting, threat intelligence, and trying to be a helping hand into the IR functions as well. So that’s been my focus this year at Amazon. How did I get into security? I got into the security field pretty early in my career. I would say around the mid-‘90s doing kind of early on pen testing. I got into security because I thought it was a good problem space. I used to play a lot of video games and then realized that computers have much broader sets of games that you can play, so a lot in the chatting with the networking back at the time. Then somebody just recruited me and says, “Hey, we’re trying to build a consulting business to do pen testing against hospitals and financial systems and so on. Why don’t you come in?” So that’s how I got started.
I worked through a broad sets company early on consulting. I got a little bit of polishing and cutting my hair and cleaning up at Pricewaterhouse for several years, which I think was a good experience for me. Then I decided to move into a couple of startups like WebInspect, SBI Dynamics. It got acquired by HP. I did quite some interesting work there and then spent some time, I would say a decade, about 11 years at Microsoft. I worked on the early Microsoft online service like SharePoint online and so on and kind of revisiting how do you take flagship products that are on-premise and adapt them to cloud. What’s the multi-tenancy concept there?
I gradually moved into a developer. I shifted my career into pure development and helped build some of the things that are now called Office 365s and Azure Active Directory to learn distributed systems. Just to kind of polish some skills and then went back into security and helped lead red team thinking in Azure and then shifted to Lyft. More of a breadth role to head security for Lyft. Then now, back to I would say a breadth role into consumer payments at Amazon. So I kind of tiptoe all around the industry.
[00:04:03] Guy Podjarny: Yeah. There’s definitely checking all the boxes there. You’ve got a security vendor in SBI dynamics. You’ve been in large sort of security-sensitive setups like Azure and today in Amazon. You’ve been in a sort of fast-growing hypergrowth company in Lyft, as well as the developer experience kind of in there. Let’s unpack some of those. So maybe what we can do is we can focus on steps in the journey. Maybe we’ll talk about Lyft where you headed security. What did it mean to do security or to head security at Lyft? What’s the perspective you need to successfully do so?
[00:04:35] Sacha Faust: The thought process and sort of the mission statement we had at Lyft and Lyft was a great way to experiment from the beginning, hire very senior people and the grass is green, right? So you don’t have established process and things like that. The mission statement for security was how do we inform Lyft to make informed and automated security decisions, right? So handling security there was let’s try to avoid pitfalls that we’ve seen in the past around sort of a dictatorship approach of like you must do this and more into an empowerment approach so that we enable the decision-makers to make their own decisions and dive into like true accountability. As a service owner, you are accountable for the A-to-Z of your service. We are a center of excellence to help you.
It was, one, is an experiment to sort of dive into meaningful kind of relationship with stakeholders, and the second part was like how do we scale this thing because we’re moving at an absolutely insane pace sort of in the very, very fast growth, right? Heading security was interesting, moving a lot more into stakeholder management. Do not dictate. We do not sign off. We recommend. We guide you. At the end of the day, the reality is even if security orgs are signing off on things, service owners will either overrule you or make their own decision as they, in my view, should do. Managing that was pretty interesting and then navigating through fast scale, really focus on hiring builders, and have a very thorough hiring process. What you need are smart people. Smart doesn’t necessarily mean intellectual horsepower, right? It’s like interpersonal relationships. How are they taking deep? Or do you have a strong sense of ownership and so on and really having to build them out? That’s generally sort of the experiment of Lyft’s security in the early days.
[00:06:26] Guy Podjarny: Do you feel like that’s an advantage or a disadvantage? When you think about this significant ownership, empowered teams that can overrule you or run? Can you contrast that maybe with surroundings where it sounds like that was a bit of a change maybe from the reality you came in in Microsoft? Is that a positive or a negative?
[00:06:45] Sacha Faust: I think that was a good positive. You can look at it two-way. Either it’s overwhelming or empowering, right? Overwhelming is like, “Hey, there’s no infrastructure. There’s nothing done, right?” Or not at the bar that, for example, a well-established with a big bull’s-eye on their back like Azure is or even Microsoft.
The second thing is empowering. You can actually solve problems in a very different way without having any historical or territorial battles or existing sense of ownership there. You left to kind of a clean slate, so you can do things in a more optimal way. From a culture perspective, you get to define the culture, which is what I was focusing on a little bit earlier around the mission statement. How do we make decisions, right?
Now, onboarding people from other companies like large established companies and so on, it’s a little bit of an adjustment. You are not telling people what to do. You’re guiding them and then just a tweak in communication and a tweak in sort of the metrics that you look at and so on. Let me give you a concrete example, right, is security and imposing it at fix SLA. At the end of the day, who are you to dictate how they operate, right? If you’re saying, when we’re looking at the sort of security SLA and I look at compliance aspect and I looked at the engineering SLA, the engineering SLA were generally in part with compliance. So decision-makers, like why don’t we actually gather the security debt? Provide meaningful information back to leadership in terms of, these are the security debts that are out of your SLA, right, and surface this up.
Then here are the ones that were probably probed even further and try to move away from hypothetical threat to more real threat with some exercise, and here’s the ones that we’ve surfaced. That’s the type of mindset where we are sort of empowering them to make good decisions, and then they come back to us as, “How can we get better?” How do you grow your org as well, it becomes more kind of a stakeholder management at this point, “hey they’re informing us. We need some help.” Okay, we’re willing to help you, but it will cost X. Then simply what you’re realizing, gradually you get funded in a smooth sort of ride here. That was the experiment I would say at Lyft but turn out as a fast-paced company. There’s pros and cons, but generally I think it worked well and we had a good culture there.
[00:08:58] Guy Podjarny: You’re coming into this sort of Lyft surrounding, and we’ve got – I’m imagining a decent number of areas that require investment. How did you choose what to take on first?
[00:09:08] Sacha Faust: That’s an interesting challenge and I think a continuing exploration on my end in terms of the prioritization aspect. At the end of the day, the customers are going to tell you what they feel important, right? You may agree or disagree with it. It doesn’t matter. You are a servant to them and you should provide value to them. So how we pick and choose, it was also kind of adapting to us in terms of what’s really important for them, and I found the engineering phases defined by some folks. I forgot his name, some person at Facebook around the explore, expand, and extract phase of engineering. It made sense to me and when I was looking at the investments that we had, this is not just, for example, focusing on Lyft. This applies to broad sets of companies that I’ve worked at, is what’s really important at each phase, right?
If you have a security engineer, if you don’t mentor your security engineers in that way, for example, if you’re building autonomous vehicles that you’re trying to bring to market and the security engineer is only focusing on, “here’s the scenario of people pointing specialized lasers through your Lidar,” are you really kind of giving them value to get to market? Now, if you’re not asking those types of level of questions in the payment infrastructure, are you really giving value back? Probably not, right? In both cases, like you have to make informed decisions there.
Around the prioritization is really go ask the customers, and customers or your engineering teams may have other customers as well. But your true customers, like the people consuming the product, what is important to them? What are some of the metrics? Is fraud a problem? Can we educate them to improving the security of their own device? Can we actually automate that so that you build kind of a more solid security aspect? Around scale, are they making good decisions internally, right? So you can do education and so on, but how do you validate that really solidify? At the end of the day, it’s all about like very quick decisions.
Attackers are playing a numbers game, so they blast you with a bunch of things, and very often what they are trying to do is generate sort of forcing an individual to make a decision and looking for a favorable outcome. Then they played a number, right? In terms of prioritization also is, yeah, you can try to prioritize all sorts of things. But if you prioritize a culture, things will move fast and give them incentive, not just financial. I love one of the podcasts that said, “Yeah. You can throw money at it, but that’s maybe not the right reward. Recognition is probably a lot more powerful.”
We spearheaded some of these things at Lyft around are they making good decisions. How do we validate this? An example of that is spear phishing, for example. A lot of people only look at who clicked on it. Now, how many people actually reported it, right? So you had that metrics that, “Oh, okay. There’s a decision there that we need to improve. We have security mechanism.” I’ve seen a lot of security tools focusing on the perception and reality. Perception is here’s the coverage they report. The reality is they don’t look at the usage. If this is not being utilized, what’s the value, right?
SO BE CUSTOMER-FOCUSED AND LOOK AT SCALING AND SCALING IN A WAY THAT DRIVES A CULTURE.
I think people always is first there. That’s kind of an abstract answer to your question. There is depth to it, but generally that’s what we try to do in a very, very fast-paced environment.
Now, in larger, more mature spaces, it’s completely different. You have like requirements and a bunch of, especially in the fintech. You have your strong compliance that you got to meet, and that’s business enablers.
[00:12:37] Guy Podjarny: Yeah, for sure. But it’s a very kind of good perspective I feel playing to the advantages of a fast-growing startup, which is you have these empowered teams. You have the opportunity to build things into the fabric, whether it’s cultural or technology. So it’s kind of helping a bunch of these problems go away, and you need to prioritize the speed and agility because needs change and you have to move along with them in a faster pace maybe than they might be in a more established company.
[00:13:01] Sacha Faust: The main thing for me or if I sort of can resume when I looked at this is how do we drive accountability in the right way? Clear the room in terms of I see a lot of teams, for example, defer accountability back to security. Oh, security signed off on it. Well, the reality is you gave 1% of the information there. We’re not making informed decision, right, so I am not the approver of this. I’m a guide and showing the security engineer responsibility matrix and say, “Where do you feel you fit into that,” was kind of a powerful message from a coaching perspective. It’s like, “Well, the reality is you fit here and not there.” You’re not an approver, for example. So that’s how we try to navigate sort of the fast-paced. I would say that lessons learned are definitely applicable in larger companies, how you tackle it, and level of patience is a little be different. You have to look long term versus like short-term changes.
[00:13:54] Guy Podjarny: Well, let’s dig into that. It’s a fascinating period there at Lyft, but you also had worked at Azure and you’re now with Amazon. How is that different? How do you sort of consider the security needs or what does it mean to do security well in those types of setups?
[00:14:09] Sacha Faust: There are different targets. We’re looking at that cloud infrastructure. Let’s say a financial infrastructure much more mature, much, much larger teams, established process. It’s very difficult to change a process. You have to have a good idea why you’re changing it. Do we have some level of inertia or lack of proactiveness in certain areas? How do you navigate making that change in a fast way but also a diplomatic way? The infrastructure and the process are already well established. It’s about fine-tuning things versus creating things, for the most part. The baseline is there. Now,
HOW DO YOU TACKLE ‘UNKNOWN UNKNOWN’, FOR EXAMPLE, IS AN AREA THAT YOU’RE GOING TO SPEND A LOT MORE TIME ON IN ESTABLISHED ENVIRONMENTS. VERSUS A LESS MATURE IS LIKE WE’RE JUST TRYING TO GET UP TO SPEED AND JUST GETTING THE BASELINE DONE, RIGHT?
Tackling ‘unknown unknown’ is an interesting one. I think generally assume breach. Be humble. Realize that whatever you have in place might fail, and apply default modeling mindset is like, “Do you know that this is failing, and can you recover?” So I would say larger environments are much more focused on automation and large-scale systems. If we think about kind of the Lyft environment versus – I mean, just to give you kind of ballpark employee numbers. You’re looking at 10,000, 15,000 employees versus over a million. It’s a just very different scale from the human aspect but also from an engineering like how many things you have.
The larger environment is a lot more depth-focused. You are hired to really do special things and focus on that, and everything you have to do needs to scale. So the level of experiments is a lot more challenging I would say. I would say that those are the main differences, and the corporate chassis is more complex I would say.
[00:15:54] Guy Podjarny: Yeah. It might be more stakeholders and kind of more history to navigate when you’re trying to drive some change.
[00:15:59] Sacha Faust: Absolutely, absolutely.
[00:16:01] Guy Podjarny: We talked a little bit about the sort of breadth versus depth and maybe, I don’t, just in Amazon specific or it could be your general experience. When you think about a security problem, when you think b about going deep, what does that mean to go deep when you tackle a security risk?
[00:16:16] Sacha Faust: Specific to security risk, for example, the why of a bug, right? Let’s call it – You found an issue, for example, and you call it a risk or you call it a bug. There’s defensive opinions on this, depending on who you talk to. If you talk to an engineer, it’s bug. If you talk to risk management, it’s risk. They’re both saying the same thing, but some very often they don’t agree, so that’s a different topic. Let’s talk about like you found let’s say – I think a good source of bug, for example, is Bug Bounty, which is an interesting one because that’s provable.
Going deep on a bug, one is like what is the nature of this bug and what sort of information can you get out of it? Then there’s also questions in terms of what is the true fix of a bug. So let me dive a little bit more deep around this. Going back to the thought process about providing information to decision-makers is you have a bug, but let’s avoid the doomsday thinking by default, right? I think a lot of security engineers say, “Oh, doomsday.” They can align the moon and the sun and all the planets to come up with that outcome, right? Or you can use the CVE type risk rating, and it gives you kind of a baseline, but that’s not contextual. So you have to add some context to it, to the business.
Quick questions are like, “Is this is known externally? Do we have any evidence of this being exercised against us? Is there an automated exploits for this type of bug?” Going back to the Bug Bounty is you can already prove that it’s externally found which is what I think I found super valuable there, because it cuts that conversation. It’s not internally found. They found it externally, so right away it lit the bar because you’ve proven that somebody else can figure it out with limited knowledge of your infrastructure, right?
Now, is bug being exercised against us? I see a lot of people shy away from answering that question because it involves a more thorough sort of root cause analysis of a bug, which very few teams are actually doing. Like how did we get here? I think stimulating that investigation or building the muscle to do this at scale is an interesting aspect. It’s something that I think you should look at. Second is like why are we here today. Going deep onto at least five or more layers of why is hypothetically and what I’ve seen something missing is that thoroughness, so asking the good questions of why. Why does this buy exist? The answer cannot be, “Well, we screwed up from an engineering perspective.” Okay, why did we screw up on an engineering aspect?” Is there quality control in place? Yes. We had the expectation that these types of class of bugs should be found or identified early on in a pipeline. Yes, okay.
Then we have a failure there as well. So very often, a single bug expands if you do a proper root cause analysis, and they are expensive to do, so maybe you do a sample once in a while and so on, depending on the velocity of your team. But going deep at that level, sometime it’s a lot more important to go fix the quality bar faster than go fix the bugs because if you go fix the quality bar, what you will realize in experiments I’ve done in the past with solid static analysis like [inaudible 00:19:13], for example, that allows us to model bugs like patterns at scale is there’s probably a 10, 20 instance of them. Go fix those, right? They may be in more critical systems like applied to tier model of the service, right? So go deep there. That’s what I mean about going deep is really kind of squeeze out as much learning opportunities of that sort of “bug failure” and turning into an impact that you measure.
Now, let me dive very quickly on the definition of fix. A fix can be a code change. But when do you close this bug, right? Is it that you close it when the fix is coded? Do you fix it when the fix has been deployed? When is that? When you start looking at the metrics there in terms of what is the velocity of your fixes and you’re painting an accurate picture, what you might find is that you think that the issue has been resolved, but it’s only been resolved in code, never been deployed. Or it’s been deployed partially in one of the service slice or it might be only in testing, right? When you start measuring that, you might find that there is significant gaps in terms of the reality versus perception. Perception is fix. Reality is not.
Sometime, there is a logical reason. If you’re looking at high velocity teams like Lyft and kind of the newer engineering, fresh from start, you tend to have a faster velocity, and there is some chaos approach of resiliency to terminate everything every 14 days, and so it may give you that agility. If you look at larger systems, kind of a cloud provider, fixing a bug at the sort of kernel of the infrastructure of the cloud may take several months for a very good reason. You have to do workflow migrations and so on, right?
So deep dive into what the real fix is and look at, for example, the real fix which is kind of the code fix may take six months to deploy, and you’ve made a decision of the severity based on like who is potentially – Do we have evidence that this is being exercised against you? Do you have threat intelligence that this is being sort of exercised in the wild? As a security owner, my thinking is that we should inform decision-makers based on those criteria if their criteria are changing and potentially prioritize more the detect and response aspect. The protect, which is basically just the fix itself, may take a lot longer. Once you start diving into those types of metrics, you might have some surprises there in your environment.
[00:21:40] Guy Podjarny: I love a deep perspective really on so many dimensions here, right? You mentioned an understanding of the depth of the risk, understanding if it’s severe, if it’s exploitable, if it’s being exploited. You talked about going deep on the origin of it. Why did we get here? What was the missing security control or engineering control or engineering control that kept us from finding or preventing this bug in the first place? Then you talked about the roll out to the fix. I love that approach.
At Snyk, I often get asked what’s the best place to put security controls to flag on vulnerabilities. There’s no one place that is a superset of the others. If you’re closer to the code, you’re closest to the fix. You can find those. If you’re in the pipeline, it’s probably one of the better preventative controls you can put in there. You can break. You can disallow something to get through. If you’re monitoring what to put in production and closest to the facts, because as you just pointed out, it might take a while to roll things out. There might be a line that isn’t quite straight from source codes to deployment. Somebody might even if they shouldn’t have approached the production system and sort of go and find it after. Each of these things presents a complexity. I love how you’re saying if you go deep, eventually you probably have a more robust system. But it’s an investment and it’s an effort.
[00:22:50] Sacha Faust: Absolutely. How do you do this for all of them? Maybe you take a sample. It depends on – A fast-paced company will have less time, but they also move a lot faster I would say in the fix. It was interesting for me to come from a high enterprise engineering focus with quality bars in Azure, and Azure is the only financially backed SLA cloud provider, right? Once you start digging into that, how do they get there is being interesting, and then you go through a fast-paced environment where they deploy at will several times a day, and it’s really approach of fail fast, learn fast, which doesn’t align very well with enterprise engineering, right?
So it’s a little bit kind of a shocker for me when I moved into this, but then I kind of enjoyed it and learned quite a bit about go deep on the root cause, so the one thing I would say I appreciate. I used to call it a family sort of dinner conversation which is our weekly root cause of failures at the company. As a family, you fight but it’s to improve the house, right? I absolutely enjoy the thoroughness that they went through, and it took the time it took. That’s how you fail and learn fast. In a larger environment, the volume of those signals are much greater. So I think you have to figure out kind of a way to a sample and prioritize but not avoid doing it.
[00:24:05] Guy Podjarny: I think – We can probably kind of go on and we can go deep on this specific topic, and I think it’ll be still good. But I want to dig into one other aspect of your background. We talked about the different companies that you are in and the different security perspective maybe or some anecdotal pieces of the difference in thinking about security initiative. But you also had a stretch of time in development, and I was curious, especially given the audience of this podcast, which kind of straddles this world of security and development, how impactful you think that development experience has been to your ability to do the security role to understand these things to work? Does it matter? Does it not matter?
[00:24:42] Sacha Faust: That has worked for me, right? So that’s all I can say, and it has worked for me different ways. I think diversity of experience I always encourage people to do that. Many years ago, as a security engineer, I felt that I didn’t own fixing things. I was just giving guidance, kind of got a nod from engineers, and the body language sometimes were like, “Yeah, whatever.” I didn’t feel sort of am I making a meaningful impact in this conversation or am I seeing as like kind of a roadblock that needs to be bypassed and some engineers were very good at that.
I also wanted to learn how do we solve these large scales problem, distributed problems? It was a personal choice, and I love Microsoft giving me this opportunity of lateral movements in my career, an opportunity that may have been a lot more difficult to kind of coming in at the front door, right? It’s moving in distributive systems and working early days in the Azure active directory piece which is like very, very complex. That and then going to the security view myself on the other side of the fence and not telling people my background, right? They saw me as a pure engineer and so on.
Looking at how I was approached, looking at the questions, looking at the you must do approach or we know better than you approach, right? I was like, “Yeah, I guess these are good recommendations but have you even thought about the fact that this is sort of a stateless system?” Implement cross-site request forgery many, many years ago on a stateless system at a scalable way, where a system can be distributed across a large scale problem. It’s pretty interesting, right? So it definitely helped me kind of build up this empathy of like, “Oh, okay. This is actually what the work looks like on the other side,” and kind of understanding also the areas of friction I had before without really understanding or even realizing. A, this is just a one-liner code change. Okay, this one-liner, well, we need to write unit tests for that. The money is actually in unit test very often, so we have to write unit test for that. Oh, okay. We don’t have a mock layer for this. That one-line of code, if you do it, especially at the enterprise level where fail fast, learn fast is not necessarily what the goal of the business is, you should not be failing period in enterprise because there’s major consequences to that, and you lose business and so on.
So building up that empathy of like, “Oh, it’s a lot more complex,” and also working with very talented people on the engineering side really kind of upped my game from an engineering and says, “That’s how you kind of design in that scale.” Going back
WHEN I SHIFTED BACK TO SECURITY, I BECAME A LOT MORE DIPLOMATIC. I ASSUMED I DID NOT KNOW HOW TO FIX ANYTHING. I OFFERED HELP VERSUS TELLING PEOPLE WHAT TO DO,
and I carried over my personal experiments of trying to tackle security from the beginning. I carried this at Lyft and that’s why certain decisions were made when I was there. I think they’re generally carried since but I would say it definitely helped.
It also helped from my own team because if you’re looking at doing red teaming, for example, at the scale of cloud providers, the real problem is scale. So I carried over the distributed systems into building or at least leading teams that are heavy engineering-flavored that can actually scale to the need of the customer, right? If you’re doing automated testing and scaling, top provider scale, what we ended up building is kind of a system that allows for a security engineer and the specialist to kind of model their brain and build it into an automation system, right? That carried over I would say on both side, both from an empathy and also actually scaling teams very, very quickly.
[00:28:26] Guy Podjarny: I love that and I think it was clearly empathy or maybe that perspective when you were talking about Lyft and talking about hiring builders, talking about having the servant mentality and building those out.
[00:28:37] Sacha Faust: Regardless of sort of my personal experience moving into dev, the general guidance I tell people, both from a career and also how to build teams, diversity is king, so go wander elsewhere. Ask your management. From a manager, go send your people. Embed them into engineering projects if that’s the route that they want to take. Having diversity of experience building your team and people but also work experience has been, for me, extremely valuable and I always look at building teams that have that diverse set and tapping to it which Bug Bounty is a crowdsourcing by the tremendous value and diversity. We learn a lot from those. So that would be my general guidance, regardless of my personal route.
[00:29:19] Guy Podjarny: That’s great advice. I was actually just about to sort of say as our conversation here tells a goodness in it, but I think we’re kind of running a little bit out of time. But I was just about to sort of ask you that as we close off, I like to ask every guest or if you have kind of one bit of advice to give a team that is looking to level up their security foo, above and beyond. You just offered sort of two great tidbits around sort of diversity and that Bug Bounty. But still, if I squeeze one more out of you, if you have one bit of advice that you haven’t mentioned yet for a team to level up security powers, what would that be?
[00:29:50] Sacha Faust: Aim toward self-enablement, right? Work your way out of the task that you’re working on, right? Look really at an empowerment approach versus a dictator or telling people what to do. Enable them to get there first. Taking a quote from a previous manager, Eisar [Lipkovitz] at Lyft, there is a certain degree of benevolent dictatorship that needs to happen. Pick your battles, but generally work towards an empowerment culture versus maybe where dictatorship is inaccurate but kind of telling people what to do. Help them out.
[00:30:19] Guy Podjarny: I think that’s great advice and I think you’ve even given all sorts of examples about why that is valuable as we’re journeying through it. Sacha, this has been great. Thanks a lot for sharing sort of multiple perspectives, tons to think about as we come out. Thanks for coming onto the show and sharing.
[00:30:34] Sacha Faust: My pleasure. This was actually pretty fun for me as well.
[00:30:37] Guy Podjarny: Thanks, everybody, for tuning in, and I hope you join us for the next one.
[END OF INTERVIEW]