About this episode:
LIRAN TAL: Aside from that kind of notes, when you install new packages on npm, you usually grab the latest unless there’s a specific reason to choose a different major version. That is, of course, when you like first introduce it to the project. I think from there on, all of the bot automation that happens in the ecosystem, whether it’s through Snyk or others is something that is helpful to keep you on the latest version as needed.”
[0:00:32] ANNOUNCER: Hi. You’re listening to The Secure Developer. It’s part of the DevSecCon community, a platform for developers, operators, and security people to share their views and practices on DevSecOps, dev and sec collaboration, cloud security, and more. Check out devseccon.com to join the community and find other great resources.
[0:00:53] SIMON MAPLE: Hello, everyone and welcome to another episode of The Secure Developer. Today, we’re going to be talking a little bit about supply chain security, but in the context of different ecosystems in different languages, and the different considerations we need to make based on the ecosystems we’re using. Joining me today, I’d say a couple of Snykers, each an expert in their own domain own ecosystem themselves. Joining me is Liran Tal and Roy Ram from the DevRel and the product team, respectively. Why don’t we jump into some intros first of all? Liran, first of all, welcome back to The Secure Developer podcast. Must be your, what? Third, fourth time on the podcast?
[0:01:33] LIRAN TAL: Yes, thanks for having me. It’s always a pleasure having these conversations. Am I going to expect some trolling from you?
[0:01:39] SIMON MAPLE: Well, when we talk about something as emotional as language choice and ecosystem choice, I think it’s where real life is in our respective ecosystems. I think it is one that will likely have a level of trolling in this episode, but we’ll try and keep it to a minimum. But Liran, for those who are new to the podcast, tell us a little bit about yourself a little bit about your background as well.
[0:02:35] SIMON MAPLE: Awesome. Amazing. Welcome again. Joining Liran here today is Roy. Now, Roy, you’re first time to the podcast, right? Long-time listener, first-time caller. Well, welcome to the podcast. Tell us a little bit about yourself, what you do at Snyk, and also your background in tech as well.
[0:02:48] ROY RAM: Sure. First of all, thank you for having me. Yes, it’s the first time, a first-timer, so take it easy on me with the trolling. I’m Roy, I’m part of the product team for Snyk Code, which is Snyk’s SAST solution. My main focus areas at Snyk Code is language support, the framework support, understanding exactly what are the things that bring value to our users, both the security teams and the developers themselves. I get to speak to a lot of security teams and a lot of developer teams, really trying to pinpoint what are the things that will give them the most value.
In terms of my background, I come from cybersecurity, a good couple of years. Snyk is actually my first DevSec company, but I do definitely come from the cybersecurity field. If it’s product management, if it’s running a SOC team, if it’s a bunch of different hats in the cybersecurity world.
[0:03:38] SIMON MAPLE: My name is Simon Maple, and I’m going to be the host and moderator for this session. I’ll be definitely trying to keep a cap on the trolling. But my background, 20 or so years in the Java domain, really. Been a Java champion for a number of years as well. Actually, probably cut my teeth on C as my first language, probably close to 30 years ago. I did a lot of C and C++ at university. But really, when I left uni, and I joined IBM, actually, Java was the core ecosystem that they’d invested in and they were moving into. I’ve really done a lot of my more commercial development in Java, and that’s really been the last 20 years now of my life.
Yes, let’s jump in and talk a little bit about – why don’t we start with the differences really? We’ve already done a large amount of discussion back in February, and March around supply chain security. Of course, some of the latest releases in and around our support of CC++.
Actually, we thought it would actually be a great opportunity to kind of like merge the two and say, well, for people who are thinking about securing their supply chain, and in particular, looking at their bill of materials, looking at the third-party dependencies that they’re pulling in, what are the differences people need to really be cognizant of in terms of the ecosystems, the languages that they’re using. Is there anything we need to be more aware of, any approach changes that we need to make in our approach to being more secure?
I think understanding that is, first off, a really important aspect that this is like a less conservative community. Language ecosystem is less conservative, it’s fast-moving. The maintenance changes very quickly. Aspects like, what kind of packages you can find on the registry is very different. How long are they maintained is also very different. A lot of options around choosing dependencies too kind of differs based on that, because then you have to look at different properties of “What’s the maintainer profile? When was the last committed change to your dependency?” and so on.
I think that kind of profile of the developers and maintainers to a specific ecosystem, whether they’re less conservative, more conservative is very important in understanding also how to approach managing dependencies and the bill of materials for that ecosystem.
[0:06:36] SIMON MAPLE: Yes, thank you. Same question to you, Roy.
[0:06:39] ROY RAM: Yes, definitely. I’ll try to take a more on the C and C++ side of things. Definitely, when thinking about just the amount of time that C++ has been around, the amount of changes it’s undergone in the 40 years that it’s been out. Just for the sake of – I wrote down just a couple of dates and stuff, just because it blew my mind, C++ came out in ’85. Michael Jordan was named Rookie of the Year at that year. C came out in 1972, it was the first-time women were able to run in the Boston Marathon. That’s how long we’re talking back. To that extent, it means that essentially, you can do the same within this code base 10 different ways.
It means that you have so many different package managers, and open source, and open-source communities adding so much to it on a daily basis because it’s still being used for, in many cases, the most important code stack in an application. Generally, it’s very old, but it’s also potentially the most important code stack. I think the importance here is understanding exactly what you’re introducing into the code, the fact that you very regularly embed open-source libraries into your code.
I think it’s important to always be very aware of what you’re introducing, potentially even more than some other languages, and really trying to understand what it is you’re introducing, how much can you rely on it. And really try to understand where it’s coming from and make sure that it’s secure.
[0:08:06]SIMON MAPLE: Yes, yes, that’s amazing to think how old it really is in 50 or so years.
[0:08:13] LIRAN TAL: That’s great because I suddenly feel just so much younger, so thank you for putting that in perspective. Great.
[0:08:19] SIMON MAPLE: Well, many will really see Java as the C++ evolution really. It’s kind of like what C++ maybe should have been. Maybe we should call it C+++. But I think a lot of the things that Java adds on top, I guess, it tries to take a little bit more away from the developer in the sense of things like your memory management and things like that. It tries to help developers stay out of trouble. I remember when I was as a developer, getting lost through pointers, and addresses, and all kinds of things like that, which is really, really hidden largely in the Java space.
Let’s dig a little bit deeper and kind of talk, first of all, about each of the languages use third-party libraries. Which I think is one of the key things to ask really understanding our inventory, our bill of materials, and ease of building our application.
In Java, of course, we tend to use Maven I would say mostly. Gradle as well is a very, very common tool to build and fetch dependencies. There are, of course, a number of other diffusing and perhaps using IV if we go back a little bit, and some others as well. But Maven and Gradle are the main two that are used.
When we build using those, our POM XML is now Gradle build files. It’s quite declarative in the sense of the way we pull out dependencies that are used. We tend to kind of pin in terms of versions, when we pull a dependency, when we use a dependency, we’ll grab a dependency a certain version. There are pros and cons to that, of course. It can get a little bit complex, depending on what you use. If you’re using Maven, perhaps using parent palms, and you’re using this module-style system. If you’re using Gradle, you could be using a DSL. It could actually be quite hard for systems to actually identify what you’re actually pulling down purely on looking at the artefacts alone without building.
[0:12:48] SIMON MAPLE: It is very interesting, actually. Because I know if you just look at the sheer numbers of packages that have been registered, I don’t know what npm was at. I remember when we did a blog a while back was about the one millionth. It’s probably way higher these days, probably like two, something like two or three million.
[0:13:01] LIRAN TAL: Yes, I think it was like 2.3, 2.4 about.
When the one million package was raised someone on Twitter, I saw someone on Twitter said, “Oh, yes, I did the one millionth.” Then I thought, how is it possible, because it doesn’t give you a number back and the website update is way too slow?” He replied back saying, “I wrote a script, that when it was about 40 or 50 below, that automatically uploaded 60 packages.
[0:13:34] LIRAN TAL: I remember that.
[0:13:35] SIMON MAPLE: It’s interesting because it gives the ability of being that fast-moving organisation. It is very, very quick. It is very, very easy. It really reduces the barrier to entry. But that does pose other problems, right? Things like typosquatting, things like malicious packages, like you say. Would you say is a bigger issue on npm than almost any other ecosystem, or is this –
[0:13:35] LIRAN TAL: That’s a good question. I think we’ve seen campaigns where we had those malicious packages and type of squatting packages by thousands of them across both npm and Phyton over at PyPI. Probably Ruby as well. I would say, npm isn’t the one to kind of like specifically call out all packages but if you compare that against more conservative Java and C++, then definitely, there’s like a huge difference.
[0:14:58] SIMON MAPLE: That’s really interesting. I think, I would say, definitely, Maven repositories are much more governed centrally. You almost – [inaudible 0:15:07] need approvals and things like that to actually be able to push things up onto like a Maven Central or something like that, as well as update, and things like that. It’s definitely slower, but you definitely see far, far fewer, I would say, issues in and around, whether it’s typosquatting, whether it’s even malicious code up into those spaces. There’s a definite difference in approach across the registry point of view. CC++, Roy, how would you call out differences there?
[0:15:33] ROY RAM: Yes, again. I think I touched on it a bit. It’s a bit of a different ecosystem as well, where it might be a little more governed to an extent. It might be a little tougher to do. But at the same time, I’m sure, there are ways that smart people that are able to find the loopholes to get those malicious libraries into it.
Again, like I said, I think just the fact I touched on it a bit earlier, I think just the fact that you, again, essentially embed that code into your code, it could also mean that some things might go through the cracks. If you don’t really look at your code, and one of the things you’re introducing, it’s definitely, potentially in some cases harder to actually notice, or things might go unnoticed as well.
I think there are a couple of interesting conversations I had with some of the security teams where, again, going back to the fact that it’s very old. We were talking to a company where they have 30-year-old code with 20-year-old vulnerabilities. There’s a real struggle there and kind of a tug-of-war between the security team and the developers. Where the security team wants them to fix everything, and the developers are saying it’s been here for 20 years, nobody’s actually done anything with it, why do we even need to fix it? It’s a real struggle, which might be a bit different than other languages where it’s a bit more looked at, at a higher frequency. Some cases, these vulnerabilities are taken a bit more seriously. But at the same time, it does explain the need to not only understand and be able to find them, but also be able to prioritise them, and really focus on the things that are important for all sites, and to make sure that they’re being fixed and addressed.
That is something that I did come across in C++ less than other languages, where you really have those old code bases with very old vulnerabilities, where it’s a real struggle to get the developers to really want to fix it, and understand the importance of it.
[0:17:25] SIMON MAPLE: I think interestingly, when you look at something like C and C++, it’s obviously older than open source, right? When we think about the usage of libraries, third-party libraries, within your C and C++ applications, you’re going to get a large variety of usage from them. Very often, people will just pull them in, very often people may use a package manager. Tell us about your experiences in whether people are using package managers or whether people are just pulling in unmanaged libraries from third parties.
[0:17:55] ROY RAM: It really depends on the developers and what they’re used to. I think some of the more old-school ones would probably go towards embedding them, embedding the open-source libraries. Or you probably see that again, a lot more in the older code. C++ is continuously going into iteration and being improved on. I’d say, the more modern C++, you’d see package managers more often than embedded. That’s what I’ve been noticing, so far. Again, I don’t know, maybe Liran or Simon, you’ve noticed something else, but that’s definitely a trend that I’m seeing. Definitely more in the older code rarely see package managers. The newer code is the more frequent you would see them.
That proved to be like – not to be opinionated here, but it’s very dangerous in a sense as well. There’s no baggage management, you literally just – you have no way to track those unless you go through, like a static analysis of the file, and try to locate all of those. Who says you use an HTTPS versus HTTP, I don’t know. Maybe there’s a typo or just copied the wrong link and you use that instead. Go, I think, has a similar convention in one of its packaged – once the package management uses, which is kind of also like doing that. Then what happens when vendors like different supply chain security aspects, like what happens when the maintainer just goes, changes the repository, or something like that?
[0:22:02] SIMON MAPLE: Couple of other things that might be worth talking about is the differences in the dependencies. Maybe even just like talking about the sheer number of dependencies that are being introduced, maybe we will refer to the heaviest objects in the universe being node modules directory. We’ll talk a little bit about pinning and unpinning as well, typing perhaps, and then where languages are used, by which companies, and what the application is more likely to look like. Are there going to be more enterprise applications, bigger applications or not?
[0:22:31] ROY RAM: Is it also worth touching on – if we’re talking about like the SBOM, so just the fact that the security world is also moving to a more transparent world, where things SBOMs are going to be mandated in the US, it’s probably going to be mandated in other places, and really need to think about what you implement into your application. Because once there is full transparency, anybody can see what libraries you have, you need to make sure that whatever you’re using is secure because –
Does this make it harder if there are vulnerabilities as a result, trying to identify how they’re coming in, number of paths they’re coming in? Also, how to remediate? Is that fair or is that –
[0:23:54] LIRAN TAL: It’s a fair assessment, I think. That’s definitely – which more packages that you have? I think these days, maybe it’s not that common to see the one-liners, but it doesn’t really mean anything else. You’d have still shorter packages, like very best for one thing. I think that kind of dependency tree bubbles up really, really high. It’s kind of hard to imagine how different dependencies you have in a hierarchy level.
One academic research that looked at npm and PyPi a few years back was comparing the amount of nested dependencies you would have to install. The average npm package install already gives you four levels of a dependency chain. Definitely, that is something to be concerned of, and see how you are able to manage that. You also mentioned like – they also impact the vulnerability triaging, right? What happens when your last dependency in the chain, if you imagine a tree of nested dependencies, or one before last has a vulnerability? There is essentially a bunch of maintainers up the stream now that need to choose how they remediate that. Like, what did they replace that vulnerable dependency with something else? Are they able to update a newer version of that dependency if it exists? What happens if it exists in a new major version that kind of maybe changes the version requirements scheme?
From a C++ point of view, I guess, depending on whether using a Conan or unmanaged, it becomes actually probably quite easy from an unmanaged point of view to be able to perform upgrades and things like that.
[0:26:48] ROY RAM: Again, it’s similar but different, to an extent. It is definitely something that in some cases, might be harder to trace back when it comes to the embedded space of really, really understanding the flow of how things are being introduced and where they are being introduced. It really requires to understand the code very well and understand where the code is coming from, and what it’s doing. There are a lot more depth than in other cases. I’d sum it up by, it’s very similar, but different just in the visibility of the dependency graph, what you’re able to really easily see in the versions and things like that.
I think there’s a difference here in risk in terms of which you opt to go for. If you’re pinning, you, first of all, know what you’re building, and that won’t necessarily change. But also, in years to come, I’d spoke with the spring folks about this as well. Very often, spring initialiser, it’s a great way of actually getting on the late starting on a latest version of all your dependencies. Problem isn’t with the people who are building that day. The problem is the people who built two years ago, who actually haven’t changed their POM XML, and they’re still on those older versions. They’re actually not just security issues, but pulling in plenty of other functional issues and other things like that.
I guess Java script certainly benefits from the building to the latest version, and getting your bug fixes, and your security fixes more frequently. Any downside to that, Liran?
[0:28:50] LIRAN TAL: I know of friends who part of their CI is essentially they updated the latest and see that nothing breaks. I think that’s less common in Java, or C, C++. But I have friends who literally just do that to stay up to date unrelated to anything, just like update to the latest, run your test. If nothing breaks, they pin the latest.
Downsides might be malicious actors on supply chain security stuff, right? Like they’re able to take over a package, and maintain, or whatever. Then, they of course have issued new dependency versions. We’ve seen several cases of that baker on npm, colours, node IPC, all of those are examples of like maintainers in different aspect, just releasing offensive code. That kind of broke the dependency. If you were always on the bleeding edge, like unrelated, you would be susceptible to that, whether that’s by the way, a functional issue or a security vulnerability.
I’m not part of that camp of like “always update to the latest no matter what”. I think you definitely need to take some buffer, understand what you’re updating to and upgrading to as well as managing your dependency hygiene in a safe way. Which means, knowing which dependency maintainers to take it from and so on.
Aside from that kind of notes, when you install new packages on npm, you usually grab the latest unless there’s a specific reason to choose a different major version. That is, of course, when you first introduce it to the project. I think from there on, all of like bot automation that happens in the ecosystem, whether it’s through Snyk or others is something that is helpful to keep you on the latest versions as needed.
[0:30:28] SIMON MAPLE: It makes creating SBOMs and things much harder because it always depends on when you build, is to determine what you’re using at any one time. In fact, Roy, you were mentioning earlier, actually, the importance of SBOMs these days, right? Particularly in terms of the impact it’s having on our industry, and certainly the regulations that’s being added into our industry. Tell us more about that.
[0:30:49] ROY RAM: Yes, for sure. Definitely we’re seeing trends. I don’t think it’s new in the cybersecurity world. It might be a bit newer here. The fact of just being transparent about what you’re using, what your application is introducing. Just as an example in the US, they’re talking about mandating SBOMs for your application. So providing an open list of what your application is built out of, for the importance of transparency, and understanding exactly, in cases of supply chain, what am I introducing into my application?
Historically speaking from cybersecurity, but other fields, generally, that tends to spread very quickly. I came from the brand protection world in cybersecurity. We saw that very rapidly, every single country was actually mandating the need for a brand protection vendor. I can definitely see that happening in this world as well, where the mandating of exactly what is in your code, what you’re introducing, is going to be fairly widespread just for the sake of transparency. We don’t need to call out specific instances, but you have a lot of instances where third-party supply chain attacks, at the end of the day, caused a lot of harm.
Then, you really want to understand exactly what am I introducing, know about it beforehand. I’m sure that there’s going to be a lot of red tapes, especially with the government, and companies, and other types. What’s been around compliant? What are you allowed to introduce? What are you not allowed to introduce? So it’s certainly something that developers need to take into account.
[0:32:23] SIMON MAPLE: Yes. I think even going back to something like Log4j, of course. It’s the first thing that we need when we have a zero-day or something that is an SBOM. That some artefact, that kind of like is able to tell us accurately, is this in my domain? Where is it in my domain and what do I need to do to fix this? I think that transparency across companies as well will become very important because we’re not just creators of code, but we all consume other people’s off-the-shelf software as well. It’s important for us to know when we host software or when we use services, what else is being impacted by the zero days and things like that as well.
While we talk about Log4j, I can’t believe no one’s brought this up, but me, actually. I was ready for a troll on Log4j. But when we talk about something like Log4j, one of the things that I was actually really impressed with was the speed at which the community and maybe we were attached lucky, just in how good the maintainers of Log4j were in the speed at which they did roll out fixes, the openness that they did that through. That really did help the users of Log4j. Of course, the vast majority of those being through transitive dependencies, actually pull in fixed versions of the code bases.
[0:33:59] LIRAN TAL: That’s fairly a good amount in terms of packages, who have been deemed abandoned if that category is fitted into a release that was done less than 12 months.
[0:34:10] SIMON MAPLE: Yes. I mean, I guess there’s also going to be potential one important thing to kind of like think about when we, as organisations, and we as developer organisations as well, think about when we use libraries, a real consideration of the community strength behind that. Also, the maintenance that is happening to that library in case something happens.
[0:34:53] LIRAN TAL: I would say two things. First is, just like manage your dependencies hygiene in a more responsible way. That is, know how to choose dependencies, use things like the Snyk Advisor that helps you understand the package held, and like weave that into your internal processes of how your developers choose dependencies.
Then the other one is, there are many existing package manager security controls, if sort of say. Such as, when you installed new dependencies, do not allow them to run arbitrary commands and stuff like that. Definitely, like apply all of those security controls that exist within the package managers themselves to the extent possible. I think those are very first few good steps to just be at the next level in terms of just your readiness for a supply chain security issue.
[0:35:42] SIMON MAPLE: Great advice. I think a lot of that actually is really great advice equally for other ecosystems as well. Roy, what would you add?
[0:35:48] ROY RAM: Generally speaking, I think everything within your code is important. The more we move towards a more modern approach, where you can utilise more third-party vendors, third-party code, things like that, I think it really indicates the importance of looking at your code holistically. You can’t really focus on the IP code, only on dependencies, only different teams. Everybody needs to look at the entire code holistically, and be able to understand the full context of what you’re calling, where you’re calling, what does it affect, what does it not affect.
There’s a shift already, in my opinion, that’s going on. I think we need to continue that shift. Really looking at the whole code and the whole application holistically, and understanding that at the end of the day, everything is moving to code. Everything is code, everything is application, everything can have a vulnerability in it, everything can be misconfigured. I think continuing down that path, and really understanding that you need to look at it as holistically is the best advice I can give you.
[0:36:41] SIMON MAPLE: Great advice. Great advice. Yes. I guess the thing I would add probably is, when we think about whether we’re looking at individual ecosystems or greater ecosystems, just be very intentional about what you’re doing. If you’re pulling things, and know that you’re pulling things in, recognise why you’re pulling them in and the health, like you say, Liran, about what you’re pulling in. I think that’s most important. By intentional, I kind of like mean, make sure you have good open-source strategies in the sense of recognizing how your build is likely – works in the sense of, are you pulling in latest, what is the result of maybe if you’re on Java, and not pulling in latest? And how often do you need to make sure that you’re looking at whether you should be upgrading whether through tooling or otherwise? I think being very intentional and getting that visibility into what you’re using I think is core.
We hope that’s been useful for folks who are interested in and I guess some of the differences in and around security of various ecosystems. Thanks very much for listening, and we hope to see you and hear from you again on another The Secure Developer podcast. Thanks very much.
[END OF EPISODE]
[0:37:44] ANNOUNCER: Thanks for listening to The Secure Developer. That’s all we have time for today. To find additional episodes and full transcriptions, visit thesecuredeveloper.com. If you’d like to be a guest on the show or get involved in the community, find us on Twitter at @DevSecCon. Don’t forget to leave us a review on iTunes if you enjoyed today’s episode. Bye for now.
Developer Advocate at Snyk
About Liran Tal
A GitHub Star, recognized for activisim in open source communities and advancing web and Node.js security. Member of the Node.js Foundation ecosystem security working group, Project lead and contributor to the OWASP Foundation, and Developer Advocate at Snyk.
Field CTO at Snyk
About Simon Maple
Simon Maple is the Field CTO at Snyk, a Java Champion since 2014, JavaOne Rockstar speaker in 2014 and 2017, Duke’s Choice award winner, Virtual JUG founder and organiser, and London Java Community co-leader. He is an experienced speaker, having presented at JavaOne, DevoxxBE, UK, & FR, DevSecCon, SnykCon, JavaZone, Jfokus, JavaLand, JMaghreb and many more including many JUG tours. His passion is around user groups and communities. When not traveling, Simon enjoys spending quality time with his family, cooking and eating great food.
About Roy Ram
Roy is a senior product manager at Snyk. He is an experienced Product Manager with a demonstrated history of working in the information technology and services industry. Skilled in End-to-End Product Management, Cyber Security, Management, Agile Methodology, and Product Owning.
About The Secure Developer
In early 2016 the team at Snyk founded the Secure Developer Podcast to arm developers and AppSec teams with better ways to upgrade their security posture. Four years on, and the podcast continues to share a wealth of information. Our aim is to grow this resource into a thriving ecosystem of knowledge.
Hosted by Guy Podjarny
Guy is Snyk’s Founder and President, focusing on using open source and staying secure. Guy was previously CTO at Akamai following their acquisition of his startup, Blaze.io, and worked on the first web app firewall & security code analyzer. Guy is a frequent conference speaker & the author of O’Reilly “Securing Open Source Libraries”, “Responsive & Fast” and “High Performance Images”.