Transcript
Brodskyi: My name is Mykhailo Brodskyi. As Principal Software Architect, I focus on platform security and cloud migration. I’m going to walk you through top four security risk categories in software supply chain and show you how to mitigate them effectively. I will share real case studies from our projects, highlighting strategies that protect systems from vulnerabilities, and ensure security and resilience of your platform.
Here is how I’m going to do it. First, we’ll start with the challenges that we have in FinTech industries. Then, we will deep dive in the risk categories. I will show practical examples of how to mitigate them. I prepared some case studies from our real projects. Then, I will show a real hands-on demo.
Looking Into the Future and Reflecting on the Past
Do any of you know what significant event related to security takes place here in Munich every year? It’s not Oktoberfest. Any ideas, every year in winter? Munich Security Conference. It has been a global stage for discussing international security issues. Here, we are not talking about geopolitical issues. We are talking about something equally similar, software security. As Munich Security Conference, that shapes global security policies, our goal is how we shape our software security chain.
Uncover FinTech Landscape Challenges
Let’s dive into the FinTech landscape. The FinTech ecosystem is driven by key serious business domains, such as customer onboarding, payment processing. Each of these domains ensure smooth operations of financial services. Each domain houses numerous business applications inside. For example, for instance, customer onboarding. There is an application for know your customers. For payment processing, we have an application that is responsible for APMs processing, alternative payment processing, credit card transaction processing, and more. Each of these domains work under some framework that operates with standards, laws, and principles.
For example, payment processing is subject to PSD2, while fraud detection is mitigated and operated with AMLD. Why do we have all these principles, laws, and standards, all these regulations that we have in FinTech, and in other areas as well? The answer is simple, that’s risks. All these regulations are designed to mitigate some risk: financial risk, reputation risk. This law exists right now, that helps organizations to mitigate such risk. As you can see that the FinTech landscape is very complex due to these regulations that we have, and also integrations with other applications. It’s clear that we need to have a really robust approach, how we can secure all our applications inside our landscape.
Explore Software Supply Chain
Let’s dive into software supply chain. I would like to begin with something that we are all familiar with, that’s our physical goods supply chain. The journey begins with upstream supplier, that delivers raw materials to manufacturer, and then customer is going to receive the final product. Similar to software development, we rely on suppliers such as third-party libraries, dependencies, and tools. Everything goes into development flow. In case any of this component is compromised, our final product is at risk. Development organization in software supply chain security, it’s similar to manufacturer. Inside we have the different stages of the process.
The process starts with development, goes to integration, and then end with deployment. Each of this stage relies again on third-party libraries, tools, and dependencies. It’s clear that we need to have an approach to secure all these dependencies. That’s why compliance and security, it’s not the static flow, it’s a static layer in our software supply chain. It’s dynamic and it’s going to be integrated in each step of the process. Based on this overview and understanding of software supply chain, we can create different categories. The main category that’s related to our third-party libraries and tools. The second category that’s related to our internal development. Then we have process and all this risk that’s related to our delivery and deployment, and governance, and security testing.
Address Mitigation Strategies for Third-Party Risks
Now I’m going to talk in detail about all these categories that we have. Let’s start with the first one. Let’s start with our software development chain, when we have all these components. The first approach and first what we need to understand and ask when we work with third-party libraries, they need to be certified. In this case, we can make sure that our final product that is going to be developed based on these libraries is also going to be protected. Then we can integrate software composition analysis. This approach will help us to mitigate these issues and risk that’s related to third-party libraries and tools. Software composition analysis, there are multiple steps there.
First component, that’s dependency analysis, they analyze all our dependencies. Then vulnerability detection, because this tool already has integration with internal database, which is possible to monitor and understand if there are any issues in our pipeline. Then, also module that’s responsible for license compliance. In our organization, usually we have private artifact repository. Then we have version control system. Our journey starts with fetching these dependencies and trying to build our project. This tool, software composition analysis, will help us to analyze all these dependencies that we have there. The next step is going to be build our pipeline. We can integrate some job in this pipeline that is going to monitor all these dependencies. Then, also to provide some notification to us in case we have any issues.
Now we can go even further and try to mitigate and build even more layers of security while we are talking about third-party libraries. Let’s imagine that we would like to start working on some new feature, and we need to use some new third-party libraries that are not available in our repository. First, we have the development team, then we have cyber team, and we have our supplier. In this case, that supplier who is going to deliver third-party libraries and tools. A developer is going to select this component that is needed to be integrated in our private artifact repository, and select in public artifact repository. Then it’s going to be added first in intermediate repository, where we’re going to trigger this vulnerability scanning, what I mentioned earlier, and license scanning.
Only after we perform vulnerability scan, license compliance check, and we will be ready for the further check, we can include this library into the next repository to secure the repository. This repository is going to be integrated and continuously execute some monitoring tool. We will try to identify new vulnerabilities there. Try to also check licenses, what we have. Once we receive a good sign from this validation, we can include this library to our development repository. This zero-trust dependency management really will help us to minimize all these risks that’s related to dependencies and tool. Finally, at the end, we can execute verification. We can execute all security verification, SAST and DAST. Then perform penetration test.
Let’s try to summarize what we need to do for mitigating third-party dependencies. We need to ask about licenses. They need to be compliant with that. Then we need to integrate. Of course, use only private artifact repository. To build several layers of repository in case it’s needed, depends on your business domain. Then, to integrate continuous verification in your pipeline.
Secure Internal Development
Let’s go to the internal development. Here there is a best practice in case you would like to improve your security development. Try to integrate some existing principles and standards. For example, in our FinTech industry, there is a common set of rules and standards, PCI DSS. All payment processing domains should follow these standards. Let’s talk about these standards. The definition. First of all, it’s a set of standards that explain in which way we need to implement, and how we need to build our network. Also, there are other standards as well. This standard is super important for FinTech. There are six groups of requirements. One group is focused on network segmentation. Then it’s related to how you build access control to your system and your environment. Also related to how you’re going to monitor your environments and your applications. There are stages of process inside this flow, in case you would like to apply these standards for your organization.
First process that we need to discover, we need to scope and we need to analyze your infrastructure and your landscape, what you have. What does it mean for this? You identify all components in this chain. You also analyze which type of data you store there. Based on this information, then you can apply different segmentation strategies. That’s number one, scoping, organization analysis. Then, categorization. PCI DSS explains different categories for systems that you need to apply. It depends on which data you store there. First, that’s CD system, cardholder data environment. That’s the environment, where do you process transactional data or you store transactional data? Everything that’s related to simple transactions, everything that’s related to cardholder data.
Then, connected-to: you have a separated system that doesn’t store any cardholder related data or customer data. This system is just only connected to cardholder data environment that process or store related data. Then you have security impacts in your system. A good example, some configuration management, when you store configuration for a particular microservice or particular customer. Out-of-the scope system. Out-of-the scope means that the system is not going to be under PCI DSS. Its system doesn’t contain any credit cardholder data. It can be completely isolated from our main environment. The next step, we need to implement all these segmentation and controls. Then, we need to implement validation. We need to maintain this segmentation. It means, for example, in our industry, two times per year, we need to complete PCI DSS. Every time we need to update this documentation, we need to show that we have a monitoring system in place. That’s why it’s very important.
Examine Real-World Case Study
I would like to show a real example. It’s a very interesting story of what we already started. Our company, the main goal is process transactions. All our systems that we have currently, they were hosted in a private data center. We initiated a really complex project to migrate all our 100-plus application modules from a physical data center to the cloud. During this migration journey, we had to review all our current segmentation approaches that we have, all our communication strategies. I’m trying to show some small set of architecture where we try to apply all these principles. Then, somehow, to bring architectural improvements during this cloud migration. Holistic architecture. In payment processing, there are different layers of architecture. Here we have, first, input channel, where we need to obtain this transaction. Then, to send to our payment processing gateway. There are different input channels. We can use mobile devices. We can integrate with external websites. Or it can be integration with external systems, with airlines, for example.
In this case, we have environment, when we need to consume these transactions. Right now, there is a component input channel. We are going to receive this transaction from physical terminal. Then, if you use different currency, and you would like, for example, to travel somewhere from Europe to U.S., or in other countries, you can ask which currency you would like to pay. For this currency conversion exchange, there is a separate component, or even a separate service is responsible and integrated to payment industry. That’s currency conversion service. This component is responsible to decide, which option is better and how we are going to exchange it. Then, we are going to process this transaction.
In this case, payment processing service is going to be connected to one of these card schemes that we have. Once we started to analyze the current architecture, what we had previously in our data center, the landscape is super complex. Sometimes there is shared database approach, and 10 applications connected to one database. Of course, in cloud, it’s difficult to somehow troubleshoot this issue, and try to implement some new features. That’s why we started thinking, let’s try to separate these components. Let’s remember which categories we have. That’s CD system that’s responsible for cardholder data environment, connected-to system, security-impacting systems, and out-of-the scope system.
Obviously, the transaction is going to be received, first of all, by input channel. Then, processed by payment service. Then, sent to card schemes. It means that these two services, it goes to CD bucket. Then we can separate and we can move currency conversion service independently to another zone. In this case, let’s assume that we can include the service in non-card data environment. What else do we need to follow in order to build this separation, and, first of all, to move this service to the out-of-the scope category? We need to implement access control. We need to have authentication and encryption. It means that it’s not just possible payment service is going to talk directly to currency conversion service, we need to authenticate this service. We need to implement some mechanism of authorization, how we are going to do it. Also, we need to put it in a separate zone.
I understand that there are so many people from different industries. I try to think, how can you use this information and apply this information and deploy already, let’s say, next week? Try to think from this categorization point of view, and these separate categories that we have in FinTech industry, in PCI DSS, and build the same categorization and segmentation level on your system. Let’s say that we are talking about healthcare. We can create and build a separate environment where we are going to put applications that are related to storing and processing some personal information. Then, you can store this information that’s related to health insurance, health state, for example, of this person, in this separate environment and even in separate application. Then, construction. I remember back in the past, in my experience, construction domain, we had microservice architecture. All these services were just deployed in one single zone.
Of course, from communication point of view and then security point of view, it’s a really bad approach. In case we are talking about construction domain, it’s better also, again, all related customer information put in one service, in one database, and then to separate in a completely different environment. Then, real estate. The same goes to this domain. Customer related information, we put in one database, even in another environment. All information about objects and real estate properties, you can put in separate environment, because also you need to protect this information. Somehow, for their competitors, it’s going to be super important. Energy sector, all information, for example, telemetry information, information about some plants manufacturers, you can also separate in completely different environment and zone. That’s cross-industry applications, and how we can build this inspiration and apply for other industries.
Approach number two, that also goes to secure development practices. This approach is successfully applied in the current company and also in the previous one that was related to network security protection, so threat modeling. What does it mean? There are three questions that we need to answer. First, we need to understand what we are going to build, which potential issues we can have, and how we are going to mitigate them. Idea that, in case you have any design process in your organization or you have architectural process, you can integrate threat modeling on an earlier stage of your development. That’s exactly what we did in the current company. It means that on this earlier stage, you can, together with your development team, think with all these vulnerabilities, potential vulnerabilities that we have right now, and try to mitigate already on the earlier stage of your design, architectural draft version.
It helped us multiple times, because it’s reputation risk, and it’s development risk, and even some additional costs then which we need to fix later on. Key components, so, first of all, we identify the same, there is some similarity that’s related to PCI DSS, that we have scoping, here we have asset identification. We are going to analyze all our components, what we have in our system. Then we are trying to also review current threats that we have, and try to build mitigation for this threat. There are some benefits. First of all, we can increase time to market. We are not going to spend some additional time for testing or verification, and then fixing these issues. We can improve our application security. Then, it’s also to use some best practices, some frameworks that we have already in this industry. There are so many approaches of what we have. We applied several times a straight approach for threat modeling.
I’m going to show you right now a DFD diagram. That’s a diagram that is going to be compiled and created during this threat modeling process. With this diagram, you can identify external boundaries of what you have, internal systems that we have right now, and then the processes and storage. Then we will try to map all these issues that we can have, and identify what is the communication flow from one service to another service, and then try to build some additional security layers. For example, what is going to be authentication and authorization? Do we have any encryption there? Which type of information do we store in this database? Then it’s going to be everything documented. It’s going to be reviewed together with our cyber experts, with our architects. Then, to make sure that we are not going to introduce any additional risk there. This approach is possible to automate with different tools.
Even from Microsoft, there is automation. It’s even possible to use some AI approach to analyze and build some list of the risks potentially that you can have. Once we applied this approach, we were able to identify some potential vulnerabilities which were not identified during penetration test, and that was really a red sign for us, and we spent immediately to resolve these issues that were related to service-to-service communication, and which data do we send there. These two big issues were identified, especially because of this process that we applied.
Let’s summarize how we can mitigate internal development. First of all, that’s one more time, apply existing security standards, what we have right now. In case you’re in healthcare, you can try to apply these industry standards, what I just explained recently. Also, security review, really good code review, and threat modeling.
Mitigate Software Delivery and Governance Risks
Let’s move to software delivery deployment, and governance and security testing categories. I would like to show you how we are going to mitigate these delivery risks, what we have during our deployment. Let’s, one more time, go back to our process that we have, our development organization, with different stages during this flow. First issue, what we can have, that’s version control system stage. We can, by accident, expose some credential secret. There were so many examples in GitLab, in GitHub, that were found in public repositories, all secrets. It can be a really big issue to all systems that we have.
This issue, we can mitigate with secret management. Let’s say that we are together right now, building some software, building a new feature. Of course, there are so many available secrets management tools for our platform. There are platform agnostic, that we don’t care which cloud provider we are going to use. There are some cloud providers that are container native. Then we have some tools that’s DevOps focused. Also, in a separate category, I added identity management system. That’s not related to all of them, but it’s somehow in the first layer, how we are going to protect our access. Let’s say, because during our cloud migration, we are going to deploy everything that we have in a data center to Azure.
In this case, let’s select Key Vault Secret Manager in Azure. Then we can go and we can move to the build stage. Here there is a risk that our build infrastructure can be compromised. In this case, we can use additional security controls. That’s what we have in all version control systems, in Git or GitLab. Then we can also include and implement SAST and DAST, static and dynamic security test and analysis. For static testing, we have SonarQube. For dynamic, we have Acunetix and Qualys. Let’s say that for security controls, we will select SonarQube and Acunetix. That’s what we use in our current company. Then, package stage, insecure artifact. Insecure artifact, I explained previously, that’s really zero-trust approach and CCI approach as well. It can also be integrated. Another approach is source code signing. There are different tools for this: Cosign, Notary, pipeline code signing in Azure. We are going to select Cosign.
Then, let’s move to the testing stage and deployment stage. Insufficient security testing. I have seen multiple times that we do not pay really big attention. There are no multiple security test cases available to mitigate and complete final verification of your software. That’s why it’s a really good approach first to integrate SAST, DAST. Depending on your domain industry, integrate also penetration test for your organization. This approach even was applied earlier in previous companies, related to construction or network and security verification testing. All these issues we can mitigate with security controls and secret management. Also, there is a point here. Have you already integrated a secret management tool in your pipeline? Also, there is verification. It’s very important that these keys that you have in this tool, that they have expiration date. Otherwise, it’s not going to be compliant, in case you use any tool that’s integrated with your environment, and then can monitor it.
Hands-On Demo
Now, I would like to go to the demo that I prepared. Specifically, I’m going to focus on the third-party libraries’ mitigation, and show you how this artifact, we can generate a software bill of material. We can use in our verification and analysis. Here, I have a simple project in GitHub. There is a microservice with some dependencies. It’s a really simple microservice. In the pipeline, we have two different stages. It’s build and generate software bill of material. Then there is stage when we build integration with Snyk. There are two stages. First, we generate this software bill of material. Then we use this artifact for further scanning and verification. That’s related to software composition analysis. Also, there is a dashboard of this tool. Right now, I don’t have any critical or high critical vulnerabilities. Also, I integrated this Nexus Repository. Right now, it’s running on my EC2 instance. Here, we have different types of repositories that I created. First is Maven Central Repository.
There is GitHub repository here, integrated pipeline there. There are multiple stages. First, we have Snyk scan integration. I’m going to trigger right now the tool of this build. Then, I have integration with this dashboard. Also, there are no high critical vulnerabilities. There are multiple repositories. This repository, it’s related to my dependencies, what I have in the current project. Then, I have a separate repository when I’m going to publish my artifact, which I’m going to build. Here, you can see the separation of these two repositories. That’s EC2 configuration of security groups. Then, I’m going to change this form configuration. Right now, everything is green. Here, I’m going to introduce non-dependency, Log4j dependency, and see what is going to be the behavior of this tool and how it’s going to be integrated in this dashboard. I’m going to comment out this dependency, and trigger a build. Build started. It was completed. Now, you can see that new issues were introduced.
Based on this artifact that was created, this tool is integrated, and continuously analyzing my software bill of material. Then, I’m going to remove this dependency, and generate this file one more time. At the end, it’s a big artifact. It’s a big XML or JSON file, with all these dependencies that you have in the application. Then, this file, now you can see that’s integrated already in the pipeline. Here, you can even build some business logic on the current pipeline on top. You can establish continuous monitoring. Then, you can use this file in order to share, and then trigger a compliance check. Then, you can use this outcome for your regulation and compliance process. I remember back in the past, in one of the projects, the compliance team asked the development team, can you please create an Excel file and put all the dependencies in this file? We were really surprised. It’s really manual work. It’s better to implement and integrate software bill of materials. Then, to have some stage in the pipeline that the security and risk team can analyze and can approve. At the end, this issue is mitigated, resolved, and dashboard is green.
Questions and Answers
Nikolai: You have a step, a Snyk scan, but what if a dependency was found after the build finished and it already was deployed? Do you continuously rescan all your dependencies, and then notify and rebuild all the services which depend on this dependency?
Brodskyi: A question about the integration, about how we automate, and how we’re going to notify and protect our next deployment step.
Nikolai: Not next, but if it’s already in production, and next day we found some zero-day vulnerability in the dependency which we already deployed.
Brodskyi: In this case, you need to establish patch management, and make sure that your organization is able to provide this process where you can mitigate and deliver this simple fix as soon as possible. That’s only related to, what is your patch process. In case it’s happened, then in our organization there is an SLA. We need to react in this period of time. In case it’s not happened, then it’s going to be a problem, a reputation risk for our company. That’s patch processing what we have. We have SLA, how fast we’re going to react, and what is going to be the mitigation.
Nikolai: To know that you have this vulnerability, how you go about it.
Brodskyi: To know it, because of the PCI DSS, we need to have a really strong monitoring system. We have a monitoring system that is going to notify each team immediately, all development teams. This monitoring tool is integrated with all other notification channels that we have: Teams, for example, emails, and so on. First of all, the operational support team is going to receive this alarm. Then, development team is going to receive all this notification.
Nikolai: More practical, like, for example, I have a container inside my private registry. I know that, for example, AWS Inspector continuously does this scanning of the containers if you keep it in their registry. As soon as they found that in your container you have some vulnerability, you can configure this notification pipeline that will send you a message. Then you can rebuild your artifact with a fresh dependency, and then deploy it again. How do you do that? What tools do you use?
Brodskyi: For container scanning, we use Azure tool. We integrated this tool there. Then, for this type of application that’s not in the cloud right now, we use Acunetix, Qualys, SonarQube, and, of course, penetration test in case we are going to release a very business critical update.
Shashi: These DORA regulations are coming from next year, which have to be adapted by, I think, all the European companies regardless of the industry. These pipelines which you showed us, will these have to be adapted and become more faster because the SLAs might be much smaller because of this regulation? If yes, then, is there already something going on on these architectures which you have just shown in this talk?
For example, in our company, we use Black Duck for software composition analysis. Because in our company we have C++ based libraries and some of them take really long to build, we build them locally on our infrastructure. Let’s say we have a CVE found, like the guy asked, zero-day CVE found, how would we use this thing which you showed us just now to be compliant with the DORA regulations that we have immediately a new batch created and delivered to the end customer?
Brodskyi: There is a new regulation coming to the FinTech industry, DORA. Also, regarding the pipeline, how is it going to be adapted?
Regarding DORA regulation, that’s particularly related to resilience and how your system platform is going to be resilient. How do you deploy? Also, it’s about platform security. Regarding the deployment, for sure, right now we are working to improve our deployment. Because of the cloud migration, we integrated all these DevOps principles in order to speed up. Latest example is that in order to complete some verification of our big APM processing, alternative payment method processing application, we spent several hours in this cloud migration, optimizing the processing strategy and optimizing the feedback loop in this pipeline. In GitLab plus Argo CD, we are able to speed up our deployment. This DORA regulation in our company is running in parallel, because we are doing these improvements not because of this regulation, because of our big cloud migration journey and to improve speed to market.
Regarding the vulnerabilities, in case we have them in production, we have a very complex monitoring tool. We have our support team that is looking every time on this monitoring tool. Also, we have notifications. Once they receive, we immediately react. All scrum teams, depending on the application or microservice, focus on this particular vulnerability. Then it’s going to be delivered. We will use all these tools in our pipeline in order to verify. Then it’s going to be deployment using this patch processing and hotfix deployment.
See more presentations with transcripts