Product 4 minutes

The use case for AI in your security pipeline

This is the first blogpost in a series of how we leverage modern NLP technology to enrich tooling data for many different use cases.

The use case for AI in your security pipeline featured Image
Published on

Introduction

Security tools are great, they save us time in finding potential issues and often allow for a pretty decent idea of how secure an application or infrastructure is. However, security findings from scanners or manual sources such as pentests rarely have all the information the builders or defenders need in order to adequately fix the issue. Vulnerability scanners lack the vulnerability context but also, often reporting on how to remediate a finding is energy best invested in improving the quality and variety of the findings. But defenders need this information in order to correctly fix the issue, educate the creators of those issues on how to avoid them and clearly and concisely communicate which policies, procedures and potential regulators are being breached because of the existence of that specific issue. Most often, this information is added either manually in the triage process or is communicated in a time consuming manner in a prioritisation conversation between the security team and the development team. How often have you had to explain to a development team: “Please prioritise issue X-123 because it’s a SQLi and not only are we in danger of getting hacked but also we could fail PCI due to its existence”? But the development team does not know how you want them to fix a specific issue, they can search for it and more often than not end up in a Stack Overflow solution from years ago or incorrect advice. How often have you seen fixes that address the specific instance of the issue instead of the root cause? That is not for the lack of effort from the development team but for lack of understanding of the general problem or lack of training resources on how to fix it.

As security engineers we try to solve these issues by curating knowledge bases, maintaining training and sending links to the internal resources to development teams again and again and again. But this is inefficient and boring, our time and expertise is better used in more interesting tasks. Wouldn’t it be great if this could be automated? In Smithy this is exactly what we did! We created a feature, available initially only on Smithy-SaaS, which can enrich any security tool finding with arbitrary information based on user preference. This information can include:

  • The severity of the vulnerability
  • The potential impact of the vulnerability
  • Remediation advice
  • References to relevant documentation
  • Training resources
  • etc

This information is automatically added to security findings, so that the development team has all the information they need to fix the issue correctly and efficiently.

Approach

One use-case that several GPT models are good at is finding connections between concepts. Although these connections are not always optimal since models lack human context. With human intervention, we can improve these connections over time. Once we have identified these connections, we need to store this information in a way that allows us to link it to each other. One of the projects that Smithy is passionate about and actively sponsors is opencre.org. OpenCRE is a semantic web of security information and its data model allows us to easily find transitive links between any two pieces of information. Additionally, since opencre.org mappings are human-reviewed, we gain quality information quickly and easily.

By merging the concepts above, we end up with a strong, fast and accurate correlation engine. This engine allows us to enrich any output from any tool with relevant information from any other source we know about.

Use cases

We are using this approach to create several Enrichers. In this blogpost we showcase the need for a knowledge base enricher.

Pipeline with the knowledge base enricher

Most security tools focus rightly on delivering accurate and useful results. e.g.:

//bandit.json
  {
"code": "10 ' WHERE \"username\"=\"admin\" OR 1=%s --'\n11 User.objects.annotate(val=RawSQL(raw, [0]))\n",
"col_offset": 26,
"end_col_offset": 42,
"filename": "/tmp/bandit/examples/django_sql_injection_raw.py",
"issue_confidence": "MEDIUM",
"issue_cwe": {
  "id": 89,
  "link": "https://cwe.mitre.org/data/definitions/89.html"
},
"issue_severity": "MEDIUM",
"issue_text": "Use of RawSQL potential SQL attack vector.",
"line_number": 11,
"line_range": [
  11
],
"more_info": "https://bandit.readthedocs.io/en/1.7.5/plugins/b611_django_rawsql_used.html",
"test_id": "B611",
"test_name": "django_rawsql_used"
  },
//kicks.json
{                                                                                                                                                                                                              
        "query_name": "AD Admin Not Configured For SQL Server",                                                                                                                                                
        "query_id": "b176e927-bbe2-44a6-a9c3-041417137e5f",                                                                                                                                                    
        "query_url": "https://docs.ansible.com/ansible/latest/collections/azure/azcollection/azure_rm_sqlserver_module.html#parameter-ad_user",                                                                
        "severity": "HIGH",                                                                                                                                                                                    
        "platform": "Ansible",                                                                                                                                                                                 
        "cloud_provider": "AZURE",                                                                                                                                                                             
        "category": "Insecure Configurations",                                                                                                                                                                 
        "query_name": "AD Admin Not Configured For SQL Server",                                                                                                                                                
        "query_id": "b176e927-bbe2-44a6-a9c3-041417137e5f",                                                                                                                                                    
        "query_url": "https://docs.ansible.com/ansible/latest/collections/azure/azcollection/azure_rm_sqlserver_module.html#parameter-ad_user",                                                                
        "severity": "HIGH",                                                                                                                                                                                    
        "platform": "Ansible",                                                                                                                                                                                 
        "cloud_provider": "AZURE",                                                                                                                                                                             
        "category": "Insecure Configurations",                                                                                                                                                                 
        "description": "The Active Directory Administrator is not configured for a SQL server",                                                                                                                
        "description_id": "afa96f09",
        "files": [
                {
                        "file_name": "../../code/assets/queries/ansible/azure/sql_server_predictable_admin_account_name/test/positive.yaml",
                        "similarity_id": "819786035c5ea3943e303532e747fabac9f8beaddddfd6ecb863382481c50c0c",
                        "line": 10,
                        "resource_type": "azure_rm_sqlserver",
                        "resource_name": "Create (or update) SQL Server2",
                        "issue_type": "MissingAttribute",
                        "search_key": "name={{Create (or update) SQL Server2}}.{{azure_rm_sqlserver}}",
                        "search_line": 0,
                        "search_value": "",
                        "expected_value": "azure_rm_sqlserver.ad_user should be defined",
                        "actual_value": "azure_rm_sqlserver.ad_user is undefined"
                },
                {
                        "file_name": "../../code/assets/queries/ansible/azure/ad_admin_not_configured_for_sql_server/test/positive.yaml",
                        "similarity_id": "e2acb4c0c4afe11e1908953bb768bab35e0ca53d4d20bc6af02f1c5350cbe587",
                        "line": 3,
                        "resource_type": "azure_rm_sqlserver",
                        "resource_name": "Create (or update) SQL Server",
                        "issue_type": "MissingAttribute",
                        "search_key": "name={{Create (or update) SQL Server}}.{{azure_rm_sqlserver}}",
                        "search_line": 0,
                        "search_value": "",
                        "expected_value": "azure_rm_sqlserver.ad_user should be defined",
                        "actual_value": "azure_rm_sqlserver.ad_user is undefined"
                }
        ]
},

Even vendor solutions could not possibly offer extensive documentation on the issues and remediations.

With Dracon’s Knowledge Base enricher, this is trivial. First we need a pipeline Pipeline without the knowledge base enricher

Then we add our Knowledge Base enricher and define which knowledge base we want to use. Smithy-SaaS supports any knowledge base your organisation currently uses. In our example we use OWASP CheatSheets.

Pipeline with the knowledge base enricher set to Cheatsheets

Executing this pipeline, gives us the following information:

Results

Using this functionality, all development teams can now automatically be shown where the most relevant information related to a finding is, including code samples on how to fix an issue.

If you are interested in Smithy, let’s schedule a demo here.