LMM and Security

  • 31 August 2023
  • 1 reply

Userlevel 2

I know that GenAI is a very popular topic in the security industry right now.  But I am curious, how does everyone see this technology becoming useful in the security (or even more specifically the cloud security) world?  To me, it’s about helping security teams do more with less.  Using the technology to automate response to tasks that take up the time of engineers. Or it could be the first level of triage for an alert that can response and know a specific action to take.  Other thoughts?




Using Lacework/Operationalizing



1 reply

Userlevel 1

LLM’s in security are a really interesting problem to solve.

What not to do:  

Use the Internet as a datasource to insert remediation recommendations based on chat-gpt/OpenAI.

This is what other security companies have done (NOT Lacework) to everybody’s detriment. We know from Red Canary MDR and other threat hunting organizations that hackers are currently using OpenAI/ChatGPT to scan for re-usable gaps in security software. In the history of cybersecurity with new innovations, attackers will always have the first advantage to re-exploit old vulnerabilities in creative ways. Well financed state-actors have the resources to exploit old-zero-day exploits in a new ways using combinations of bad advice and the accurate prediction that enterprises (individual software engineers) will get lazy, and copy and paste code from stack-overflow and other repos without considering the wider implications in new contexts. Because Large Language Models like ChatGPT mine threat-stack and development exchanges - all the known bads - these LMMs have already been contaminated with what NOT to do (redo) and are beginning to re-recommend “terrible code” examples for new use-cases.

There should be a bug-bounty to find that first example where a security company’s remediation results fueled by LMMs (and the Internet) are not only instantly dated/exploitable, maybe neutral or even flatly wrong based on our understanding of technology that evolved since 2021. Companies that take the LMM path to recommendations for security will be injecting bad practices into their foundational recommendations for remediation and will have difficulty going backward to clean-up. Threat actors know and are watching for these mistakes. 

On a positive note, what can/should be done with LMM. The industry needs to police itself using LLM, OpenAI, ChatGPT checkers to evaluate if security results or code are being re-used that shouldn’t be re-used. Even if it is one subroutine or method, the industry needs to incorporate SCA or DAST checker tools to better arm our security allies to be part of solutions and not victims.

Some rules we, as an industry could follow:

  1. Don’t use (generative-ai) engines in code recursively from previously built-code *(that is likely insecure) as the company can fall victim to repeated attack patterns e.g. SQL injection and DDOS mem-type attacks. If feeding models, don’t train LMMs to suppress outlier signal and minimize anomaly detection - hackers are looking for those signals to exploit.
  2. AGI models need guardrails, so realize that artificial generative intelligence, will get exploited more often for evil versus good purposes initially because right now the only guardrails we have are people and people make mistakes. So don’t let AGI train AGI without guardrails or else mistakes will propagate and cross contaminate other LLMs. Lacking remediation policies to shut down or block spikes and growth of AGI will at worst cost the company cloud-spend money, or at even worse bring us closer to the Singularity (not good).  “Fail them (the models) on spiked usage” and don’t let the LMMs escape enclosed environments to publish with out vetting. Its like when new-novel-invasive species exploit a new fertile ecosystem, the wild ecosystems will become a monoculture of that LMM’s results and the diversity and creativity / biodiversity that make the Internet useful will fail. If AGIs spread to the open Internet beyond the bounds of the company or government control we risk diminishing the value of the World Wide Web
  3. Plagiarism, IP law, and seriously stupid results undermine ownership. Just as we all want to SAST scan public repos for bad security practices, we also need to protect companies from exposing webhooks that can contaminate or taint their proprietary development environments (this is beyond ISO12700x).

I love the possibility of leveraging the next generation of AI to feed LLMs to create better guardrails and synthesize known exploits/vulnerabilities to better stamp out security problems. Crowd sourcing and security exchanges like CISA and Sp4rkCon are beginning to codify best practices across institutional boundaries and talk about the explosion of LLMs in security. There are working groups on LLM and publications are just beginning to shed light on some threats LLMs can pose  (paywalled) but checkout the references section in this article (not paywalled); there are some positive aspects cited too. I know Lacer customers are using LLMs for fun and profit already …. so let’s hear from y’all on the positives.