JANUARY 30, 2025

DIANNA: Deep Instinct's Artificial Neural Network Assistant - Powered By Amazon Bedrock

Explore how Deep Instinct’s GenAI-powered malware analysis tool, DIANNA, the DSX Companion, leverages Amazon Bedrock to revolutionize cybersecurity explainability by providing rapid, in-depth analysis of known and unknown threats in minutes, enhancing the capabilities of SOC teams and addressing key challenges in the evolving threat landscape.

This post is co-written with Tzahi Mizrahi and Tal Panchek from AWS. A similar version was originally published on the AWS Machine Learning Blog.

Deep Instinct is a cybersecurity company that offers a state-of-the-art, zero-day data security (ZDDS) solution, “Data Security X” (DSX™). DSX safeguards data repositories across cloud, NAS, applications, and endpoints, delivering the industry's highest efficacy (>99%), lowest false positive rate (<0.1%), and explainability into never-before-seen threats.

Utilizing deep neural networks (DNNs), Deep Instinct prevents ransomware and known and unknown threats with unmatched accuracy and speed. This advanced form of AI significantly raises the cybersecurity bar and is ideally suited for large enterprises and critical infrastructure sectors such as finance, healthcare, and technology with a mandate to protect their data, in motion or at rest.

In this blog post, we explore how Deep Instinct’s GenAI-powered malware analysis tool, DIANNA, the DSX Companion, leverages Amazon Bedrock to revolutionize cybersecurity explainability by providing rapid, in-depth analysis of known and unknown threats in minutes, enhancing the capabilities of SOC teams and addressing key challenges in the evolving threat landscape.

The SecOps Challenge

There are two main challenges cybersecurity teams are facing: bad actors are outpacing security response, largely based on the the industries use of ML-based tools that “detect and respond” instead of preventing threats. And the flood of low-fidelity false positives that flood the SOC without context.

Fueled by the rise of DarkAI, bad actors are overwhelming SOC teams with a constant stream of sophisticated, novel malware and ransomware that are circumventing legacy security tools while triggering a massive amount of false positives that require investigation. This hampers proactive threat hunting and exacerbates team burnout. Most importantly, the surge in “alert storms” increases the risk of missing critical alerts. SOC teams need a solution that provides the explainability necessary to perform quick risk assessments regarding the nature of the attacks to make informed decisions.

Malware analysis is an increasingly critical and complex field. The challenge when analyzing zero-day attacks lies in the limited information explaining why a file was blocked and classified as malicious. Threat analysts are spending considerable time assessing whether an attack was truly malicious or just a false positive.

Let’s explore some of the key challenges that make malware analysis so demanding.

  1. Is it even malware?: Modern malware has become incredibly sophisticated in its ability to disguise itself. It often mimics legitimate software and files, making it challenging for analysts to distinguish between benign and malicious code. Some malware can even disable security tools or evade scanners, further obfuscating detection.
  2. Preventing zero-day threats: The rise of zero-day threats, which have no known signatures, adds another layer of difficulty. Identifying unknown malware is crucial, as failure can lead to severe security breaches and potentially incapacitate organizations.
  3. Information overload: Today’s powerful malware analysis tools can be a double-edged sword. While they offer solid explainability, they can overwhelm analysts with data, forcing them to sift through a digital haystack to find crucial indicators of malicious activity, making it easy to overlook critical compromises.
  4. Connecting the dots: Malware often consists of multiple components interacting in complex ways. Not only do analysts need to identify the individual components, but they also need to understand how they interact. This process is like assembling a jigsaw puzzle to form a complete picture of the malware's capabilities and intentions, with constantly changing pieces and an unclear final picture.
  5. Keeping up with bad actors: The world of cybercrime is constantly evolving, with attackers relentlessly developing new techniques and exploiting new vulnerabilities, leaving organizations struggling to keep up. The time between the discovery of a vulnerability and its exploitation in the wild is narrowing, putting pressure on analysts to work faster and more efficiently. This rapid evolution requires malware analysts to constantly update their skill set and tools.
  6. Racing against the clock: In malware analysis, time is crucial. Malicious software can spread rapidly across networks, causing significant damage in minutes, often before the organization realizes an attack has occurred. Analysts face the pressure of conducting thorough examinations while also providing timely insights to prevent or mitigate attacks.
The Solution

There is a critical need for malware analysis tools that provide precise, real-time, in-depth analysis for both known and unknown threats, easing SecOps efforts. Deep Instinct recognized this need and developed the Deep Instinct Artificial Neural Network Assistant (DIANNA), the DSX Companion.

DIANNA is a groundbreaking malware analysis tool powered by generative AI (GenAI), utilizing Amazon Bedrock as its LLM infrastructure to tackle real-world issues. It offers on-demand features that provide flexible and scalable AI capabilities tailored to the unique needs of each client. By concentrating our GenAI models on specific artifacts, we can deliver focused, comprehensive responses to effectively address this market need.

DIANNA’s Unique Approach

Unlike traditional methods that rely solely on retroactive analysis of existing data, DIANNA harnesses GenAI to gain the collective knowledge of countless cybersecurity experts, sources, blogs, papers, threat intelligence reputation engines, and chats. This extensive knowledgebase is embedded within the LLM, allowing DIANNA to delve deep into unknown files and uncover intricate connections that would otherwise go undetected.

At the heart of this process are DIANNA’s advanced translation engines, which transform complex binary code into natural language that LLMs can understand and analyze. This unique approach bridges the gap between raw code and human-readable insights, enabling DIANNA to provide clear, contextual explanations of a file's intent, malicious aspects, and potential system impact while addressing information overload by distilling vast data into concise, actionable intelligence.

This translation capability is key to connecting the dots between different components of complex malware. It allows DIANNA to identify relationships and interactions between various parts of the code, and offer a holistic view of the threat landscape. By piecing together these components, DIANNA can construct a comprehensive picture of the malware's capabilities and intentions, even when faced with sophisticated threats. DIANNA doesn't stop at simple code analysis—it goes deeper. The ability to turn information into natural language provides insights for the teams who are investigating why files have been flagged as malicious, streamlining what is often a lengthy process. These insights allow SOC teams to prioritize threats and focus on those that matter most.

Enhancing DIANNA with Amazon Bedrock

DIANNA’s integration with Amazon Bedrock allows us to harness the power of state-of-the-art language models while maintaining the agility to adapt to evolving client requirements and security considerations. DIANNA benefits from Amazon Bedrock’s robust features, including seamless scaling, enterprise-grade security, and the ability to fine-tune models for specific use cases.

Accelerating Development with Amazon Bedrock: The fast-paced evolution of the threat landscape necessitates equally responsive cybersecurity solutions. DIANNA’s collaboration with Amazon Bedrock has played a crucial role in optimizing our development process and speeding up the delivery of innovative capabilities. By leveraging Bedrock’s extensive collection of pre-trained foundation models, we were able to swiftly create a proof-of-concept (POC) that demonstrated the potential of GenAI in malware analysis. This initial success bolstered our confidence in the technology and enabled us to concentrate on further enhancing DIANNA’s features.

A Platform for Innovation and Comparison: Amazon Bedrock has also served as a valuable platform for conducting LLM-related research and comparisons. The service's versatility has enabled us to experiment with different foundation models, exploring their strengths and weaknesses in various tasks, leading to significant advancements in DIANNA’s ability to understand and explain complex malware behaviors.

Alongside its core functionalities, Amazon Bedrock provides a range of ready-to-use features for customizing the solution. One such feature is model fine-tuning, which allows users to train foundation models on proprietary data to enhance their performance in specific domains. For example, organizations can fine-tune an LLM-based malware analysis tool to recognize industry-specific jargon or detect threats associated with particular vulnerabilities.

Another valuable feature is the utilization of Retrieval Augmented Generation (RAG) enabling access to, and incorporation of, relevant information from external sources, such as knowledge bases or threat intelligence feeds. This enhances the model’s ability to provide contextually accurate and informative responses, improving the overall effectiveness of malware analysis.

Seamless Integration, Scalability, and Customization: Integrating Amazon Bedrock into DIANNA’s architecture was a straightforward process. Bedrock’s user-friendly API and well-documented interfaces facilitated seamless integration with our existing infrastructure. Furthermore, the service’s on-demand nature allows us to scale our AI capabilities up or down based on customer demand. This flexibility ensures that DIANNA can handle fluctuating workloads without compromising performance.

Prioritizing Data Security and Compliance: Data security and compliance are paramount in the cybersecurity domain. Amazon Bedrock’s enterprise-grade security features provide us with the confidence to handle sensitive customer data. The service’s adherence to industry-leading security standards, coupled with AWS’s extensive experience in data protection, ensures that DIANNA meets the highest regulatory requirements such as GDPR. By leveraging Amazon Bedrock, we can offer our customers a solution that not only protects their assets, but also demonstrates our commitment to data privacy and security.

By combining Deep Instinct’s proprietary prevention algorithms with the advanced language processing capabilities of Amazon Bedrock, DIANNA offers a unique solution that not only identifies and analyzes threats with high accuracy but also communicates its findings in clear, actionable language. This synergy between Deep Instinct’s expertise in cybersecurity and Amazon’s leading AI infrastructure positions DIANNA at the forefront of AI-driven malware analysis and threat prevention.

The following diagram illustrates DIANNA’s architecture.

Figure 1_Evaluating DIANNA's Malware Analysis.png

Evaluating DIANNA’s Malware Analysis

A key component of DIANNA’s success is testing its malware analysis capabilities and evaluating the results to ensure they remain at peak levels. In this test, the input is a malware sample, and the output is a comprehensive, in-depth report on the behaviors and intents of the file. However, generating ground-truth (GT) data is particularly challenging. The behaviors and intents of malicious files are not readily available in standard datasets and require expert malware analysts for accurate reporting. Thus, we needed a custom evaluation approach.

We focused our evaluation on two core dimensions:

  • Technical features evaluation: This dimension focuses on objective, measurable capabilities. We used programmable metrics to assess how well DIANNA handled key technical aspects, such as extracting Indicators of Compromise (IOCs), detecting critical keywords, and processing the length and structure of threat reports. These metrics allowed us to quantitatively assess the model’s basic analysis capabilities.
  • In-depth semantic evaluation: Since DIANNA is expected to generate complex, human-readable reports on malware behavior, we relied on domain experts (malware analysts) to assess the quality of the analysis. The reports were evaluated on:
    • Depth of information: Whether DIANNA provided a detailed understanding of the malware’s behavior and techniques.
    • Accuracy: How well the analysis aligned with the malware’s true behaviors.
    • Clarity and structure: Whether the report was organized in a clear and comprehensible manner for security teams.

Since human evaluation is labor-intensive, fine-tuning the key components (the model itself, the prompts, and the translation engines) required iterative feedback loops. Small adjustments in any component led to significant variations in the output, requiring repeated validations by human experts. The fine-tuning incorporated the following tasks and elements:

  • Gathering a malware dataset: To cover the breadth of malware techniques, families, and threat types, we collected a large dataset of malware samples, each with technical metadata.
  • Splitting the dataset: The data was split into subsets for training, validation, and evaluation. Validation data was continually used to test how well DIANNA adapted after each key component update.
  • Human expert evaluation: Each time we fine-tuned DIANNA’s model, prompts, and translation mechanisms, human malware analysts reviewed a portion of the validation data. This ensured that any improvements or degradations in the quality of the reports were identified early. As DIANNA’s outputs are highly sensitive to even minor changes, each update required a full reevaluation by human experts to verify whether the response quality was improved or degraded.
  • Final evaluation on a broader dataset: After sufficient tuning based on the validation data, we applied DIANNA to a large evaluation set. Here, we gathered comprehensive statistics on its performance to confirm improvements in report quality, correctness, and overall technical coverage.
Automation of Evaluation

To make this process more scalable and efficient, we introduced an automatic evaluation phase. We trained a language model specifically designed to critique DIANNA’s outputs, automating to some degree the assessment of how well DIANNA was generating reports. This critique model acted as an internal “oracle,” allowing for continuous, rapid feedback on incremental changes during fine-tuning. This enabled us to make small adjustments across DIANNA’s three core components (model, prompts, and translation engines) while receiving real-time evaluations of the impact of those changes.

The automated critique model enhanced our ability to test and refine DIANNA without having to rely solely on the time-consuming manual feedback loop from human experts. It provided a consistent, reliable measure of performance and allowed us to quickly identify which model adjustments led to meaningful improvements in DIANNA’s analysis.

Advanced Integration and Proactive Analysis

DIANNA is integrated with DSX to explain the zero-day threats that are detected and prevented. DSX’s proactive approach has extremely high accuracy and a low false positive rate, which, alongside explainability from DIANNA, helps security teams quickly identify unknown threats and allocate resources more effectively.

Additionally, DIANNA’s output streamlines investigations, minimizes cross-tool efforts, and automates repetitive tasks, making the decision-making process clearer and faster. Ultimately, organizations are able to strengthen their security posture and significantly reduce the mean time to triage.

Key Features and Benefits of DIANNA:
  • Performs on-the-fly file scans, allowing for immediate assessment without any prior setup or delays.
  • Generates comprehensive malware analysis reports for a variety of file types in just seconds, ensuring that users receive timely information about potential threats.
  • Streamlines the entire file analysis process, making it more efficient and user-friendly, reducing the time and effort required for thorough evaluations.
  • Supports a wide range of common file formats, including Office documents, Windows Executable files, script files, and Windows Shortcut files (.lnk), ensuring compatibility with various types of data.
  • Offers in-depth contextual analysis, malicious file triage, and actionable insights, greatly enhancing the efficiency of investigations into potentially harmful files.
  • Empowers SOC teams to make well-informed decisions without relying on manual malware analysis by providing clear and concise insights into the behavior of malicious files.
  • Eliminates the need to upload files to external sandboxes or VirusTotal, enhancing security and privacy while facilitating quicker analysis.
Explainability and Insights for Better Decision-making for SOC Teams

DIANNA sets itself apart from traditional AI tools that use LLM engines by providing insights into why unknown attacks are deemed malicious. Explaining a zero-day attack typically involves a lengthy process that can take hours or even days, often resulting in an incomplete understanding. While this approach has its merits, it mainly offers retroactive analysis with limited context. In contrast, DIANNA goes further than simple code analysis; it understands the intent and potential actions behind the code and delivers clear explanations of its malicious nature and potential impacts on systems. This allows SOC teams to focus on alerts and threats that truly matter.

Example Scenario of DIANNA in Action

In this section, we explore some DIANNA use cases and examples.

For example DIANNA can perform standalone malware analysis on malicious files.

The following screenshot is an example of a Windows executable file analysis.

Figure 2_Standalone Malware Analysis on Windows Executable File.png

The following screenshot is an example of an Office file analysis.

Figure_3_Standalone_Malware_Analysis_on_Office_File_.png

You can also quickly triage incidents with enriched data on file analysis provided by DIANNA. The following screenshot is an example using Windows shortcut files (LNK) analysis.

Figure 4_Standalone Malware Analysis on Windows Shortcut File.png

The following screenshot is an example with a script file (JavaScript) analysis.

Figure 5_Standalone Malware Analysis on JavaScript File.png

The following figure presents a before and after comparison of the analysis process.

Figure 6_Before and After Comparison of Analysis Process.png

A key advantage of DIANNA is its ability to provide explainability by connecting the dots and summarizing the intentions of malicious files in a detailed narrative. This is especially valuable for zero-day and unknown threats, where investigations start from scratch, without any clues.

Potential Advancements in AI-driven Cybersecurity

AI capabilities are enhancing daily operations, but adversaries are also using AI to create sophisticated attacks and advanced persistent threats (APTs). This leaves organizations, and particularly their SOC and cybersecurity teams, combatting more complex threats.

While detection controls are useful, they often require significant resources and can be ineffective on their own. In contrast, using AI engines for prevention controls, like a high-efficacy deep learning (DL) engine, can lower the total cost of ownership (TCO) and help SOC analysts streamline their tasks.

Conclusion

DSX predicts and prevents known, unknown, and zero-day threats in <20 milliseconds—750 times faster than the fastest ransomware encryption. This makes it essential for any security stack, offering comprehensive protection in hybrid environments.

“Time is money” holds true in the world of cybersecurity. DIANNA enhances the incident response process for SOC teams, allowing them to efficiently tackle and investigate zero-day attacks with minimal time investment. This, in turn, reduces the resources and expenses that CISOs need to allocate, enabling them to invest in more valuable initiatives.

The rise of AI-based threats is becoming more pronounced. As a result, defenders must outpace increasingly sophisticated attackers by moving beyond traditional AI tools that leverage ML and embrace advanced AI, especially deep learning. The entire cybersecurity ecosystem must consider this shift to effectively combat the growing prevalence of AI-driven cyberattacks and the future of cyber threats.

To try DSX with DIANNA, visit Deep Instinct in the AWS Marketplace.