How Your Regular Microsoft Office file is Open to Manipulation
Microsoft Office is one of the most common attack vectors used by malicious actors to spread their malware. Considering that most computers have Microsoft Office installed and Office files are commonly shared between organizations and individuals - the chance you’ll stumble across a malware of this sort is highly likely. In this blog, we review the prevalence of malware embedded into Microsoft Office documents (OOXML), the common attack vectors used and proactive steps that can be taken to defend your organization against such attacks.
OOXML’s Vulnerability to Manipulation
Office Open XML (OOXML) is a zip-based file format used by Microsoft to contain Microsoft Office files, such as xlsx, docx and pptx. It is the successor of the OLE file format, which uses compound files instead of XML files to hold content, and is represented by the extensions ‘.doc’, ‘.ppt’ and ‘.xls’, while the most commonly used OOXML file extensions are docx, xlsx, and pptx. OLE and OOXML files are interchangeable- a file of one format can be saved as the other while keeping all its functionalities.
Below is an example of the internal structure of an OOXML file and an OLE file. The left image shows the structure of an OOXML file. It can be seen in the image below that the file has a hierarchical structure, made up of several directories, each containing XML files. In addition to XML files, these directories can also contain OLE objects, and any other file type, such as PE files, images, etc.
The right image shows the structure of an OLE file, which contains compound objects, each with a specific role in the creation of the Office document. For example, the main compound object in a Microsoft Word file, “WordDocument”, is responsible for storing the text.
Structures of an OOXML file (Left) and of an OLE file (Right)
Threat Landscape Trends
As you probably know, Microsoft Office files are highly common and are shared between individuals and organizations on a daily basis. For this reason, many malware authors take advantage of the prolific use of OOXML and OLE file formats to spread their malicious activities.
The graph below shows the relative amount of OLE and OOXML based malware in each yearly quarter since January 2015. The data, which was collected from Deep Instinct’s D-Cloud, is presented in arbitrary units, where the number that represents the number of malicious OOXML files in the first quarter is one. As you can see, there is a continuous but slow increase in both OOXML and OLE malware. While OLE malware is clearly more common in the threat landscape, the margin between OLE and OOXML based malware is slowly decreasing. In addition, there are sudden spikes that can result from phishing campaigns related to global events, like the COVID-19 outbreak, specific country related time periods, such as the US tax return season or the entrance of new malware into the arena. For example, in the third quarter of 2016, the Cerber ransomware started offering its services in exchange for 40% of each ransom paid, which resulted in over 150,000 users who fell victim to it just in July.
Newly discovered malicious OOXML and OLE files per quarter represented in arbitrary units.
Common attack vectors
There are several common methods used by malicious actors to attack victims using the OOXML file format. Some of the discussed methods, such as RELS, are unique to OOXML files, while other methods, such as using VBA macros, are also very common in OLE files.
- Taking Advantage of OLE Object Embedding
The OOXML file format allows the embedding of OLE objects within an OOXML file. OLE objects are created with programs supporting Microsoft's Object Linking and Embedding (OLE) technology, such as Microsoft Word. Like all files, OLE objects have vulnerabilities that can be taken advantage of by hackers. For example, by putting “CVE-2017-11882” to work, malicious actors can use an OLE object embedded in an OOXML file to execute remote code on a victim computer without any user interaction other than opening the Office file. The infamous Loki malware family used this method to harvest user credentials from web browsers, steal crypto wallets, collect data stored in sticky notes, and more.
- DDE Attacks
Dynamic Data Exchange (DDE) is a protocol used to transfer data between Microsoft Office applications. With DDE, which can also be used in the older OLE format, these applications can share data on a one-time basis or for a continuous period of time, for example, one can insert a table into a Microsoft Word document that gets data from an Excel chart, and even updates when the latter changes. Although this protocol was created with good intentions, its functionality can be used for less than innocent purposes. Basically, DDE allows embedded code execution with just a single custom field, which means an attacker can easily use it to run various commands. Usually, an attacker will use this functionality to download additional malware and run it, but it can also be used to share sensitive information, open a remote shell through which the threat actor will send commands, and more. Although DDE is an old protocol (First Introduced in 1987) that was suppressed by the OLE toolkit many years ago, it is still supported by Microsoft, and therefore it’s still posing a security threat. To mitigate this issue, updated MS Office applications will usually prompt two security warnings when a document containing a DDE is opened. However, an attacker that sends such a file to potential victims can trick them into ignoring the warnings and allowing the DDE content to run. For example, the Necurs botnet used spear-phishing emails that told victims that the documents they had requested were attached, making them believe these documents were trustworthy when they were actually used to deliver the infamous Locky ransomware.
- VBA Macros
Macros are sets of commands used to automate procedures in Microsoft Excel and Word. Modern-day macros, which like DDE, are not exclusive to OOXML, are written in Visual Basic- a fully functional scripting language with which numerous actions can be performed. Unfortunately, many attackers take this lemon and use it to make an awfully bitter lemonade. With VBA, a threat actor can do anything from downloading an executable and running it, to opening a reverse shell. Luckily, Microsoft is aware of that and prevents newer versions of Microsoft Office applications from running such VBA Macros automatically. But, as discussed above, attackers find manipulative ways to make their victims ignore the warnings presented by Microsoft and permit the Macros to run. One known malware family that uses this technique to deliver first stage droppers is Emotet malware, which obfuscates the VBA script and uses base 64 encoding to make its actions less likely to be detected by security vendors. Like many other malware families, this banking trojan uses VBA to run PowerShell code that downloads its payload from a remote address, saves it to disk, and runs it. In August 2019, Deep Instinct discovered an interesting VBA malware in a production environment. The malware, an Ursnif dropper, was delivered as an Excel file, posing as an invoice from the corporate giant DHL. It used encoded and obfuscated PowerShell code to deliver the malware.
- Abusing RELS
OOXML’s Relationship File (RELS) describes how an OOXML file’s parts are connected to each other to form a document. In a RELS file, there are ‘targets’, which are usually the pathways to parts of the document with the attached descriptions. However, these targets can also specify files stored in remote locations, which will be called when the document is opened. Naturally, this feature is used by attackers to inflict damage, by pointing to web-based targets and downloading malicious payloads. For example, one can add a remote HTML executable ‘target’, and using the vulnerability CVE-2017-0199, make it run when the OOXML document is opened, so it can fulfill its malicious purpose. This method was implemented by many threat actors over the years, for instance, payloads of the Spywares PonyStealer and FormBook were delivered in this manner. But, as with DDE and VBA macros, an alert is prompted when the RELS file tries to load external content.
- Use of Additional File-Types
Since OOXML is zip, attackers can insert and execute any file from within an OOXML by modifying RELS and [Content_Types].xml. Consequently, attackers use this loophole to deliver and run malware. For example, a PE file can be inserted into an OOXML and executed by VBA macros contained in the OOXML.
How to Defend Against OOXML Malware
Besides being vigilant when it comes to OOXML files, organizations need an advanced threat prevention solution capable of identifying this type of threat autonomously. The chosen solution should be able to handle these files both statically and dynamically preventing this threat before it gets a chance to run. And in the case where the malware somehow slips through, the solution should be able to stop any malicious activities before they can inflict any damage. Additionally, the selected product should have the ability to prevent the execution of malicious VBA macros.
Deep Instinct’s product checks all the boxes- it prevents the execution of malicious VBA macros, uses a Deep Learning model specifically designed to eliminate this kind of threat, before the client even knows they have been compromised, and employs a dynamic mechanism which looks for malicious activities in real-time.
Even though they may seem innocent, OOXML files can cause quite a havoc. In the past, malware authors found ways to abuse them and most likely will find more ways in the future. This reality means users should be cautious when opening Microsoft Office files and organizations should adopt the best possible security technology available to account for the times when employees are not as vigilant as you would hope.