1 Introduction

Malware has shown an intriguing evolution during the last decade. Recent reports have shown that malicious programs are often conveyed by embedding payloads in infection vectors, i.e., file formats such as multimedia and documents [1, 2]. Victims often underestimate the capabilities of such formats, which can embed scripting codes written in languages that allow attackers to conceal payloads easily. Between 2010 and 2020, the two most used formats for embedding attacks were PDF and SWF due to the numerous vulnerabilities that targeted Adobe Reader and Flash [3, 4]. However, PDF vulnerabilities have been progressively patched, and Adobe dismissed Adobe Flash at the end of 2020. Hence, attackers reversed back to the ’90 s, when macro viruses, like MelissaFootnote 1 and Concept,Footnote 2 became one of the most prevalent infection mechanisms to exploit vulnerabilities effectively and convey malicious programs, Microsoft Word being one of the main targets. In 2018, security companies showed an increment of 1000% (in 1 year) of malicious PowerShell payloads [1], with more than 30,000 released in the first quarter of 2019 (in the same quarter of 2018 they were less than 5000) [5]. These reports also showed that such payloads are concealed using macros embedded in Microsoft Office (Word and Excel) files. Macros are written in Visual Basic for Applications (VBA), which can be heavily obfuscated and feature APIs that allow even direct interactions with the OS.

Analyzing macros is possible by employing publicly available tools such as OleVBA [6, 7]. However, the advanced obfuscation and anti-analysis techniques used by malicious samples make these tools unusable in most cases, thus making research works primarily focus on develo** more advanced static analysis techniques [8,9,10]. Nevertheless, such approaches show clear limitations, as static analysis can only partially address the complexity of obfuscated malware, especially in dynamic code loading. Although dynamic analysis through sandboxing may seem an excellent strategy for detecting advanced attacks, it also exhibits various critical issues. For example, sandboxes typically focus on the effects that malicious payloads extracted by macros have on the target system but do not provide enough information on how and why macros work (or fail, in some cases!). With an in-depth analysis of how macros work, we could gain much information on the employed attack and obfuscation strategies and how specific families behave. Moreover, publicly available online and offline sandboxes [11,12,13] are slow and unfeasible for analyzing large loads of malicious macros.

To overcome these issues, we propose Oblivion, an open-source, modular, fast, static and dynamic framework for the instrumentation and analysis of macros contained in Office files. This is done to deobfuscate the operations carried out by the macro without needing to directly reconstruct a clear version of the code. Oblivion leverages the characteristics of VBA to instrument macros to trace every variable value and method call included in the file and to retrieve and de-obfuscate the employed PowerShell codes. Besides, Oblivion can reveal attacks alternative to PowerShell by detecting suspicious actions (e.g., accessing Outlook to send malicious emails or drop** additional malicious macros) and automatically interacting with windows that may be prompted during execution to hinder automatic analysis. This novel functionality, not yet present in the State of the Art, helped overcome the hindrance of MessageBox-like windows, used by malware authors to stop dynamic analyzers in their tracks. Additionally, the tool constitutes a novelty in the sense that, to the best of our knowledge, no other tool provides the same level of detail for VBA malware in a scenario where advanced obfuscation techniques are involved.

The architecture of Oblivion has been designed to run fast and effective analyses of large loads of files. In particular, we performed an analysis of more than 40,000 Office malicious files belonging to different families and featuring macros of various types. The attained results show that Oblivion could analyze most of them by extracting and de-obfuscating thousands of PowerShell codes. Moreover, we used the capabilities of Oblivion to describe a comprehensive list of attack families that reflect the different behaviors of macros. Finally, we measured Oblivion’s performance by showing an average analysis time of less than one minute. Our major goal is to develop a tool the scientific community can use effectively for further research and analyses. For this reason, we make Oblivion open-sourceFootnote 3. We release all the reports generated during our experimentsFootnote 4. These reports contain the de-obfuscated macro operations and related obfuscated and de-obfuscated PowerShell codes that Oblivion could extract from the macros.

The rest of the paper is organized as follows: Sect. 2 provides an overview of the organization of Office files and of the VBA language used to write macros, by focusing on macro-based malware and its obfuscation; Sect. 3 describes the related work in the field, highlighting the advances concerning the state of the art of the proposed approach; Sect. 4 describes the architecture and the functionalities of Oblivion; Sect. 5 provides the experimental results attained by Oblivion on a dataset of malware samples; Sect. 6 presents and discusses the limitations of our approach; Sect. 7 provides the closing remarks for the paper and sketches future research directions.

2 Microsoft office files

The Microsoft Office suite is among the most popular document-processing software bundles. The whole suite revolves around three main products employed to elaborate documents (Microsoft Word), spreadsheets (Microsoft Excel), and presentations (Microsoft PowerPoint). The files parsed by such products can be represented in two formats, between which users can easily switch: OLE (Object Link and Embedding - Compound Document Format) and OOXML (Office Open XML) [14]. The first format, identified by file extensions such as .doc, .xml and .ppt, was the de-facto standard in Microsoft Office 97-2003. The second format, identified by file extensions such as .docx, .xlsx, and .pptx, was introduced in Office 2007, and it is the default standard in recent versions (currently, Office 2019 and 365). The following briefly describes the primary differences between the OLE and the OOXML formats. Then, we explain how macros are typically employed in Office files, along with their characteristics.

2.1 File formats

The Object Link and Embedding Compound Document Format (from now on referenced as OLE) is a hierarchical collection of storage and stream objects that can be seen, from a file system perspective, as directories and files [15]. The general idea is organizing the document in components that can be easily updated/added without altering the rest of the file.

In the case of .doc files, the primary stream is represented by the File Information Block (FIB), which contains the references to the other streams inside the file. Such streams include, among others, tables, data with no predefined structures, and macro codes [16]. Excel documents typically contain one or more workbook streams, data structures that can contain additional substreams. Substreams contain additional information about the elements commonly used inside the workbook, such as sheets, charts, and macros [17].

The OOXML format has been codified in international standards ISO/IEC 29500 and ECMA-376 [18]. An OOXML is a zipped archive containing previously embedded elements in the OLE format’s storage/object structure. The file is now represented as a compressed archive, so it is more straightforward to understand and point out its components. Many elements in the OOXML format are seen as separate files. This characteristic enhances the modularity compared to the previous implementations and improves the file robustness against data corruption. In this representation of the file, detecting macros embedded inside the file is even easier. Note that, differently to the OLE format, the structural OOXML representations of the .docx and .xlsx files are very similar.

2.2 VBA macros

Macros are programs written in Visual Basic for Applications (VBA), an implementation of Visual Basic for Office. Macros are contained in binary files (typically named vbaProject.bin). They are integrated into the file structure according to the employed format (OLEFootnote 5 or OOXMLFootnote 6). Macros are sequences of events that are automatically executed to avoid the repetition of manual actions inside an Office document. For clarity, we borrow from TrumpExcelFootnote 7 an example of a simple Excel macro used to save all worksheets in a separate PDF when the file is closed, reported in Listing 1.

figure a

By default, OOXML files (.docx,.xlsx,.pptx) can’t be used to store macros. Only specific files with enabled-macro can be used to contain VBA macros. Conversely, OLE files are organized in streams that can be visualized via oledir, as depicted in Listing 2. The VBA code’s execution is inherently linked to the opened Office file (i.e., it is impossible to execute a stand-alone VBA program). VBA macros can be represented in three major file formats, according to the design choices made by the user [19]:

  • Class Modules (.cls). These macros contain classes, and the embedded variables are instance-based, meaning they can be accessed only through objects related to the class.

  • Macro Modules (.bas). These macros only contain global variables, meaning only one instance is saved and employed in the rest of the macro code. Changing variables inside .bas macros mean that their upgraded values will be employed by other procedures that use them.

  • Form modules (.frm). These macros typically focus on creating graphical interfaces for the users to insert data that can be used in the document.

figure b

Typically, at least one standard.cls macro (typically referred to as ThisDocument or ThisWorkbook) is present in each macro-based file. These standard macros cannot be deleted from the VBA project. Listing 3 shows an example of macro employed in VBA applications [20].

The code takes an integer c (with the InputBox command) as user input. It multiplies it for each element of a list rng of numbers the user previously selected (Selection). Routines in VBA are typically introduced with the Sub keyword, while variables are declared with Dim. Users typically employ such small functions as valid aids to perform complex operations on data.

figure c

2.3 VBA malware

Besides allowing users to simplify their work with Microsoft Office, VBA provides a set of advanced functionalities to control the operating system, spawn external processes, and interact with shells or networks. These characteristics make VBA a well-suitable vector to execute malware, as attackers can trigger functions to, e.g., load payloads in memory, download files, and execute external scripts (by employing PowerShell, a powerful scripting language used in Windows environments). In this way, attackers do not even need to exploit vulnerabilities of applications, as the functions that they can directly invoke potentially allow them to install additional payloads on the victims’ systems. This operation is designed to evade static analysis: the malicious content is not immediately present inside the code. Instead, it will be retrieved via legitimate internet connectivity or file reading functions.

Most malicious macros hide and generate (typically, at runtime) PowerShell codes.Footnote 8 Once the scripting code is ready, it gets executed through a shell spawned using VBA APIs such as WScript.Shell. The execution is often finalized by drop** and executing additional payloads. In other cases, macros can directly load different payloads, which typically require extensive routines. Listing 4 shows a typical example of macros executed by malware.

figure d

Examining this macro makes it possible to infer some typical traits of macro-based attacks. First, the majority of them employ automatic functions, i.e., functions that execute either when the users open or close the Office files. Notably, these functions have common names which are automatically recognized by the macro-processor (e.g., AutoOpen, DocumentOpen, WorkbookOpen). The second characteristic is the PowerShell command, which in this case, executes another PowerShell script (all.ps1) located in the C drive of the victim. Another interesting point is that a part of the command, specifically the powershell word, has been obfuscated with a simple string concatenation technique.

2.4 Macro obfuscation

Obfuscation is extensively used in macros, and it often represents an insurmountable hurdle for static analysis. Listing 5 represents a small example of this technique.

figure e

The code is hard to be examined by humans or static automatic analyzers. However, it is still possible to retrieve some information by analyzing some small readable parts of the code. For example, the word iNvoKE suggests the presence of an encoded PowerShell command. Likewise, the presence in the same line of code of tps://hawk and .exe suggests that there may be an encoded URL from which an executable file is downloaded. The Shell function at the end of the macro indicates that a shell is spawned for executing PowerShell. Nevertheless, in many cases, the static analysis of the code is practically impossible.

For comparison, in Listing 6 we report an equivalent VBA Macro that performs the same action as the example before (we avail of the custom function “DownloadFile” for clarity). From here, it is immediately visible that this is a downloader that retrieves an EXE file from a compromised server and subsequently runs it.

figure f

According to a recent taxonomy [9], we can identify four major obfuscation techniques employed by obfuscated macros:

  • Random Obfuscation. The function and variable names in macros are replaced with random sequences of characters.

  • Split Obfuscation. Strings inside macros are split and chained with the join operators & and +. The number and length of the splits are arbitrary.

  • Encode Obfuscation. Data inside macros are encoded using algorithms such as Base64 or Shift. More specifically, there are three ways to obfuscate macros with encoding: (i) by using built-in functions such as Replace, which replaces characters with other sequences of characters; (ii) by employing character encoding with the use of functions such as Asc, Hex or Chr; (iii) by using custom algorithms that resort to xor, Base64, or Shift.

  • Logic Obfuscation. This technique is employed by declaring variables or functions that are never reached by the execution of the code.

The techniques described above can be combined to make the analysis even more complicated if performed only statically. Thus, it becomes crucial to employ approaches that can de-obfuscate macros regardless of the complexity of the obfuscation techniques.

3 Related work

Office Malware Detection Previous scientific work on Office malware focused on analyzing and detecting Office files by employing static or dynamic analysis of the original macro codes. Schreck et al. [21] used dynamic analysis to inspect Office files by executing them in multiple sandboxes (till Office 2007). They observed the system call traces generated during the execution and the Assembly instructions employed by payloads.

Smutz and Stavrou [22] proposed an approach to disarm the exploits in Office files by randomizing their structural contents. In particular, the authors randomized the file data structures to make the malicious contents not accessible anymore while preserving the remaining functionality of the documents. The approach was applied to.doc and.docx files. Ruaro et al. [23] focused on another nuance of the Office macro problem and proposed SYMBEXCEL. This program resolves obfuscation in Excel XL4 malicious macros via symbolic execution.

Concerning machine learning-based approaches, ALDOCX [8] uses active learning to perform static analysis and detect malicious.docx files. In comparison to Oblivion, this system does not analyze the code that is truly executed by the files. Instead, it resorts to hierarchical structural paths obtained from the XML structure of the files. Therefore, this approach can only be used on XML-based Office documents, thus ruling out other formats such as.doc and.xls.

Kim et al. [9] proposed a machine-learning method to analyze obfuscated macros. More specifically, the proposed strategy aimes to extract a comprehensive set of static features from the analyzed code, such as the number of characters, the average length of words, and the Shannon entropy.

Lu et al. [10] proposed detecting malicious Office macros by performing static analysis of the files from four perspectives: functional words, OLE file object formats, structural paths, and specification errors. The authors employed machine learning on features extracted from these characteristics to perform the detection of OOXML files.

Mimura and Ohminami [24, 25] proposed techniques to detect obfuscated macros by using Latent Semantic Indexing (LSI) and Natural Language Processing (NLP) to extract words from the source code of macros. The extracted words are then encoded as features used to train a machine-learning model.

Koutsokostas et al. [26] computed, both statically and dynamically, a binary vector representation of an imbalanced Office malware set and used it to feed classifiers that estimated the maliciousness of documents. Notably, their method detects the use of DDE (Dynamic Data Exchange) and LOLBins (Living Off the Land Binaries).

Yan et al. [27] analyzed the visual data, such as text and images, inside the samples and used the definition of deceptive content (e.g. samples that visually mimic official Microsoft Documentation) to classify a sample as malicious or not.

PowerShell Analysis Previous scientific work also focused on analyzing PowerShell scripts generated by macro codes. Specifically, the first methods analyzed obfuscated scripts by employing machine learning and techniques such as Abstract Syntax Trees [28, 29]. Other strategies employed Deep Learning in combination with Abstract Syntax Trees and Natural Language Processing [30,31,32]. Alahmadi et al. [33] used Deep Learning in conjunction with auto-encoders. Ugarte et al. [34] presented PowerDrive, an automatic, open-source de-obfuscator for PowerShell that simplifies the analysis of these attacks and that has been used as a part of the post-processing module in Oblivion. Finally, Li et al. [35] proposed an alternative de-obfuscation approach for obfuscated PowerShell codes based on the semantic sub-tree analysis.

Tools for Macro Analysis Various publicly available tools can be used to extract information from Office files. OleVBA is among the best static tools to analyze Office files [6], and Oblivion uses it to aid static analysis. It works on both OLE and OOXML files, and extracts information about suspicious VBA keywords that can be used to perpetrate attacks. Notably, OleVBA cannot be employed alone to perform full malware analysis, as it suffers from the limitations of static analysis (it is especially vulnerable against) obfuscation. In 2016, ESET released a dynamic approach to analyze Word files called VHook [7], which Oblivion has extended. The file is instrumented by injecting specific control instructions in the macro-code, thus extracting the input parameters of System functions (such as Shell). However, this approach is limited to Word files and lacks many of the characteristics introduced with Oblivion (see Sect. 4).

Macros can also be treated as Visual Basic scripts. With this respect, Usui et al. [36, 37] proposed to trace API calls in scripting languages. Their work aims to be universally suitable for a plethora of scripting languages, including Visual Basic. While not that broadly applicable, Oblivion can retrieve richer information, such as variable content and interaction windows.

Finally, another popular tool is OfficeMalScanner [38], which performs static analysis of macro embedded in Office documents, similarly to OleVBA. The tool also looks for possible encryption keys that may be used to protect the analyzed documents.

4 The oblivion framework

Oblivion is a framework that combines static and dynamic analysis to provide a complete overview of macro-based Office files. The overall architecture of the system, depicted in Fig. 1, has been tailored to analyze complex macro-embedding malware (but it can be employed on any Office file). The system receives a folder containing the target files and outputs a detailed analysis report for each file. The overall architecture of the system is composed of multiple modules, described as follows:

Fig. 1
figure 1

General Architecture of Oblivion

4.1 Instrumentation

  1. 1.

    Pre-Processing. The system performs a preliminary analysis of the target files by employing static analysis. The goal of this step is multi-folded: (i) ensuring that the analyzed files contain macros; (ii) ensuring that the macros are syntactically correct; (iii) finding the presence of possible obfuscation; (iv) ensuring that the macros are correctly executed. If the system can analyze the embedded macros, they are sent to the instrumentation module.

  2. 2.

    Instrumentation. Oblivion injects special control and logging instructions into each macro extracted during the pre-processing phase, to track each variable and method call. The output of this module is a modified Office file that can execute the instrumented macro.

  3. 3.

    Execution. Oblivion executes the instrumented macros in a virtualized environment. This module examines the macro’s execution by tracing the values of the employed variables and logging all method invocations. The extracted information is saved and sent to the post-processing module.

  4. 4.

    Post-Processing. Oblivion parses the output sent by the execution module to produce a final report containing, among other things, the extracted PowerShell codes (obfuscated and de-obfuscated—if any), the contacted URLs, the evolution of each macro variable, and more.

In the following, we provide a detailed description of the functionality of each module.

4.2 Pre-processing

This module aims to simplify the analysis of multiple files as much as possible by excluding those that embed syntactically wrong or empty macros. Additionally, the system analyzes possible obfuscation patterns related to the macros’ maliciousness. The operations carried out by this module can be summarized in two steps:

  1. 1.

    The pre-processor searches for macros embedded in the formats described in Sect. 2 (in particular, .cls and .bas). This search is automatically carried out using the popular static analysis tool OleVBA [6]. OleVBA also retrieves additional information about possible suspicious calls and actions the extracted macros perform and adds the results to the final report (i.e., when all the other analysis phases are complete).

  2. 2.

    The pre-processor analyzes the macros extracted by OleVBA and returns four possible labels for each macro:

    • Corrupted. The macro contents are corrupted and not visible. This output means the macro cannot be executed (i.e., no malicious actions will occur).

    • Password protected. The embedded macro is password-protected from visualization and access. Hence, the macro cannot be analyzed without the correct password.

    • Interaction-based Macros. The macro requires specific interactions with the user to be properly executed. In particular, the macro typically employs VBA APIs such as MsgBox and ShowWindow to ask users for additional interactions.

    • Standard macros (.cls and.bas). Macros labeled as standard are statically valid, not password-protected, and do not require users’ interactions to be executed. Typically, these macros are in the (.cls) or .bas formats. Office files can often contain more than one macro in the two formats. We refer to this case as .bas+.cls.

Oblivion will analyze macros deemed as working, standard, or interaction-based. Then, it retrieves the obfuscation patterns described in Sect. 4 by analyzing macros through the following heuristics: (i) the presence of specific APIs; (ii) the randomness in variable names; (iii) existing encoding-related functions; (iv) anomalous distributions of special characters, such as & and +.

We first extract the macro code from the original file using OleVBA.

This module then instruments the extracted macros with special logging instructions (a phase called macro modification) and re-injects them either into the original Office file or into a clean file of the same type (a phase called injection). For the following experiments, we used the first option to preserve possible additional malicious content not directly included in the VBA code (e.g. PowerShell code hidden in a Text Block). In the following, we provide additional details about the two phases.

Macro Modification. This phase aims to control and trace the evolution of the variables and method calls employed by macros. This technique is beneficial for macros that hide scripting codes by scrambling them into multiple variables, which are dynamically re-assembled at runtime. Observing each variable’s evolution is crucial to maximizing the probability of extracting the full scripting code. This strategy also allows the detection of other attack strategies besides PowerShell (e.g., using Outlook to send malicious emails).

Monitoring VBA instructions is a notoriously complex challenge because of the rich syntax employed by Visual Basic, the wide variety of employed samples, and the numerous obfuscation techniques. To tackle this challenge, we completely re-designed and expanded VHook [7], a popular macro instrumentation tool. The idea behind this tool takes inspiration from Windows API Hooking techniques. Namely, common VBA methods and functions, like Mid, are replaced with self-logging versions of themselves. The primary goal is to discover information such as internal VBA functions within malicious files (e.g. Shell) and external function declarations (e.g. URLDownloadToFileA). However, this approach can fail on malware employing heavy obfuscation. Moreover, it used no code (or variable) analysis or PowerShell extraction.

We significantly expanded the instrumentation approach proposed in VHook by implementing complete variable tracking and methods monitoring for both OLE and OOXML Office files. In particular, for each executed instruction that is related to a variable assignment and method execution, we inject logging instructions that belong to a special logging VBA class. The methods embedded in the class belong to two categories: (i) general logging methods that print the contents of accessed variables; (ii) overriden VBA methods (e.g., CreateObject, GetObject, Mid) that allow, along with the execution of the original methods, to log their parameters.

To perform reliable instrumentation that would not introduce crashes during the execution of the instrumented macro, we introduced proper management of the following technical aspects of the language (which were absent in VHook):

  • Data Structures. Complete handling and tracking of data structures such as arrays and lists.

  • Special Statements. Special statements like If, With, For, and While instructions can be either expressed in multiple lines and/or in line. Oblivion can extract and track variables in multi-line and in-line complex statements.

  • In-Line Instructions. Effective management of multiple in line instructions separated by a colon (:).

  • Exceptions. Correct handling of exceptions-throwing functions.

  • In-line Comments. Proper management of comments, especially when in line with other instructions. In VBA, comments are introduced by a single quote (). When these comments are in line with proper instructions, they can compromise the overall analysis.

To demonstrate the capabilities of Oblivion, we added in Appendix A an example of an obfuscated macro that our system has fully analyzed.

Macro Injection. In this phase, the system injects the modified macros into a clean file to significantly speed up the analysis process. The execution and load times are not influenced by external elements (such as heavy Excel worksheets). Conversely, Oblivion may also employ a copy of the original file devoided of its macros. The user can decide the type of injection: injecting into clean files will speed up the analysis process, as the execution and load times are not influenced by external elements (such as heavy Excel worksheets). However, this may create problems in analyzing files whose macro execution depends on elements contained in the original file (e.g., the value of a specific cell in an Excel file, or the textual content of a Shape element). During our experimental phase, we opted for the original document mode, slightly compromising the performance in favour of completeness.

Once the macros have been correctly injected into the file, the analysis proceeds to the execution module.

4.3 Execution

In this phase, the file with instrumented macros is executed in a virtualized environment. As pointed out, Oblivion has been optimized to work with Sandboxie, a free-to-download sandbox [39]. We chose Sandboxie because of its popularity and the straightforward-to-use APIs that allow automatic sandbox cleaning.Footnote 9

Moreover, Oblivion simulates user interactions with simple dialogue windows (a common example is reported in Fig. 2) by, e.g., clicking on buttons or inserting data into input bars. To this end, it employs the PyWinAuto library and handles windows generated via MsgBox, InputBox, or other custom functions. This functionality is handy for those samples that employ windows to prevent automatic analysis by sandboxes that cannot simulate users’ actions.

Fig. 2
figure 2

An example of window spawned by malware requiring user interaction

The execution starts by opening the instrumented file, which often loads and executes routines with default names, such as DocumentOpen or WorkbookOpen. In VBA, these functions will be executed as soon as the file opens. The log of the execution is then written to an output file.

Oblivion also supports the analysis of macros that are loaded when files are closed (by using default routines such as DocumentClose). Normally, this would happen when users click on buttons to close windows. Oblivion addresses this by using the win32com APIs to automatically mirror the VBA Close function behavior.

4.4 Post-processing

This module receives, as inputs, the information obtained during the pre-processing, instrumentation, and execution phases. Then, it produces a final report containing the extracted information about the analyzed macros in a comprehensive and organized way. The report is organized into three sections, described in the following:

  • Call-graph generation and variable tracking. Oblivion parses the execution flow of the macro to reconstruct the methods that have been truly called during the execution, thus ruling out routines with dead code. The report contains, for each executed method, the call-graph paths that lead to the method itself.

    Besides, Oblivion profiles each variable encountered during the execution of the macro. In particular, the report contains each variable’s sequence of values as the macro’s execution unfolds. In this way, it becomes easy to understand which variables contain information related to malicious actions.

    For example, consider the line:

    figure g

    This line was contained in a sample of our set,Footnote 10 coupled with 32 analogue lines the purpose of which was to build a string. However, the macro also contains more than 1800 useless lines used for Logic Obfuscation. Variable tracing allows us to effortlessly trace the evolution of this string and retrieve the payload in the report:

    figure h
  • Attack de-obfuscation. Oblivion examines the values of the variables to reconstruct PowerShell codes (or other commands executed from shells), which are often dynamically obtained through multiple variable assignments. Hence, the system searches for variables that contain keywords related to shell commands, such as powershell.exe and cmd. If such values are found, Oblivion may further de-obfuscate the obtained script by employing PowerDrive [34], an open-source tool for automatic de-obfuscation of PowerShell codes. This tool will, for example, attempt to resolve common obfuscation methods used in PS scripts such as Base64 encoding, script scrambling and Hex encoding. PowerDrive will also perform a syntactical examination of the script to assess its correctness.

  • Attack profile. Oblivion examines standard API functions (e.g., WScript.Shell) often employed by macros for malicious purposes so that a human analyst can extract a possible macro profile, which can be conceived as a comprehensive synthesis of the actions performed by the analyzed sample. For example, a popular profile we found is a set of actions that macros use to self-replicate and influence the next execution of Office. Other common actions include loading bytes in memory to construct and execute a malicious payload without saving it or downloading and running additional payloads from the net.

    Oblivion also dumps any references to environmental variables (e.g., APPDATA) that malware can use as paths to drop additional payloads. It also reconstructs and extracts the URLs directly contacted from the macros or the PowerShell code. This extraction can occur during the static pre-processing phase or the macro’s execution.

As a final note, we remind that the functionalities of Oblivion can be further expanded in the future due to its modular, open-source nature. This product is designed to be used privately and, at least in this phase, does not employ a dedicated server to which users may perform remote analyses.

5 Experimental evaluation

In this Section, we provide a detailed insight into the results obtained by running Oblivion on a large number of malicious files. Every module belonging to Oblivion was written in Python 3 to optimize its interaction with existing tools. The experiments were executed in an Intel XEON workstation with 96 GB of RAM and 24 processors running Linux Debian, which executed a Virtual Machines where we installed Microsoft Windows 11, Office 365 Professional (with macro execution enabled), and Sandboxie. The use of a Virtual Machine posed an additional resource costraint, as the dedicated resources consisted of 16 GB of virtual RAM and 8 virtual processors.

We start this section by describing the dataset employed for the analysis and providing the results obtained during the pre-processing phase. Then, we describe the results after the instrumentation and execution phases by showing the main characteristics of PowerShell- and non-Powershell-based macros codes. Finally, we provide an insight into the characteristics of the analyzed malware and the computational performances attained by Oblivion during the analysis.

5.1 Dataset and pre-processing

5.1.1 Dataset

In our experimental evaluation, we employed a dataset composed of 42,991 malicious files, belonging to Word and Excel formats (.doc, .xls, .xlsm, .docm).Footnote 11 We obtained our dataset in 2018 from the VirusTotal [40] service. While this dataset might be considered outdated from a threat detection perspective, we argue that (i) Oblivion is not a detection system per se: it does not dictate if a sample is malicious via, for example, a classifier, and it instead streamlines the macro execution so that it becomes far simpler for security analysts to take that decision themselves; and that (ii) to the best of our knowledge, VBA-based attacks have not significantly been affected by concept drift; therefore the corpus of information represented by this dataset is still valid nowadays. We constructed this set by selecting those files that featured macros and whose score in VirusTotal was higher than 3. This threshold was empirically chosen, as detection rates equal to 1 or 2 may often refer to false positives. In total, we obtained 27,512 Word and 15,479 Excel files, and this proportion reflects the higher number of Word files employed in malicious contexts.

Notably, there is no guarantee that the gathered files are effectively working. Most engines belonging to VirusTotal perform static analysis of the samples without ensuring that they are syntactically correct or analyzable. Hence, performing a thorough pre-processing analysis was crucial to select genuinely working samples.

5.1.2 Pre-processing

The pre-processing phase was executed with OleVBA ver. 0.54.2, and we report its results in Table 1, according to the taxonomy proposed in Sect. 4.2. Most analyzed files were statically correct and required no user interaction (or password).

Table 1 Results obtained from the static pre-processing of the dataset. Executable files are syntactically correct and can be executed. On the contrary, empty and corrupted files will surely be discarded

However, thousands of files also required some interaction from the user. This aspect reflects a common trend in Office malware, where users are often tricked into clicking on a confirmation window. Syntactically correct files are marked as executable, regardless of the presence or absence of interactions. Conversely, files deemed empty do not contain macros and feature non-macro-based techniques not analyzed in this paper. We also observed several corrupted files, an unsurprising fact since attackers often submit non-working samples to VirusTotal to test possible code- or byte-level modifications made to macros. Corrupted macros cannot be executed. Notably, executable files do not necessarily complete their malicious actions, especially when they depend on external contexts. For example, samples that rely on Microsoft Outlook to send malicious emails would not work if the software is not installed, even if the macro is syntactically correct.

Overall, we obtained 30,750 files that our system could analyze. The files were then sent to the instrumentation and execution modules for further analysis.

5.2 Instrumentation and execution

After the instrumentation and execution phases, we can structure the analyzed files into two major categories: (i) Success, when the execution of the file was successful, (ii) Failure otherwise.

Files whose execution was successful can be categorized further according to the presence (3357 files) or absence (16,880 files) of embedded PowerShell scripting codes. The execution of the files is defined full when the embedded macros are completely executed (13,095 files). Conversely, we define partial those files whose instrumented macros could not complete their execution (7142 files). Notably, Oblivion can retrieve some meaningful (albeit partial) information about employed variables and methods even from partially executed macros.

Files whose execution was not successful can be categorized as follows: (i) Semantic Errors (SE) (7962 broken samples), where the instrumented files could not be executed due to logical errors in the original macro. (ii) Oblivion Errors (OE) (2251 instrumentation errors), where the instrumented files could not be executed due to errors related to the instrumentation process. The results are summarized in Table 2.

Table 2 Number of files belonging to the general categories detected by Oblivion after the post-processing phase

The larger number of attacks without PowerShell should not surprise. While PowerShell is a very effective attack vector, many other strategies (as will be described in the following) exist to achieve a successful attack. Results on failed attacks show that the execution of malicious macros is far from trivial, and many attacks can fail even when they appear to be syntactically correct. In the following, we list the most common semantic errors that we encountered during execution:

  • All macros are empty: these samples have macros in them, but there is no VBA code except the method signatures.

  • Method or data member not found: these macros make a reference to a never-defined constant.

  • Sub or Function not defined: these macros call a never-defined method.

  • The code must be updated for 64-bit systems: these macros use constructs that can only be executed in 32-bit versions of VBA, such as improperly managed Declare statements.Footnote 12

The remaining errors are imputed to Oblivion’s code manipulation, which will be discussed further in Sect. 6. We point out that this set makes only 8.29% of the whole analyzable samples dataset.

Fig. 3
figure 3

Number of files belonging to the main categories of PowerShell attacks

5.3 Post-processing

In the following, we provide additional insight into the characteristics of those files whose execution was full or partial. In particular, the post-processing module performed additional analyses on Powershell and non-Powershell files, whose results are described in the following (see the taxonomy proposed by [34]):

  • Powershell attacks. Oblivion analyzed the 3357 de-obfuscated codes by extracting the following categories (depicted in Fig. 3):

    • Download. The PowerShell code retrieves additional files from the network, such as.dll libraries or additional macros.

    • Execution. The PowerShell code attempts to execute another process, an operation often performed by using Windows system APIs such as VirtualAlloc.

    • Others. The PowerShell code performs malicious actions other than download or execution, such as opening and closing existing processes.

    Results show that the most used attack strategy is the execution of remotely retrieved payloads.

  • Non-Powershell attacks. Oblivion found many files that did not employ PowerShell to perform their attacks, resorting to five major alternative techniques listed in the following.

    • Run Executable (RE). These attacks perform operations that create malicious executables (or retrieve them from the net), save them on the disk, and then execute them directly.

    • File Manipulation (FM). This category involves creating, opening and editing additional non-executable files (such as new Word or Excel files).

    • Office Infection (OFI). These attacks aim to infect the Office macro processor by forcing it to overwrite every loaded macro with malicious variants. In this way, the injected macros will always interfere with operations performed by the user.

    • Outlook Infection (OTI). This category concerns the infection of Outlook profiles and the abuse of mail addresses to create SPAM campaigns.

    • File Download (FD). These attacks concern downloading non-executable files (e.g., additional documents).

    The categories described above are often identified by the usage of specific system routines, which can be combined to create attacks that can feature multiple characteristics. Table 3 shows the distribution of these categories among the analyzed samples and the related system routines, with the most popular attack category being Run Executable (RE). This result is reasonable, as obtaining the truly malicious payload at execution time may help to avoid detection. This technique is antithetical to OFI, which privileges undetectability over power: infecting the target macros is much stealthier and harder to be detected by victims than other operations (e.g., opening executable services).

Table 3 Number of files belonging to the main categories of attacks that do not involve PowerShell, along with the most typical lines of code for each family

We also found that the execution of the process generated one or more windows in 5005 cases, constituting circa 16.28% of the processable set. Oblivion saves a screenshot of the spawned windows and interacts with all available Buttons and TextBox objects by saving them textually in the report. In practice, we found no valuable information inside these windows; the plug-in proved useful to move the execution along and bypass the interrupts.

5.4 Malware statistics

In the following, we provide additional insight into the analyzed data after the post-processing phase. Our analysis evidenced ten malware families that are especially used in this dataset (also reported in Table 4):

  • Metacol is a Melissa-inspired mass-mailing sample that hijacks Outlook and sends infected documents to available e-mail addresses.

  • Thus first infects the Global Template and all currently open Office documents. The payload only activates on December \(13^{th}\) and deletes all files in C: \(\backslash \).

  • Laroux is a historic piece of malware that replicates itself in all Excel workbooks opened.

  • Donoff downloads malicious executables or libraries and saves them in user folders.

  • Valyria contains a script that, at the same time, downloads additional payloads and tries to send user information to a compromised server.

  • Marker exfiltrates execution logs via FTP. This sample can be recognized by its mentions of “Shankar’s Birthday” in various instances.

  • Alcaul is a Metacol variant.

  • Locky encrypts all user data, appends .locky as an extension, and generates a ransom note on the Desktop.

  • Madeba opens a connection with the attacker and receives commands.

  • Mailcab drops an infected workbook named K4.xls inside the Microsoft Excel Startup folder.

Table 4 Most common malware families in our dataset

Interestingly, the most employed families span from old attacks constantly reused over the years to recent, destructive ones like ransomware. Many of these samples established network connections to carry out their malicious actions. We report in Table 5 the most common contacted domains, along with the HTTP response codeFootnote 13 to estimate if that domain is still active. Despite the analyzed samples being some years old, we could notice that there are two still reachable domains, deemed as malicious by SURBL.Footnote 14 Notably, the fact that two malicious samples contact the same domain does not necessarily mean they belong to the same attack family.

Table 5 Most contacted domains by the samples in our dataset

Table 6 shows the top 10 unique macros we can identify in our set. We extracted the macro using the Oblivion Macro Instrumentation module, then computed the SHA-256 hash value for the resulting object to see if there were repetitions. With this, we evidence a trend of code reuse in (apparently) unrelated samples. However, it is important to mention that two samples with the same VBA code may execute different attacks. For instance, consider two documents containing the piece of code seen in Listing 7; the infection methodology is the same, but the payload is contained outside the VBA Project; hence, it may differ between the two samples.

figure i
Table 6 Enumeration of the most popular macros contained in our dataset samples

5.5 Performances analysis

In this Section, we provide an insight into the performances attained with Oblivion in terms of time employed to execute macros. More specifically, we tested the execution performances of Oblivion on samples that were fully executed. We did not include in our analysis the performances related to partial executions or errors, as the shorter execution of such macros would have biased the overall results. The execution times concern the sum of the instrumentation, execution, and post-processing phases (i.e., till the creation of the file report).

Fig. 4
figure 4

A representation of the overall time Oblivion took to analyze the dataset. The section in blue \((60.74\%)\) contains samples that were analyzed in less than 30 s, the one in orange \((23.36\%)\) in less than 60 and the one in red \((13.90\%)\) in more than 60. The highest execution time was 122.53 s

The attained results are depicted in Fig. 4, showing that Oblivion could analyze most samples in less than 30 s each. Specifically, the average analysis time per single sample is \(41.11 \pm 32.47\), which drops to \(28.70 \pm 9.02\) if we discard the outliers in the red region. Considering the typical analysis times of sandboxes in the wild, we believe that this result shows that Oblivion can be employed to analyze large groups of files, providing quick and reliable results.

5.6 Comparison with other works

5.6.1 VHook

As mentioned earlier, Oblivion bases its code on VHook. The most notable expansion we implement compared to this tool is Variable Tracing, which allows us to observe the final content and the relative evolution of each variable in the macro. To demonstrate the utility of this capability, we take the case of the “Sample A”,Footnote 15 which we analyze separately with both VHook and Oblivion. The results of these two analyses are reported in Appendix B. The sample in question is classified as belonging to the sagent family by AVClass2 [41]. As reported by Kaspersky,Footnote 16 malware of this family consists of Microsoft Office documents that contain a malicious VBA script for downloading other malware secretly. This behavior is not inferrable from the data dumped by VHook, which consists mostly of MID and Left calls. However, from the information detected by Oblivion, it is immediately noticeable that a Base64 string is constructed in the variable v06754B1BKV6. It is, therefore, sufficient to decode it to reveal a second stage, which we also report in Appendix B, whose capabilities are more explicit since this code is not as obfuscated.

5.6.2 Online sandboxes

We have stated that the main advantages of Oblivion over dynamic analysis tools available in the wild are (i) the ability to handle elementary interactions without human support and (ii) reasonable timing. To demonstrate the first claim, let us consider the case of “Sample B”,Footnote 17 analyzed with Oblivion and ANY.RUN,Footnote 18 a popular online sandbox. We analysed the sample once with Oblivion and twice with ANY.RUN, the difference between these two interactions being that we manually clickFootnote 19 or notFootnote 20 on the MessageBox that spawns. As Fig. 5 shows, the actual malicious request to bagsrad.com:8099 was sent only after the button was clicked. This means that, if no manual input is provided, ANY.RUN does not detect the maliciousness of the file. Oblivion was instead able to fully execute the sample, perform the required interaction and generate a report in 8.45 s. This sample utilized the MsgBox API to block the execution of the malicious code and hinder dynamic analysis. As previously stated in Sect. 5.1.2, we find evidence of 6646 samples that contain traces of usage of this API (or of the equivalently used InputBox). As seen in this case study, the code following these calls becomes unreachable unless the interaction is not dealt with.

Fig. 5
figure 5

From top to bottom: ANY.RUN analysis where the interaction is not carried on, ANY.RUN analysis where it is and section of the Oblivion report

As for the second claim, we refer to ANY.RUN and Crowdstrike’s Falcon Sandbox .Footnote 21 The Hybrid Analysis websiteFootnote 22 and Fig. 6 show that the average processing time for the latter is around 7 to 8 min, depending on queue length. Conversely, the default time for an ANY.RUN analysis not involving usage of paid REST APIs is 60 s. Additionally, access to these APIs is the only non-convoluted way to use ANY.RUN for larger corpora of files. If we consider Oblivion’s times discussed in Sect. 5.5, we can see that Oblivion provides a result in less time in the majority of the cases. This statement must also however take into account that Online Sandboxes’ performances may be biased because of the varying availability of the system. Additionally, we did not include network latency in our analysis time. While Office files are on average fairly small in size, an under-performing connection may increase the overall waiting time.

Fig. 6
figure 6

Falcon Sandbox average waiting time warning

5.6.3 Emulation

We also consider emulation, that is, the set of techniques that aim to imitate the behavior of another program or device. Specifically, we consider ViperMonkey [42], presented by Philippe Lagadec at Black Hat Europe in 2019 and subsequently on Github.Footnote 23 The program converts VBA code into Python and then evaluates it to extract IoCs. While this program has certainly its advantages, mainly: (i) it does not require an Office installation and (ii) it can be run from different Operative Systems, Oblivion still presents characteristics that ViperMonkey does not. At first, we tried to utilize “Sample B” as in Sect. 5.6.2 to infer ViperMonkey’s behavior when interactions are present. Unfortunately, since the code uses unconventional encoding for obfuscation purposes, ViperMonkey refuses to analyze it because it mistakes it as corrupted. To inspect this behavior, we then introduce “Sample C” ,Footnote 24 which we publish in VirusTotal, where we replace the variable names with simpler ones. String content has not been replaced because such an operation would break the functionality of the macro. Additionally, we added a MessageBox call to ensure that traces of the to-be-called URL are present even in the event that such a malicious domain should be closed. As seen in Fig. 7, ViperMonkey is actually able to deal with the MessageBox call, but ultimately fails to correctly inspect non-ASCII strings and therefore returns an incorrect result.

Fig. 7
figure 7

From top to bottom: ViperMonkey analysis and section of the Oblivion report

6 Discussion and limitations

As shown in the previous Sections, Oblivion is a complex system whose elements cooperate to address the variety of malicious macros in the wild. However, the system is imperfect, as it features some limitations that we aim to address (also with the community) in the next releases.

First, while Oblivion can address most of the user interactions in the wild, it does not consider those that are not directly linked to actions performed by the macro. In some cases, interaction windows are generated by the Office suite itself, according to unexpected events. Hence, it is generally difficult to control and predict the appearance of these windows.

The second limitation concerns samples containing passwords, which essentially lock access to the embedded macros. Some passwords can be easy to remove with brute-forcing or by directly patching the document (by replacing the DPB string in the vbaProject.bin with DPX [43]). However, this method does not always work, as it depends on the employed version of Office and the file type (for example, there are consistent differences between.xls and.xlsx files in managing passwords). For simplicity, we decided not to address password-protected files in the experimental evaluation of this work. However, we plan to integrate full password-cracking support in the next releases of Oblivion.

The third major limitation concerns the presence of Oblivion errors, as stated in Sect. 5.2. A more detailed analysis of the errors showed that they are mostly related to the excessive size of the instrumented macros (in terms of code lines). We plan to solve this problem in the next release by splitting the instrumented routine into subfunctions (that can also be located in different modules) that are progressively called.

Finally, it is worth noting that some instrumented macros failed their execution due to unexpected errors we could not correctly debug, such as invalid routine calls or sudden crashes of the virtualizer that could not allow us to complete the analysis. We speculate that some of these problems may be solved by using a different virtualizer, and we plan to test Oblivion with other virtualizers besides Sandboxie.

7 Conclusions and future work

In this paper, we presented Oblivion, an open-source framework for analyzing and de-obfuscating macros embedded in Office files. We used Oblivion to perform a large-scale analysis of malicious macro-based Office files by pointing out several intriguing characteristics, such as the embedded PowerShell codes, attack categories alternative to PowerShell, and popular reachable domains. Finally, we showed that Oblivion is especially suitable for large-scale analyses due to its architecture and speed. We are releasing the complete source code of Oblivion, as well as all the experimental results obtained with our tool.

As mentioned in the paper, the architecture of Oblivion is modular and easily expandable, thus allowing other researchers and users to work on the system. Indeed, Oblivion is just the first step of various challenges that must be adequately addressed, such as the detection of non-macro-based Office malware. We hope that our work can foster research on these categories of attacks, which are still among the biggest malware threats in the wild.

Oblivion may also be further expanded to address Office-based attacks that do not resort to macros.