Robustly Parsing the PE Header

The first thing needed in order to disassemble a program is (obviously) a place to start. For Windows executables, this is usually discovered by parsing the PE header of the executable. This well known, well documented data structure contains the information about the executable that the program loader will need to execute it. This information includes things such as a mapping of the program’s file contents to memory, which libraries need to be loaded into memory, and what we are looking for, the entry point of the program.
Continue reading

Mobile Botnets

Mobile platforms provide botnet creators with new threats and challenges. There is a significant need for academic papers that analyse, predict, or mitigate the production of mobile botnets. Even designing a new botnet as a warning and proof of concept can be beneficial to security researchers. This article describes the state of research in mobile botnets and suggests open problems for academics to solve.
Continue reading

Dynamic Analysis of Applications

There are several ways to analyze Android applications for suspicious behavior. These are typically categorized as static or dynamic analysis. Static analysis evaluates code without executing it while dynamic analysis tests the behavior of code during execution. This article will discuss current dynamic analysis techniques for Android applications and the open problems associated with them.
Continue reading

Hybridizing Assembly Retrieval

Most disassembly tools perform either a linear sweep retrieval or a recursive traversal retrieval. Linear sweep starts at the beginning of each executable section and disassembles from the first offset, continuing to the offset following the end of the retrieved instruction. Recursive traversal has a formal definition, but put simply, it performs piece-wise linear sweep over a series of program blocks, or contiguous (non-branching) instruction segments. When a branch instruction is discovered, an attempt is made to determine the target and, if any are found, each target is recursively disassembled. I mentioned in an earlier post that this tool can handle aliased instructions by intentional design. This ability affords us several benefits as a disassembler, one of which is ability to perform both linear sweep and recursive traversal simultaneously.
Continue reading

Semantic Representation of Assembly Architectures

It is a very common need in areas such as binary translation, program analysis, and compilers to represent the low level assembly instructions in a more abstract semantic representation. This semantic representation is often referred to as an Intermediate Language or Representation (IL/IR). These IRs provide a standardized method of performing the same operations over many different types of architectures without having to write separate operations for every supported architecture. For example, an optimizing compiler can contain a single optimization procedure that operates on an IR rather than many procedures, each having to handle the quirks of the architecture they were targeting.
Continue reading