[SANS ISC] CFBF Files Strings Analysis, (Mon, Jun 28th)

The Office file format that predates the OOXML format, is a binary format based on the CFBF format. I informally call this the ole file format.

It’s a binary file format, and is uncompressed (disregarding application specific exceptions, like VBA source code).

That lends itself to strings analysis, as I’ve wrote about in previous diary entries.

There is a potential problem when you run the strings command on a .doc file, for example. The CFBF file format, is similar to a file system format: it is made up of sectors, and has File Allocation Tables. This means that the data that is contained into streams, is written into sectors. These sectors don’t have to be sequential.

If you are looking for URLs for example, you could run the strings command on a .doc file, and grep for string http.

It can happen, that a URL straddles the boundary of 2 sectors: its first part is at the end of sector N, and its last part is at the start of sector N+1. If both sectors are written sequentially into the CFBF file, there is no problem. But if they are not contiguous, the strings command can not extract the complete URL, as it is split into 2 strings that are separated by other data.

I have a couple of solutions to this problem.

The first one I’ll cover in this diary entry, is quite simple: my tool oledump.py, a tool to analyze CFBF files, has an option to extract all strings from a stream: -S.

Take this Word document, a .doc file, where I have typed a URL on the first page:

Stream 6 contains the content that I typed:

This is for a single stream. Use “-s a” to extract strings from all streams:

With this last command, you can extract all strings from all streams. And you will not extract strings that are not located in streams, like the names of the streams for example.

You don’t have control over oledump’s string extraction method, for example, you can not specify a minimum string length (it’s minimum 4).

There are other methods where you can do that: that is for another diary entry.


Didier Stevens
Senior handler
Microsoft MVP

(c) SANS Internet Storm Center. https://isc.sans.edu Creative Commons Attribution-Noncommercial 3.0 United States License.

Source: Read More (SANS Internet Storm Center, InfoCON: green)

You might be interested in …

Daily NCSC-FI news followup 2021-11-16

Google Chrome 96 breaks Twitter, Discord, video rendering and more www.bleepingcomputer.com/news/google/google-chrome-96-breaks-twitter-discord-video-rendering-and-more/ Google Chrome 96 was released yesterday, and users are reporting problems with Twitter, Discord, and Instagram caused by the new version. UK Covid App Goes Offline www.pandasecurity.com/en/mediacenter/technology/uk-covid-app-goes-offline/ People are now hugely reliant on their Covid passports. So when NHS England experienced a system outage, […]

Read More

[TheRecord] NSA, CISA publish guide for securing VPN servers

The National Security Agency (NSA) and the Cybersecurity and Infrastructure Security Agency (CISA) have published today technical guidance on properly securing VPN servers used by organizations to allow employees remote access to internal networks. The NSA said it put together the nine-page guide [PDF] after “multiple nation-state advanced persistent threat (APT) actors” weaponized vulnerabilities in […]

Read More

[HackerNews] Iranian Hackers Abuse Dropbox in Cyberattacks Against Aerospace and Telecom Firms

All posts, HackerNews

Details have emerged about a new cyber espionage campaign directed against the aerospace and telecommunications industries, primarily in the Middle East, with the goal of stealing sensitive information about critical assets, organizations’ infrastructure, and technology while remaining in the dark and successfully evading security solutions. Boston-based cybersecurity company Cybereason dubbed Source: Read More (The Hacker […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.