Abstract:
Techniques for creating SFX archives and interpreters.

Created by Peter Kankowski
Last changed
Contributors: Ace
Filed under Win32 programming

Share on social sitesReddit Digg Delicious Buzz Facebook Twitter

Self-extracting executables

Some programs extract data from themselves, e.g. SFX archive gets the compressed data from its exe file. The file consists of two parts — the executable and the archive — that are simply concatenated. You can append the archive to the exe with this command:

copy /b sfx.exe+archive.zip sfx_ready.exe

The program opens its own exe file, finds the zip file attached to it, and unpacks it. In similar way, some programming languages append your program to an interpreter to make a stand-alone exe, which is easier to distribute (AutoIt and Rapid-Q are the examples of such approach).

How can you do this?

Executable file structure

Exe file structure

The exe file stores information about its size in headers. So you can get the size of executable data from the headers and read the attached data from this position.

The chart above shows all necessary details. First, you should reach the portable exe headers (IMAGE_NT_HEADERS32 structure in winnt.h file) by extracting the offset to it from DOS header. The PE header contains SizeOfCode and SizeOfInitializedData fields, but Windows doesn't require their values to be correct, that's why a linker can write wrong numbers here. We need more reliable source to calculate the exe data size.

And here it is — a section table. Each element of the table stores file offset of the section data and its size; the number of elements in the table can be found in the NumberOfSections field. There are two sections on the chart, ".data" and ".text". In real programs, there may be more different ones. Some of them can have PointerToRawData set to zero meaning the loader should initialize them to empty memory pages.

Let's walk through the section table and find the section with maximum PointerToRawData value. Then the size of executable data will be equal to PointerToRawData of this section plus its SizeOfRawData. In the sample file from the chart, the size will be calculated as offset of the ".data" section plus size of the ".data" section.

The program

Here is the program (error handling mostly stripped to make the example shorter and more clear):

int ReadFromExeFile(void)
{
  BYTE buff[4096]; DWORD read; BYTE* data;

  // Open exe file
  GetModuleFileName(NULL, (CHAR*)buff, sizeof(buff));
  HANDLE hFile = CreateFile((CHAR*)buff, GENERIC_READ, FILE_SHARE_READ,
    NULL, OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
  if(INVALID_HANDLE_VALUE == hFile)
      return ERR_READFAILED;
  ReadFile(hFile, buff, sizeof(buff), &read, NULL);
  IMAGE_DOS_HEADER* dosheader = (IMAGE_DOS_HEADER*)buff;

  // Locate PE header
  IMAGE_NT_HEADERS32* header =
     (IMAGE_NT_HEADERS32*)(buff + dosheader->e_lfanew);
  if(dosheader->e_magic != IMAGE_DOS_SIGNATURE ||
    header->Signature != IMAGE_NT_SIGNATURE) {
    CloseHandle(hFile);
    return ERR_BADFORMAT;
  }

  // For each section
  IMAGE_SECTION_HEADER* sectiontable =
    (IMAGE_SECTION_HEADER*)((BYTE*)header + sizeof(IMAGE_NT_HEADERS32));
  DWORD maxpointer = 0, exesize = 0;
  for(int i = 0; i < header->FileHeader.NumberOfSections; i++) {
    if(sectiontable->PointerToRawData > maxpointer) {
      maxpointer = sectiontable->PointerToRawData;
      exesize = sectiontable->PointerToRawData +
        sectiontable->SizeOfRawData;
    }
    sectiontable++;
  }

  // Seek to the overlay
  DWORD filesize = GetFileSize(hFile, NULL);
  SetFilePointer(hFile, exesize, NULL, FILE_BEGIN);
  data = (BYTE*)malloc(filesize - exesize + 1);
  ReadFile(hFile, data, filesize - exesize, &read, NULL);
  CloseHandle(hFile);

  // Process the data
  *(data + datasize) = '\0';
  MessageBox(0, (CHAR*)data, AppName, MB_ICONINFORMATION);
  free(data);
  return ERR_OK;
}

The sample program just reads the whole overlay and shows it in the message box. In a real program, you may consider reading long overlays in chunks of 32-64 Kb or so. The code was tested with different compilers and sections' layouts.

Download source code (7 Kb) with full error handling

Other possible approaches

  • Signature search. Put a signature between the extractor and the payload, open the executable file, and search for this signature. Rar self-extractor uses this method (see Archive::IsArchive method in UnRAR sources). Drawbacks:
    • If the extractor is large, the search may take a long time.
    • Because you search for the signature in your code, your executable will contain the signature, so a false positive is possible. You should somehow encode the signature instead of doing a direct comparison!
  • Fast signature search. Exe file sections are aligned by 512 or 4096 bytes. So you don't need to scan the whole file, you just need to read the 512-byte chunks and look for the signature at their start. This method is used in NSIS installer. It's faster and the false positives are unlikely.
  • Writing payload size at the end of the file. To extract the payload, you seek to the end of the file, read the payload size, then seek to the start of the payload (see this article). The same approach is used in Lyrics3 tag v2. Drawbacks:
    • This will not work if the payload structure is predefined, and no footer can be appended to it, e.g., if you are attaching a standard RAR file.
  • Embedding payload in resource. You can put your payload in exe file resources. Drawbacks:
    • Modifying resources is a complex task.
    • Such SFX file are not supported by WinZip/WinRAR.

Additional reading

You can find more information about PE file format in the paper by B. Luevelsmeyer and in Iczelion's tutorial. See also this forum thread about self-extracting from memory, not from file (note that Wayside's code will not work if the last section has PointerToRawData==0).

Peter Kankowski
Peter Kankowski

About the author

Peter is the developer of Aba Search and Replace, a tool for replacing text in multiple files. He likes to program in C with a bit of C++, also in x86 assembly language, Python, and PHP.

Created by Peter Kankowski
Last changed
Contributors: Ace

7 comments

AC,
Just a few ideas:
First, I would just put some marker bytes between the extractor and the payload. Since I control the extractor, I can be sure that the marker is not there verbatim.
Second, I would just mapviewoffile, search to the marker, and then save the payload without doing ReadFile once.
Peter Kankowski,
This method is simple and reliable; Rar self-extractor uses it (see Archive::IsArchive method in UnRAR sources). The only drawback I see is that if the extractor is large, the search will take a long time.

The author of this article found an elegant solution: write the size of payload at the end of the file. To extract the payload, you seek to the end of the file, read the payload size, then seek to the start of the payload. BTW, the same approach is used in Lyrics3 tag v2.

Of course, this will not work if the payload structure is predefined, and no footer can be appended to it, e.g., if you are attaching a standard RAR file.
AC,
An installation sample that uses marker and .inf files to make the installation program very small

http://www.ddj.com/184416551?pgno=1

Very good demonstration of using MapViewOfFile with a "window" to the bigger file. For the copying purposes, the story can be even simpler

http://www.catch22.net/tuts/bigmem01.asp

As far as I know, the biggest advantage of mapping the file is that ReadFile must copy the content to the buffer you allocated (and the page is anyway previously loaded from the disk to the cache outside of your process space), and with the mapping, you access the page directly, the system only maps the page to your process space.

The whole James Brown's site looks interesting:
http://www.catch22.net/
Peter Kankowski,
Inspired by your comments, I digged into my favorite installer sources. It turns out that NSIS authors found a fast solution using marker. It's so simple that you will say: "Why didn't I think about it?!"

Just remember that data in an exe file is aligned by 512 or 4096 bytes. So you don't need to scan the whole exe for a marker, you just need to read 512-byte chunks and look for marker at their start. In pseudocode:

BYTE buff[512];
while(not end of file) {
   ReadFile( 512 bytes into buff)
   if(*(long*)buff == marker) {
       // Marker found!
   }
   // else read another 512-byte chunk in the loop
}


I believe it's the simplest method; it's also faster than other methods with marker.

About James Brown's article, I read it before, but IMHO MapViewOfFile is not better in any way than ReadFile. Optimization gurus from wasm.ru says the same thing. Namely, Leo found in his tests (the page is in Russian, sorry; here is a poor-quality automated translation) that the performance of ReadFile and MapViewOfFile is the equal for files less than 2-3 Mb. For larger files (70-100 Mb), MapViewOfFile is 1.5-2 times slower. In conclusion, he recommends using ReadFile with 32-64 kilobyte chunks.

Thank you for a link to the "Creating small setups" article; I enjoyed reading it. So pleasant to find a programmer who still cares about the size of the setup file; many others don't think about it and end up creating bloated installers.
AC,
Yes, faster marker search is a very good idea. Thanks!

I took a look at the measurement article, it seems that leo measured with FILE_FLAG_NO_BUFFERING which sounds to me like a bad benchmarking. In normal situations nobody wants "no buffering". Instead, he should have created more big files, checked that they are not fragmented, restart and then read different files for each test. If he lets file cache manager do his work, I's still expect different results. As far as I was able to see from filemon logs, it looks to me that VS 6 uses mapviewoffile to access sources, and it's very fast with big sources. You can imagine it yourself: readfile must first read the file to the file cache and then memcpy to your buffer. With mapvievoffile there should be one memcpy less for each read -- just the mapping of the pages to the process' virtual memory.
Tarmo Pikaro,

Btw - zip file format allows to place zip header at the end of file, meaning you could have .exe + .zip in the same size. (Try to just append zip to exe - you will still be able to open zip as a zip with 7z).

But one of remaining question is digital signature - apparently it's placed at the end of .exe as well.

I would like to tamper PE header to include SFX data as dummy data, so .exe can be signed afterwards.

Any idea about this ?

Peter Kankowski,

This has nothing to do with zip header being located at the end of archive. 7-Zip can open other SFX archive formats (RAR, 7z), where the headers are located at the beginning.

The SFX archive should be digitally signed after you appended your ZIP file to EXE.

Your name:


Comment: