VMware Cloud Community
eimaj2nz
Contributor
Contributor

Format of VGZ files?

I'm a developer on a content inspecting web gateway product. One of the features of our product is that it will recursively unpack downloaded files, in order to perform in-depth content analysis such as malware scanning and binary file type detection.

Our product is currently having problems with the ESXi 4.0.0 ISO image. More specifically, the files "cim.vgz" and "sys.vgz" are causing errors when we try to unpack them. In most cases, this prevents users of our software from being able to download the file. This is a situation that we'd prefer to avoid, especially as how the download is a legitimate VMWare ISO image (checked via MD5).

As far as our code can tell, both of the problem files are standard GZip archives. We use 7-Zip to extract the contents of these files, and there are no problems in doing so. The extracted files then look like TAR archives, which we again run 7-Zip on in order to extract the contents. It is at this point that 7-Zip returns an error, which our code picks up and marks the file as "bad".

We have used a number of tools in order to attempt to extract these files. 7-Zip simply says it can't open the TAR file. WinRAR reports corruption errors, but is sometimes able to report a few files. GNU TAR does best and is able to extract several files, but still reports errors.

Is there something special about these files that prevents them from being unpacked by normal tools? If so, is there any way to extract the contents of these files, in order for our software to inspect them? If not (for legal, technical or any other reason), is there something we can look for that differentiates these files from normal TAR+GZ archives?

Thanks in advance.

Reply
0 Kudos
31 Replies
philwo
Contributor
Contributor

Hi,

have you figured anything out yet? I'm trying to extract the pci.ids and simple.map file from the sys.vgz file in order to override them in the oem.tgz, but I can't find a way to unpack the .vtar file which I get after gunzip'ping the .vgz ... have the same errors as you mention in your post.

Best regards,

Philipp

Reply
0 Kudos
eimaj2nz
Contributor
Contributor

I haven't found a lot yet.

So far, it looks like the files are a variant of the UStar TAR format. They appear to have the additional UStar-style data associated with them. However, that's about all I can tell so far.

I'll post back if I find out anything else.

Reply
0 Kudos
Schorschi
Expert
Expert

You can't use a build ESXi 4 server to track down the location of the map and PCI files?

Reply
0 Kudos
eimaj2nz
Contributor
Contributor

This will not help my problem. My company's product needs to be able to extract the files from the TAR archive for inspection. This happens at the time that ESXi is being downloaded, so it's no good installing the product to get at the files.

So far, the biggest difference in the file format is that the VMWare TAR files use "visor\x20\x20\x20" in the "magic" field. For a UStar TAR file this would be "ustar\x00\x00\x00".

I'm not sure quite what the other differences are yet, as I haven't had time to take a better look at the files.

Reply
0 Kudos
admin
Immortal
Immortal

vgz is a modified tgz format. It is to reduce memory consumption both physical and reservation for user apps.

You can use vmtar to convert between tgz <-> vgz. The usage:

Usage: vmtar {[-x vtar/vgz-file] -o destination} | -t < vtar/vgz-file

-Praveen

Reply
0 Kudos
eimaj2nz
Contributor
Contributor

Thanks for the information Praveen. Unfortunately it does not help me.

I cannot find any information on the vmtar application. I cannot find it on the VMWare site, in a general Google search, or in the Synaptic package manager in Ubuntu.

I'm also not sure why vmtar would be better with memory than normal tar. The tarball appears to have been compressed with normal GZip compression, and the data looks quite normal after decompression. It's only the tarball itself that's different, and there should be nothing in there the raw untar process that uses large amounts of memory.

Reply
0 Kudos
admin
Immortal
Immortal

Normal tar format aligns files on 512 byte boundary. vmtar does it on a 4K (page size) boundary. This helps sharing pages for running apps directly with the vmtar file. Otherwise, we would need to allocate memory. So for example, if you have 10 apps sharing a page, the memory needed is 1 page. But if we were to allocate, then it would be 10 pages.

vmtar is internal only and the following is not officially supported:

You can get vmtar from /sbin/vmtar after you boot the machine. To login into the machine, type "unsupported" on the console (Alt-F1). Then for the login as root and no password.

Hope this helps.

-Praveen

Reply
0 Kudos
dpayette
Contributor
Contributor

I am trying to expand the sys.vgz file, and receive the screen of death (no ramdisk, image corrupt) while ESXi is loading. All I want to do is add 2 pci-ids to the etc/vmware/simple.map within the sys.vgz.

This is my maddness:

cp sys.vgz sys.gz

gunzip sys.gz

vmtar -x sys -o sys.tar

tar xvf sys.tar

edit my files

tar -cf sys.tar ./*

vmtar -c sys.tar -o sys

gzip -S .vgz sys

I place the new sys.vgz in my build directory and mkisofs, then burn the ISO. I have tried variour methods to untar the .vgz file with no luck.

Any suggestions?

David

Reply
0 Kudos
admin
Immortal
Immortal

Can you remove ./ when doing tar, i.e. "tar -cf sys.tar ./*" should be "tar -cf sys.tar *"and make sure if there are files starting ".", they are part of tar file.

Please let me know if it still does not work.

Regards,

Praveen

Reply
0 Kudos
stillfly
Contributor
Contributor

I struggled with this for a while - here is where the breakdown is:

if you are trying to include hidden files, "tar -cf sys.tar *" will not work - it is the wildcard that is killing you here... instead tar the directory "tar -cf sys.tar ./"

Reply
0 Kudos
NCC1470
Contributor
Contributor

Hey!

I am confused now:

stillfly writes that tar cf sys.tar ./ works for him: I found that "./" still leads to the "boot image is corrupted" error on boot.

My soulution is tar cf sys.tar `ls` what give the right structure to the archiv.

Like:

# tar tf sys.tar

bin/

bin/busybox

bin/ipkg

Cheers

Reply
0 Kudos
peterlyons
Contributor
Contributor

Did you ever find a solution or workaround to extract files from .vgz archives without the vmtar program?

Reply
0 Kudos
dominic7
Virtuoso
Virtuoso

You can grab a copy of the vmtar binary from the ESXi 'tech support shell', scp root@youresxihost:/sbin/vmtar .

file /sbin/vmtar
/sbin/vmtar: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.9, stripped

Seems to work fine for me on debian 5.0.5, I've successfully injected a driver into sys.vgz so I can install ESXi 4.1 (Update 1) on some HP BL460c g7's.

Reply
0 Kudos
akreitman
Contributor
Contributor

How about a windows version of vmtar, at least the sources so I can build it myself.

PS Why on earth do they use a non standard tar file format?

Reply
0 Kudos
DSTAVERT
Immortal
Immortal

I don't believe that vmtar is an opensource tool but you could check the source available from http://downloads.vmware.com/d/info/datacenter_downloads/vmware_vsphere_hypervisor_esxi/4_0#open_sour...

-- David -- VMware Communities Moderator
Reply
0 Kudos
akreitman
Contributor
Contributor

When you copied vmtar from the maintenance console on ESXi, where did you copy it to?   What's the command sequence to mount a usb drive under ESXi so I have a place to copy it to?

Thanks

Reply
0 Kudos
DSTAVERT
Immortal
Immortal

It isn't possible to attach a USB drive to ESXi. Copy the file to a datastore and then download it from the datastore browser.

-- David -- VMware Communities Moderator
Reply
0 Kudos
akreitman
Contributor
Contributor

Is there an ftp client, of something close to that?

Reply
0 Kudos
DSTAVERT
Immortal
Immortal

The datastore browser is part of the vSphere client.

-- David -- VMware Communities Moderator
Reply
0 Kudos