VMware Cloud Community
bparinas
Contributor
Contributor

UnixP2VVolumeCloneFailedEvent error in Converter 5.1

Hello Experts,

I'm having a problem converting the RHEL 5.x powered Linux machine in Converter 5.1. I'm not sure if this is a known issue or a bug. Is there anyone experienced the same issue? Any help is greatly appreciated.

Additional notes: /srv/data203591 -> 140GB used disk space (source system). The filesystem contains a large database files of more than 2GB and up in sizes.

Logs snippet:

-->                         fullMessage = "event.UnixP2VVolumeCloneFailedEvent.summary",

-->                         job = <unset>,

-->                         hostName = "135.247.217.121",

-->                         sourceMountPoint = "/srv/data203591",

-->                         reason = (converter.fault.CloneFault) {

-->                            dynamicType = <unset>,

-->                            faultCause = (vmodl.MethodFault) null,

-->                            description = "/usr/lib/vmware-converter/bin/ssh -z -F /usr/lib/vmware-converter/ssh.conf root@135.247.217.121 -p 22 " tar --one-file-system --sparse -C /srv/data203591 -cf - ." | /bin/tar_1.20 --numeric-owner --delay-directory-restore  -C /mnt/p2v-src-root/srv/data203591 -y -xf -

--> /bin/tar_1.20: Archive value 13585343488 is out of size_t range 0..4294967295

--> /bin/tar_1.20: Skipping to next header

--> /bin/tar_1.20: Error exit delayed from previous errors

-->  (return code 2)",

-->                            msg = "fault.CloneFault.summary",

Reply
0 Kudos
15 Replies
patanassov
VMware Employee
VMware Employee

Hello and welcome to the forum

Converter does support large files (more than 4GB); currently it handles files up to 2TB. To be 100% sure I have explicitly tested a conversion of  a 32bit RHEL 5.1 with a 5GB file in it.

This is the first time I see such a complaint. I can only speculate about two possible reasons:

  - a network error that has confused the 'untaring'  (tar_1.20 is running on the destination VM)

  - the tar on the source machine doesn't support larger than 4GB files (I don't really believe this is possible)

Have you tried the same conversion again?

Regards,

Plamen

Reply
0 Kudos
patanassov
VMware Employee
VMware Employee

Something else - perhaps the file has been in use while converting. Can you try stopping the database application before retrying the conversion?

Reply
0 Kudos
bparinas
Contributor
Contributor

Hello Plamen,

Converter does support large files (more than 4GB); currently it handles files up to 2TB. To be 100% sure I have explicitly tested a conversion of  a 32bit RHEL 5.1 with a 5GB file in it.

---

I'm cloning a 64bit RHEL 5.8

Yes there are large files (2GB up) in this directory /srv/data203591

---

This is the first time I see such a complaint. I can only speculate about two possible reasons:

  - a network error that has confused the 'untaring'  (tar_1.20 is running on the destination VM)

  - the tar on the source machine doesn't support larger than 4GB files (I don't really believe this is possible)

---

source has a tar version 1.15 (i don't think there is an issue with 1.15)

destination used tar 1.20

---

Have you tried the same conversion again?

---

Yes, I excluded the two big filesystems and the cloning was completed without error. However the SAN storage filesystems are missings and only the default OS filesystems are mounted in the cloned image.

I checked the 3 disks and all of them are present in the cloned image. What puzzled me was that the LVM (RHEL) is unable to detect the Physical disks, Volume Groups, and the Logical Volumes. When I tried to mount all the filesystems it says the special blocks associated to the above mount point /srv/data203591 /dev/<volume group> doesn't not exists.

---

I'm not sure if this is a bug or known issue to converter 5.1.

Reply
0 Kudos
bparinas
Contributor
Contributor

Something else - perhaps the file has been in use while converting. Can you try stopping the database application before retrying the conversion?


---

i confirm no database instances running before the cloning.


---

Reply
0 Kudos
patanassov
VMware Employee
VMware Employee

Hmm, that's a different issue. Can you please attach the task log bundle (I want to see the helper log)?

Reply
0 Kudos
bparinas
Contributor
Contributor

attached log is the failed conversion without exclusions.

Reply
0 Kudos
bparinas
Contributor
Contributor

Attached is the successful cloning logs with exclusion of /srv/data203591 filesystem.

Additional notes and testing:

Prior to cloning no changes were made on the source system.

During converter setup everything was set to default.

After the cloning, I was able to boot up the image but only the OS filesystems are mounted. All the SAN storage filesystems are missing (e.g. physical volumes, volume groups, logical volumes, block devices, etc...). I tried to scan but no luck.

# lvmdiskscan

  Ignoring too small pv_min_size 512KB, using default 2048KB.

  0 disks

  0 partitions

  0 LVM physical volume whole disks

  0 LVM physical volumes

# pvs

  Ignoring too small pv_min_size 512KB, using default 2048KB.

# vgs

  Ignoring too small pv_min_size 512KB, using default 2048KB.

  No volume groups found

# lvs

  Ignoring too small pv_min_size 512KB, using default 2048KB.

  No volume groups found

# pvscan

  Ignoring too small pv_min_size 512KB, using default 2048KB.

  No matching physical volumes found

# vgscan

  Ignoring too small pv_min_size 512KB, using default 2048KB.

  Reading all physical volumes.  This may take a while...

  No volume groups found

# lvscan

  Ignoring too small pv_min_size 512KB, using default 2048KB.

  No volume groups found

Reply
0 Kudos
patanassov
VMware Employee
VMware Employee

Hello again

After examination of the logs everything seems fine. Volume groups and logical volumes get created (except for some excluded volumes), formatted, and the data gets cloned.

The only thing that looks suspicious is the "_netdev" option in fstab which  "prevent the system from attempting to mount these filesystems until the network has been enabled on the system" (quote from man). Is it possible this is the issue? Can you try to mount again with network?

Our fstab patcher does not remove this option. Perhaps it should.


Regards,

Plamen

Reply
0 Kudos
bparinas
Contributor
Contributor

Hello Plamen,

I tried to remove the _netdev option in /etc/fstab of the cloned image and then remounted the filesystems but no such luck.

# cat /etc/fstab

/dev/sda1       /                       ext3    defaults        1 1

/dev/sda6       /opt                    ext3    defaults        1 2

/dev/sda5       /var                    ext3    defaults        1 2

/dev/sda3       /srv/tool203583         ext3    defaults        1 2

/dev/sda7       /srv/home203585         ext3    defaults        1 2

tmpfs                   /dev/shm                tmpfs   size=16877879296      0 0

devpts                  /dev/pts                devpts  gid=5,mode=620  0 0

sysfs                   /sys                    sysfs   defaults        0 0

proc                    /proc                   proc    defaults        0 0

/dev/sda2       swap                    swap    defaults        0 0

#

#

#

/dev/vgdata/data203584_u01      /srv/data203584 ext4    noatime 0 3

/dev/vgdata/data203590_u02      /srv/data203590 ext4    noatime 0 4

#/dev/mapper/vgdata-data203591_u03      /srv/data203591 ext4    noatime 0 3 -> fs excluded during cloning

#/dev/mapper/vgdata-data203592_u04      /srv/data203592 ext4    noatime 0 3 -> fs excluded during cloning

/dev/vgdata/data203593_u05      /srv/data203593 ext4    noatime 0 4

/dev/vgdata/data203628_u06      /srv/data203628 ext4    noatime 0 3

/dev/vgdata/data203629_u07      /srv/data203629 ext4    noatime 0 3

/dev/vgdata1/data203977_u08             /srv/data203977 ext4    noatime 0 3

# mount -a

mount: special device /dev/vgdata/data203584_u01 does not exist

mount: special device /dev/vgdata/data203590_u02 does not exist

mount: special device /dev/vgdata/data203593_u05 does not exist

mount: special device /dev/vgdata/data203628_u06 does not exist

mount: special device /dev/vgdata/data203629_u07 does not exist

mount: special device /dev/vgdata1/data203977_u08 does not exist

I have an idea what if I create a new clone image but this time removing the _netdev mount option in the /etc/fstab of the source system before the cloning. Do you think this will work?

Many Thanks!

Reply
0 Kudos
patanassov
VMware Employee
VMware Employee

Hello bparinas,

I don't think patching the source fstab would help if patching the destination's doesn't.

I am examining the not mounted devices again in the log, I'd like to show you what is happening (I am taking an arbitrary one, e.g. /dev/vgdata/data203628_u06😞

- first we partition the destination disks

- then we create the physical volume and the volume group:

2014-01-31T08:41:44.504Z [F2060B70 verbose 'task-1'] LD_LIBRARY_PATH=/usr/lib/vmware-converter

2014-01-31T08:41:44.504Z [F2060B70 verbose 'task-1'] PATH=/sbin:/usr/sbin:/bin:/usr/bin

2014-01-31T08:41:44.504Z [F2060B70 verbose 'task-1'] Invoking /sbin/lvm with the following arguments:

2014-01-31T08:41:44.505Z [F2060B70 verbose 'task-1'] args[0]: pvcreate

2014-01-31T08:41:44.505Z [F2060B70 verbose 'task-1'] args[1]: --metadatatype

2014-01-31T08:41:44.505Z [F2060B70 verbose 'task-1'] args[2]: 2

2014-01-31T08:41:44.505Z [F2060B70 verbose 'task-1'] args[3]: /dev/sdc1

2014-01-31T08:41:45.217Z [F2060B70 verbose 'task-1'] Command return code: 0; result string:   Writing physical volume data to disk "/dev/sdc1"

-->   Physical volume "/dev/sdc1" successfully created

-->

2014-01-31T08:41:45.217Z [F2060B70 verbose 'task-1'] LD_LIBRARY_PATH=/usr/lib/vmware-converter

2014-01-31T08:41:45.218Z [F2060B70 verbose 'task-1'] PATH=/sbin:/usr/sbin:/bin:/usr/bin

2014-01-31T08:41:45.218Z [F2060B70 verbose 'task-1'] Invoking /sbin/lvm with the following arguments:

2014-01-31T08:41:45.218Z [F2060B70 verbose 'task-1'] args[0]: vgcreate

2014-01-31T08:41:45.218Z [F2060B70 verbose 'task-1'] args[1]: --metadatatype

2014-01-31T08:41:45.219Z [F2060B70 verbose 'task-1'] args[2]: 2

2014-01-31T08:41:45.219Z [F2060B70 verbose 'task-1'] args[3]: --physicalextentsize

2014-01-31T08:41:45.219Z [F2060B70 verbose 'task-1'] args[4]: 4096K

2014-01-31T08:41:45.219Z [F2060B70 verbose 'task-1'] args[5]: vgdata

2014-01-31T08:41:45.219Z [F2060B70 verbose 'task-1'] args[6]: /dev/sdc1

2014-01-31T08:41:45.417Z [F2060B70 verbose 'task-1'] Command return code: 0; result string:   Volume group "vgdata" successfully created

- create the logical volume:

2014-01-31T08:41:45.618Z [F2060B70 verbose 'task-1'] LD_LIBRARY_PATH=/usr/lib/vmware-converter

2014-01-31T08:41:45.618Z [F2060B70 verbose 'task-1'] PATH=/sbin:/usr/sbin:/bin:/usr/bin

2014-01-31T08:41:45.619Z [F2060B70 verbose 'task-1'] Invoking /sbin/lvm with the following arguments:

2014-01-31T08:41:45.619Z [F2060B70 verbose 'task-1'] args[0]: lvcreate

2014-01-31T08:41:45.619Z [F2060B70 verbose 'task-1'] args[1]: --name

2014-01-31T08:41:45.619Z [F2060B70 verbose 'task-1'] args[2]: data203628_u06

2014-01-31T08:41:45.619Z [F2060B70 verbose 'task-1'] args[3]: --size

2014-01-31T08:41:45.620Z [F2060B70 verbose 'task-1'] args[4]: 104857600K

2014-01-31T08:41:45.620Z [F2060B70 verbose 'task-1'] args[5]: vgdata

2014-01-31T08:41:45.668Z [F2060B70 verbose 'task-1'] Command return code: 0; result string:   Logical volume "data203628_u06" created

- format it:

2014-01-31T08:42:40.830Z [F2060B70 verbose 'task-1'] LD_LIBRARY_PATH=/usr/lib/vmware-converter

2014-01-31T08:42:40.830Z [F2060B70 verbose 'task-1'] PATH=/sbin:/usr/sbin:/bin:/usr/bin

2014-01-31T08:42:40.830Z [F2060B70 verbose 'task-1'] Invoking /sbin/mkfs.ext4 with the following arguments:

2014-01-31T08:42:40.830Z [F2060B70 verbose 'task-1'] args[0]: -Ldata203628_u06

2014-01-31T08:42:40.830Z [F2060B70 verbose 'task-1'] args[1]: -O dir_index,filetype,has_journal,^journal_dev,resize_inode,sparse_super,extent,flex_bg,uninit_bg

2014-01-31T08:42:40.831Z [F2060B70 verbose 'task-1'] args[2]: -I 128

2014-01-31T08:42:40.831Z [F2060B70 verbose 'task-1'] args[3]: /dev/vgdata/data203628_u06

2014-01-31T08:42:57.957Z [F2060B70 verbose 'task-1'] Command return code: 0; result string: mke2fs 1.41.12 (17-May-2010)

- mount it to a temporary directory:

2014-01-31T10:31:17.627Z [F2060B70 verbose 'task-1'] LD_LIBRARY_PATH=/usr/lib/vmware-converter

2014-01-31T10:31:17.627Z [F2060B70 verbose 'task-1'] PATH=/sbin:/usr/sbin:/bin:/usr/bin

2014-01-31T10:31:17.627Z [F2060B70 verbose 'task-1'] Invoking /bin/mount with the following arguments:

2014-01-31T10:31:17.627Z [F2060B70 verbose 'task-1'] args[0]: -t

2014-01-31T10:31:17.627Z [F2060B70 verbose 'task-1'] args[1]: ext4

2014-01-31T10:31:17.628Z [F2060B70 verbose 'task-1'] args[2]: /dev/vgdata/data203628_u06

2014-01-31T10:31:17.628Z [F2060B70 verbose 'task-1'] args[3]: /mnt/p2v-src-root/srv/data203628

2014-01-31T10:31:17.677Z [F2060B70 verbose 'task-1'] Command return code: 0; result string:

- transfer the data:

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] Invoking /usr/lib/vmware-converter/copyFileSystem.sh with the following arguments:

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[0]: --sshClient

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[1]: /usr/lib/vmware-converter/bin/ssh

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[2]: --user

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[3]: root

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[4]: --host

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[5]: 135.247.217.121

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[6]: --port

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[7]: 22

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[8]: --sourceMountPoint

2014-01-31T10:31:17.678Z [F2060B70 verbose 'task-1'] args[9]: /srv/data203628

2014-01-31T10:31:17.679Z [F2060B70 verbose 'task-1'] args[10]: --targetMountPoint

2014-01-31T10:31:17.679Z [F2060B70 verbose 'task-1'] args[11]: /mnt/p2v-src-root/srv/data203628

2014-01-31T10:31:17.679Z [F2060B70 verbose 'task-1'] args[12]: --sshConfigFile

2014-01-31T10:31:17.679Z [F2060B70 verbose 'task-1'] args[13]: /usr/lib/vmware-converter/ssh.conf

2014-01-31T10:31:17.679Z [F2060B70 verbose 'task-1'] args[14]: --sourceTarOption

2014-01-31T10:31:17.679Z [F2060B70 verbose 'task-1'] args[15]: --sparse

And everything passes OK.

In fstab patching there is:

2014-01-31T11:48:28.893Z [F2060B70 info 'task-1'] patching device path of mount point /srv/data203628, from /dev/mapper/vgdata-data203628_u06 to /dev/vgdata/data203628_u06

Is it possible this machine has the links of the form /dev/mapper/vgdata-<lvname> but doesn't have links like /dev/vgdata/<lvname> ? Can you check that?

If so, please update, I would like to file a bug.

Regards,

Plamen

Reply
0 Kudos
bparinas
Contributor
Contributor

Hello Plamen,

All links are present in the source system.

# ls -l /dev/mapper/vgdata*

brw-rw---- 1 root disk 253, 12 Apr 22  2013 /dev/mapper/vgdata1-data203977_u08

brw-rw---- 1 root disk 253,  4 Mar 12  2012 /dev/mapper/vgdata-data203584_u01

brw-rw---- 1 root disk 253,  7 Mar 12  2012 /dev/mapper/vgdata-data203590_u02

brw-rw---- 1 root disk 253,  9 Mar 12  2012 /dev/mapper/vgdata-data203591_u03

brw-rw---- 1 root disk 253, 10 Mar 12  2012 /dev/mapper/vgdata-data203592_u04

brw-rw---- 1 root disk 253,  6 Mar 12  2012 /dev/mapper/vgdata-data203593_u05

brw-rw---- 1 root disk 253,  8 Mar 12  2012 /dev/mapper/vgdata-data203628_u06

brw-rw---- 1 root disk 253,  5 Mar 12  2012 /dev/mapper/vgdata-data203629_u07

# ls -ld /dev/vgdata*

drwxr-xr-x 2 root root 180 Mar 12  2012 /dev/vgdata

drwxr-xr-x 2 root root  60 Apr 22  2013 /dev/vgdata1

# ls -lR /dev/vgdata*

/dev/vgdata:

total 0

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203584_u01 -> /dev/mapper/vgdata-data203584_u01

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203590_u02 -> /dev/mapper/vgdata-data203590_u02

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203591_u03 -> /dev/mapper/vgdata-data203591_u03

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203592_u04 -> /dev/mapper/vgdata-data203592_u04

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203593_u05 -> /dev/mapper/vgdata-data203593_u05

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203628_u06 -> /dev/mapper/vgdata-data203628_u06

lrwxrwxrwx 1 root root 33 Mar 12  2012 data203629_u07 -> /dev/mapper/vgdata-data203629_u07

/dev/vgdata1:

total 0

lrwxrwxrwx 1 root root 34 Apr 22  2013 data203977_u08 -> /dev/mapper/vgdata1-data203977_u08

Reply
0 Kudos
bparinas
Contributor
Contributor

Hello Plamen,

The issue has been resolved for the missing filesystems. Problem is the generic scsi LVM filter setting in /etc/lvm/lvm.conf has to be corrected.

Ex.

< filter = [ "r/disk/", "r/cciss/", "r/r.*/", "r/ram.*/", "r/sda.*/", "a/.*/" ]

---

> filter = [ "r/disk/", "r/cciss/", "r/r.*/", "r/ram.*/", "r/sd.*/", "a/.*/" ]

This will activate the other virtual disks (/dev/sdb & /dev/sdc) that contains the missing filesystems. Then a pvscan;vgscan;lvscan, a final “vgchange –a y”.

Removing the comment signs in /etc/fstab for relevant LV’s and a “mount –a” did it.

But we still don't know what causing this error. The above solution was just a workaround because I excluded the two big filesystems to make the cloning successful and we have to recreate the excluded FS then copy the contents of each FS from source to VM image.

--> /bin/tar_1.20: Archive value 13585343488 is out of size_t range 0..4294967295

Reply
0 Kudos
patanassov
VMware Employee
VMware Employee

Thanks for sharing these details. I think this is too specific to include in Converter reconfig; perhaps a KB will do...

As for the tar issue - I am 100% certain the helper VM tar (1.20) supports files larger than 4GB (it has been tested with files larger than 2TB) and 99.99% certain the source machine tar (1.15) also supports them. From your messages I am not sure whether you have ruled out the network error possibility. What I mean is retrying the conversion with the problem volumes. Perhaps you could try converting just one of those volumes if you want to test it.

If that is not a network error, honestly, I am short of ideas. If you can copy these volumes separately as a workaround, perhaps this could just remain a mystery.

Regards

Reply
0 Kudos
bparinas
Contributor
Contributor

Hello Plamen,

I'm getting the same error message this time for a small filesystem.

-->                         description = "/usr/lib/vmware-converter/bin/ssh -z -F /usr/lib/vmware-converter/ssh.conf root@135.247.217.120 -p 22 " tar --one-file-system --sparse -C /srv/data203589 -cf - ." | /bin/tar_1.20 --numeric-owner --delay-directory-restore  -C /mnt/p2v-src-root/srv/data203589 -y -xf -

--> /bin/tar_1.20: Archive value 14403036160 is out of size_t range 0..4294967295

--> /bin/tar_1.20: Archive value 9343001600 is out of size_t range 0..4294967295

--> /bin/tar_1.20: Archive value 9492751360 is out of size_t range 0..4294967295

--> /bin/tar_1.20: Error exit delayed from previous errors

-->  (return code 2)",

-->                         msg = "fault.CloneFault.summary",

Disk utilization from source system is only 43G:

/dev/mapper/vgdata-data203589_u05

                       99G   43G   52G  46% /srv/data203589

Any idea?

Reply
0 Kudos
patanassov
VMware Employee
VMware Employee

Perhaps there is some incompatibility between the source and destination tar formats (just a shot in the dark, I am not familiar with tar internals)

Is it possible to try replacing the source machine's tar with the one from the helper? Follow this KB: VMware KB: Enabling Logging in to Helper Virtual Machine During Conversion of Powered-On Linux Sourc... about how to log in the helper VM. Just start a dummy conversion and log in. Look for /bin/tar_1.20, scp it out of the helper to the source machine, backup and replace the original.

This is a clumsy workaround but there is nothing better I can think of.

HTH

Reply
0 Kudos