PROBLEM - gzdecode is returning truncated decoded string, file size of .gz file was 268 KB only
SOLUTION - use GZFILE() function to return the fully decoded string without any truncation
Hope this helps
(PHP 5 >= 5.4.0, PHP 7, PHP 8)
gzdecode — Decodes a gzip compressed string
$data
, int $max_length
= 0
) : string|false
This function returns a decoded version of the input
data
.
The decoded string, or false
if an error occurred.
PROBLEM - gzdecode is returning truncated decoded string, file size of .gz file was 268 KB only
SOLUTION - use GZFILE() function to return the fully decoded string without any truncation
Hope this helps
katzlbtjunk (2008-04-30), thanks for code. But remember, .gz file may be series of members (data sets). They should be uncompressed in loop.
Fixed code gzdecodeM() on http://gist.github.com/msegu/de556cdd32eb58430871b06145168024
Addicionally, you can
- get data of gz members in array
- choose members you need to extract
This can help to decompress the data from tar archive in certain cases, hope this helps somebody
<?php
$tarName = 'example.ini';
try {
/** For files with tar.gz extensions **/
$phar = new \PharData($path, 0, $tarName);
} catch ($e) {
/** Decode the file first **/
$decodedFile = file_put_contents($path, gzdecode(file_get_contents($path)));
/** create PharData object second **/
$phar = new \PharData($path, 0, $tarName);
}
?>
In my case to decode correctly a GZIPOutputStream POST from Android I used
gzinflate( substr($post,10,-8) );
and not
gzinflate( substr($HTTP_RAW_POST_DATA,10,-8) ) . PHP_EOL . PHP_EOL
I use the following code:
function &gzdecodedata(&$data) {
ob_start();
readgzfile('data://application/gzip;base64,' . base64_encode($data));
$d = ob_get_clean();
return $d;
}
if (!function_exists('gzdecode')) {
function gzdecode($data)
{
return gzinflate(substr($data,10,-8));
}
}
To decode / uncompress the received HTTP POST data in PHP code, request data coming from Java / Android application via HTTP POST GZIP / DEFLATE compressed format
1) Data sent from Java Android app to PHP using DeflaterOutputStream java class and received in PHP as shown below
echo gzinflate( substr($HTTP_RAW_POST_DATA,2,-4) ) . PHP_EOL . PHP_EOL;
2) Data sent from Java Android app to PHP using GZIPOutputStream java class and received in PHP code as shown below
echo gzinflate( substr($HTTP_RAW_POST_DATA,10,-8) ) . PHP_EOL . PHP_EOL;
From Java Android side (API level 10+), data being sent in DEFLATE compressed format
String body = "Lorem ipsum shizzle ma nizle";
URL url = new URL("http://www.url.com/postthisdata.php");
URLConnection conn = url.openConnection();
conn.setDoOutput(true);
conn.setRequestProperty("Content-encoding", "deflate");
conn.setRequestProperty("Content-type", "application/octet-stream");
DeflaterOutputStream dos = new DeflaterOutputStream(
conn.getOutputStream());
dos.write(body.getBytes());
dos.flush();
dos.close();
BufferedReader in = new BufferedReader(new InputStreamReader(
conn.getInputStream()));
String decodedString = "";
while ((decodedString = in.readLine()) != null) {
Log.e("dump",decodedString);
}
in.close();
On PHP side (v 5.3.1), code for decompressing this DEFLATE data will be
echo substr($HTTP_RAW_POST_DATA,2,-4);
From Java Android side (API level 10+), data being sent in GZIP compressed format
String body1 = "Lorem ipsum shizzle ma nizle";
URL url1 = new URL("http://www.url.com/postthisdata.php");
URLConnection conn1 = url1.openConnection();
conn1.setDoOutput(true);
conn1.setRequestProperty("Content-encoding", "gzip");
conn1.setRequestProperty("Content-type", "application/octet-stream");
GZIPOutputStream dos1 = new GZIPOutputStream(conn1.getOutputStream());
dos1.write(body1.getBytes());
dos1.flush();
dos1.close();
BufferedReader in1 = new BufferedReader(new InputStreamReader(
conn1.getInputStream()));
String decodedString1 = "";
while ((decodedString1 = in1.readLine()) != null) {
Log.e("dump",decodedString1);
}
in1.close();
On PHP side (v 5.3.1), code for decompressing this GZIP data will be
echo substr($HTTP_RAW_POST_DATA,10,-8);
Useful PHP code for printing out compressed data using all available formats.
$data = "Lorem ipsum shizzle ma nizle";
echo "\n\n\n";
for($i=-1;$i<=9;$i++)
echo chunk_split(strtoupper(bin2hex(gzcompress($data,$i))),2," ") . PHP_EOL . PHP_EOL;
echo "\n\n\n";
for($i=-1;$i<=9;$i++)
echo chunk_split(strtoupper(bin2hex(gzdeflate($data,$i))),2," ") . PHP_EOL . PHP_EOL;
echo "\n\n\n";
for($i=-1;$i<=9;$i++)
echo chunk_split(strtoupper(bin2hex(gzencode($data,$i,FORCE_GZIP))),2," ") . PHP_EOL . PHP_EOL;
echo "\n\n\n";
for($i=-1;$i<=9;$i++)
echo chunk_split(strtoupper(bin2hex(gzencode($data,$i,FORCE_DEFLATE))),2," ") . PHP_EOL . PHP_EOL;
echo "\n\n\n";
Hope this helps. Please ThumbsUp if this saved you a lot of effort and time.
I don't have any deep knowledge in compression algorithms and formats, so I have no idea if this works the same on all platforms. But by several experiments on my Linux box I found how to get gzdecoded data without temporary files. Here it is:
<?php
function gzdecode($data)
{
return gzinflate(substr($data,10,-8));
}
?>
That's it. Simply strip header and footer bytes and you get raw deflated data for inflation.
I have used this simple function to gzdecode gzipped files downloaded from the web.
<?php
function gzdecode($data){
$g=tempnam('/tmp','ff');
@file_put_contents($g,$data);
ob_start();
readgzfile($g);
$d=ob_get_clean();
return $d;
}
?>
3 more bugs found and fixed:
1. failed to work when the gz contained a filename - FIXED
2. failed to work on 64-bit architecture (checksum) - FIXED
3. failed to work when the gz contained a comment - cannot verify.
Returns some errors (not all!) and filename.
<?php function gzdecode($data,&$filename='',&$error='',$maxlength=null)
{
$len = strlen($data);
if ($len < 18 || strcmp(substr($data,0,2),"\x1f\x8b")) {
$error = "Not in GZIP format.";
return null; // Not GZIP format (See RFC 1952)
}
$method = ord(substr($data,2,1)); // Compression method
$flags = ord(substr($data,3,1)); // Flags
if ($flags & 31 != $flags) {
$error = "Reserved bits not allowed.";
return null;
}
// NOTE: $mtime may be negative (PHP integer limitations)
$mtime = unpack("V", substr($data,4,4));
$mtime = $mtime[1];
$xfl = substr($data,8,1);
$os = substr($data,8,1);
$headerlen = 10;
$extralen = 0;
$extra = "";
if ($flags & 4) {
// 2-byte length prefixed EXTRA data in header
if ($len - $headerlen - 2 < 8) {
return false; // invalid
}
$extralen = unpack("v",substr($data,8,2));
$extralen = $extralen[1];
if ($len - $headerlen - 2 - $extralen < 8) {
return false; // invalid
}
$extra = substr($data,10,$extralen);
$headerlen += 2 + $extralen;
}
$filenamelen = 0;
$filename = "";
if ($flags & 8) {
// C-style string
if ($len - $headerlen - 1 < 8) {
return false; // invalid
}
$filenamelen = strpos(substr($data,$headerlen),chr(0));
if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) {
return false; // invalid
}
$filename = substr($data,$headerlen,$filenamelen);
$headerlen += $filenamelen + 1;
}
$commentlen = 0;
$comment = "";
if ($flags & 16) {
// C-style string COMMENT data in header
if ($len - $headerlen - 1 < 8) {
return false; // invalid
}
$commentlen = strpos(substr($data,$headerlen),chr(0));
if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) {
return false; // Invalid header format
}
$comment = substr($data,$headerlen,$commentlen);
$headerlen += $commentlen + 1;
}
$headercrc = "";
if ($flags & 2) {
// 2-bytes (lowest order) of CRC32 on header present
if ($len - $headerlen - 2 < 8) {
return false; // invalid
}
$calccrc = crc32(substr($data,0,$headerlen)) & 0xffff;
$headercrc = unpack("v", substr($data,$headerlen,2));
$headercrc = $headercrc[1];
if ($headercrc != $calccrc) {
$error = "Header checksum failed.";
return false; // Bad header CRC
}
$headerlen += 2;
}
// GZIP FOOTER
$datacrc = unpack("V",substr($data,-8,4));
$datacrc = sprintf('%u',$datacrc[1] & 0xFFFFFFFF);
$isize = unpack("V",substr($data,-4));
$isize = $isize[1];
// decompression:
$bodylen = $len-$headerlen-8;
if ($bodylen < 1) {
// IMPLEMENTATION BUG!
return null;
}
$body = substr($data,$headerlen,$bodylen);
$data = "";
if ($bodylen > 0) {
switch ($method) {
case 8:
// Currently the only supported compression method:
$data = gzinflate($body,$maxlength);
break;
default:
$error = "Unknown compression method.";
return false;
}
} // zero-byte body content is allowed
// Verifiy CRC32
$crc = sprintf("%u",crc32($data));
$crcOK = $crc == $datacrc;
$lenOK = $isize == strlen($data);
if (!$lenOK || !$crcOK) {
$error = ( $lenOK ? '' : 'Length check FAILED. ') . ( $crcOK ? '' : 'Checksum FAILED.');
return false;
}
return $data;
}
?>
Aaron G. 07-Aug-2004 03:29 posted the function gzdecode()
to gzencode comments. I FIXED the BUG: if($flags & 1) to the correct if($flags & 2)
Unfortunately the function gzencode() does NOT append a CRC so I did not notice until I tried to upload a file compressed with gzip itself.
<?php
function gzdecode($data) {
$len = strlen($data);
if ($len < 18 || strcmp(substr($data,0,2),"\x1f\x8b")) {
return null; // Not GZIP format (See RFC 1952)
}
$method = ord(substr($data,2,1)); // Compression method
$flags = ord(substr($data,3,1)); // Flags
if ($flags & 31 != $flags) {
// Reserved bits are set -- NOT ALLOWED by RFC 1952
return null;
}
// NOTE: $mtime may be negative (PHP integer limitations)
$mtime = unpack("V", substr($data,4,4));
$mtime = $mtime[1];
$xfl = substr($data,8,1);
$os = substr($data,8,1);
$headerlen = 10;
$extralen = 0;
$extra = "";
if ($flags & 4) {
// 2-byte length prefixed EXTRA data in header
if ($len - $headerlen - 2 < 8) {
return false; // Invalid format
}
$extralen = unpack("v",substr($data,8,2));
$extralen = $extralen[1];
if ($len - $headerlen - 2 - $extralen < 8) {
return false; // Invalid format
}
$extra = substr($data,10,$extralen);
$headerlen += 2 + $extralen;
}
$filenamelen = 0;
$filename = "";
if ($flags & 8) {
// C-style string file NAME data in header
if ($len - $headerlen - 1 < 8) {
return false; // Invalid format
}
$filenamelen = strpos(substr($data,8+$extralen),chr(0));
if ($filenamelen === false || $len - $headerlen - $filenamelen - 1 < 8) {
return false; // Invalid format
}
$filename = substr($data,$headerlen,$filenamelen);
$headerlen += $filenamelen + 1;
}
$commentlen = 0;
$comment = "";
if ($flags & 16) {
// C-style string COMMENT data in header
if ($len - $headerlen - 1 < 8) {
return false; // Invalid format
}
$commentlen = strpos(substr($data,8+$extralen+$filenamelen),chr(0));
if ($commentlen === false || $len - $headerlen - $commentlen - 1 < 8) {
return false; // Invalid header format
}
$comment = substr($data,$headerlen,$commentlen);
$headerlen += $commentlen + 1;
}
$headercrc = "";
if ($flags & 2) {
// 2-bytes (lowest order) of CRC32 on header present
if ($len - $headerlen - 2 < 8) {
return false; // Invalid format
}
$calccrc = crc32(substr($data,0,$headerlen)) & 0xffff;
$headercrc = unpack("v", substr($data,$headerlen,2));
$headercrc = $headercrc[1];
if ($headercrc != $calccrc) {
return false; // Bad header CRC
}
$headerlen += 2;
}
// GZIP FOOTER - These be negative due to PHP's limitations
$datacrc = unpack("V",substr($data,-8,4));
$datacrc = $datacrc[1];
$isize = unpack("V",substr($data,-4));
$isize = $isize[1];
// Perform the decompression:
$bodylen = $len-$headerlen-8;
if ($bodylen < 1) {
// This should never happen - IMPLEMENTATION BUG!
return null;
}
$body = substr($data,$headerlen,$bodylen);
$data = "";
if ($bodylen > 0) {
switch ($method) {
case 8:
// Currently the only supported compression method:
$data = gzinflate($body);
break;
default:
// Unknown compression method
return false;
}
} else {
// I'm not sure if zero-byte body content is allowed.
// Allow it for now... Do nothing...
}
// Verifiy decompressed size and CRC32:
// NOTE: This may fail with large data sizes depending on how
// PHP's integer limitations affect strlen() since $isize
// may be negative for large sizes.
if ($isize != strlen($data) || crc32($data) != $datacrc) {
// Bad format! Length or CRC doesn't match!
return false;
}
return $data;
}
?>