Image Compression Application
10 Jun 2017This blog post is an extension to my previous post on K-means algorithm. We will use K-means clustering method to find \(k\) dominent colors of an image. Each pixel of an image is represented as a 3 dimensional vector (containing RGB value) where the previous blog post visualize k-means on 2D vector (points on 2D plane). Content:
1. Introduction
This blog is an extra section of the previous blog K-Means Clustering with Visualization tool that i wrote.
Let’s see a quick demo of the program, the main function of the program is:
2. Main functions
Input:
- Image (filename of image)
- K (integer, number of color we want after compressing)
Output: New image which contains K colors
3. Demo Video
4. Algorithm Recap
4.1. K-means on 3-dimensional space
Let me briefly explain the algorithm that I use in this program.
According to the previous blog K-Means Clustering with Visualization tool. K-means algorithm is able to put data into group/ or cluster. In the visualization tool, each data is a point on a plane, each point has 2 properties, namely x-axis location and y-axis location. In this image compression application, each data point is one pixel of the image, each data point (pixel) has 3 properties, namely, Red color, Green color and Blue color density (RGB color). Each red, green and blue density will have a value ranging from 0-255.
When data is points on plane, all data points are put in a 2-dimensional space. In the case of image compression application, each data is a point on a 3-dimensional space, our goal here is to cluster all these points. For example, if we have a 100x100 resolution image, we have 10000 pixels in total, they will be 10000 points on a 3-dimensional space.
4.2. Represent data in array format
The way I choose to represent a pixel in array format is below:
img[x][y][RGB]
img: the image we store, Img will be a 3-dimensional array.
[x]: x-location (if we have a 20x30 image, x will range from 0 to 19)
[y]: y-location (if we have a 20x30 image, y will range from 0 to 29)
[RGB]: Red, green and blue channel. RGB will have 3 value: 0, 1 and 2 corresponding to Red, Green and Blue
Example: img[1][2][0]: The Red channel value of pixel in the location 1 (x-axis) and 2 (y-axis).
5. Source Code and PDF report
Source Code:
type
// type used for application
RGB = array[0..2] of Single; // x-axis // store red, green and blue of an image
CollumnPixels = array of RGB; // y-axis, rowPixels
imgArray = array of CollumnPixels; // 3-dimensional array used to store red, green, blue color of each pixel in an image
IntArray = array of Integer; // label
SingleColorArray = array of Color; // 1 dimensional array used to store color
ColorArray = array of SingleColorArray; // 2 dimensional array used to store color
// typed used for Visualization
PointsRecord = Record
x : Single; // x-axit on panel
y : Single; // y-axit on panel
idx : Integer; // store label of each point, instead of create another array or enumeration to store it
end;
PointsArray = array of PointsRecord; // Store all the points on panel and the init points
// Setlength to 3-dimen array
procedure AddSize(var Img: ImgArray; row: Integer; col: Integer);
var
i: Integer;
begin
SetLength(Img, row);
for i:=0 to row-1 do
begin
SetLength(Img[i], col);
end;
end;
//return 3-dimensional array
function ReadImage(var filename: String): ImgArray;
var
userBmp: Bitmap;
row, col: Integer;
i,j: Integer;
c: Color;
begin
WriteLn('Reading in image... Please wait...');
userBmp := BitmapNamed(filename);
row := BitmapHeight(userBmp);
col := BitmapWidth(userBmp);
AddSize(result, row, col);
for i:=0 to row-1 do
begin
for j:=0 to col-1 do
begin
c := GetPixel(userBmp, j, i);
result[i][j][0] := RedOf(c);
result[i][j][1] := GreenOf(c);
result[i][j][2] := BlueOf(c);
end;
end;
end;
//return width*height x 3, put all pixel into one array
function Reshape(var img: ImgArray): CollumnPixels;
var
row, col: Integer;
i,j,rgb: Integer;
begin
row := Length(img);
col := Length(img[0]);
SetLength(result, row*col);
for i:=0 to col-1 do
begin
for j:=0 to row-1 do
begin
for rgb:=0 to 2 do
begin
result[i*row+j][rgb] := img[j][i][rgb];
end;
end;
end;
WriteLn('Reading in image successfully!');
end;
//return 3-dimensional array from two dimensional array
function InverseReshape(var imgSingle: CollumnPixels; row: Integer; col: Integer): ImgArray;
var
i, j, rgb: Integer;
begin
//SetLength to 3 dimensional matrix aka the result of the function according to row, col
SetLength(result, row);
for i:=0 to row-1 do
begin
Setlength(result[i], col);
end;
//Processing inverse reshape function
for i:=0 to col-1 do
begin
for j:=0 to row-1 do
begin
for rgb:=0 to 2 do
begin
result[j][i][rgb] := imgSingle[i*row+j][rgb];
end;
end;
end;
end;
//use random color later
function RandomInitColors(var colorNum: Integer): CollumnPixels;
var
c: Color;
i: Integer;
begin
SetLength(result, colorNum);
for i:=0 to colorNum-1 do
begin
c := RandomColor();
result[i][0] := RedOf(c);
result[i][1] := GreenOf(c);
result[i][2] := BlueOf(c);
end;
end;
//return label for each pixels (form 0 to clusterNumber-1)
function ReadLabels(var imgSingle: CollumnPixels; initColor: CollumnPixels): IntArray;
var
i,j,rgb: Integer;
imgRow, initRow: Integer;
distance, min: single;
begin
imgRow := Length(imgSingle);
initRow := Length(initColor);
SetLength(result, imgRow);
for i:=0 to imgRow-1 do
begin
// a random value that is big enough so that first distance value is always smaller than min
min := 3*256*256;
//loop through each pixels to calculate minimum distance
for j:=0 to initRow-1 do
begin
distance := 0;
// calculate distance in euculide space
for rgb:=0 to 2 do
begin
distance += sqr(imgSingle[i][rgb]-initColor[j][rgb]);
end;
if (distance < min) then
begin
min := distance;
result[i] := j;
end;
end;
end;
end;
//calculate the in each cluster then return a vector contain the sum of these cluster
function SumInCluster(var imgSingle: CollumnPixels; idx: IntArray; clusterNum: Integer): CollumnPixels;
var
i, rgb, lengthIdx: Integer;
begin
Setlength(result, clusterNum);
//assign initial value
for i:=0 to clusterNum-1 do
begin
for rgb:=0 to 2 do
begin
result[i][rgb] := 0;
end;
end;
//cal total sum
lengthIdx := Length(idx);
for i:=0 to lengthIdx-1 do
begin
for rgb:=0 to 2 do
begin
result[idx[i]][rgb] += imgSingle[i][rgb];
end;
end;
end;
//return num_k
function CountCluster(var imgSingle: CollumnPixels; idx: IntArray; clusterNum: Integer): IntArray;
var
i, rgb, lengthIdx: Integer;
begin
Setlength(result, clusterNum);
//assign initial value
for i:=0 to clusterNum-1 do
begin
for rgb:=0 to 2 do
begin
result[i] := 0;
end;
end;
//check iteration of cluster
lengthIdx := Length(idx);
for i:=0 to lengthIdx-1 do
begin
result[idx[i]] += 1;
end;
end;
//assign new value to cluster after calculating the mean
procedure ReassignCluster(var initialColors: CollumnPixels; sum_k: CollumnPixels; num_k: IntArray);
var
i, rgb: Integer;
begin
for i:=0 to Length(initialColors)-1 do
begin
for rgb:=0 to 2 do
begin
if num_k[i] = 0 then
initialColors[i][rgb] := 0
else
initialColors[i][rgb] := sum_k[i][rgb] / num_k[i];
end;
end;
end;
//return the imgSingle (all pixels in one row) after doing the algorithm
procedure AssignLabelToPixel(var imgSingle: CollumnPixels; idx: IntArray; finalCentroids: CollumnPixels);
var
i, rgb: Integer;
begin
for i:=0 to High(imgSingle) do
begin
for rgb :=0 to 2 do
begin
imgSingle[i][rgb] := finalCentroids[idx[i]][rgb];
end;
end;
end;
//return final centroids
function KMeansOnPixel(var img: ImgArray; clusterNum: Integer; iter: Integer; initialColors: CollumnPixels): ImgArray;
var
//initialColors: CollumnPixels; //initial random colors
imgSingle: CollumnPixels; //2-dimensional array after reshape 3 dimensional array
num_k: IntArray; // clusterNumx3
sum_k: array of RGB; // sum off all pixel in same cluster clusterNumx3
i: Integer;
idx: IntArray; // The label of each pixel (size = size(imgSingle)); value range from 0 to clusterNum-1
row, col: Integer;
begin
//convert to 2 dimensional array
imgSingle := Reshape(img);
row := Length(img);
col := Length(img[0]);
initialColors := RandomInitColors(clusterNum);
WriteLn('...Running k means....');
for i:=0 to iter-1 do
begin
//label of each picxel
idx := ReadLabels(imgSingle, initialColors);
//caculate means
sum_k := SumInCluster(imgSingle, idx, clusterNum);
num_k := CountCluster(imgSingle, idx, clusterNum);
ReassignCluster(initialColors, sum_k, num_k);
WriteLn('K-Means iteration '+ IntToStr(i+1) + '/' + IntToStr(iter) + '...');
end;
//reassign imgSingle
AssignLabelToPixel(imgSingle, idx, initialColors);
result := InverseReshape(imgSingle, row, col);
end;
//convert 3 dimensional RGB value to 2 dimensional type color
function RGBToColor(var img: ImgArray): ColorArray;
var
row, col: Integer;
i,j: Integer;
begin
row := Length(img);
col := Length(img[0]);
//SetLength for colorarray (row and col size of imgArray);
Setlength(result, row);
for i:=0 to row-1 do
begin
Setlength(result[i], col);
end;
//begin converting using RGBFloatColor (swinGame API)
for i:=0 to row-1 do
begin
for j:=0 to col-1 do
begin
result[i][j] := RGBFloatColor(img[i][j][0]/255, img[i][j][1]/255, img[i][j][2]/255);
end;
end;
end;
//pass by value function. Use this type to modify the bmp without changing the original one, in other way, we create a copy of the bitmap
function MatrixToBitmap(img: ImgArray): Bitmap;
var
i, j: Integer;
row, col: Integer;
colorArr: ColorArray;
begin
row := Length(img);
col := Length(img[0]);
result := CreateBitMap(col, row);
colorArr := RGBToColor(img);
for i:=0 to row-1 do
begin
for j:=0 to col-1 do
begin
PutPixel(result, colorArr[i][j], j, i);
end;
end;
end;
//main funcion of the application
function KMeans(var filename: String; var clusterNum: Integer; var iter: Integer): Bitmap;
var
img: ImgArray; //3 dimensional array of image
imgKMeans: ImgArray;
initialColors: CollumnPixels;
bmpOriginal: Bitmap;
begin
// Return 3-dimen array
img := ReadImage(filename);
bmpOriginal := BitmapNamed(filename);
initialColors := RandomInitColors(clusterNum);
imgKMeans := KMeansOnPixel(img, clusterNum, iter, initialColors);
result := MatrixToBitmap(imgKMeans);
end;
Full Source Code: https://github.com/DungLai/Image-Compression-Segmentation