Project Two: My Image Processor
Out: June 1
th, 2023; Date: 11:59pm, June 14
th, 2023
I. Motivation
The goal of this project is to help you practice basic C++ programming and in particular:
program arguments, I/O stream, array manipulations, and function pointers. It also
gives you an idea about how to organize and write more modular and reusable code.
Accessorily, you will also learn some basics about image processing and gain some
more knowledge about Linux commands.
II. Introduction
In this project, you will implement a Linux command, called mip (for my image
processor), that offers some simple image processing functionalities. This command
reads an image from a file, applies an image transformation to it, and writes the resulting
image into another file. As an illustrative example, the following command flips the
image in input.ppm upside down and writes the results in the file named output.ppm:
./mip -i input.ppm -o output.ppm -t verticalFlip
input.ppm output.ppm
See below for a presentation of the ppm file extension.
If no input or output file is provided, the command will read from the standard input or
output respectively. With such behavior, it is then possible to run this command
consecutively and apply several image transformations. For instance, the following
command flips the input image upside down and rotates it:
./mip -i input.ppm -t vertical_flip | ./mip -o output.ppm -t rotate90
input.ppm output.ppm
Recall "|" is called a pipe. It allows to redirect the output of the command that appears
in its left to the input of the command that appears in its right. Note that one call of the
program only applies one transformation.
In the next sections, we provide a short introduction to image processing, explain your
programming assignments, describe the expected behaviors of your mip command, and
finish with the usual explanation about submission and grading.
III. Image Processing
An image can be thought of as a 2D matrix. The dimension of the image (i.e., width
and height) corresponds to the size of the matrix. One component of this matrix
corresponds to one point of this image, which is called a pixel (picture element). A
colored image is usually represented in an RGB (red, green, blue) format. In that case,
a pixel corresponds to a vector of three elements representing the intensities of those
three colors.
In mathematical notations, an RGB image is an element I of ℝ𝑤×ℎ×3 where 𝑤 is the
width of the image and ℎ is its height. We will denote 𝑰𝒊,𝒋
its RGB vector in ℝ3
at
pixel position (𝑖,𝑗). We will follow the usual convention in programming of indexing
from 0. Therefore, 0 ≤ 𝑖 < 𝑤 and 0 ≤ 𝑗 < ℎ.
RGB Color Space
In a computer, each component of the RGB vector is usually represented by a nonnegative integer bounded by 𝑀 . In this project, we will assume that they can be
represented by a char and can therefore only take values between 0 and 𝑀 = 255
(inclusive).
Part I. Operations on Images
Notations: I in ℝ𝑤×ℎ×3 denotes the input image and J denotes the resulting image
after an operation is applied.
Various operations can be performed on images, as you may know if you have already
used any image processing applications (e.g., gimp or photoshop). In this project, we
will consider the following simple operations:
• Vertical flip: After applying this operation on an image I, the resulting image J in
ℝ𝒘×𝒉×𝟑
is such that 𝑱𝒊,𝒋 = 𝑰𝒊,𝒉−𝒋
.
• Rotation by 90° (clockwise): After applying this operation on an image 𝑰 , the
resulting image 𝑱 in ℝℎ×𝑤×3
(note that the width and height of 𝑱 are swapped!)
is such that 𝑱𝒊,𝒋 = 𝑰𝒉−𝒋,𝒊
.
Vertical flip
Rotation by 90°
• Intensity inversion: After applying this operation on an image I, the resulting image
J in ℝ𝒉×𝒘×𝟑
is such that 𝑱𝒊,𝒋 = 𝑴 – 𝑰𝒊,𝒋
. where the subtraction is componentwise.
• Filtering: This operation consists in computing the new value of each pixel with an
aggregating function applied on a local region centered around that pixel. A simple
case is when this aggregating function is the mean (simple average). The resulting
image would be a smoothed version of the input image. Other aggregating functions
can be considered, such as max or median, with different effects on the output
image. Formally, the resulting image J ∈ ℝ
𝒘×𝒉×𝟑
is given by: 𝑱𝒊,𝒋 =
𝒂𝒈𝒈{ 𝑰𝒊+𝒌,𝒋+𝒍
| − 𝒔 ≤ 𝒌 ≤ 𝒔, −𝒔 ≤ 𝒍 ≤ 𝒔} where 𝒂𝒈𝒈 is an aggregating function
that is applied on a set of values in a local region defined by a (𝟐𝒔 + 𝟏) × (𝟐𝒔 + 𝟏)
square centered around the pixel at (𝒊,𝒋). Note 𝒂𝒈𝒈 is applied componentwisely
over the RGB values. If some indices become negative, the corresponding RGB
vector is assumed to be the zero vector. Important: You can assume that 𝒔 = 𝟏 in
this project.
We consider three possible cases for 𝒂𝒈𝒈, which leads to three types of filtering:
o Max filtering: The resulting image J ∈ ℝ
𝒘×𝒉×𝟑
is given by: 𝑱𝒊,𝒋 =
𝒎𝒂𝒙{ 𝑰𝒊+𝒌,𝒋+𝒍
| − 𝒔 ≤ 𝒌 ≤ 𝒔, −𝒔 ≤ 𝒍 ≤ 𝒔} . Note the 𝒎𝒂𝒙 operation is
componentwise over the RGB values.
Intensity inversion
o Mean filtering: The resulting image J ∈ ℝ
𝒘×𝒉×𝟑
is given by: 𝑱𝒊,𝒋 =
𝒎𝒆𝒂𝒏{ 𝑰𝒊+𝒌,𝒋+𝒍
| − 𝒔 ≤ 𝒌 ≤ 𝒔, −𝒔 ≤ 𝒍 ≤ 𝒔} where the 𝒎𝒆𝒂𝒏 operation is
also componentwise over the RGB values.
o Median filtering: The resulting image J ∈ ℝ
𝒘×𝒉×𝟑
is given by: 𝑱𝒊,𝒋 =
𝒎𝒆𝒅𝒊𝒂𝒏{ 𝑰𝒊+𝒌,𝒋+𝒍
| − 𝒔 ≤ 𝒌 ≤ 𝒔, −𝒔 ≤ 𝒍 ≤ 𝒔}, which is also componentwise.
Recall that the median of a list of values is the value that separates that list
in half, one half being smaller and one half being larger than the median.
Part II. File Formats
Many image file formats exist, such as jpg, png, or gif. These file formats describe
how an image is stored in a file, usually after applying some compression algorithm to
reduce the overall file size. In this project, we use a format called portable pixmap
format (PPM), whose file extension is ppm. This image file format stores images
without any compression. Normally, most image viewers (e.g., ImageMagick, MacOs
Preview) can deal with this type of image files.
A PPM file is composed of two parts: a header and its image content. For an RGB image
I as discussed above, the header is formatted as follows:
P6
w h
Max filtering
Mean filtering
Median filtering
M
The header is directly followed by the image content:
RGBRGB……
RGBRGB……
where P6 is a code to specify that the image content is stored in binary, 𝑤, ℎ, and 𝑀
are as defined above. As a remark, if P3 is used as a code, the image content is stored
in ASCII. After the header part, the file contains the list of the RGB values written in
binary. Among the starter files, you can find in ppm.cpp a simple example that writes
a PPM file. You can use that C++ code to help you understand better the PPM format.
Important: The PPM files that you will generate should be in the same format as in
this C++ code. Moreover, your program will only deal with PPM files written with the
code P6. The binary encoding can reduce the file sizes of the images you will deal with.
Moreover, you can assume that 𝑤 ≤ 800 , ℎ ≤ 800 , and 𝑀 = 255 . Those three
constants are denoted WMAX, HMAX, and M respectively in the code snippets below.
IV. Programming Assignment
We describe in this section the different C++ functions you need to implement. Your
main function will read potential options passed to it and process the input image, if
any. We describe the behavior of the main function at the end. To help you understand
better, we ask you some questions (see Self-Quiz below) that you can answer for
yourself. You do not need to submit your answers to us.
We assume that a pixel takes values of the following type:
typedef struct{
unsigned char red;
unsigned char green;
unsigned char blue;
} Rgb;
Since we have not learned dynamic memory management yet, you will create images
of the maximum sizes. Therefore, an image will be of the following type:
typedef struct{
Rgb image[WMAX][HMAX];
unsigned int w;
unsigned int h;
} Image;
We call a value of this type an image array.
First, you need to take care of the input and output of your program:
• implement a function that reads an image from an input stream (e.g., file or standard
input). This function takes two arguments: an input stream and an image array. The
image is read from the input stream and is stored in the second argument. The caller
can then access this image from the image array. The signature of this function is:
void readImage(std::istream &is, Image &imI);
Self-Quiz: Why is the image passed by reference? Could we remove & since an array
is inside? How else could we have passed imI?
• implement a function that writes an image array to an output stream (e.g., file or
standard output). This function takes two arguments: an output stream and an image
array. The image array is written in the output stream using the PPM file format
described above. The signature of this function is:
void writeImage(std::ostream &os, const Image &imI);
Self-Quiz: Why is imI passed like this?
Next, you need to implement the different image transformations. Each of them is
performed by its corresponding C++ function. All these functions take at least two
arguments: an input image and an output image. The caller can obtain the resulting
image after the application of an image transformation from the output image. The
signature of an image transformation function fun with only two arguments is:
void fun(const Image &imI, Image &imJ);
Self-Quiz: Could we have returned the resulting image via a return (assuming that we
do not know how to do memory allocation)?
• implement a function verticalFlip that vertically flips an image.
• implement a function rotate90 that rotates an image by 90° (clockwise).
• implement a function intensityInversion that inverts the intensity of the RGB
values.
• implement a function filter that applies the filtering operation. Since it depends
on an aggregating function, its signature is:
void filter(const Image &imI, Image &imJ, Agg f);
where the type of the aggregating function is defined by:
typedef unsigned char (*Agg)(const unsigned char
values[2s+1][2s+1]);
which corresponds to a function pointer that takes a 2D array of chars and return an
aggregated value as a char. You will implement three instantiations of aggregating
functions: max, mean, and median.
Last, you need to code your main function, which will call your previous functions
depending on its program arguments. Your program should work according to the
following syntax:
Usage: mip [-i input file] [-o output file] -t transformation
We will call this previous line the help message.
Recall that in this help message, options in brackets mean that they are optional. The
allowed transformations are verticalFlip, rotate90, intensityInversion,
maxFilter, meanFilter, and medianFilter.
Once the program is called with option --help or -h, even if it is called with other
options, the help message should be printed and the program should stop without
creating an output image.
If the program is called with incorrect arguments, the pro gram should stop and print
the corresponding error message. There is at most one incorrect argument in each
command.
Incorrect arguments correspond to the following cases:
The specified input file does not exist. Error message:
Error: The specified input file does not exist.
The specified input file exists, but is not a PPM file. Error message:
Error: The specified input file exists, but is not a PPM file.
The specified transformation does not correspond to any accepted transformations.
Error message:
Error: The specified transformation does not correspond to any
accepted transformations.
Important: When you print any messages, end it with a newline. In any case, do not
print any other messages, or your grading on JOJ will be penalized.
Examples:
Command:
./mip -t invalid_transform -i test.ppm -o test_out.ppm
Output:
Error: The specified transformation does not correspond to any
accepted transformations.
Command:
./mip -t verticalFlip -i test.txt -o test_out.ppm
Output:
Error: The specified input file exists, but is not a PPM file.
Command:
./mip --help -abc -def
Output:
Usage: mip [-i input file] [-o output file] -t transformation
VI. Implementation Requirements and Restrictions
1. When writing your code, you may use the following standard header files:
, , , , , and
. No other header files can be included.
2. All required output should be sent to the standard output stream; none to the
standard error stream.
VII. Source Code Files and Compiling
To compile, you should have constants.h, mip.cpp, image.h in your directory.
Note that mip.cpp is created by your own.
Use the following Linux command to compile:
g++ --std=c++17 -o mip mip.cpp -Wall -Werror
We use some features of C++ 17 to implement debug functions. Be sure to use the
following option --std=c++17 when compiling your program.
In order to guarantee that the TAs can compile your program successfully, you should
name you source code files exactly like how they are specified above. For this project,
as usual, the penalty for code that does not compile will be severe, regardless of the
reason.
VIII. Submitting and Due Date
You should submit the source code files (in one compressed file) via Online Judge.
The due time is 11:59 pm on June 14
th, 2023.
IX. Grading
Your program will be graded along three criteria:
1. Functional Correctness
2. Implementation Constraints
3. General Style
Functional Correctness is determined by running a variety of test cases against your
program, checking against our reference solution. We will grade Implementation
Constraints to see if you have met all of the implementation requirements and
restrictions. General Style refers to the ease with which TAs can read and understand
your program, and the cleanliness and elegance of your code. For example, significant
code duplication will lead to General Style deductions.