COEN 346 – Operating Systems
Assignment 01
Introduction
The key to this experiment is that all worker threads synchronize the change count value.
There are several ways around this, here we use the AtomicInteger class to ensure that
each child thread can atomically change count to ensure thread synchronization. In the
Java language, the ++ I and I ++ operations are not thread-safe, and the synchronized
keyword is unavoidable when used. AtomicInteger works through a thread-safe addition
and subtraction interface. Using atomicInteger, we can implement count increment in
multiple child threads.
Classes
Tool class
Worker_thread
Main
Tool class
We store all variables to be used by the main thread in the Tool class, such as count,
worker_number, pattern, etc., Tool object created in the main thread, and all threads
share it by reference.
Worker_thread
Child Thread class, inherited from Thread class. At initialization, you get the Tool
object and the line number of the log to operate on. Rewrites the run method to call the
LevenshteinDistanc function in the run method to calculate the difference between the
current log and pattern. If acceptable_change is true, increase the count in tool.
Main
Instantiate tool in the class and read vm_1.txt to store all logs in logs (ArrayList).
Create the corresponding worker thread according to the work_number. Each worker
thread corresponds to an index. This index is not only the thread ID, but also the row
number of the log to operate. The main thread then calls the join method and calculates
the closed approximate_avg when current round all child threads have finished. If it is
0.2 larger than the previous one, work thread will +2 and then start the next round.
Program Flow
1. Read the txt file into an array.
2. Each thread (initial two) reads one line for processing.
3. Use the algorithm given from LevenshteinDistance to compare strings, and then
update a global variable count based on the result multi-threaded.
4. After all the threads are updated, the main thread updates the approximate_avg
(initial zero) again according to the results. If the requirements are met, two more
threads are created and then executed.
Compile results
Vulnerability Pattern in my implementation is:
7K205MBSYBCT49JT8NZK2N137DIWX7WUPNBJGLVIGN0LB6OBM68ONZ1S
9L1VO7DS9D. Since program needs to go through more than 9,000 iterations to read
a total of 200,000 lines of log statements, I only show the screenshots of the result of
first ten iterations below.
The worker thread’s task is to search its assigned log statement for the possible
similarity with the vulnerability pattern inside the log statement, if the value of
acceptable_change is true(change ratio equal or bigger than 0.05), the worker thread
will increase the number of detected vulnerabilities by one. Every time the temp_avg
is more than 20% of the previous value of approximate_ave, the master thread will
increase the number of launched worker threads for the next iteration by 2.
As we can see, after 9th iteration, the number of worker threads stop increase since
temp_avg = 132/200000 = 6.6E-4 which is not more than 20% of 5.5E-4(previous value
of approximate_ave).
Conclusion
For this programming assignment, the goal is to divide the task through multiple threads,
and then all the sub-threads complete the task (each sub-thread reads a line of log
statement and processes it). When multiple child threads are operating the same variable
or data structure, it must be ensured that only one thread is operating the variable or
data structure. If a thread context switch occurs suddenly during the operation of the
child thread, it will force the thread to be suspended and lead to erroneous results.
Count++, for example, is ostensibly a single statement, but the compiler translates it
into three assembly instructions. When a thread is forced to pause at the first assembly
instruction, the result will be an error.