CSCI 1100 — Computer Science 1 Homework 7

Dictionaries

Overview

This homework is worth 100 points toward your overall homework grade, and is due Thursday,

November 14, 2019 at 11:59:59 pm. It has two parts, each worth 50 points. Please download

hw7_files.zip. and unzip it into the directory for your HW7. You will find multiple data files to

be used in both parts.

The goal of this assignment is to work with dictionaries. In part 1, you will do some simple file

processing. Read the guidelines very carefully there. In part 2, we have done all the file work for

you so you should be able to get the data loaded in just a few lines. For both parts, you will spend

most of your time manipulating dictionaries given to you in the various files.

Please remember to name your files hw7_part1.py and hw7_part2.py.

As always, make sure you follow the program structure guidelines. You will be graded on program

correctness as well as good program structure.

Remember as well that we will be continuing to test homeworks for similarity. So, follow our

guidelines for the acceptable levels of collaboration. You can download the guidelines from the

Course Resources section of Submitty if you need a refresher. Note that this includes using someone

else’s code from a previous semester. Make sure the code you submit is truly your own.

Autocorrect, now improved a bit more...

As promised, this is a simple modification of the HW6 version of autocorrect. You will make a

few changes to make it more realistic. We will describe the whole homework but point out the

differences in bold. Feel free to start from your HW6 code. Do not start from someone else’s HW6

code.

To solve this problem, your program will read the name of three files:

❼ the first contains a dictionary of words (Note: the format of the file is changed) ❼ the second contains a list of words to autocorrect (as before)

❼ the third (a new file) contains potential letter substitutions (described below).

The input word file has a single word per line as before, but the dictionary file has two entries per

line, the first entry on the line is a single valid word in the English language and the second entry

is a float representing the frequency of the word in the lexicon. The two values are separated by a

comma. The inclusion of frequency is a slight change from the dictionary used in the

previous assignment.

Read the English dictionary into a Python dictionary, using words as keys and frequency as values.

You will use the frequency for deciding the most likely correction.

The keyboard file has a line for each letter. The first entry on the line is the letter to be replaced

and the remaining letters are possible substitutions for that letter. All the letters on the line are

separated by spaces, These substitutions are calculated based on adjacency on the keyboard, so if

you look down at your keyboard, you will see that the “a” key is surrounded by “q”, “w”, “s”, and

“z”. Other substitutions were calculated similarly, so

b v f g h n

means that a possible replacement for b is any one of v f g h n. Read this keyboard into a

dictionary: the first letter is the key (eg. b) and the remaining letters are the value, stored as a

list.

Your program will then go through every single word in the input file, autocorrect each word and

print the correction. To correct a single word, you will consider the following:

FOUND If the word is in the dictionary, it is correct. There is no need for a change. Print it as

found, and go on to the next word. (No change: you are now checking the word is in the key

for your dictionary: use in dictionary to test please. Do not use a loop.)

DROP If the word is not found, consider all possible ways to drop a single letter from the word.

Store any of the words that are in your English dictionary in some container

(list/set/dictionary). Note that you do not stop if a match is found here. Read

below for ranking all matches.

INSERT If the word is not found, consider all possible ways to insert a single letter in the word.

Store any of the words that are in your English dictionary in some container

(list/set/dictionary). Note that you do not stop if a match is found here. Read

below for ranking all matches.

SWAP Consider all possible ways to swap two consecutive letters from the word. Store any

of the words that are in your English dictionary in some container (list/set/dictionary). Note that you do not stop if a match is found here. Read below for