For this project, you will be implementing both a modified AVL tree and a 2-5 tree (this is a 2-4 tree but you are allowed to have up to 5 children as apposed to 4). The code is to be uploaded to gradescope under Programming Assignment 3, where you may test your code against the autograder. Gradescope should be able to run your program using the command “make” followed by “./prog1.out”. Be sure to include all .cpp/.h files that make up your program, the dataset provided a Makefile, and name the executable prog1.out Gradescope has clang++ (clang 10) and a g++ compiler available. Please note that it is possible that your program could have some unexpected behavior on gradescope’s auto grader compared to whatever machine you wrote the code on, so submit your program early and make sure it is bug free on gradescope. You are to implement your own AVL/2-5 trees, so you may not use any libraries that automatically do this for you.
Each node in the AVL or 2-5 trees is a pair of (word,counter) where counter shows the number of occurrence of the word in the dataset.
The AVL trees we will be implementing in this assignment are identical to the AVL trees you learned in class, except that here the height constraint will allow for a height dierence of the subtrees of a given node to be at most 2 (as apposed to 1, which is the version you learned in class. The tree should NOT balance itself if the height dierence is only 2), So your code should only perform rotations if the height dierence is over 2.
Each class should have at least the following functions:
- A constructor and a destructor.
- A function for searching a word (the word may or may not exist).
- A function for inserting a new word or incrementing the counter if the word is already inserted.
- A function for doing a range search. The function takes as input two words. Given two words, the function should find all the words in between. The resulting words should be sorted.
Additionally, in order to verify your data structures, you will need to implement the following methods for both classes:
- A print function that prints out a pre-order traversal of the tree.
- A function that prints out the height of the tree.
These last two functions will not be apart of gradescope’s autograder test, but will be used by us to ensure that you have properly implemented the data structures using a modified AVL tree and a 2-5 tree. Please comment your code so it is clear how to use your print and height functions.
For this project, you should use PA3 dataset.txt available on piazza and gauchospace to initially build your data structures. You are expected to parse the dataset and insert the words contained in it. Insert each word using the insert method you implemented for both trees.
There should be no output when inserting words from PA3 dataset.txt. For your gradescope submission to be able to access the data set you must upload it and use the path:
Your code should build a modified AVL tree as described before and a 2-5 tree out of the dataset provided. Gradescope will pass a string of commands via argv1. Each command will be separated by a comma. Your code may receive the following commands: search, insert, range search from [word 1] to [word 2]. Your output should follow this structure EXACTLY to ensure full credit.
Gradescope will pass a string of commands via argv1. Each command will be separated by a comma. Your code may receive the following commands:
(a) Search a given word. Both the AVL tree and the 2-5 tree should be searched to find the word. The program should print out “[word] found”, along with the count of the word, once for each data structure. If the word is not found the program should output “[word] not found”, once for each data structure.
Ex: search hello hello found, count = 2 hello found, count = 2 or if hello is not in the data structure hello not found hello not found
(b) Insert a word. If the word is already present then its counter should be increased by one. The program should output “[word] inserted, new count = [count]”, once for each data structure.
Ex: insert goodbye goodbye inserted, new count = 1 goodbye inserted, new count = 1
(c) Do a range search. The program should print out all words alphabetically in between (an including) the two words provided in the range search that are in the data structures. Note that the two words given to perform the range search may or may not be in your data structures. The program should do this once for both data structures.
Ex: range search band to cat band bankers bat cab calling band bankers bat cab calling
(d) Print out a traversal of the tree. The program should print out the values of the tree using a pre-order traversal. For each node, you must do the following: print an open parenthesis, the node’s data, print the nodes’ children from left to right, followed by a close-parenthesis.
For each datum, you must print the key and value separated by a colon. If a node has multiple data elements, you should delimit them using commas. For any children that are null, you should print empty parentheses.
Ex: An AVL tree with root value (hello, 3) and children (ahoy, 4) and (howdy, 1) should be printed as follows: (hello:3(ahoy:4()())(howdy:1()())) A 2-5 tree node with values (cromulent, 2) and (embiggen, 4) should be printed as follows: (cromulent:2,embiggen:2()()())
(e) Print the height of the tree. The code should simply output: “Height = [height of tree]”
Your program receives the commands through argv1. In the following example we have that that the starting count of hello is 2, of yesterday is 3, and goodbye is not present in the data structure. We also have that the only words in between band and cat in the data structure are band, bankers, bat, and cab. gradescope passes the following commands through argv1:
search hello, insert goodbye, insert hello, range search band to cat
We expect the following output:
hello found, count = 2 hello found, count = 2 goodbye inserted, new count = 1 goodbye inserted, new count = 1 hello inserted, new count = 3 hello inserted, new count = 3 band bankers bat cab band bankers bat cab