CSci 161 Lab #9
 

There are 50 reserved words in the Java programming language. You can find a list of them at this web site. They vary in length from 2 to 12 characters. Experiment with different hash table lengths to see if you can find a collision-free hash function (i.e., a perfect hash function) for these 50 items. Use the following strategy to compute hash values:

  1. pad the word with leading blanks, if necessary, to make the length a multiple of 4
  2. break the word up into 4-character (32-bit) chunks, convert these chunks into 32-bit integers, and add them up (this is called "folding")
  3. compute a 64-bit long integer by squaring the result of the previous step
  4. "extract" the middle 32-bits from the 64-bit value above (use shifts to get rid of the first 16 and last 16 bits)
  5. divide the middle_bits value computed in the previous step by the length of your hash table, keep the remainder as your hash value (i.e. hash_value = middle_bits%table_length)
You may find it easiest to get the reserved words into your program by hard-coding an array of strings using a syntax like
	String[] reservedWords = {"abstract", "assert", "boolean", ...};  // this is one long long long line of code
	
Alternatively, you could read these reserved words into your program from a text file.

In a loop, you want to grab each of these reserved words, one at a time, and compute its hash value for a specified table size. After the loop completes execution, you want to see if there were any collisions. If might be easier to make your program check for collisions, rather than having yourself manually check the 50 hash values for duplicates.

Inputting one or more table sizes as command line parameters would be a convenience. You don't want to have to recompile your program each time you change the table size.

 
When you are finished, ask your lab instructor to check your work and record your scores.