Symbol Table Applications

A closer look at symbol table applications.

OMAR ELGABRY
Feb 17, 2017 · 4 min read

The next few topics will show how symbol tables (associative arrays) might be used to solve practical problems.

Symbol Table Applications — algs4.cs.princeton.edu

Sets

They are just collection of distinct keys. There is no values associated; the keys themselves are values at the same time.

The operations are: add a key, ask if the set contains a key, or remove a key.

Sets are commonly implemented in the same way as associative arrays, either using:

Therefore, sets can be implemented using associative arrays, but, we either:

  • Remove all references to “value” field from any symbol table implementation.
  • Use a dummy value as the values, which is ignored.
  • Spell Checker: Store words in a dictionary, identify misspelled words.
  • Browser: Store visited pages, mark the current page if visited not.
  • Spam Filter: Store spam IP addresses, eliminate IP address if it’s a spam.

All of these applications are “Exception filter” application; meaning, they store either a whitelist or blacklist of keys, then check if a key exists or not.

public class WhiteList { 
public static void main(String[] args) {
SET<String> set = new SET<String>();
String[] whitelist = {/*...*/};
String[] input = {/*...*/};

for(String word: whitelist) // store whitelist words
set.add(word);
for(String word: input) // print whitelist words in input
if (set.contains(word))
System.out.println(word);
}
}

Dictionary Clients

It’s an application of symbol tables, where we find the value (if any) that’s mapped to a given key in a list of key-value pairs.

  • DNS Lookup: Store a list of URL and it’s associated IP address in symbol table. The key can be the URL, and value is the IP address, or vice-versa.
  • Student Lookup: Store a list of students data, where key can be for example their id, and first name is the value.

Indexing Clients

It’s same as Dictionary Clients. But, this time, the value is a list of values, instead of only one value.

  • File Indexing: Store words in all files, then tell me which files contain a given query string. The key is the word, and the value is a set of files containing this word.
import java.io.File;
public class FileIndex {
public static void main(String[] args) {
// The value can have type of "Set";
// a collection of distinct items.
ST<String, SET<File>> st = new ST<String, SET<File>>();
String files[] = {/*...*/};

// for each word in file, add to corresponding set
for (String filename : files) {
File file = new File(filename);
String words[] = file.getWords();
for(String word: words) {
if (!st.contains(word))
st.put(word, new SET<File>());
SET<File> set = st.get(word);
set.add(file);
}
}
// get list of files for each word
String input[] = {/*...*/};
for(String word: input)
System.out.println(st.get(word));
}
}
  • Book Index: Store words in an article, then tell me the position(s) where a word appeared in the article. The key is the word, and the value is a set of indexes where the word appeared in the article.

Sparse Vectors

Matrix-vector multiplication — algs4.cs.princeton.edu

The matrix-vector multiplication; multiplying matrix by a vector using the brute force solution takes O(N²).

And, if the matrix(or the vector) is sparse, meaning, most of the elements are zero, which is the case in many applications, then using associative arrays can get better performance.

We assume if matrix dimension is 10⁴, then non-zeros per row ~ 10 on average.

Although using a 1d array takes a constant time to access an element, but, it might consume a lot of wasted space if most of the elements are zeros; Sparse Vector. In addition, iterating over the elements would take a linear time in the worst case.

So, using associative arrays instead can save time and space (both are ~= non-zero keys).

Vector Representations — algs4.cs.princeton.edu

Thus, we can make a data type called “SparseVector”, which uses the symbol table to represent a sparse vector (a vector with a lot of zeros), and provides extra methods for mathematical operations, such as finding dot product.

A matrix can also have two representations; 2d array, or an array of associative arrays, or SparseVectors.

Using associative arrays save time and space since most of the elements are zeros (both are ~= non-zero keys in each row, and plus N for space).

Matrix Representations — algs4.cs.princeton.edu

The number of iterations equals to non-zero keys in each row (constant for each row). Therefore, using associative arrays can get linear running time for sparse matrix.

SparseVector[] matrix = new SparseVector[N]; 
double[] vector = new double[N];
double[] result = new double[N];
// Initialize matrix[] and vector[]for (int i = 0; i < N; i++)
result[i] = matrix[i].dot(vector);

OmarElGabry's Blog

This is my freedom area. Don't underestimate it. The devil is in the detail.

OMAR ELGABRY

Written by

Software Engineer. Going to the moon 🌑. When I die, turn my blog into a story.

OmarElGabry's Blog

This is my freedom area. Don't underestimate it. The devil is in the detail.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade