CS 062, Lecture 41

Sets

So far we have seen two ways in which we can represent sets

  1. Bitstrings
  2. Hash tables

Bitstrings are very fast and easy and support set operations using the usual bit operations of & (for intersection), | (for union) and ~ (for complement). Set subtraction, A-B, can also be supported by a combination of these operations (left as an exercise for the reader).

Unfortunately we need a discrete linear ordering for this to work (e.g., every element has a unique successor), and in fact they all need to be codeable as non-negative integers. Thus this works well for subranges of ints, for chars, and for enumerated types. However, they will not work well for strings or for other complex orderings.

Hash tables work well for representing elements when we have a good hash table, but they don't support the set operations well at all. Taking a union or intersection would involve traversing through all of the elements of a hash table (empty and non-empty) to process individual elements. This is O(N) for the size of the table, which is usually larger than O(n) (actually, we ignore constants with "O", but you know what I mean!).

A simple alternative is to use an ordered sequence (e.g., an ordered linked list). It is easy to see how we could perform union, intersection, and set subtraction operations. If moving to the next element, comparing, and copying are all O(1) then the entire thing will be O(n+m) where n and m are the sizes of the sets involved.

See design pattern in text for Template Method Pattern. See code on-line in OrderedListSet.

Visitor Pattern

See the Parser Visitor classes to illustrate how the Visitor Pattern works. The idea is that rather than building all of the operations into a collection of classes, you build in a hook that allows "Visitor" to visit the tree, getting access to whatever information is necessary to perform the operations on the tree. See ParserVisitor5.