CS201 - Spring 2014 - Class 25
binary search tree height
- most methods on a binary search tree are bounded by its height
- what is the worse case height?
- O(n) the twig
- when does this happen?
- insert elements in sorted or reverse sorted order
- what is the best case height?
- O(log_2 n)
- when it's a complete tree
- Randomized BST: the expected height of a randomly built binary search tree is O(log n), i.e. a tree where the values inserted are randomly selected
- this is only useful if we know before hand all of the data we'll be inserting
- does this give you an idea for a sorting algorithm?
- randomly insert the data into a binary search tree
- in-order traversal of the tree
- running time
- best-case: O(n log n)
- worst-case: O(n^2) - we could still get unlucky
- average-case: O(n log n)
- even randomized trees still don't give us guaranteed best-case O(log n) height on the tree
- however, there are approaches that can guarantee this by making sure the tree doesn't become too "unbalanced"
- AVL trees
- red-black tress
- B-trees (used in databases and for "on-disk" trees)
- a binary search tree with additional constraints
- a binary search tree
- each node is also labeled with a color, red or black
- the root is always black (this isn't technically required, but doesn't hurt us and makes our life easier)
- all red nodes have two children that are colored black
- for a given node, the number of black nodes on any path from that node to any leaf is the same
- how does this guarantee us our height?
- what is the shortest possible path from the root to any leaf?
- all black nodes
- what is the longest possible path from the root to any leaf?
- alternating red and black nodes (since a red node has to have two black children)
- what is the biggest difference between the longest and shortest path?
- since all paths must have the same number of black nodes, the longest path can be at most twice as long
- the tree can be no more than an order of 2 imbalanced, which will still guarantee us O(log n) height, since 2 is just a constant multiplier
- insertion into a red-black tree
- we insert as normal into the binary tree at a leaf
- we color the node inserted as red
- then we need to fix up the tree to make it maintain the constraints
- like delete for normal BSTs, there are a number of cases, with some more complicated than others
- beyond the scope of this class, but they utilize "rotations" of the tree to alter the structure
- basic idea is to rotate the child up into the parent position and then give the child on the side of the rotation to the old parent
- x with left subtree alpha and right subtree y with left subtree beta and right subtree gamma
- becomes: y with right subtree gamma and left subtree x with left subtree alpha and right subtree beta
- right rotation is in the opposite direction
- how might this help us?
- insert: 1, 2, 3 into the tree
- inserting 1 and 2 is fine
- after inserting 3, we have a twig
- if we rotate left, it looks more like a balanced tree
- look at demo:
data structures with a purpose
- as I've mentioned before, there is no one single best data structure
- data structures help us speed up certain operations
- what was the purpose of binary search trees?
- speed up searching for items when we have a dynamically changing set
- balanced BSTs have O(log n) search, insert and delete
- what did queues allow us to do efficiently?
- keep track of a sequential ordering of items
- add to the back and remove from the front in constant time
- Queues work well for operations when everything is equal, but this is often not the case
- A priority queue is a queue where order is determined by an associated priority
- items with the lowest priority exit the queue before items with a larger priority
- look at PriorityQueue interface in
- very simple interface (like queue)
- we can add elements
- the only way we can remove elements is via the extractMin method, which removes the smallest elements from the set
- when/where might priority queues be useful? common in scheduling tasks:
- process scheduling
- there are many processes running on your computer at any given time
- each application you run has one or more processes associated with it
- the operating system has many processes associated with it
- why do we need priorities associated with processes?
- some process are just more important than others
- enforce fairness (we can adjust priorities of those processes that aren't getting much processor time)
- the "top" command (on macs and linux machines) shows you the processes and their priorities (on windows this information is in the task manager, type ctrl+alt+del, select task manager and then select the processes tab)
- shows a variety of information on the machine about the number of processes, cpu usage, memory usage, etc.
- also shows each individual process and the cpu usage and the priority
- typing 'q' exits top
- network traffic scheduling
- different information floating around the net may have higher priority than others
- what might be some examples?
- real-time/streamed data has higher priority over things like e-mail, etc.
- certain customers might have higher priority
- P2P protocol traffic (like bittorrent) often has lower priority
implementing a priority queue
- what would be possible approaches?
- use an ArrayList (or similar expandable linear structure)
- two options:
1) add at the back of the array
- add: O(1)
- extractMin: O(n)
2) keep in sorted order with highest priority at the back
- add: O(n)
- extractMin: O(1)
- look at SimpleArrayListPriorityQueue class in
restricting generic types
- If we declare a generic type variable (e.g. <E>) this can be instantiated with ANY class
- There are situations where we need to restrict the type of things that can be instantiated in the class variable
- most often when you need to require that the class have certain attributes, e.g.
- implement a particular interface
- extend a particular class
- you can add restrictions to the the type variable
- For example: <E extends Comparable<E>>
- defines a type parameter E
- the classes that can be used to instantiate this type parameter must implement "Comparable<E>"
- in the code, we can then assume that anything of type E has the compareTo method!