Data structures

From Wikiversity
Jump to navigation Jump to search
Educational level: this is a tertiary (university) resource.
Subject classification: this is a mathematics resource.
Completion status: this resource is ~25% complete.
A portrait of Fibonacci, after whom Fibonacci heaps are named. These heaps enable O(1) insert time, amortized O(lg n) deletemin time and amortized O(1) decreasekey time, making them asymptotically faster than the basic binary heap structure discussed in this course.

Data structures help you organize and process your data. There are many different ways of implementing them depending on available resources and whims of the programmer, but here are the general ideas behind them:

Readings[edit | edit source]

Choosing a data structure[edit | edit source]

The type of data structure you want to use will often be determined by how quickly you need to be able to do certain things to the data and how often, with compromises sometimes made for hardware or network restrictions.

Simple data structures[edit | edit source]

Stacks[edit | edit source]

A stack is a data structure that supports first-in-last-out access to elements, meaning the most recently added element is the first to be removed. Stacks have two main operations, namely Push() and Pop(). Push() adds an element to the top of the stack, while Pop() removes the element at the top of the stack. You can think of it as a stack of plates: you can 'push' additional items onto the stack of plates or 'pop' plates from the top of the stack.

Stacks are usually implemented through a linked list.

/*needs pictures*/

Linked lists[edit | edit source]

Think of a linked list as a series of boxes(called nodes) in a row. Each piece of information (or set of information) is put into one box with a pointer to the next box. A doubly linked list is one that also has pointers that go back the other way to the previous box. They have head and tail pointers to help you keep track of where the beginning and end are, and usually at least one pointer that moves around the inside of the structure to point at a box to help you keep your place as you look for things.

/*needs pictures*/

For example, suppose you wanted to keep the names, addresses, and birthdays of your friends in a linked list. Each node would have one friend's name, address, and birthday in it, plus a pointer to the next in the list. If you want, the list can be sorted as you add friends to it, based on their name, address, or birthday in whatever way you want. If you know that some friends are more important to you and you don't want to go through the whole list to look for them every time, you can add in another variable for each person that can be used to set a sorting priority.

Queues[edit | edit source]

A Queue is a data structure that provides first-in-first-out access to elements. The two basic operations are Enqueue() and Dequeue(). Enqueue() adds an element to the back of the Queue. Dequeue() removes an element from the front of the queue. Just like a line at the supermarket, a Queue only supports adding items to the back and removing them from the front. In addition, some implementations allow you to 'peek' at the item in front without removing it.

/*needs pictures*/

Dictionaries[edit | edit source]

Hash tables[edit | edit source]

Trees[edit | edit source]

You can think of a tree structure as a linked list with more than one outgoing pointer per node. This way, it branches out and the ends are called leaves instead of tails. The top node is called the root and the branches, like a real tree, don't merge back together. Thus, the nodes all have one incoming pointer and zero or more outgoing pointers, depending on the type of tree, its location within it, and the set of data that is shaping the tree.

Binary Search Trees[edit | edit source]

A binary tree is a specific type of tree with 0, 1, or 2 child nodes.

Traversal techniques[edit | edit source]

There are three main ways to process data in a tree. Recursion is usually the simplest way to perform such a task, where "traverse left" and "traverse right" below are recursive functions calls with the left and right children, respectively.

Preorder: process node, traverse left, traverse right

Inorder: traverse left, process node, traverse right

Postorder: traverse left, traverse right, process node

For example, consider the following recursive function to display the elements in a tree:

Routine DisplayElements( Node )
   if Node = null then return
   DisplayElements( Node's left )           //recursive function call with left child
   DisplayNode( Node's value )             //display value at current node
   DisplayElements( Node's right )        //recursive function call with left child
End Routine

This is a simple inorder traversal.

AVL Trees[edit | edit source]

2-4 Trees[edit | edit source]

Red-Black Trees[edit | edit source]

Disk-based data structures[edit | edit source]

Pattern matching[edit | edit source]

Data compression[edit | edit source]

Priority queues[edit | edit source]

Sorting[edit | edit source]

Graphs[edit | edit source]

Breadth-First Search[edit | edit source]

Depth-First Search[edit | edit source]

Minimum Spanning Trees[edit | edit source]

Shortest Paths[edit | edit source]

Dynamic allocation[edit | edit source]

Dynamic allocation asks for memory as it is needed.

Lecture Notes[edit | edit source]

Data structures lecture notes from University of Maryland, College Park

Assignments[edit | edit source]

ADUni:

College of William and Mary:

IIT Delhi:

More:

Exams[edit | edit source]

MIT, College of William and Mary

Note that these exams will cover material outside the scope of this course.

Supplementary Links[edit | edit source]

See also[edit | edit source]